Stream-based Data Compression
based on Hardware

Stream-based Data Compression
based on Hardware

Development of Stream-based Lossless Data Compression Technology

Development of Stream-based Lossless Data Compression Technology

【 Research Outline 】
Due to fast growth of the amount of data from the data sources such as network, video, sensors, etc., the fast communication data path is demanded. It is getting to reach the technological limit for the data migration due to the BigData applications. The conventional data compression technology uses blocking approach that treats a data block stored once in memory. It degrades the performance when it is applied to a situation that treats data streams. Thus, the data compression technology requires to treat directly the data stream without buffering. However, there does not exist any data compression technology that fully treats data stream without any stall. This research project will develop a new stream-based data compression/decompression technology that treats a data stream. This technology will contribute a reliable technology to modern computing systems.

1. Background at the beginning of research

The industry needs compression technology to cope with the explosive increase in the amount of data flowing through data transmission lines. Data transmission lines are standardized such as by PCI Express, HDMI, and USB. Those are getting to reach GHz order. It is not scientific approach that multiple patterns are designed at the printed circuit board during the manufacturing where they are investigated by multiple trials. It is extremely difficult to implement a data transmission line before the implementation. In order to increase the transfer amount without complicating the implementation technology, it is an urgent task to develop a compression technology with hardware that compresses the data transferred in the line, reduces the frequency and the amount of data, and simplifies the implementation.

On the other hand, it is necessary to urgently shift the existing compression technology from blocking compression to stream compression. That is, until now, blocking compression has been performed in which data is stored in a memory and compressed / decompressed, but it degrades when the bandwidth of the data transmission increases. Furthermore, even if the speed is increased by hardware, the speed of the circuit must be increased with the improvement in the performance of the transmission line. This cause a situation that the performance can be “cat-and-mouse”. Therefore, the technology can be expected to fail. Therefore, it is necessary to shift the technology to support stream compression that can realize high data density in proportion to the performance of the transmission line. The technology that must be implemented in a scalable manner on hardware.

2. Research objective

According to the above background, the objective of this research is to develop a stream compression technology that compresses the data flowing in the data communication path in real time. The data handled in this study is continuous data without stalling. Therefore, there is no timing to send the symbol look-up table. Therefore, it is necessary to obtain the same table on the compression side and the decompression side in real time to perform compression / decompression. In order to apply this into the data communication path in hardware, this research addresses a real-time compression / decompression mechanism for the stream data communication path, which can operate at high speed with a small amount of resources. It can be implemented in hardware. We aimed to develop a communication protocol and a symbol look-up table update mechanism that ranks the frequency of data appearance in real time. In order to carry out these processes in a scalable manner by hardware, it will be possible to develop a method for registering fixed-length symbol strings in a table and replacing the least used symbol when the table capacity is saturated. With such a dynamic management method, even if the entropy of the data stream in the communication path changes, the amount of data in the path can be increased. Additionally, the stream compression will achieve more than the peak performance of the physical media based on a hardware-based implementation.

3. Research outcome

We proposed a new method, LCA-DLT, which overcomes the problems of conventional data compression explained in the background above. The new method can be applied to an infinite length of data streams, and can be operated at high speed in hardware. We achieved good performance in terms of both hardware and software. In this compression method, as shown in Fig. 1, (1) a module that compresses two unit data into one is prepared, (2) the module dynamically creates a look-up table, and (3) the dynamic management of the compression mechanism is based on the look-up table, (4) restore the equivalent look-up table on the decompression side and restore the data. With this technology, it is not necessary to send the conversion table to the decompression side, and once the compressed data stream is received, decompression can be performed one after another. That is, when the first data is compressed, the data is passed to the decompression side one after another, and the decompression side sequentially restores the look-up table associated with the data structure and decompresses them. Therefore, it is possible to perform compression / decompression processing in a stream.

FIGURE1:Stream data compression technology LCA-DLT

Additionally, since the compression module performs the compression from 2 data to 1 data, it can compress only 50%. By cascading the modules, we were able to develop an innovative compression method that can theoretically compress up to 1/16 = 25% in the case of 4 stages. As the academic significance of this research, only the data compression method in which the processor randomly accesses the data held in the memory has been the mainstream. However, it is impossible to perform random access for a data stream. The processing performances can be proved by the novel mechanism such as sensor data. A new academic field of stream data compression can be expected in the future.

4. Further improvements : ASE Coding

Based on the know-how of real-time data compression obtained from LCA-DLT, we have developed a new data compression technique called ASE Coding. ASE Coding is a new method that performs continuously and further compression baed on the entropy of a data stream in real time. Ad LCA-DLT performs, ASE Coding is also implemented with compact and small hardware. We have confirmed that ASE Coding can be implemented with about one-fifth the hardware resources of LCA-DLT. Currently, we are developing an application processor for AI and IoT using this hardware.

5.Related papers

[Journal papers]

  1. Shinichi Yamagiwa, Yuma Ichinomiya, Stream-Based Visually Lossless Data Compression Applying Variable Bit-Length ADPCM Encoding, Sensors 21(13) 4602, July 2021.
  2. Shinichi Yamagiwa, Koichi Marumo, Suzukaze Kuwabara, Exception Handling Method Based on Event from Look-Up Table Applying Stream-Based Lossless Data Compression, Electronics10(3) 240, January 2021.
  3. Shinichi Yamagiwa, Suzukaze Kuwabara, Autonomous Parameter Adjustment Method for Lossless Data Compression on Adaptive Stream-Based Entropy Coding,  IEEE Access 8 186890 – 186903, October 2020.
  4. Shinichi Yamagiwa, Eisaku Hayakawa, Koichi Marumo, Stream-Based Lossless Data Compression Applying Adaptive Entropy Coding for Hardware-Based Implementation, Algorithms 13(7) 159, June 2020.
  5. Koichi Marumo, Shinichi Yamagiwa, Ryuta Morita and Hiroshi Sakamoto, Lazy Management for Frequency Table on Hardware-Based Stream Lossless Data Compression, Information7(4)63, MDPI, October 2016.

[Conference proceedings]

  1. Shinichi Yamagiwa, Eisaku Hayakawa, Koichi Marumo, Adaptive entropy coding method for stream-based lossless data compression, In Proceedings of the 17th ACM International Conference on Computing Frontiers 265-268, May 2020.
  2. Shinichi Yamagiwa, Ryuta Morita and Koichi Marumo, Reducing Symbol Search Overhead on Stream-based Lossless Data Compression, In Proceedings of ICCS 2019, LNCS, Springer11540 619-626, June 2019.
  3. Shinichi Yamagiwa, Ryuta Morita and Koichi Marumo, Bank Select Method for Reducing Symbol Search Operations on Stream-based Lossless Data Compression, Data Compression Conference 2019, March 2019.
  4. Koichi Maruo, Shinichi Yamagiwa, Time-sharing Multithreading on Stream-based Lossless Data Compression, In Proceedings of The Fifth International Symposium on Computing and Networking 305-310, IEEE, November 2017.
  5. Shinichi Yamagiwa, Koichi Maruo and Hiroshi Sakamoto, Stream-based Lossless Data Compression Hardware using Adaptive Frequency Table Management, In Proceedings of VLDB2015/BPOE-6, Springer 133-146, September 2015.
  6. Shinichi Yamagiwa, Hiroshi Sakamoto, A reconfigurable stream compression hardware based on static symbol-lookup table, 2013 IEEE International Conference on Big Data 2013 86-93, October 2013.