LZO data compression functions and improvements in Intel® Integrated Performance Primitives

ID 标签 688890
已更新 8/10/2017
版本 Latest
公共

author-image

作者

Introduction

In this document, we describe Intel IPP data compression functions, that implement the LZO (Lempel-Ziv-Oberhumer) compressed data format. This format and algorithm use 64KB compression dictionary and do not require additional memory for decompression. (See original code of the LZO library at http://www.oberhumer.com.)

Lempel–Ziv–Oberhumer (LZO) is one of the well-known data compression algorithms that is lossless and focused on decompression speed. One of the fastest compression and decompression algorithms.

 

 

 

LZO Example in IPP

IPP LZO is one of the numerous LZO methods with the medium compression ratio, and it shows very high decompression performance with low memory footprint. 

The code example below shows how to use Intel IPP functions for the LZO compression. It includes compression and decompression procedures. 

Before learning LZO functions of IPP, take a look at the IPP parameters made specially for LZO functionality. 

The LZO coding initialization functions have a special parameter method. This parameter specifies level of parallelization and generic LZO compatibility to be used in the LZO encoding. The table below lists possible values of the method parameter and their meanings.

Parameter method for the LZO Compression Functions
Value Descriptions
IppLZO1XST The compression and decompression are performed sequentially in a single-thread mode with full binary compatibility with generic LZO libraries and applications
IppLZO1XMT

The compression and decompression are performed in parallel (multi-threaded mode), it is more fast, but not compatible with the generic LZO. 

Intel IPP provides 5 functions for LZO. Please refer these links below for each supported functions' details. 

Please refer here to learn how to find, setting environment variables, compiler integration and building for Intel IPP applications ( Getting Started With Intel IPP  )

/* Simple example of file  compression using IPP LZO functions */
#include <stdio.h>
#include "ippdc.h"
#include "ipps.h"

#define BUFSIZE 1024
void CompressFile(const char* pInFileName, const char* pOutFileName)
{
	FILE *pIn, *pOut;
	IppLZOState_8u *pLZOState;
	Ipp8u src[BUFSIZE];
	/* For uncompressible data the size of output will be bigger */
    Ipp8u dst[BUFSIZE + BUFSIZE/10];
    Ipp32u srcLen, dstLen, lzoSize;

	pIn = fopen(pInFileName, "rb");
	pOut = fopen(pOutFileName, "wb");
	ippsEncodeLZOGetSize(IppLZO1XST, BUFSIZE, &lzoSize);
	pLZOState = (IppLZOState_8u*)ippsMalloc_8u(lzoSize);
	ippsEncodeLZOInit_8u(IppLZO1XST, BUFSIZE, pLZOState);
	while ((srcLen = (Ipp32u)fread(src, 1, BUFSIZE, pIn)) > 0) {
		ippsEncodeLZO_8u(src, srcLen, dst, &dstLen, pLZOState);
        fwrite(&srcLen, 1, sizeof(srcLen), pOut);
        fwrite(&dstLen, 1, sizeof(dstLen), pOut);
		fwrite(dst, 1, dstLen, pOut);
	}
	fclose(pIn);
	fclose(pOut);
}
/* Example of using of DecodeLZO function to decompress the file */
void DecompressFile(const char* pInFileName, const char* pOutFileName)
{
	FILE *pIn, *pOut;
	size_t allocSizeSrc = 0;
	size_t allocSizeDst = 0;
	Ipp32u srcLen, dstLen;
	Ipp8u *pSrc, *pDst;

	pIn = fopen(pInFileName, "rb");
	pOut = fopen(pOutFileName, "wb");
	while (1) {
        if (fread(&dstLen, 1, sizeof(dstLen), pIn) != sizeof(dstLen))
            break;
        fread(&srcLen, 1, sizeof(srcLen), pIn);
		if (srcLen > allocSizeSrc) {
            if (allocSizeSrc > 0)
                ippsFree(pSrc);
			pSrc = ippsMalloc_8u(allocSizeSrc = srcLen);
        }
		if (dstLen > allocSizeDst) {
            if (allocSizeDst > 0) 
                ippsFree(pDst);
			pDst = ippsMalloc_8u(allocSizeDst = dstLen);
		}
		fread(pSrc, 1, srcLen, pIn);
		ippsDecodeLZO_8u(pSrc, srcLen, pDst, &dstLen);
		fwrite(pDst, 1, dstLen, pOut);
	}
	fclose(pIn);
	fclose(pOut);
}

 

LZO Improvements compared to the previous version

For the latest IPP 2018, there have been significant improvements in compression performance compare to the previous version. Please take a look at the compression data below.

Compression performance uses 'MB/s' unit. So the higher the better.  The compression performance can speed up ~ 50% in average with the newest IPP version. 

for decompression performance there hasn't been a big change even, on a several files 2018 decompression performance is lower than 2017.

The reason is because that in 2017 Update 2 we introduced the support of decompression of LZO-999 compressed data which requires additional checks during the decompression process and as a result, we get a lower performance.

IPP team managed restore decompression performance in 2018 but not for all files from Calgary.

Additionally, source code had been re-written from ASM to C so that some optimization potential appeared. 

Please refer the test data below for the decompression results.

 

"