TTA lossless audio compressor algorithms
The basic principles of lossless audio data compression
It is well known that there is no effective algorithm that can compress of arbitrary input data without a loss. The high compression level for audio data can be achieved only with algorithm that account for data characteristic features. TTA audio compressor is meant for lossless compression of multichannel 8,16 and 24 bit audio data, stored in a popular WAV format files.
Lossless data compression is not new technology. This principle is incorporated in well known compression utilities such as PkZip, Compress or Gzip for text and binary files compression. This is a compression without loss, because files after decompression are identical to original ones. Unfortunately, these programs, which usually uses Lempel-Ziv's algorithms variants, does not shows good results for multimedia data compression. Though majority of text files can be compressed with ratio greater than 2:1, the sizes of multimedia files remains practically the same.
Compression utilities that based on Lempel-Ziv's algorithm substitutes for symbol group with pointer to that text fragment when the group occurs earlier. For audio files the symbols corresponds to signal samples. Unfortunately, the sequences of repeating signal samples are very rare and so the compression ratio is very low. In addition, these compression utilities does not use substantial feature of multimedia files - strong dependence between neighbor signal samples. Because of that all modern algorithms for multimedia data compression includes data decorrelation stage for lowering of this statistical dependence. The modern compressors usually has two preliminary stages: intra-channel decorrelation and a prognostic modeling.
The residual decorrelated signal usually compress without loss by one of the well-known entropy coding technique. In fact, all modern audio compressors uses one of following techniques: Huffman coding, run length encoding (RLE) or Rice coding.
Mostly all of modern lossless multichannel audio compressors are constructed by general plan which includes four main stages:
- Signal decomposition by blocks;
- Inter-channel decorellation;
- Signal modeling (prognosing);
- Entropy coding.
TTA lossless audio compressor architecture includes all of these canonical stages. After block decomposition the data are subjected to intra-channel decorrelation and then transferred to prognosis stage. On this stage the modeling of input signal is realized by 2-step adaptive filter. The difference between original and prognosis signal is transferred to entropy encoder. The residue signal is compressed by Rice coding. Let us consider the TTA audio compressor architecture in detail.
The optimal block size depends greatly on chosen compression algorithm for modeling stage. In general case for the sake of opportunity to restore the defective files the block size must be sufficiently small. Yet the reduction of block size leads to expanding of frame headers amount and so compression level is reduced. The increasing of block size tends to more difficult editing of compressed digital stream. In the case of statistical modeling the block size also must not be large since effective signal modeling becomes impossible. Our compressor uses dynamic (adaptive) signal modeling. The algorithms of this type are effective only for big frames. For audio data compression TTA compressor uses a frames with a duration about 1 sec.
The multichannel incoming data can be subjected to intra-channel decorrelation. As an example, for two-channel data the two of original channels are transformed to mean and difference channels by the next formulas: mean = (first + second)/2, difference = first - second. For the sake an excluding of data loss these formulas should be converted to: difference = first - second, mean = first - difference/2.
For multichannel audio data with good correlation between neighbor channels, this procedure usually gives considerable enlargement of compression rate.
On modeling stage the TTA lossless audio compressor tries to approximate the signal with such function that the result of substraction of this function from original signal (which is called residue, difference or error) has the minimal value. In contrast to the other stages (intra-channell decorrelation and residue coding), which remains practically the same, the modeling stage differs a great deal. Below we give the main of methods which have been tested in the TTA compressor constructing activity:
- Signal approximation with a set of polynomial predictors;
- Signal modeling by linear prediction method (LPC);
- Signal modeling with adaptive filters.
The best results have been received by method of signal modeling with adaptive filters.
This method uses IIR (Infinite Impulse Response model) filters, which parameters are changing adoptively during the work process. The base system element is the p-dimensional nonrecursive filter, which in general case is described by the following expression:
where x'[n] - is a predicted value of the new sample x[n]; v[n,k] - next value of the filter weight coefficients; r - signal delay. Filter weight coefficients are defined by the formulae:
where m - is an coefficient which defines the filter's adaptation speed; e[n] - next value of output signal (error signal); sgn - sign of the signal [.];
The minimization of the residue e[n] = x[n] — x'[n] can be achieved with a help of various algorithms, such as Widrow-Hoff's least mean square error algorithm (LMS) which is based on statistical approach or recursive least squares algorithm (RLS). Though the second algorithm has greater convergence speed, it demands sufficiently more processor resources. Thus we use LMS algorithm in the TTA compressor. The TTA compressor algorithm convergence is realized by quickest descent method, moreover, for calculation simplification we use Widrow-Hoff's gradient stochastic approximation. As a optimal criterion for convergence speeding we use minimum of module of the filter error. Independently from the value of filter coefficients matrix v, which can be arbitrary, the algorithm converges at the average and remains stable while the parameter m satisfies the condition 1/λmax > m > 0, where a λmax - is a maximum of autocorrelation matrix eigenvalue for input signals. Output residue filter signal is defined as a difference between real signal and prognostic value which is computed as a convolution of real signal with weight coefficients of transversal filter according to (1). The impulse characteristic of this filter (of weight coefficient vector of dimension p) is renewed on each time discrete moment according to (2).
The TTA lossless audio compressor algorithm differs insignificantly from described above. The input signal undergoes 2-stage filtration. On the first stage we use zero level predictor as a filter and so the error signal is defined by the next formulae:
where k sufficiently close to 1.
On the next, a final stage of a filtration we use a set of analogous to described above, wide-banded filters.
This technique is most effective among examined modeling signal techniques both by prognostic precision and processing speed. It's shortcomings is ineffectiveness for blocks of small size and as corollary the difficulties in editing of compressed stream.
When the model is chosen, the encoder subtracts the approximation from original. If the model is incorrect, then the difference between original signal and prognostic one is residue signal, which after that is encoded without loss. We use the fact that difference signal usually has Laplas distribution and there is a set of Huffman special codes, which is called Rice codes, that can effectively and quickly encode these signal without a dictionary. Rice encoding consists of finding the one parameter that corresponds to signal distribution, which then is used for code composition. If we change the distribution the optimal parameter also changes, so there is technique that can evaluate it if necessary. If the prognosis is effective, then the residue signal will have less bits than original one. Further, usually the residue signal is parted by smaller blocks , which has his personal Rice parameter. The smaller blocks size choice also influences on encoding effectiveness. TTA lossless audio compressor uses adaptive encoding technique with dynamic definition of Rice parameter for residue encoding. In this case we do not carry out the partition of residue signal by smaller blocks.
- A.V.Dzhurik, P.G.Zhilin. Siberian Solar Radiotelescope: The PCA format for the SSRT data compression. Numerical methods and programming, The Lomonosov Moscow State University, 4 (2003), 278-282.
- M.Hans, R.Schafer. Lossless Audio Coding. Technical Report. CSIP TR-97-07. Atlanta, 1997.
- T.Robinson. SHORTEN: Simple lossless and near-lossless waveform compression. Technical report, Cambridge University Engineerig Department. Cambridge, December 1994.
- P.C.Craven, M.J.Law, and J.R.Stuart. Lossless Compression using IIR Prediction Filters. 102nd AES Convention. Munich, March 1997.
- R.F.Rice, Some practical universal noiseless coding techniques. Technical Report. JPL-79-22, Jet Propulsion Laboratory. Pasadena, 1979.