Modified Discrete Cosine Transform

The modified discrete cosine transform (MDCT) is a Fourier transform and is based on the discrete cosine transform (DCT-IV). This is an overlap transform. It runs on successive blocks of volumetric data sets, each of the subsequent blocks overlaps. This happens as follows: the second half of the previous block coincides with the first half of the next one. This overlap, in addition to the functions of the DCT, makes MDCT particularly useful for compressing signals from applications where it is necessary to avoid the appearance of artifacts that usually go beyond the block boundaries.

Thus, MDCT works in MP3 , AC-3 , Vorbis and AAC formats for audio compression, for example.

The MDCT was developed by Princeton, Johnson and Bradley in 1987 ^[1] , preceded by the work of 1986 by Princeton and Bradley ^[2] .

Then they developed the basic principle of the elimination of temporary interference (PUWP), described below. (There is also a similar transformation, MDSP, based on a discrete sine transformation. And also on other, less commonly used MDCT variations, based on various types of chipboard and DCT combinations).

In MP3, the MDCT is not applied to the audio signal directly, but at the output of a 32-band multi-phase quadrature filter (ICF). The output in this MDCT is processed according to the temporary noise cancellation formula in order to reduce the typical interference of the MKF filters. This combination of filter set with MDCT is called hybrid (i) filter set or MDCT subzone (i). In contrast, AAC typically uses pure MDCT; Only (previously used) MPEG-4 AAC-SSR (from Sony) are processed by the 4-band MKF set from the MDPM. ATRAC uses stack quadrature mirror filters (CGF), and then MDCT.

Content

Definition

Since MDCT is an overlap transform, it is slightly different from other Fourier transforms. In MDCT two times less outputs than inputs (unlike other transformations, where the outputs are exactly the same as the number of inputs).

In particular, it is a linear function : $F\colon \mathbb {R} ^{2N}\to \mathbb {R} ^{N}$ ${\ displaystyle F \ colon \ mathbb {R} ^ {2N} \ to \ mathbb {R} ^ {N}}$ (Where $\mathbb {R}$ ${\ displaystyle \ mathbb {R}}$ - a set of real numbers)

2 N - real numbers x ₀ , ..., x _{2 N -1 are} converted into real numbers X ₀ , ..., X _{N -1} according to the formula:

$X_{k}=\sum _{n=0}^{2N-1}x_{n}\cos \left[{\frac {\pi }{N}}\left(n+{\frac {1}{2}}+{\frac {N}{2}}\right)\left(k+{\frac {1}{2}}\right)\right]$ ${\ displaystyle X_ {k} = \ sum _ {n = 0} ^ {2N-1} x_ {n} \ cos \ left [{\ frac {\ pi} {N}} \ left (n + {\ frac { 1} {2}} + {\ frac {N} {2}} \ right) \ left (k + {\ frac {1} {2}} \ right) \ right]}$

(The normalization factor is here at the beginning of the transformation, and the set is arbitrary and differs in different variations of conditions. The result of the normalization of MDCT and OMDCP is shown below.)

Inverse transform

The inverse MDCT is known as an MDCT. Since they differ in the number of inputs and outputs, at first glance it may seem that MDCT cannot be converted to the opposite. However, the best conversion reversibility is achieved by applying (i) MDTD to overlapping blocks, and is the reason for eliminating errors before extracting the original data. This method is known as the principle of elimination of temporary interference (PUWP).

MDCT transforms real numbers X ₀ , ..., X _{N -1 of the} set N into real numbers y ₀ , ..., y _{2 N -1 of the} set 2N in accordance with the formula:

$y_{n}={\frac {1}{N}}\sum _{k=0}^{N-1}X_{k}\cos \left[{\frac {\pi }{N}}\left(n+{\frac {1}{2}}+{\frac {N}{2}}\right)\left(k+{\frac {1}{2}}\right)\right]$ ${\ displaystyle y_ {n} = {\ frac {1} {N}} \ sum _ {k = 0} ^ {N-1} X_ {k} \ cos \ left [{\ frac {\ pi} {N }} \ left (n + {\ frac {1} {2}} + {\ frac {N} {2}} \ right) \ left (k + {\ frac {1} {2}} \ right) \ right] }$

(As for the DCT-IV, in the orthogonal transformation, the same form is used in the inverse).

If the MDCT is used with interval normalization (see below), the coefficient of this normalization at the beginning of the DIRT formula must be multiplied by 2 (that is, 2 / N is obtained).

Calculation

Despite the fact that the direct application of the MDCT formula will require O (N²) operations, it is possible to perform calculations only for O (N log N) -complications, recursively factorizing the calculations, as in the fast Fourier transform (FFT). It is also possible to conduct MDCT using other transformations, such as FFT or DCT, additionally processing the input and output data with O (N) complexity algorithms.

Also, as has already been described, any algorithm for DCT-IV immediately provides a method for calculating MDCT or MDCT of any dimension.

Notes

Prince JP Princen, AW Johnson und AB Bradley: Subband / transform coding using IEEE Proc. Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2161-2164, 1987. Initial description of the MDCT.
P. John P. Princen, Alan B. Bradley: IEEE Trans. Acoust. Speech Signal Processing, ASSP-34 (5), 1153–1161, 1986. Described precursor to the MDCT using a combination of discrete and sine transforms.

[1] Prince JP Princen, AW Johnson und AB Bradley: Subband / transform coding using IEEE Proc. Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2161-2164, 1987. Initial description of the MDCT.

[2] P. John P. Princen, Alan B. Bradley: IEEE Trans. Acoust. Speech Signal Processing, ASSP-34 (5), 1153–1161, 1986. Described precursor to the MDCT using a combination of discrete and sine transforms.