4F8 Revisions

Frequency responses for image processing

Typical questions:

Directly taking the inverse Fourier Transform for ideal frequency response will give an impulse response with infinite support. To solve this, we can multiply the ideal frequency response with a window function to force the impulse response to have zero coefficients outside the window. The actual frequency response is the convolution of the desired frequency response and the window function
We multiply in the spatial domain, thus it’s a convolution in the frequency domain.
The effect of the window function is to smooth the desired frequency response . We would prefer to have the mainlobe width of the window function to be small so that to preserve the desired frequency response. Also, sidebands should have small amplitude to get small ripples in the frequency response outside the region of interest.

RGB is a device-dependent colour space, it is not a good choice for image processing.
YUV has been the colour encoding system used for analogue television worldwide (PAL, NTSC and SECAM) since the 1950s.
Converting between RGB and YUV is a linear transformation, thus it’s easy to implement.
Y is the luminance, U and V are the chrominance.
Human eyes are more sensitive to luminance than chrominance, thus we can subsample U and V to reduce the size of the image without much loss of quality.
- A common subsampling scheme is 4:2:2, which means that the U and V components are subsampled by a factor of 2 in the horizontal direction.

After DCT compression, the DCT coefficients are quantised with optimised quantisation steps.
The quantisation step are predetermined and tailored for each subband via experiment with lots of natural images.
Thus, JPEG works well for natural images, but not for images with sharp edges (artificial images)
Entropy coding is then applied, run-length and amplitude information are combined into a single Huffman code.