Panteltje's paper on low pass filtering before resizing

A horizontal scaler avoiding aliasing by low pass filtering before resizing


It seems open source resize tools just average over several samples, when sub sampling the picture.
Basic information theory however shows that if you sample a signal at f,
and there are frequencies higher then f/2 present in this signal, you will
get a 'beat signal ' (interference with the sampling frequency) known as 'aliasing'.

I tried two well known open source packages, mjpegtools and bbmpeg, both use
averaging and seem to be content with the interference this causes.
Because I needed to resize DVB-s 720 x 576 PAL high quality images either
to SVCD 480 x 576 or CVD 352 x 576 format, I did look a bit deeper into this,
and tried to make a better resizer, without the aliasing artefacts.
The solution in theory is simple:
Filter out everything (low pass) starting just below half sample frequency.
(For a correct digitalization process according to Nyquist we need to sample
at at least 2 x f, where f is the maximum frequency in our signal).

Now since we are talking video, we need to be fast, and I wanted to have
more possibilities to do things in the frequency domain (for other reasons).

A low pass can be made by doing a Fourier transform followed by a reverse
Fourier transform, when in the frequency domain we set some components to
zero (the high frequency components we do not want).
Big problem with the fft and reverse fft (fft = fast Fourier transform),
is always the speed.
Since a lot of computations are required for every line in every frame,
the processing could become un-acceptable slow.
I was lucky to find the fftw package for Linux (Unix), that does a 720
point fft on my system in about 190 micro seconds.
Since in the image processing chain the next thing is a (very slow)
mpeg2encoder, the extra overhead in time the fft and reverse fft cause,
is not a real problem (8 hour processing time is normal for a > 1 hour movie).
on an Athlon, perhaps more on a Pentium with the same clock speed.
Only an amount of this time is used in the resizer code, most in the
mpeg2 coder.

Also the tendency is to ever faster systems, the speed issue will become
less and less important.
For one frame of 720 x 576 we need to apply the fft and reverse fft one
time for the luminance Y signal, and 1 time for each chrominance
signal, U and V, for each line.
U and V however contain only 1/4 of the info, so we do an fft and reverse
fft for these of (720 / 2) is 360 points (576 / 2) = 288 times.
For Y we do a 720 point fft and reverse 576 times.
So per picture we do 576 = 288 + 288 = 1152 fft reverse fft operations.
These are some test values on a Duron 950 (from the wfft test suite):
720 points forward 189.592896 us
720 points backward 178.331604 us
360 points forward 88.463379 us
360 points backward 83.996216 us

So, as an estimate, for one picture, the time taken is:
576 x (190 + 178) + 2 x 288 x (88 + 84) = 311040 us, or 311 ms
(slightly more then 3 frames per second).

To give an idea, for a 90 minute movie at 25 pictures / second, it takes
90 x 60 x 25 x 311 ms = 41985 seconds, or 699 minutes or 11.7 hours.

This is WITHOUT optimizing anything in the software, wfft can do
multidimensional array processing and has a 'plan' function that can find
the best fft algo for your system,
These rather long processing times (add an other few hours for the mpeg2
coder) are not that big of a problem, normally I leave the computer
overnight making the resized copy.

All this shows that using Fourier transforms is now within the reach
of the average computer owner, and better video processing techniques
are just around the corner (think also object recognition, filtering
of specific interference, special effect etc.)

Drawbacks:
Speed and quality is a tradeoff, there ARE artefacts introduced, in the
code I had to limit the outcome of the reverse fft so it is within range.
When presented with a luminance of 0-255, after the reverse fft negative
values were found on some frames, and values > 265 I have seen.
These rather large errors can perhaps be reduced by using a more precise
fft / reverse fft algo.

New:
Added an option to set cutoff frequency and slope.
Selecting a gradual slope reduces 'ripples' in black at low cutoff frequencies.
See the .lsm text on the download page for details.
This all needs more investigation.
In some case the picture looks 'hairy' (all these small seemingly random
errors give an impression you are looking at a lot of irregular dots,
like register overflows?)
Will I use it? YES.

Original 720x576 testcard.






mjpegtools yuvrescaler 352x576,
note the aliasing appearing as low frequency vertical bands.






rescaler-yuv 352x576,
the high part of the spectrum set to zero before the reverse transform.
So just AT Niquist f / 2!




Processor load when using rescaler-yuv as a pipe between mpeg2dec and mpeg2enc,
(when decoding, and recoding as a resized video mpeg2 stream).
Values obtained from top:
mpeg2enc CPU usage 51.8% mem usage 16.5%
resizer-yuv CPU usage 42.9% memory usage 0.4%
mpeg2dec CPU usage 4.1% memory usage 0.6%


Download resizer-yuv and other programs source code (C) from the download page.


References:
The fft is based on the fftw package by Matteo Frigo and Steven G. Johnson.
mjpegtools by many authors.
bbmpeg rescaler is based on the work of Brent Beyeler.
The linux port of bbmpeg to transcode is by Gerhard Monzel.
Transcode by Dr Ostreich, with contributions by many others and me.



email


click here to see some handy GPL programs for the Linux platform.