By an image can be understood a projection of the real world on the retina in the eye; for instance: a photograph, a page of paper, or a scene rendered on the monitor. The mathematical model of a picture is a continuous function of two variables,   - the so-called. image function

        z=f(x, y).         (1)

    The determination domain of an image function (1) is the continuous interval I from R^2, where x a y are coordinates of a point in 2D in which the function acquires values from its determination domain [2]. From the mathematical point of view, this value can reach any number, but in practice, this value is determined by the specific application. Most often, this value is represented by a single figure (e.g., brightness, intensity of color, ?). In the field of computer graphics, the most frequently used values of colors are of the models   RGB, CMY,   HSV or XYZ. A value represents important information about the given point. If we have a sufficiently large value domain of the function then we can select a more precise value to describe a point. For instance, if I have an image in true color this can provide us more information than if such point is only a 16 color point. We most frequently use the following color pallets: 256, 65565 and 16.7 million colors, or 256 dyes of gray.

In creating an image, the value domain of the function is limited not only by the own application, but also by the physical characteristics of the technical equipment used for it. One disadvantage arising from the mathematical model and image characteristics is the fact that we almost never have a continuous determination domain of the function. Mostly, we use a point raster comprised of the basic image elements: pixels. In a 3D system, the basic element used is a - voxel Perhaps, you have met already with some PC games using a voxel technology (e.g., Comanche, Outcast, ?). In the case of a texture, the basic element is texel.

The image function (1) is defined on a sub-set of a two-dimensional area. However, for simplicity it is sometimes more advantageous to show the principle in an one-dimension determination domain. Therefore, we will consider in the majority of cases a one-dimension function. 


In the course of obtaining a digital image there occurs the transfer from the continuous function f(x,y) to the discrete function D(x,y), both in the determination domain of function f(x,y)   as well as in its value domain. This process is called digitalization. It comprises two independent parts: quantization and sampling.


The essence of quantization is the determination of discrete values of the image function. The image function is split into intervals, and for each such interval it is assigned a single value, quantity - a representative. If the length of interval is constant, then it is a uniform quantization. In the case of a non-uniform quantization the length of intervals vary. It is appropriate for transfer of a picture comprising a large number of colors into a color image with a lower number of colors. The way of determining a representative of the interval will depend on the use of the relevant image. We can count an average value of the interval, to assess median, averaged limit values, and others.

Through using quantization a so-called quantization error results. The cause is a sudden change in color: if on an interval, which is under the quantization process, there occurs a smooth change of colors, we substitute this with a calculated value. If we have a smooth transit of colors and we carry out representation by a representative, we obtain a jump transit to which the human eye is sensitive, and therefore, this error gives a troublesome effect. Partially, we can eliminate it through non-uniform quantization by an appropriate representative.


Applet no. 5.1 : Color transit on one side of RGB cube (left) and CMY cube (right), where we can see a quantization error due to jump transits


By sampling of a continuous function f(x,y) is understood the finding out of the values of samples in regular intervals. We can understand it also as being a transition from a continuous to a discrete case. In this way, we obtain a new function: V(x,y). Concurrently, we will use the fact that 'pixel' is not a point, but an area of the non-zero value. This area is represented by a certain value. We call this 'point sampling' such sampling as where we obtain values at one point. In the case of 'area sampling' we assign to the area a single value, in most cases a variable value of the area. But this method is demanding on calculation, and therefore, we try to approximate area sampling by several point samples, and then to average them.


Sampling of the continuous function (left) into the discrete one (right).

Fourier transformation

For simplicity, we show the Fourier transformation on the one-dimension function f(x). By a Fourier image of the one-dimension function f(x), defined on the determination domain, is understood a complex function F(u), obtained as follows:

    The inverse Fourier transformation is defined as:


    where u is a frequency variable, and x is an area variable.

    If by chance you do not know what is "the character between the lying eight characters",(it is called the integral character), then, please, skip this section and click on   discrete Fourier transformation..

    Because the function F(u) is a complex function, we can write it in the form: F(u) = R(u) + i.I(u), where  R(u) and I(u) are the real functions, representing the real and the imaginary parts.

    Often, we use the exponential form of the Fourier transformation:

,       , where i is an imaginary unit,

is is the Fourier spectrum of the function  f(x), and

is the phase angle.

    The function for the quadrate of the spectrum is.

    Because the exponential form of the Fourier transformation is rather difficult to calculate, in particular in the case of "e to something", we therefore use for arrangement the Euler formula:

    exp (-i2piux) = cos(2piux) -i.sin(2piux)

    Now, let's expand the Fourier transformation for the function of two variables f(x,y). Let f(x,y) be a continuous and integrable, and F(u,v) is also integrable, then the Fourier transformation is bude

    where u and v are frequency variables and x,y are area variable then

the spectrum is: ,

the phase angle is: ,

the quadrate of spectrum is: .

Discrete Fourier transformation

With the calculation of the one-dimension Fourier transformation we used a continuous one-dimension function f(x) with the variable x, on the basis of that we integrated from the minus infinity to the plus infinity. However, in practice we have given a discrete sequence of function values (a subsequence of sampled values), and for calculation, we use the discrete Fourier transformation.

For easier understanding, we show it on one example. Assume that we have a continuous function f(x). We take from it a sequence of function values with the regular step dx, {(f(x0). f(x0+1.dx), ..., f(x0+N.dx)}.In the case of a graphics image this is a sequence of function values of one row. In the case of the continuous function it will be regular sampling. The resulting sequence can be expressed as:: 

f(x) = f(x0 + x.dx)
, kde x = 0,1, ..., N,.

Then, the Discrete Fourier transformation is:


where u and x = 0, 1, ..., N.

For the function of two variables we define the discrete Fourier transformation as:
where u = 0, 1, ..., M, and v = 0, 1, ..., N,

where x = 0, 1, ..., M, and y = 0, 1, ..., N.

Alias and antialiasing

In graphics the word 'alias' means an unwanted optical phenomenon, a defect in the image [1]. We meet it while drawing abscissas into a raster. A diagonal abscissa in a square raster comprises the equal number of points as its horizontal projection in the axis x which causes the diagonal abscissa to be thinner in brightness, and not look smooth. This is due to making a raster of the image.


Aliasing of the abscissa


Aliasing of the abscissa

In computer graphics 'antialiasing' means smoothing edges. For this purpose we use the method of balancing and smoothing, used for the first time by Catmull and Crow. The idea of the method lies in the following: any point from an image has a non-zero area in the display, so thus also the dot displayed in the area of a model has also a non-zero area. In the following applet, is displayed a non-zero thickness abscissa put into the raster. Every square of this raster represents a one display pixel.

The image point overlapping an abscissa of given thickness has its brightness given by the range of their conjunction. Time-consumption is, however, relatively more demanding, and therefore, in drawing in an application this option is mostly either accepted or disallowed, so deciding whether the application should use antialiasing or not.

Raster image compression

RLE coding

RLE encoding is a simple, but effective method of data compression used for instance with coding   BMP and PCX files.

The method of coding a binary sequence, when it is coded the distances of transit of a sequence of zeros and units. The simple and effective method based on assumptions of frequently repeating adjacent pixels. For instance, consider the sequence: 7,7,7,7,7,7,7,7,7,7,7,7. Would not be it better to write this sequence as only 12,7, what should mean the number of repeating and the character or figure which is repeated?

Example :

We have to code the given sequence : 001111110000001100000110011011100011111.

At the beginning of the sequence is 0, therefore we proceed further until we find the first 1. Then into a result chain we write down the length of zeros, in our case: number 2.

Actual stage of coding: we are at the bit no.3, and we have to code the first two bits. We go further in the sequence until we encounter the first zero. This is at the eight bit. In this sequence, there are 5 units,   and we give this number into the outcome chain.

In the end we have the outcome chain:265352221335. The biggest number here is number 5. Therefore we transform all the numbers  of the outcome chain into the binary system, and we exclude 3 bits for every number. The outcome sequence will be: 010 101 101 010 101 010 010 001 011 011 101. Length of this sequence is all together 36 bits. The original sequence had a length of 38 bits.

After understanding this example, we show advantages and disadvantages of this method. Note that if in the previous example we added 1 or 0 so that the length of subsequent 1 or 0 did not exceed 7, then the length of the outcome sequence will not be changed, while the length of the original sequence would grow. In the case if the sequence contains a few of repeating values, we receive the outcome sequence with a length longer than of that of the original. So, this sequence would have a "negative compression", meaning such coding is rather ineffective.

Huffman code

The Huffman code belongs between basic loss-free codes where no losses of data occur [4]. Because it is based on entropy, it is often arranged among the category of entropy codes. It uses a similar principle   as the Morse alphabet, where the most frequently used sounds of the English language  is assigned the shortest code, and less frequently ones have a longer code. The Huffman code aims for an optimum distribution of the binary symbol sequence in accordance with the frequency of the occurrence of the coded elements, concurrently with an effort to approximate to their entropy.


Entropy H is defined for random variables f1, f2, .., fk and probability p (fi) = pi as follows: 


Let's have the values q1, q2, .., q8 with the following probabilities: p3=0,08    p4=0,2    p5=0,12    p6=0,08    p7=0,04    p8=0.

For binary coding of eight values we would need 3 bits for every value.

Entropy is H = -[0,4.log 2 (0,4) + 3 x 0.08. log 2 (0.08) + 
                           + 0.2 x log 2 (0.2) + 0.12 x log 2 (0.12) +
                           + 0.04 x log 2 (0.04)] =
                        = 2.42 bit

The procedure of coding is the following [4]:

  1. To find two lowest frequencies, and to total them
  2. In the result of proceeding step to find two lowest frequencies and to sum them
  3. To proceed in previous step until the relative frequency 1
  4. To code words either comprising of 0 and 1 in accordance with the tree arising through summing so that we continue in a reverse way from the frequency one where 1 is left and 0 is right, which are the higher bits, and using the same procedure, we count lower the bits.

You can see the procedure on the following applet.


Applet Huffman coding

Lempel - Ziv - Welch

This method and its modifications are used in the majority of compressing programs (e.g., ZIP, RAR, ARJ, LHA, other) and in the graphics format GIF.

In 1997, Lempel and Ziv discovered a new method of data coding, which was improved in 1984 by Welch. The principle lies in the replacement of samples of input data by binary codes of a variable length (mostly, gradually growing). The input samples are coded according to a dictionary (or code table) gradually being supplemented. The length of this dictionary is given by the number of bits used in coding. In the case where the dictionary is full, we increase the number of input data bits by 1. So we reach a capacity of this dictionary to the double. In processing a picture with the color resolution one byte for one pixel, we have a dictionary with the capacity of 256 (8 bits). The first 256 items of such dictionary (so-called basic dictionary) does not require writing of output data because it is equal to the values 0 to 255.

Discrete cosine transformation (DCT)

In coding images with a large amount of color transits and colors, RLE and LZW methods do not give good results. Quality images, however, have 24 colors or 32 bits per point, where there are only few adjacent equal points (meaning such points which differ only a little in color or in brightness). For such images it is suitable to use loss coding, since in such number of colors we do not notice greatly a lowering of quality. The coding using DTC gives very good results. Therefore also for this reason it is used in the graphics format JPEG.

In coding an image in JPEG with 75 percent maintenance of original image data, we obtain compression rates 1 : 20 up to 1 : 25. DCT is in fact the discrete Fourier transformation, where the function values sequence is the image data sequence. For processing, it is necessary first to transform the used color model (e.g., RGB, ?) into the model YCBCR.

JPEG image with the compression quality 85 percent (above) and 15 percent (below).

As you can see in above, when we increase the degree of compression a square effect occurs here. This is due to splitting the image into regular squares, with the dimension 8x8, and their subsequent processing. You can learn more about this in [2] and [4].

Fractal compression

One of the most progressive types of modern compression is the fractal. The father of this method is Michael F. Barnsley, discovered when he studying fractal algorithms. This method belongs among non-symmetrical compressions because the time between compression and decompression is significantly different in favor of compression. The method is based on a search for similarities, either in shape or color parts, in different sections of a compressed image. Mathematically, it is possible to write this by the linear transformation:

    T[x,z] = [ax + by + e, cx + dy + f ],

whereby there is in this  expression a rotation, scaling, and shifting of the found similar part

It is striking how many similar and repeating samples and details can be found in images of real world as well as in those generated by computers. The standard file for keeping such compressed images is the FIF (Fractal Image Format) where there are stored the coefficients of similarity in compressed form. In compression the algorithm aims to split an image into unequal sections (domains) and these are then transformed suitably into the result picture.

The overall range of the compressed file depends on the amount of domains and transformations used for the composition of this image. This depends mainly on the time used for searching for the best transformations. The time complexity of searching for an appropriate transformation is extraordinarily demanding and therefore mostly in the case of compression by fractal compression there is also set up a maximum time until when it is possible to search for the best transformations. It is rather difficult to search for such transformations also due to the fact that every image has different features and it is not possible to apply uniform procedures as it was in the case of the preceding types of coding. The overall time range of fractal compression takes in the order of tens of minutes. However, decompression is very fast and comparable to previous types of coding. In coding using fractal compression we receive an image which can easily be zoomed because transformations are set by functions and those can be iterated (zoomed, scaled) to any range, virtually without a loss of quality.