MLT Report Vfinal2

Embed Size (px)

Citation preview

  • 8/4/2019 MLT Report Vfinal2

    1/22

    Modulated Lapped Transform based image compression

    Course : Data Compression

    Students: Paula PopaCosmin Caba

    2011

  • 8/4/2019 MLT Report Vfinal2

    2/22

    Table of Contents

    1. Introduction ...................................................................................................................................... 3

    2. Transform block ................................................................................................................................ 4

    3. Quantization block ............................................................................................................................ 7

    4. Entropy Encoding Block .................................................................................................................... 9

    4.1 Entropy Encoding Part 1 : Huffman Encoding ................................................................................. 9

    4.2 Entropy Encoding Part 2 ................................................................................................................ 10

    5. Measurements ................................................................................................................................ 12

    5.1. Code length measurements and entropy estimation .................................................................. 12

    5.2. Objective and Subjective assessment .......................................................................................... 15

    6. Conclusion ....................................................................................................................................... 19

    7. Bibliography .................................................................................................................................... 20

    8. Appendix ......................................................................................................................................... 21

  • 8/4/2019 MLT Report Vfinal2

    3/22

    3

    1.IntroductionThe general purpose of the project is to implement a coder and a decoder and to test the

    implemented modules on some test data. The implementation will follow the block diagram presented

    in figure 1:

    Figure 1:Block diagram of encoder

    The first module is the Transform block. The purpose of this block is to decorrelate the image .

    The second block includes both Quantization of the transformed matrix and the ZigZag traversing of the

    quantized matrix. The result obtained after the ZigZag is coded as a run length vector. The last block is

    Entropy encoding module. The aim of this block is to code, lossless, the run length vector into the final

    bit stream. More details about how this techniques have been implemented are going to be given in the

    next chapters.

    After the encoding and decoding process have been performed we will evaluate the quality of

    the reconstructed image in comparison to the original one. Also the report will comprise several

    measurements related to the code length and entropy estimation.

    We have chosen to work with Modulated Lapped Transform (MLT) in order to decorrelate the

    initial image. This decision is based on the fact that, as it has been proven in many articles, lappedtransforms perform very good in reducing the blocking effect of the reconstructed image. For

    Quantization we have chosen to work with a standard matrix quantization, similar to JPEG standard. For

    the Entropy encoding Huffman algorithm is used.

    We have to mention that the goals of this project is to assess the coding in terms of

    compression ratio, coding rate and how close is the reconstructed image to the original one( objective

    and subjective assessment). We do not intend to measure the performance of the implemented

    algorithms like speed, complexity or memory required, therefore the implementation is not focused on

    optimizing the algorithms but only on providing the right results.

    The structure of the report is : first we will describe the coding principle according to the

    diagram in figure 1, then the implementation is presented and the tests that were effectuated,

    afterwards the results will discussed and finally the conclusions.

    T Q EInput image

    Binary bit

    stream

  • 8/4/2019 MLT Report Vfinal2

    4/22

    4

    2.Transform blockAs previously mentioned the transform used in this block in the Modulated Lapped Transform.

    MLT is basically a block transform and it operates on the image similar to DCT transform. Lapped

    transform is considered an extension to the regular block transform. The difference is that for lapped

    transforms the basis functions are longer than for DCT leading to overlapping between neighboringblocks. Also the basis functions for MLT decay to 0 very smoothly thus reducing the block artifacts. We

    have decided to use an 8x16 transform matrix. This is a compromise between quality and complexity.

    Next we will show how the transformation matrix has been built. Let N be the number of basis

    functions in the matrix and L be the length of one function. Therefore we have N=8 and L=16 for our

    case. We denote the matrix P and the elements of the matrix are computed as follows:

    where k=1..8 and n=1..16. h(n) is seen as a window that shapes (modulates ) the basis functions in such

    a way that both ends have a continuous transition to 0. h(n) is defined :

    Next picture depicts the basis functions computed with the previous formulas for the case of

    N=8 and L=16:

    Figure 2: MLT :N=8 ,L=16 basis functions

    The obtained functions (figure 2) are similar to those provided in references [1] and [2].

  • 8/4/2019 MLT Report Vfinal2

    5/22

  • 8/4/2019 MLT Report Vfinal2

    6/22

    6

    achieved because of the distortion that the transform introduces in the image. Only 4 pixels at every of

    the four edges are affected by this distortion. We have tried to eliminate or at least to reduce as much

    as possible the effect of the distortion but this is the best we got. The technique used is to mirror 4 of

    the pixels around the edge. This problem appears only for the lapped transforms and not for block

    transforms like DCT. Although this may not affect the human perception thus subjective assessment of

    the reconstructed image but it still affects the objective assessment. When the distortion will be

    evaluated we will take into consideration this imperfection at the borders as well.

    Next figure illustrates the reasons behind this behavior.

    Figure 5: Mapping of pixels using MLT

    In figure 5 the case of one-dimensional transform is depicted. Each block of 8 pixels from the

    original data is transformed into a 8 pixel block. But in input of the transform is 16 pixels, thus we take 4

    pixels from left side and 4 from the right side. For the first and last block of 8 pixels there is a need to

    add 4 extra pixels on each side (the two red blocks). In our implementation this 4 pixels are mirrored

    from the first 4 and the last 4 pixels. The blue blocks are the overlapping ones. They partially overlap

    with the neighboring 8 pixels blocks. The problem is that we introduce some additional information with

    the added pixels (red blocks) into the transformed image which cannot be extracted back when doing

    inverse transformation.

    As for the inverse transformation we compute the inverse transform matrix by doing the

    transpose of the forward transform matrix (we take advantage of the fact that we deal with

    orthonormal transform). This time the input of the transform module must be a block of 8 pixels and

    the output is 16 pixels. We still need to map the input to an output of 8 pixels thus at the output we do

    overlapping and sum the overlapped values in order to get the output sequence. Another technique

    that we have used in order to minimize the distortion at the borders is to duplicate the last and the first

    8 pixels block before doing the inverse transform and take into account also the influence of this

    replicated blocks when computing the reconstructed data stream.

    The property of separability has been used to implement the two dimensional transform (2-D

    transform) by first doing one dimensional transform on each row of the input image and again one

    dimensional transform for every column.The functions that can be used for forward transform are: MLTenc1D (unidimensional transform

    ), MLT_enc (2-D transform) and for inverse transform : MLTdec1D and MLT_dec.

    The transformed matrix that we obtained for input image Lenna has almost all the information

    concentrated in the DC coefficient (the first value in a 8x8 block) the rest of the coefficients decay to 0.

    Next figure shows the first transformed block of image Lenna:

  • 8/4/2019 MLT Report Vfinal2

    7/22

    7

    Figure 6: 8x8 pixels transformed block

    The purpose of the transformation process has been achieved. The pixels are not correlated as

    much as they are in the original image. There is still some correlation which we are going to eliminate in

    the next block of the encoder namely Quantization Block.

    3.Quantization blockCompression is done starting with quantization. Thus each 8x8 block of the picture, after MLT

    has been applied, is quantized by dividing each element from the 8x8 matrix with the corresponding

    element from the quantization matrix and rounding to the nearest integer value. The image quality is

    obtained by selecting specific quantization tables. The quality levels range between 1 and 100, where 1

    is the poorest image quality and the highest compression and 100 is the best quality and the lowest

    compression. In the project we use 50 quality level matrix because its a trade-off between image quality

    and compression level by providing a very good image quality and high compression. The quantization

    matrix used it is:

    (

    )

    To obtain a better quality the matrix Q50 is multiplied with the (quality level)/50. Thus if we

    want to obtain a quality level of 100, Q50 matrix will be multiplied by 100/50. The elements of the new

    matrix will be then rounded to obtain positive values and situated between 1 and 255 interval. For the

    measurements in the project we used Q10, Q30, Q70 and Q90 quantization matrices to compare the

    differences in image quality with the image compressed with the standard Q50 matrix.

    The other quantization matrices Q10 and Q90 will be equal to:

  • 8/4/2019 MLT Report Vfinal2

    8/22

    8

    (

    )

    (

    )

    Thus after quantization the coefficients situated in the upper-left corner of the image block

    have bigger values and corresponds to the lower frequencies to which the human eye is more sensitive.

    The rest of the coefficients will be zero and they would not be used for the reconstruction of the image

    (as a part of lossy compression). More measurements have been done for the above quantization

    matrices. In the pictures it can be seen the difference in image quality.

    The quantized coefficients are now prepared for coding. The coefficients will be quantized in

    ZigZag order, to compress much better the large number of zeros. The DC coefficients are differential

    coded and the AC coefficients are run-length coded. This process is necessary because the DC

    coefficients contain a big fraction of the image energy. In the differential coding from the DC coefficient

    of the current block is subtracted the DC coefficient from the previous block. The difference obtained is

    then encoded, to utilize spatial correlation of the DC coefficients from the neighboring blocks. Because

    the difference between DC coefficients is smaller, then so many bits are not needed for encoding. Afterthe run length coding we will have something like (symbol, frequency of symbol), where the frequency

    of the symbol is the number of consecutive apparitions of a value. The encoded values will be stored in a

    single vector and then Huffman encoded. This process: ZigZag ordering, differential and run-length

    coding makes the entropy coding more efficient and easy.

    Figure 7 :Left: Zigzag ordering for a 8x8 block; Right: Differential coding [1]

  • 8/4/2019 MLT Report Vfinal2

    9/22

    9

    4.Entropy Encoding BlockAfter doing quantization thus lossy compression, this block aims to compress the data in the

    lossless manner. The biggest part of this block is about Huffman encoding. We have chosen Huffman

    instead Arithmetic coding because we found it easier to implement and it gives a better compressionratio than arithmetic encoding. The disadvantage of Huffman is that the codebook may be too large but

    this is not our case. The idea behind the Huffman encoding is to use it for the run length vector that is

    obtained in the Quantization phase.

    4.1 Entropy Encoding Part 1 : Huffman Encoding

    For the entropy coding we use Huffman algorithm, which uses fewer bits to encode with

    variable length codes the most probable symbols (more frequently) and more bits for the less probable

    symbols. The reason for using Huffman algorithm is because is much easy to implement and its not so

    complex as Arithmetic algorithm. Arithmetic algorithm is slower and it needs more time for decoding

    than Huffman, which requires only searching in the table look up the corresponding codeword for thesymbols. Huffman algorithm generates compact and optimal codes which are stored in codebook. The

    encoded data and the codebook are needed at the decoder to obtain the initial symbols. The symbols

    will be represented with a prefix-tree code and the bit sequence that represents a symbol will never be

    a prefix for the bit sequence, which represents another symbol.

    In the project we use Huffman static algorithm and we generate a binary tree by taking the two

    probable symbols and add them to form a new symbol with a probability equal to sum of the other two

    symbols. The process continues until we have one symbol (it remains a single node-the root node). As a

    part of Huffman algorithm we use a function that generates the binary tree and the code words to

    encode the symbols. The Huffman dictionary, the binary tree, is created by adding zeros and ones to the

    last two probable codeword and keeps on doing this, until we dont have any symbols. The code words

    are stored in a vector having the same size with the number of symbols and sorted according to theircorresponding symbol frequency. The lowest weights will be on the first positions and in this way it is

    more efficient and it is not necessary to keep searching the symbols starting from the first positions.

    Even if we have the last probable symbol on the first position when we will make decoding, the vector

    with the code words will be evaluated in a descending order to be more efficient and to save time.

    At the receiver side the decoding process takes place. The bit stream will be decoded using the

    codebook to decode the initially symbols. Thus from the received bit stream we compare a number of

    bits which is equal to the length of the corresponding codeword from the dictionary. If the values are

    the same we take the corresponding symbol and we store it in a vector. The code words from the vector

    will be evaluated in decreasing order, to be more efficient. Once we find a symbol we start to evaluate

    again the last values from the vector because it is more probable to find the most frequent symbols at

    the end of the vector. After we obtain the Huffman decoded values, the symbols are differential and

    run-length decoded and then rearranged in zigzag order.

    After encoding the run length vector we must provide the receiver side with some extra

    information in order to be able to reconstruct the Huffman codebook identically to the one used in the

    encoding side. The algorithm and techniques used for this are described in the next subchapter.

  • 8/4/2019 MLT Report Vfinal2

    10/22

    10

    4.2 Entropy Encoding Part 2

    At this point we have our picture encoded using the Huffman algorithm. In order to decode this

    data at the receiver, we need to have the same codebook. There are many possibilities to have the

    codebook at the receiver side. One of them is to send the codebook from encoder to decoder in such a

    way that encoder knows how to read it from the file. Another option is to send some data that can helpthe decoder to reconstruct the same codebook that has been used to encode the run length vector. A

    format for the data and a protocol has to be established for this case as well.

    The solution we have chosen is to send the symbols and the frequencies associated with them

    to the decoder so that we can rebuild the codebook. Even if this solution might be slower than sending

    the codebook directly, we are not concerned with the speed of the algorithms.

    First we will describe how the file will look like when having all the necessary information

    written in it. We may think of it as of a packet which has a header and a payload. The header contains

    the necessary information to rebuild the Huffman codebook and the payload contains the data that

    needs to be decoded using this codebook.

    Figure 8: File structure

    Basically the Header contains the information regarding the symbols and frequencies taken from

    the run length vector. The next step is to encode this information in such a way that the decoder knows

    how to interpret it. For this we have used a protocol.

    Because each symbol might be using a different number of bits to be represented a fixed symbol

    length would not be appropriate. Thus we represent each symbol with a variable number of bits and for

    delimitation between consecutives values we used a standard sequence of bits: 0 1 1 1 1 0.

    Before being encoded using this protocol the information is structured like in the next figure:

    Figure 9: Header structure before encoding

    The header comprises of two rows: one with the symbols and the other with the frequencies.

    The columns are sorted so that we have the symbols in a increasing order with value of 0 being

    somewhere in the middle.

    We start by taking the first symbol and transform it to binary representation. The result is

    written in some vector and then the delimiter 01 1 1 1 0 is added to the vector. Next the frequency

    Header Payload

  • 8/4/2019 MLT Report Vfinal2

    11/22

    11

    associated with this symbol is transformed to binary and added followed by the delimiter. After this we

    move to the next column and keep doing this until the end of the header. We based our protocol on the

    fact that the delimiter will not appear in the middle of the data very often. In the case that during binary

    transformation we encounter the delimiter 01 1 1 1 0, then a 0 is added after each sequence of

    three consecutive 1. At the receiver side , whenever the decoder finds three consecutive 1 it

    extract the next bit which should be the 0 we have inserted previously and whenever finds the

    delimiter reads the bit sequence and does the transformation from binary to decimal reconstructing the

    header as illustrated in figure 9.

    We do not encode the sign of the symbol. For this we take advantage of the header being

    sorted. The decoder adds negative sign in front of the symbols starting from the first one until it finds

    0 or it discovers that the order of symbols is switching from decreasing to increasing.

    At this point we have two vectors with binary values, the first one is encoded using Huffman

    algorithm and it represents the picture itself and the second one represents the information needed to

    reconstruct the Huffman codebook and it is encoded using the protocol we have described earlier.

    Next logical step would be to write these two vectors into a file. When writing to a file in Matlab

    we have to specify what data type we are writing (int8, int16, int32etc. ). This way, each value we are

    writing will be represented in the file using the precision for that specific data type. The smallest

    precision is for uint8. Using this each value will be represented using only 8 bits. The problem is thatour values in the two vectors are bits and there is a huge amount of wasted capacity because we

    represent 1 bit using 8 bits.

    In order to address this issue we came up with a solution that takes bits in blocks of 8 and

    transforms this block of 8 bits into the corresponding integer value. The resulted integer is then written

    into the file as a uint8 data type thus using 8 bits. In this way we have taken 8 bits and represent them

    into the file using 8 bits. Therefore before writing the binary header into the file we transform it into a

    vector of integers. The same goes for the binary vector coded using Huffman algorithm. The two

    resulted vectors of integers are finally written into the file.

    The decoder has to take the vectors from the file and represent the integer values in binary

    obtaining the binary header and the Huffman binary vector. Then, from the binary header, using the

    protocol presented above, it computes the header with symbols and frequencies associated to thesymbols. Using this header information the codebook is easily reconstructed and the Huffman algorithm

    can be applied in order to obtain the run length vector.

  • 8/4/2019 MLT Report Vfinal2

    12/22

    12

    5.MeasurementsThe first part of the measurements chapter comprises the results about code length and

    estimates of the entropy in different points during the encoding process. The measurements are

    performed on Lenna image.

    5.1. Code length measurements and entropy estimation

    In the quantization stage we have used 5 different quantization tables thus we expect that

    encoder/ decoder modules to perform different for each case. The most common quantization table

    that is used also in JPEG standard is Q50. First we will perform some measurements on the test data for

    this case (with Q50) and then well compare and relate to the other cases (Q10, Q30, Q70 and Q90).

    The code length is computed as the final file size expressed in bits divided by the total number

    of pixels in the compressed image. The result is the following:

    Code length =215872 bits / 262144 pixels= 0.8235 bits/pixel

    Using the code length we may compute the compression ratio as well:

    Compression ratio: 8 / 0.8235=9.7146

    We assumed that the initial image needs 8 bits to represent a pixel. In order to evaluate how

    good is our compression algorithm, we have to relate the code length to the entropy estimate. There

    are many ways of estimating the entropy. The entropy is going to be estimated using the probability

    model of the source by applying the following formula:

    In the first hand we estimate the entropy in the original image without taking into consideration

    any correlation between pixels. We make the assumption of i.i.d. pixels. The entropy in this case is

    H=7.2185.It is not relevant to compare this entropy to the code length but if we assume that initially

    one pixel is represented using 8 bits (if there is no correlation and each pixel is coded individually)thus

    the rate is 8 bits/pixel then the entropy estimate is very close to this value.

    The purpose of the transform block is to decorrelate the initial image, thus improving the

    coding. If we estimate the entropy of the matrix containing the transform coefficients the result is:

    H=4.3889. For this case the entropy has decreased significantly because the pixels are not as correlated

    to each other as in the original image. Still the difference between the code length and the entropy

    estimate is very big. The explanation for this fact is that because of the Quantization block a part of the

    information contained in the image is discarded. Quantization is not about improving the coding gainbut only about discarding some of the information which is not so relevant in order to have a good

    reconstruction of the original image.

    The entropy estimate of the quantized coefficients is H= 0.8591. This time we notice that the

    entropy estimate is very close to the code length. But the entropy should be the lower limit, so the code

    length should be close but higher than the entropy. Even if the entropy estimate in this case is very

    close to the code length that we obtained, there is still some correlation that we havent captured. And

  • 8/4/2019 MLT Report Vfinal2

    13/22

    13

    this correlation is related to the run length coding. Therefore the coding algorithm uses this technique

    but our entropy estimate does not capture the effect of run length encoding.

    The code length that we have computed takes into account the run length coding and the extra

    information added into the file to help reconstructing the codebook at the receiver thus it is not a very

    precise technique if we intend to evaluate the performance of our Huffman algorithm. In order to focus

    only on Huffman coding part we may consider as an input the run length vector and calculate the code

    length relative to the symbols in the run length vector. Next picture depicts the idea:

    Figure 10 : Part of encoder from Q50 quantization case

    The entropy estimate for the run length vector is written in the figure. The code length may also

    be computed in two ways: first one is to take into account the header information thus the whole filesize and second one is to take into account only the run length vector transformed in binary by the

    Huffman algorithm. For the first case the code length is:

    C1=215872 bits/64338 symbols== 3.3553 bits/symbol

    And for the second case we have:

    C2=3.3097 bits/symbol

    As we notice this time the measurement and the entropy estimation comply with the general

    rule for Huffman encoding:

    ,where L is the average length of the code.Also the code length C1 is bigger because of the header information that is written into the file.C1 is more relevant to use as a comparison term than C2, if we want to evaluate the encoder as a whole.

    If we think that there are better ways of compressing the header and we only want to evaluate how the

    Huffman algorithm performs without taking into account how well we have compressed the extra

    information needed for decoding then it makes more sense to compare the entropy with C2.

    Next table presents the measurements for all the quantization tables used:

    Q10 Q30 Q50 Q70 Q90

    File size(bits) 77384 140904 215872 387992 568632

    H1 0.3486 0.5849 0.8591 1.4653 2.0290

    H2 2.9145 3.1902 3.2596 3.2641 3.2935

    C1(bits/pixel) 0.2952 0.5375 0.8235 1.4801 2.1691

    C2(bits/symbol) 3.0078 3.2994 3.3553 3.3334 3.3644

    C3(bits/symbol) 2.9671 3.2556 3.3097 3.2894 3.3126

    Comp. ratio 27.1 14.9 9.7 5.4 3.6

    Table 1.

    H1- entropy estimation for the Quantized coefficients;

  • 8/4/2019 MLT Report Vfinal2

    14/22

  • 8/4/2019 MLT Report Vfinal2

    15/22

    15

    5.2. Objective and Subjective assessment

    This subchapter deals with measuring the distortion between the original and reconstructed

    image. This is a way of evaluating how close the reconstructed image resembles the initial picture.

    Another way is to perform a subjective assessment by comparing the two images. Subjective assessment

    usually gives better results in a sense that it reflects closer the quality and specially the difference inquality between the two images. Because is much difficult to make such subjective quality assessments

    , many times objective measurements are sufficient.

    Here we have performed objective assessment for the several cases of compression that we

    have but also subjective assessment. In order to be relevant and precise, subjective assessment must be

    made using a big number of persons. This is not the case for the project we are working on.

    In order to objectively assess the difference between one of the reconstructed image and the

    original Lenna image we chose to compute the PSNR. There are some other methods that can be used,

    but that is not the scope of this report and project. PSNR reflects the difference in quality between two

    pictures. A bigger value of the PSNR suggests that the reconstruction is very close to the original image.

    The formula used to compute the PSNR is:

    Where is the mean square errorbetween individual pixels in the two images and is computed as:

    A good way of observing the tradeoff between the distortion and the rate of the code is to

    create a rate-distortion graph. The next figure(figure 12) illustrates the rate-distortion curves for five

    different quantization tables that are used inside the compression algorithm (Q10Q90).

    The graphs shows that at low rates, thus poor quality , the PSNR is smaller than for higher rates,

    which is what we expected for. An important thing is the logarithmic shape of the curve. This tells us

    that the PSNR increases faster when we are in the low rate (low quality) zone of the curve, until it

    reaches a saturation point where even if the rate, thus quality, is increased, the PSNR increases at a

    slower pace. This somehow resembles the human quality perception. We shall discuss this after the

    subjective assessment.

    The following PSNR values were calculated for different quantization matrices for the image

    Lenna:

    Quality level PSNR [dB] rate [bits/pixel]

    10 31.48 0.25930 34.89 0.537

    50 36.88 0.822

    70 39.10 1.48

    90 40.81 2.169

    Table 2

  • 8/4/2019 MLT Report Vfinal2

    16/22

    16

    Figure 12: Rate-distortion graph

    As for the subjective quality assessment, the next five figures depict the reconstructed images

    with different quality:

    Figure 13: Lenna reconstruction : left -> Q10, right ->Q30

  • 8/4/2019 MLT Report Vfinal2

    17/22

    17

    Figure 14: Lenna reconstruction : left -> Q50, right ->Q70

    Figure 15: Left: Lenna reconstruction with Q90, Right: Lenna

    Looking at the reconstructed images we notice the same pattern of low quality when we use a

    lower rank quantization table (Q10 or Q30). The quality of the decoded picture is reflected by the rate of

    the code, therefore low rate leads to poor quality. It can be seen that reconstruction with Q90 is almost

    perfect, the compression is lower and the file size is bigger. When we use Q10 and Q30 quantization

    matrices for the reconstruction of the image, the quality is poor, but the compression is higher and the

    size of the file is smaller. The use of higher quality levels matrices gives better quality images since after

    quantization with these matrices, doesnt result so many values of zeros which dont participate at the

    reconstruction of the image.

    If we are to trace a curve with the subjective assessment it would have a logarithmic shape,

    because the quality (in or perception, though it may not be the case for other people) increases faster

  • 8/4/2019 MLT Report Vfinal2

    18/22

    18

    when the rate is low. This translates into the fact that human visual system is more sensitive to when

    the quality is poor. If the quality is good (i.e. Q70 or Q90 reconstructions) we cant distinguish or notice

    the difference in quality.

    As a final conclusion for this subchapter we would say that our PSNR measurement resembles,

    at least the trend of, the subjective assessment. In order to obtain a precise graph for the subjective

    assessment we should develop a grading system and have many tests to grade the images. In any case it

    is easy to say that the potential trace would probably look like the PSNR graph.

    There is an issue that we havent addressed so far. It is about our transform block. We must not

    forget that this block, in spite the fact that it should offer perfect reconstruction, introduces some

    distortion on the borders of the reconstructed image. When we have computed the PSNR we didnt take

    into account this effect. If we plot the difference between the new PSNR (PSNR2) and the old PSNR

    (PSNR1) against the rate in this case we get the following graph:

    Figure 16: Difference between the PSNRs versus rate

    If for low quality images the difference is small is becomes more relevant for high quality (when

    we quantize the transform coefficients using smaller values). Thus the distortion induced by the

    transform block becomes relevant when moving towards higher quality of the reconstructed image. Thisis valid only for objective assessment because when we talk about subjective assessment, the four pixels

    affected by distortion at the borders of the image are not noticeable by the eye.

  • 8/4/2019 MLT Report Vfinal2

    19/22

    19

    6.ConclusionFor the decorrelation of the image in the project it was used Modulated Lapped Transform. This

    technique is better than DCT because eliminates the blocking effect. This is done by using basis functions

    that decay smooth to 0 and are longer than in DCT case, yielding an overlap of the samples betweenneighboring blocks. In our case the basis functions have a length of 16 samples and thus the overlapping

    between neighboring blocks is 4 samples on each side. We showed that even if at the borders there are

    some imperfections in the reconstructed image (because we add some samples), it is almost identical

    with the original one. Thus only 4 pixels for each edge were affected by the distortion. Applying the MLT

    transformed we decorrelate the pixels.

    For the quantization we used the standard quality level 50, for which we obtained good results.

    However the measurements were done also for the 10, 30, 70 and 90 quality levels. For the higher

    quality levels like 70 and 90 we obtained a good quality image, a small compression thus a bigger file

    size. That is because the quantization for the higher quality levels doesnt discard so much information

    related to the reconstruction of the image as the lower levels do.

    For the entropy encoding we have used Huffman algorithm because its more simple and fastthan arithmetic coding. After the MLT transform, quantization, run length coding and Huffman coding

    the data was put in a file and sent to the decoder. In the header of the file we kept the information

    necessary for the reconstruction of the codebook in order to be able to decode the symbols. The

    average code length obtained with Huffman algorithm was comparable with the entropy. The estimates

    for the entropy were done: on the original image, after MLT transform, after quantization and after the

    run length vector. The code length was measured relative to the image, relative to the run length vector

    and relative to the run length vector without the header. We noticed that after each step, the

    estimation of the entropy was better and the Huffman length code was closer to the entropy. That is

    because after each step the pixels are more decorrelated, leading to lower entropy and better

    compression.

    The second part of the measurements comprises the rate-distortion graphs. The measurements

    were performed by taking into account the quality levels used. The code rate is higher for the higher

    quality levels, thus the PSNR grows as the number of bits/pixel increases. The measurement of the PSNR

    was done in both situations: by omitting the imperfections on the borders (the first and last four pixels)

    and by taking into account the imperfections.

    For the subjective assessment, when we plotted the difference between the two PSNRs (the

    error) against the rates it can be seen that the distortion grows faster for the higher rates, which means

    that the distortion in quality of the image it is more noticeable for higher quality levels, than for the

    small ones.

    The compression algorithm used is similar with JPEG but MLT transformed is used instead of

    DCT to eliminate the blocking effects. The results obtained are good and the quality of the reconstructed

    image is close to the original picture.

    As for the future work, improvements have to be made in order to achieve perfectreconstruction at the borders of the image. Also it would be very interesting to compare the MLT

    technique against DCT to observe how well the overlapping reduces the blocking effect.

  • 8/4/2019 MLT Report Vfinal2

    20/22

    20

    7.Bibliography

    1) Til Aach : Fourier, Block and Lapped Transforms, Institute for Signal Processing, University ofLubeck;

    2) Henrique S. Malvar : Extended Lapped Transforms: Properties, Applications, and FastAlgorithms, 1992, IEEE Transactions on signal processing vol. 40, no.11 ;

    3) Khalid Sayood : Introduction to Data Compression ,Third edition, 2005;4) Gregory K. Wallace, Multimedia Engineering, Digital Equipment Corporation, Maynard,

    Massachusetts :The JPEG still compression standard,1991, IEEE Transactions on ConsumerElectronics;

    5) Tinku Acharya, Ajoy K. Ray : Image processing Principles and Applications, 2005;6) Cornelius T. Leondes, Database and Data Communication Network Systems, Volume 1,

    Academic Press 2002.

  • 8/4/2019 MLT Report Vfinal2

    21/22

    21

    8.AppendixThe source code is uploaded in electronic format. Here we will give just a short introduction

    about each function we have used to encode and decode the image.The implementation is structured in 2 main parts: encoding part and decoding part.

    There is a program that computes the Transform matrix and plots the basis functions:

    MLTmatrix.m .

    We have defined the five quantization tables under the name Q10, Q30, Q50, Q70, Q90.

    Each time someone wishes to run encoder or decoder, these matrixes must be uploaded into

    the workspace.

    8.1 Encoding part

    The main script in the encoding part is encoding1.m. In this script we call individual functions

    that are related to different block in the encoder. These are going to be described in the following: MLT_enc.m function : inputs (, ) ;

    output : coefficients matrix ;

    quant_enc.m function : inputs (,);output : run length vector of the quantized coefficients matrix;

    buildmat.m function : input ();output : header information (symbols and frequencies from run length ;

    vector);

    huffman_dict.m function : input();outputs: codes and symbols associated to each code;

    runl2bin.m function :inputs (,,);output: binary representation of run length vector coded with Huffman

    algorithm;

    linecode_enc.m function : input();output : binary coded header information using the protocol

    described in section 4.2;

    putinteger2.m function : input ();output : uint8 vector;

    filewrite.m function : input (,,);

    8.2Decoding partThe main script in this part is decoding1.m which call the following functions :

    fileread.m function : input (,);output : header and run length vector in uint8 data format;

    putbits.m function : input();output : binary representation of the input vector;

    linecode_dec.m function : input ();

  • 8/4/2019 MLT Report Vfinal2

    22/22

    22

    output : header information (symbols and frequencies);

    huffman_dict_rebuild.m function : input ();output : codes and symbols for Huffman algorithm;

    bin2runl.m function : input( ,< codes>, );output : run length vector ;

    quant_dec.m function : input (,);output : transform coefficients;

    MLT_dec.m function : input ();output: reconstructed image;