OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFORMATION DERIVEDFROM TRAINING IMAGES

Embed Size (px)

Citation preview

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    1/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    DOI : 10.5121/ijcsit.2013.5202 19

    OBTAINING SUPER-RESOLUTION IMAGES

    BYCOMBININGLOW-RESOLUTION IMAGESWITHHIGH-FREQUENCYINFORMATION

    DERIVEDFROMTRAINING IMAGES

    Emil Bilgazyev1, Nikolaos Tsekos

    2, and Ernst Leiss

    3

    1,2,3Department of Computer Science, University of Houston, TX, USA

    [email protected],

    [email protected],

    [email protected]

    ABSTRACT

    In this paper, we propose a new algorithm to estimate a super-resolution image from a given low-resolution

    image, by adding high-frequency information that is extracted from natural high-resolution images in the

    training dataset. The selection of the high-frequency information from the training dataset is accomplished in

    two steps, a nearest-neighbor search algorithm is used to select the closest images from the training dataset,

    which can be implemented in the GPU, and a sparse-representation algorithm is used to estimate a weight

    parameter to combine the high-frequency information of selected images. This simple but very powerful

    super-resolution algorithm can produce state-of-the-art results. Qualitatively and quantitatively, we

    demonstrate that the proposed algorithm outperforms existing state-of-the-art super-resolution algorithms.

    KEYWORDS

    Super-resolution, face recognition, sparse representation.

    1. INTRODUCTION

    Recent advances in electronics, sensors, and optics have led to a widespread availability of

    video-based surveillance and monitoring systems. Some of the imaging devices, such as cameras,

    camcorders, and surveillance cameras, have limited achievable resolution due to factors such as

    quality of lenses, limited number of sensors in the camera, etc. Increasing the quality of lenses or

    the number of sensors in the camera will also increase the cost of the device; in some cases the

    desired resolution may be still not achievable with the current technology. However, many

    applications, ranging from security to broadcasting, are driving the need for higher-resolutionimages or videos for better visualization [1].

    The idea behind super-resolution is to enhance the low-resolution input image, such that the

    spatial-resolution (total number of independent pixels within the image) as well as pixel-resolution

    (total number of pixels) are improved.

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    2/13

    International Journal of Computer

    In this paper, we propose a ne

    low-resolution image with high-

    nearest-neighbor-search algorith

    sparse representation algorith

    high-frequencyinformation of shigh-frequency information help

    (a)

    Figure 1: Depiction of (a) the sup

    image and (c) high-frequency i

    The rest of the paper is organize

    of our proposed method is pres

    Section 4, and experimental represented in Section 5. Finally,

    2. PREVIOUS WORK

    In this section, we briefly review

    for general and domain-specific

    address the issue of image resol

    into two classes: multi-frame-methods compute an high-resol

    any domain [6]. The key assump

    of input LR images overlap an

    images. Then multi-frame-based

    that all information is contained

    super-resolution with the genera

    higher-resolution image is also

    counterpart of a single LR imagobserved information targeted to

    superior results specific to that

    training database to improve rec

    Moreover, the domain-specific

    each other in the way they mode[22] introduced a method to rec

    images. However, the performa

    magnification factor is more tha

    dependent on the size of the trai

    Science & Information Technology (IJCSIT) Vol 5, No

    w approach to estimate super-resolution by combi

    requency information obtained from training image

    m is used to select closest images from the training

    is used to estimate a weight parameter to

    lected images. The main motivation of our approas to obtain sharp edges on the reconstructed images

    (b) (c)

    r-resolution image obtained by combining (b) a given lo

    formation estimated from a natural high-resolution traini

    as follow: Previous work is presented in Section 2,

    ented in Section 3, the implementation details are

    ults of the proposed algorithm as well as other aection 6 summarizes our findings and concludes th

    existing techniques for super-resolution of low-reso

    urposes. In recent years, several methods have been

    tion. Existing super-resolution (SR) algorithms ca

    based and example-based algorithms [8]. Multtion (HR) image from a set of low-resolution (LR)

    tion of multi-frame-based super-resolution methods

    each LR image contains additional information t

    SR methods combine these sets of LR images into

    in a single output SR image. Additionally, these met

    l goal of improving the quality of the image so tha

    visually pleasing. The example-based methods co

    from a known domain [2, 13, 18, 10, 14]. Thesea specific domain and thus, can exploit prior knowl

    omain. Our approach belongs to this category, w

    nstruction output.

    R methods targeting the same domain differ consi

    l and apply a priori knowledge about natural imagenstruct SR images using a sparse representation of

    ce of these example-based SR methods degrades

    n 2. In addition, the performance of these SR met

    ing database.

    , April 2013

    20

    ning a given

    (Fig. 1). The

    dataset, and a

    combine the

    ch is that the(see Fig. 1).

    w-resolution

    ng dataset.

    a description

    presented in

    lgorithms arepaper.

    lution images

    proposed that

    be classified

    i-frame-basedimages from

    is that the set

    han other LR

    one image so

    hods perform

    the resulting

    pute an HR

    ethods learndge to obtain

    ere we use a

    derably from

    s. Yang et al.the input LR

    rapidly if the

    ods is highly

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    3/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    21

    Freeman et al. [8] proposed an example-based learning strategy that applies to generic images

    where the LR to HR relationship is learned using a Markov Random Field (MRF). Sun et al. [17]

    extended this approach by using the Primal Sketch priors to reconstruct edge regions and corners by

    deblurring them. The main drawback of these methods is that they require a large database of LR

    and HR image pairs to train the MRF. Chang et al. [3] used the Locally Linear Embedding (LLE)

    Figure 2: Depiction of the pipeline for the proposed super-resolution algorithm.

    manifold learning approach to map the the local geometry of HR images to LR images with the

    assumption that the manifolds between LR and HR images are similar. In addition, they

    reconstructed a SR image using K neighbors. However, the manifold between the synthetic LR that

    is generated from HR images is not similar to the manifold of real scenario LR images, which are

    captured under different environments and camera settings. Also, using a fixed number of

    neighbors to reconstruct an SR image usually results in blurring effects such as artifacts in theedges, due to over- or under-fitting.

    Another approach is derived from a multi-frame based approach to reconstruct an SR image from asingle LR image [7, 12, 19]. These approaches learn the co-occurrence of a patch within the image

    where the correspondence between LR and HR is predicted. These approaches cannot be used to

    reconstruct a SR image from a single LR facial image, due to the limited number of similar patches

    within a facial image.

    SR reconstruction based on wavelet analysis has been shown to be well suited for reconstruction,

    denoising and deblurring, and has been used in a variety of application domains including

    biomedical [11], biometrics [5], and astronomy [20]. In addition, it provides an accurate and sparse

    representation of images that consist of smooth regions with isolated abrupt changes [16]. In our

    method, we propose to take advantage of the wavelet decomposition-based approach in conjunctionwith compressed sensing techniques to improve the quality of the super-resolution output.

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    4/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    22

    3. METHODOLOGY

    Table 1. Notations used in this paper

    Symbols DescriptionX Collection of training images

    iXth

    i training image (m nR )

    jix ,th

    j patch of thi image (k lR )

    threshold value

    i sparse representation ofth

    i patch

    p. pl -norm

    D Dictionary

    W ,1W forward and inverse wavelet transforms

    , high- and low-frequencies of image][ Concatenation or vectors or matrices

    Letnm

    i RX

    be theth

    i image of the training dataset }0...=:{= NiXX i , and

    ,

    k l

    i jx R

    be theth

    j patch of an image }0...=:{= , MjxX jii . The wavelet transform of an image patch

    x will return low- and high-frequency information:

    ( ) = [ ( ), ( )] ,W x x x (1)

    where Wis the forward wavelet transform, )(x is the low-frequency information, and )(x is

    the high-frequency information of an image patch x . Taking the inverse wavelet transform ofhigh- and low-frequency information of original image (without any processing on them) will

    result in the original image:

    1= ([ ( ), ( )]) ,x W x x (2)

    where1W is the inverse wavelet transform. If we use Haar wavelet transform with its

    coefficients being 0.5 instead of 2 (nonquadratic-mirror-filter), then the low-frequencyinformation of an image x will actually be a low-resolution version of an image x , where fourneighboring pixels are averaged; in other words, it is similar to down-sampling an image x by a

    factor of 2 with nearest-neighbor interpolation, and the high-frequency information)(

    x of animage x will be similar to the horizontal, vertical and diagonal gradients of x .

    Assume that, for a given low-resolution image patch iy which is theth

    i patch of an image y , we

    can find a similar patch }0={= NMjxj from the natural image patches, then by combining

    iy with the high-frequency information )( jx of a high-resolution patch jx , and taking the

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    5/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    23

    inverse wavelet transform, we will get the super-resolution*

    y (see Fig. 1):

    2

    * 1

    02= ([ , ( )]) , ( ) ,

    i i j i j y W y x y x (3)

    where 0 is a small nonnegative value.

    It is not guaranteed that we will always find an jx such that 02

    2)( ji xy , thus, we

    introduce an approach to estimate a few closest low-resolution patches ( )( jx ) from the training

    dataset and then estimate a weight for each patch )( jx which will be used to combine

    high-frequency information of the training patches )( jx .

    To find closest matches to the low-resolution input patch iy , we use a nearest-neighbor search

    algorithm:

    ,},)(=:{= 12

    2

    Xxxyccci

    ci

    ciii (4)

    where c is a vector containing the indexes ( ic ) of training patches of the closest matches to input

    patch iy , and 1 is the radius threshold of a nearest-neighbor search. After selecting the closest

    matches to iy , we build two dictionaries from the selected patches jx ; the first dictionary will be

    the joint of low-frequency information of training patches )( jx where it will be used to estimate

    a weight parameter, and the second dictionary will be the joint of high-frequency information of

    training patches )( jx :

    .}:)({=,}:)({= cjxDcjxD jiji

    (5)

    We use a sparse representation algorithm [21] to estimate the weight parameter. The

    sparse-representation i of an input image patch iy with respect to the dictionary

    iD , is used

    as a weight for fusion of the high-frequency information of training patches (

    iD ):

    .argmin=12

    iiii

    i

    i Dy

    + (6)

    The sparse representation algorithm (Eq. 6) tries to estimate iy by fusing a few atoms (columns)

    of the dictionary

    iD , by assigning non-zero weights to these atoms. The result will be the

    sparse-representation i , which has only a few non-zero elements. In other words, the input image

    patch iy can be represented by combining a few atoms of

    iD ( iii Dy

    ) with a weight

    parameter i ; similarly, the high-frequency information of training patches

    iD can also be

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    6/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    24

    combined with the same weight parameter i , to estimate the unknown high-frequency

    information of an input image patch iy :

    * 1= ([ , ]) ,i i i i

    y W y D (7)

    where*

    iy is the output (super-resolution) image patch, and W1

    is the inverse wavelet transform.

    Figure 2 depicts the pipeline for the proposed algorithm. For the training step, from high-resolution

    training images we extract patches, then we compute low-frequency (will become low-resolution

    training image patches) and high-frequency information for each patch in the training dataset. For

    the reconstruction step, given an input low-resolution image y , we extract a patch iy , find nearest

    neighbors c within the given radius 1 (this can be speeded-up usinga GPU), then from selected

    neighbors c , we construct low-frequency i

    D and high-frequency dictionaries i

    D , where the

    low-frequency dictionary is used to estimate the sparse representation i of input low-resolution

    patch iy with respect to the selected neighbors, and the high-frequency dictionary

    iD will be

    used to fuse its atoms (columns) with a weight parameter, where the sparse representation i will

    be used as a weight parameter. Finally, by taking the inverse wavelet transform (1W ) of a given

    low-resolution image patch iy with fused high-frequency information, we will get the

    super-resolution patch*

    y . Iteratively repeating the reconstruction step (red-dotted block in Fig. 2)

    for each patch in the low-resolution image y , we will obtain the super-resolution image*

    y .

    4. IMPLEMENTATION DETAILS

    In this section we will explain the implementation details. As we have pointed out in Sec. 3, we

    extract patches for each training image }0...=:{= , MjxX jii with lkji Rx , . The number

    M depends on the window function, which determines how we would like to select the patches.

    There are two ways to select the patches from the image; one is by selecting distinct patches froman image, where two consecutive patches dont overlap, and another is by selecting overlapped

    patches (sliding window), where two consecutive patches are overlapped. Since the 2l -norm in

    nearest-neighbor search is sensitive to the shift, we slide the window by one pixel in horizontal or

    vertical direction, where the two consecutive patches will overlap each other by lk 1)( or

    1)( lk , where

    ,

    k l

    i jx R . To store these patches we will require an enormous amount of

    storage space lklnkmN )()( , where N is the number of training images and

    m n

    i

    X R . For example, if we have 1000 images natural images in the training dataset, and each

    has a resolution of 10001000 pixels, to store the patches of 4040 , we will require 1.34TB ofstorage space, which would be inefficient and computationally expensive.To reduce the number ofpatches, we removed patches which dont contain any gradients, or contain very few gradients,

    2

    2

    2, jix where is the sum of gradients along the vertical and horizontal directions (

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    7/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    25

    y

    x

    x

    xx

    jiji

    ji

    +

    ,,

    ,= ), and

    2 is the threshold value to filter out the patches with less gradient

    variation. Similarly, we calculate the gradients on input low-resolution patches ( iy ), and if they are

    below the threshold 2, we upsample them using bicubic interpolation, where no super-resolutionreconstruction will be performed on that patch. To improve the computation speed, the

    nearest-neighbor search can be calculated in the GPU, and since all given low-resolution patchesare calculated independently of each other, multi-threaded processing can be used for each

    super-resolution patch reconstruction.

    In the wavelet transform, for low-pass filter and high-pass filter we used [ 0.5 , 0.5 ] and

    [ 0.5 , 0.5 ], where 2D filters for wavelet transform are created from them. These filters are notquadratic-mirror-filters (nonorthogonal), thus, during the inverse wavelet transform we need to

    multiply the output by 4 . The reason for choosing these values for the filters is, the low-frequency

    information (analysis part) of the forward wavelet transform will be the same as down-sampling the

    signal by a factor of 2 with nearest neighbor interpolation, which is used in the nearest-neighbor

    search.During the experiments, all color images are converted to YCbCr, where only theluminance component ( Y ) is used.For the display, the blue- and red-difference chroma

    components ( Cb and Cr) of an input low-resolution image are up-sampled and combined with

    the super-resolution image to obtain the color image*

    y (Fig. 2).

    Note that we can reduce the storage space for the patches to zero, by extracting the patches of thetraining images during reconstruction. This can be accomplished by changing the neighbor-search

    algorithm, and can be implemented in GPU. During the neighbor-searching, each GPU thread will

    be assigned to extract low-frequency )( ,jlx and high-frequency information ,( )l jx at an

    assigned position j of a training image lX ; we compute the distance to the input low-resolution

    image patch iy , and if the distance is less than the threshold 2, then the GPU thread will return

    the high-frequency information )( ,jlx , where the returned high-frequency information will be

    used to construct

    iD :

    2

    12

    = { ( ) : = ( ) , } .i i i i c c

    i i

    D c c y x x X (8)

    As a threshold (radius) value for nearest-neighbor search algorithm we used 0.5 for natural images,and 0.3 for facial images. Both low-frequency information of training image and input image

    patches are normalized before calculating euclidean distance. We selected these values

    experimentally, where at these values we get highest SNR and lowest MSE. As we know that that

    the euclidean distance ( in nearest-neighbor search) is sensitive to noise, but in our approach, ourmain goal is to reduce the number of training patches which are close to input patch. Thus, we take

    a higher the threshold value for nearest-neighbor search, where we select closest matches, then thesparse representation is performed on them. Note that, sparse representation estimation (Eq. 6)

    tends to estimate input patch from training patches, where noise is taken care of [14].Reducing the

    storage space will slightly increase the super-resolution reconstruction time, since the wavelet

    transform will be computed during the reconstruction.

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    8/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    26

    5. EXPERIMENT RESULTS

    We performed experiments on a variety of images to test the performance of our approach (HSR) as

    well as other super-resolution algorithms: BCI [1], SSR [22] and MSR [8]. We conducted two types

    of experiments.

    For the first one, we performed the experiment on the Berkeley Segmentation Dataset 500 [15]. It

    contains natural images, where the natural images are divided into two groups; the first group of

    images (Fig. 3(a)) are used to train super-resolution algorithms (except BCI), and the second group

    images (Fig. 3(b)) are used to test the performance of the super-resolution algorithms. To measurethe performance of the algorithms, we use mean-square-error (MSE) and signal-to-noise ratio

    (SNR) as a distance metric. These algorithms measure the difference between the ground truth and

    the reconstructed images.

    The second type of experiment is performed on facial images (Fig. 4), where face recognition

    system is used as a distance metric to demonstrate the performance of the super-resolution

    algorithms.

    (a) (b)

    Figure 3: Depiction of Berkeley Segmentation Dataset 500 images used for (a) training and (b) testing the

    super-resolution algorithms.

    (a) (b)

    Figure 4: Depiction of facial images used for (a) training and (b) testing the super-resolution algorithms.

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    9/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    27

    5.1 Results on Natural Images

    In Figure 5, we show the output of the proposed super-resolution algorithm, BCI, SSR, and MSR.

    The red rectangle is zoomed-in and displayed in Figure 6. In this figure we focus on the effect

    ofsuper-resolution algorithms on low-level patterns (fur of the bear). Most of the super-resolutionalgorithms tend to improve the sharpness of the edges along the border of the objects, which looks

    good to human eyes, and the low-level patterns are ignored. One can see that the output of BCI is

    smooth (Fig. 5(b)), and from the zoomed-in region (Fig. 6(b)) it can be noticed that the edges along

    the border of the object are smoothed, and similarly, the pattern inside the regions is also smooth.

    This is because BCI interpolates the neighboring pixel values in the lower-resolution to introduce a

    new pixel value in the higher-resolution. This is the same as taking the inverse wavelet transform ofa given low-resolution image with its high-frequency information being zero, thus the

    reconstructed image will not contain any sharp edges. The result of MSR has sharp edges, however,

    it contains block artifacts (Fig. 5(c)). One can see that the edges around the border of an object aresharp, but the patterns inside the region are smoothed, and block artifact are introduced (Fig. 6(c)).

    On the other hand, the result of SSR doesnt contain sharp edges along the border of the object, but

    it contains sharper patterns compared to BCI and MSR (Fig. 5(d)). The result of the proposed

    super-resolution algorithm has sharp edges, sharp patterns, as well as fewer artifacts compared to

    other methods

    (Fig. 5(e) and Fig. 6(e)), and visually it looks more similar to the ground truth image (Fig. 5(f) and

    Fig. 6(f)).

    Figure 7 shows the performance of the super-resolution algorithms on a different image with fewerpatterns. One can see that the output of the BCI is still smooth along the borders, and inside the

    region it is clearer. The output of MSR looks better for the images with fewer patterns, where it

    tends to reconstruct the edges along the borders.

    (a) (b) (c)

    (d) (e ) (f)

    Figure 5: Depiction of low-resolution, super-resolution and original high-resolution images. (a)

    Low-resolution image, (b) output of BCI, (c) output of SSR, (d) output of MSR, (e) output of proposedalgorithm, and (f) original high-resolution image. The solid rectangle boxes in red color represents the

    regions that is magnified and displayed in Figure 5 for better visualization. One can see that the output of the

    proposed algorithm has sharper patterns compared to other SR algorithms.

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    10/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    28

    (a) (b) (c)

    (d) ( e ) (f)

    Figure 6: Depiction of a region (red rectangle in Figure 3) for (a) low-resolution image, output of (b) BCI, (c)

    SSR, (d) MSR, (e) proposed algorithm, and (f) original high-resolution image. Notice that the the proposed

    algorithm has sharper patterns compared to other SR algorithms.

    (a) (b) (c)

    (d) (e ) (f)

    Figure 7: Depiction of low-resolution, super-resolution and original high-resolution images. (a)

    Low-resolution image, (b) output of BCI, (c) output of SSR, (d) output of MSR, (e) output of proposed

    algorithm, and (f) original high-resolution image. The solid rectangle boxes in yellow and red colors

    represent the regions that were magnified and displayed on the right side of each image for better

    visualization. One can see that the output of the proposed algorithm has better visual quality compared to

    other SR algorithms.

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    11/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    29

    In the output of SSR, one can see that the edges on the borders are smooth, and inside the regions it

    has ringing artifacts. The SSR algorithm builds dictionaries from high-resolution and

    low-resolution image patches by reducing the number of atoms (columns) of the dictionaries under

    a constraint that these dictionaries can represent the image patches in the training dataset with

    minimal difference. This is similar to compressing or dimension reduction, where we try topreserve the structure of the signal, not the details of the signal, and sometimes we get artifactsduring the reconstruction1.

    We also computed the average SNR and MSE to quantitatively measure the performance of thesuper-resolution algorithms. Table 5.1 depicts the average SNR and MSE values for BCI, MSR,

    SSR, and HSR. Notice that the proposed algorithm has the highest signal-to-noise ratio and the

    lowest difference mean-square-error.

    Table 5.1: Experimental Results

    5.2. Results on Facial Images

    We conducted experiments on surveillance camera facial images (SCFace)[9]. This database

    contains 4,160 static images from 130 subjects. The images were acquired in an uncontrolled

    indoor environment using five surveillance cameras of various qualities and ranges. For each of

    these cameras, one image from each subject at three distances, 4.2 m, 2.6 m, and 1.0 m wasacquired. Another set of images was acquired by a mug shot camera. Nine images per subject

    provide nine discrete images ranging from left to right profile in equal steps of 22.5 degrees,

    including a frontal mug-shot image at 0 degrees. The database contains images in visible and

    infrared spectrum. Images from different quality cameras mimic real-world conditions. The

    high-resolution images are used as a gallery (Fig. 4(a)), while the images captured by a camera with

    visible light spectrum from a 4.2 m distance are used as a probe (Fig. 4(b)). Since the SCFacedataset consists of two types of images, high-resolution images and surveillance images, we used

    the high-resolution images to train SR methods and the surveillance images as a probe.

    We used Sparse Representation based Face Recognition proposed by Wright .et al [21], to testthe performance of the super-resolution algorithms. It has been proven that the performance of the

    face recognition systems relies on low-level information (high-frequency information) of the facialimages [4]. The high-level information, which is the structure of the face, affects less the

    performance of face recognition systems compared to low-level information, unless we compare

    human faces with other objects such as monkey, lion, car, etc., where the structures between them

    1 The lower-frequencies of the signal affect more the difference between original and reconstructed signals, compared to thehigher-frequencies. For example, if we remove the DC component (0 Hz) from one of the signals, original or reconstructed, the

    difference between them will be very large. Thus keeping the lower-frequencies of the signal helps to preserve the structure and have

    minimal difference between the original and reconstructed signals.

    Dist. Metric \ SR Algorithms BCI SSR MSR HSR

    SNR ( dB ) 23.08 24.76 18.46 25.34

    MSE 5.45 5.81 12.01 3.95

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    12/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    30

    are very different. Most of the human faces have similar structures: two eyes, one nose, two

    eyebrows, etc., and in the low-resolution facial images, the edges (high-frequency information)

    around the eyes, eyebrows, nose, mouth, etc., are lost which decreases the performance of face

    recognition systems [21]. Even for humans it is very difficult to recognize a person from a

    low-resolution image (see Fig. 4). Figure 8 depicts the given low-resolution images, and the outputof super-resolution images. The rank-1 face recognition accuracy for LR, BCI, SSR, MSR, and ourproposed algorithms are: 2%, 18%, 13%, 16%, and 21%. Overall, the face recognition accuracy is

    low, but compared to the face recognition performance on the low-resolution images, we can

    summarize that the super-resolution algorithms can improve the recognition accuracy.

    6. CONCLUSION

    We have proposed a novel approach to reconstruct super-resolution images for better visual quality

    as well as for better face recognition purposes, which also can be applied to other fields. Wepresented a sparse representation-based SR method to recover the high-frequency components of

    an

    (a) (b) (c) (d)

    Figure 8: Depiction of LR and output of SR images. (a) Low-resolution image, output of (b) BCI, (c) SSR,

    and (d) proposed method. For this experiment we used a patch size of 1010 pixels; thus when we increase

    the patch size we introduce ringing artifacts, which can be seen in the reconstructed image (d). Quantitatively,in terms of face recognition our proposed super-resolution algorithm outperforms other super-resolution

    algorithms.

    SR image. We demonstrated the superiority of our methods over existing state-of-the-art

    super-resolution methods for the task of face recognition in low-resolution images obtained from

    real world surveillance data, as well as better performance in terms of MSE and SNR. We concludethat by having more than one training image for the subject we can significantly improve the visual

    quality of the proposed super-resolution output, as well as the recognition accuracy.

    REFERENCES

    [1] M. Ao, D. Yi, Z. Lei, and S. Z. Li. Handbook of remote biometrics, chapter Face Recognition at aDistance:System Issues, pages 155167. Springer London, 2009.

    [2] S. Baker and T. Kanade. Hallucinating faces. In Proc. IEEE International Conference on

    Automatic Face and Gesture Recognition, pages 8388, Grenoble, France, March 28-30, 2002.

    [3] H. Chang, D. Y. Yeung, and Y. Xiong. Super-resolution through neighbor embedding. In Proc.

    IEEEInternational Conference on Computer Vision and Pattern Recognition, pages

    275282,Washington DC., 27 June-2 July 2004.

    [4] G. Chen and W. Xie. Pattern recognition using dual-tree complex wavelet features and svm. In

    Proc. Canadian Conference on Electrical and Computer Engineering, pages 20532056, 2008.

  • 7/30/2019 OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIGH-FREQUENCY INFOR

    13/13

    International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 2, April 2013

    31

    [5] A. Elayan, H. Ozkaramanli, and H. Demirel. Complex wavelet transform-based face recognition.EURASIP Journal on Advances in Signal Processing, 10(1):113, Jan 2008.

    [6] S. Farsiu, M. D. Robinson, M. Elad, and P. Milanfar. Fast and robust multiframe super resolution.

    IEEE Transactions on Image Processing, 13(10):13271344, 2004.

    [7] W. T. Freeman, T. R. Jones, and E. C. Pasztor. Learning low-level vision. IEEE International

    Journal of Computer Vision, 40(1):2547, 2000.[8] W. T. Freeman and C. Liu. Markov random fields for super-resolution and texture synthesis, chapter

    10, pages 130. MIT Press, 2011.

    [9] M. Grgic, K. Delac, and S. Grgic. SCface - surveillance cameras face database. Multimedia Tools

    and Applications Journal, 51(3):863879, 2011.

    [10] P. H. Hennings-Yeomans. Simultaneous super-resolution and recognition. PhD thesis, Carnegie

    Mellon University, Pittsburgh, PA, USA, 2008.

    [11] J. T. Hsu, C. C. Yen, C. C. Li, M. Sun, B. Tian, and M. Kaygusuz. Application of wavelet-based

    POCS superresolution for cardiovascular MRI image enhancement. In Proc. International

    Conference on Image and Graphics, pages 572575, Hong Kong, China, Dec. 18-20, 2004.

    [12] K. I. Kim and Y. Kwon. Single-image super-resolution using sparse regression and natural image

    prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6):11271133, 2010.

    [13] C. Liu, H. Y. Shum, and C. S. Zhang. A two-step approach to hallucinating faces: global parametric

    model and local nonparametric model. In Proc. IEEE Computer Society Conference on ComputerVision and Pattern Recognition, pages 192198, San Diego, CA, USA, Jun. 20-26, 2005.

    [14] J. Mairal, M. Elad, and G. Sapiro. Sparse representation for color image restoration. IEEE

    Transactions on Image Processing, 17(1):5369, Jan. 2008.

    [15] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its

    application to evaluating segmentation algorithms and measuring ecological statistics. In Proc. 8th

    International Conference on Computer Vision, volume 2, pages 416423, July 2001.

    [16] G. Pajares and J. M. Cruz. A wavelet based image fusion tutorial. Pattern Recognition,

    37(9):18551872, Sep. 2004.

    [17] J. Sun, N. N. Zheng, H. Tao, and H. Shum. Image hallucination with primal sketch priors. In Proc.

    IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages II72936

    vol.2, Madison, WI, Jun. 18-20, 2003.

    [18] J. Wang, S. Zhu, and Y. Gong. Resolution enhancement based on learning the sparse association of

    image patches. Pattern Recognition Letters, 31(1):110, Jan. 2010.

    [19] Q. Wang, X. Tang, and H. Shum. Patch based blind image super-resolution. In Proc. IEEE

    International Conference on Computer Vision, Beijing, China, Oct. 17-20, 2005.

    [20] R. Willet, I. Jermyn, R. Nowak, and J. Zerubia. Wavelet based super resolution in astronomy. InProc. Astronomical Data Analysis Software and Systems, volume 314, pages 107116, Strasbourg,

    France, 2003.

    [21] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse

    representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2):210227,

    February 2009.

    [22] J. Yang, J. Wright, T. Huang, and Y. Ma. Image super-resolution as sparse representation of raw

    image patches. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern

    Recognition, Anchorage, AK, Jun. 23-28, 2008.