4
IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 1, JANUARY 2014 79 Low-Rank Neighbor Embedding for Single Image Super-Resolution Xiaoxuan Chen and Chun Qi, Member, IEEE Abstract—This letter proposes a novel single image super-res- olution (SR) method based on the low-rank matrix recovery (LRMR) and neighbor embedding (NE). LRMR is used to ex- plore the underlying structures of subspaces spanned by similar patches. Specically, the training patches are rst divided into groups. Then the LRMR technique is utilized to learn the latent structure of each group. The NE algorithm is performed on the learnt low-rank components of HR and LR patches to produce SR results. Experimental results suggest that our approach can recon- struct high quality images both quantitatively and perceptually. Index Terms—Low-rank matrix recovery, neighbor embedding, super-resolution. I. INTRODUCTION H IGH-RESOLUTION (HR) images are needed in many practical applications [1]. Super-resolution (SR) image reconstruction is a software technique to generate a HR image from multiple input low-resolution (LR) images or a single LR image. In recent years, the learning-based SR methods have re- ceived a lot of attentions and many methods have been devel- oped [2], [3], [4], [5]. They place focus on the training exam- ples, with the help of which a HR image is generated from a single LR input. Freeman et al. [2] utilized a Markov network to model the relationships between LR and HR patches to perform SR. Inspired by the locally linear embedding (LLE) approach in manifold learning, Chang et al. [3] proposed the neighbor embedding (NE) algorithm. It assumes that the two manifolds constructed by the LR and HR patches respectively have similar local structures and a HR patch can be reconstructed by a linear combination of its neighbors. Li et al. [4] proposed to project pairs of LR and HR patches from the original manifolds into a common manifold with a manifold regularization procedure for face image SR. For generic image SR, Gao et al. [5] proposed a joint learning method via a coupled constraint. In learning-based methods, how to utilize the training set is very crucial. Patches are various in appearance. Thus it is nec- Manuscript received August 17, 2013; accepted October 04, 2013. Date of publication October 18, 2013; date of current version November 22, 2013. This work was supported by the National Natural Science Foundation of China under Grant 60972124, the National High-tech Research and Development Program of China (863 Program) under Grant 2009AA01Z321, and by the Specialized Research Fund for the Doctoral Program of Higher Education under Grant 20110201110012. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Gustavo Kunde Rohde. The authors are with the Department of Information and communication En- gineering, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China (e-mail: [email protected], [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/LSP.2013.2286417 Fig. 1. (a) Distributions of standard correlation coefcients between recon- struction weights of pairs of LR and HR patches for the NE algorithm and LRMR respectively. (b) Performance results of average PSNR for the ten test images with different values of patch size and overlap. essary to divide the whole training set into groups by certain strategies [4], [5] such that patches in each group are highly re- lated. Therefore, the subspace spanned by them is low-dimen- sional. However, how to learn the low-dimensional structure of such a subspace is also a challenge. In this letter, we employ a robust PCA approach, the low-rank matrix recovery (LRMR) [6], to learn the underlying structures of subspaces. LRMR has been successfully applied to various applications, such as face recognition [7] and background subtraction [8]. Given a data matrix whose columns come from the same pattern, these columns are linearly correlated in many situations and the ma- trix should be approximately low-rank. However, the data may be inuenced by noise in practical applications. LRMR can decompose such a data matrix into the sum of a low-rank matrix and a sparse error matrix . is the low-rank approx- imation of and has the capability of describing the under- lying structure of the subspace spanned by vectors in [6]. The columns in are more correlated with each other than they are in . According to the NE assumption, the reconstruction weights of one LR patch should be extremely similar with those of its HR counterpart. Unfortunately, it is not always the case due to the one-to-many mappings from LR to HR patches [4]. In this letter, we overcome this problem by using LRMR since the linear correlation relationship of patches is enhanced through LRMR and thus the local structure of manifold constructed by LR or HR patches is more compact. The NE assumption that the manifolds of LR and HR patches have similar local structures is more satised after LRMR procedure. In Fig. 1(a), we draw distributions of the standard correlation coefcients between the reconstruction weights of pairs of LR and HR patches [4] for the original NE algorithm and the LRMR method respectively. It is shown that the reconstruction weights of LR and HR patches for LRMR are more consistent with the NE assumption than those of the original NE algorithm, which means the LRMR proce- 1070-9908 © 2013 IEEE

Low-Rank Neighbor Embedding For

Embed Size (px)

DESCRIPTION

Low-Rank Neighbor Embedding FoLow-Rank Neighbor Embedding Fo

Citation preview

  • IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 1, JANUARY 2014 79

    Low-Rank Neighbor Embedding forSingle Image Super-Resolution

    Xiaoxuan Chen and Chun Qi, Member, IEEE

    AbstractThis letter proposes a novel single image super-res-olution (SR) method based on the low-rank matrix recovery(LRMR) and neighbor embedding (NE). LRMR is used to ex-plore the underlying structures of subspaces spanned by similarpatches. Specifically, the training patches are first divided intogroups. Then the LRMR technique is utilized to learn the latentstructure of each group. The NE algorithm is performed on thelearnt low-rank components of HR and LR patches to produce SRresults. Experimental results suggest that our approach can recon-struct high quality images both quantitatively and perceptually.

    Index TermsLow-rank matrix recovery, neighbor embedding,super-resolution.

    I. INTRODUCTION

    H IGH-RESOLUTION (HR) images are needed in manypractical applications [1]. Super-resolution (SR) imagereconstruction is a software technique to generate a HR imagefrom multiple input low-resolution (LR) images or a single LRimage. In recent years, the learning-based SR methods have re-ceived a lot of attentions and many methods have been devel-oped [2], [3], [4], [5]. They place focus on the training exam-ples, with the help of which a HR image is generated from asingle LR input. Freeman et al. [2] utilized a Markov network tomodel the relationships between LR and HR patches to performSR. Inspired by the locally linear embedding (LLE) approachin manifold learning, Chang et al. [3] proposed the neighborembedding (NE) algorithm. It assumes that the two manifoldsconstructed by the LR and HR patches respectively have similarlocal structures and a HR patch can be reconstructed by a linearcombination of its neighbors. Li et al. [4] proposed to projectpairs of LR and HR patches from the original manifolds into acommon manifold with a manifold regularization procedure forface image SR. For generic image SR, Gao et al. [5] proposeda joint learning method via a coupled constraint.In learning-based methods, how to utilize the training set is

    very crucial. Patches are various in appearance. Thus it is nec-

    Manuscript received August 17, 2013; accepted October 04, 2013. Date ofpublication October 18, 2013; date of current version November 22, 2013.This work was supported by the National Natural Science Foundation of Chinaunder Grant 60972124, the National High-tech Research and DevelopmentProgram of China (863 Program) under Grant 2009AA01Z321, and by theSpecialized Research Fund for the Doctoral Program of Higher Educationunder Grant 20110201110012. The associate editor coordinating the reviewof this manuscript and approving it for publication was Prof. Gustavo KundeRohde.The authors are with the Department of Information and communication En-

    gineering, Xian Jiaotong University, Xian, Shaanxi 710049, China (e-mail:[email protected], [email protected]).Color versions of one or more of the figures in this paper are available online

    at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/LSP.2013.2286417

    Fig. 1. (a) Distributions of standard correlation coefficients between recon-struction weights of pairs of LR and HR patches for the NE algorithm andLRMR respectively. (b) Performance results of average PSNR for the ten testimages with different values of patch size and overlap.

    essary to divide the whole training set into groups by certainstrategies [4], [5] such that patches in each group are highly re-lated. Therefore, the subspace spanned by them is low-dimen-sional. However, how to learn the low-dimensional structure ofsuch a subspace is also a challenge. In this letter, we employ arobust PCA approach, the low-rank matrix recovery (LRMR)[6], to learn the underlying structures of subspaces. LRMR hasbeen successfully applied to various applications, such as facerecognition [7] and background subtraction [8]. Given a datamatrix whose columns come from the same pattern, thesecolumns are linearly correlated in many situations and the ma-trix should be approximately low-rank. However, the datamay be influenced by noise in practical applications. LRMR candecompose such a data matrix into the sum of a low-rankmatrix and a sparse error matrix . is the low-rank approx-imation of and has the capability of describing the under-lying structure of the subspace spanned by vectors in [6]. Thecolumns in are more correlated with each other than they arein .According to the NE assumption, the reconstruction weights

    of one LR patch should be extremely similar with those of itsHR counterpart. Unfortunately, it is not always the case dueto the one-to-many mappings from LR to HR patches [4]. Inthis letter, we overcome this problem by using LRMR since thelinear correlation relationship of patches is enhanced throughLRMR and thus the local structure of manifold constructed byLR or HR patches is more compact. The NE assumption that themanifolds of LR and HR patches have similar local structuresis more satisfied after LRMR procedure. In Fig. 1(a), we drawdistributions of the standard correlation coefficients between thereconstruction weights of pairs of LR and HR patches [4] for theoriginal NE algorithm and the LRMR method respectively. It isshown that the reconstruction weights of LR and HR patches forLRMR are more consistent with the NE assumption than thoseof the original NE algorithm, which means the LRMR proce-

    1070-9908 2013 IEEE

  • 80 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 1, JANUARY 2014

    dure can improve the performance of the NE-based SR method.Therefore, we propose to apply the LRMR technique to performNE-based SR in this letter. The proposed method is easy to per-form and can get excellent results.The remainder of this letter is organized as follows. In

    Section II, we describe the proposed method in detail. The ex-perimental results are given in Section III. Finally we concludethis letter in Section IV.

    II. THE LOW-RANK NEIGHBOR EMBEDDING METHOD

    In this section, we first describe how to divide the training setinto groups. Then we stack vectors in each group as columns toconstruct a matrix and employ LRMR to learn the underlyinglow-rank component of this matrix. Finally, we perform NE re-construction on the computed vectors to obtain the initial SR es-timate. We further improve the quality of SR result by enforcingthe global reconstruction constraint.

    A. Grouping of the Training Set

    In this subsection, we select suitable examples from thetraining set for each input patch. Patches in the whole trainingset are various in appearance, thus the structure of the manifoldconstructed by all these patches is very complicated. However,we are only concerned with the subspace spanned by patchesthat are related with the input patch. Therefore, a good solutionis to divide the huge and complicated set into groups such thatthe patches in each group are correlated with each other.To be specific, the training set are constructed by pairs of LR

    and HR patch features, i.e., the LR image patch gradient featureset and the HR image patch intensity feature set

    , where is a feature vector for the th LRimage patch by concatenating the first- and second-order gradi-ents in both horizontal and vertical directions respectively, isa feature vector for the th HR image patch by vectoringthe pixel intensities with the patch mean value subtracted, andis the number of patches. Because the feature representations

    for LR patches do not contain the absolute luminance informa-tion, we subtract the mean value from each HR intensity vector.All HR and LR feature vectors are normalized to unit -normrespectively.For each vector in , we choose out its -nearest neigh-

    bors ( -NNs) from and put them into a group together with,

    (1)

    where denotes the group related with andis the index set of the -NNs of . The vectors spec-

    ified by can be seen to locate at the same local region of. In order to save the storage space, we only need to save the

    indices of the vectors specified in instead of saving vectorsthemselves [5]. The formulation is rewritten as

    (2)

    This stage can be performed off-line to reduce the computationtime.

    B. Low-Rank Matrix Recovery

    The input LR image is also divided into patches. For an inputpatch , we find its nearest neighbor in the training set .From the above subsection, we can get the group informationrelated with . The indexes in specify pairs ofLR and HR feature vectors. These LR feature vectors are re-lated with the input patch . We stack these LR gradient fea-ture vectors as columns to form a matrix , i.e.,

    . Similarly, we have a matrix by stackingthe HR intensity feature vectors, i.e., .The LR feature vectors in locate closely with each other

    in space, and they form a low-dimensional subspace. But theircorresponding HR feature vectors may have larger varietyin appearance due to the one-to-many mappings existing be-tween one LR image and many HR images. To deal with suchcase, we apply the LRMR technique [6] to learn the under-lying low-dimensional versions of these features. LRMR de-composes a matrix consisting of correlated vectors into the sumof a low-rank component and a sparse component. The columnvectors in the low-rank matrix are the low-dimension versionsof original vectors. They are more related with each other thanbefore. The sparse component is a matrix representing the noiseor differences among the original vectors [6]. Specifically, weattach to the LR gradient feature matrix and get the aug-mented matrix . Then we perform the low-rankmatrix decomposition on and respectively. The optimiza-tion functions [6] are formulated as

    (3)

    (4)

    where and are the low-rank component and sparse com-ponent of , while and are the low-rank componentand sparse component of , the nuclear norm (i.e., thesum of the singular values) approximates the rank of a matrix,and the -norm approximates the sparsity of a matrix. Tosolve the minimization problems of Eqns. (3) and (4), the in-exact augmented Lagrange multipliers (ALM) techniques [9] isapplied since it has excellent computational efficiency.We get four components after the optimization process. The

    computed can be divided into two parts, i.e., one is the low-rank component of input patch , and the other is the low-rankcomponent of LR gradient feature matrix ,

    (5)

    because the low-rank matrix decomposition do not change theidentities of columns.

    C. Neighboring Embedding

    For the low-rank component of input patch , we findits nearest neighbor , in the matrix . The

  • CHEN AND QI: LOW-RANK NEIGHBOR EMBEDDING FOR SINGLE IMAGE SUPER-RESOLUTION 81

    TABLE ITHE IMPROVEMENT OF SR PERFORMANCE INDUCED

    BY IBP FOR JLSR AND OUR METHOD

    optimal weights are computed by minimizing the reconstructionerror of using to reconstruct ,

    (6)

    After obtaining the weights, the HR intensity feature is re-constructed with these optimal weights and the low-rankcomponents of the corresponding HR intensity features ,

    (7)

    Finally we scale the HR intensity feature and add the meanvalue of LR patch to generate the HR patch ,

    (8)

    where is a scale factor to further improve the SR quality [10]and is set to 1.7 empirically. Having obtained all HR patches

    , the initial HR image is produced by merging all HRpatches with averaging multiple predictions for the overlappingpixels between the adjacent patches.

    D. Post-Processing Procedure

    The above-mentioned algorithm is performed on patches.Usually, the resultant image is not satisfied with the globalreconstruction constraint. Thus, we apply the iterative back-pro-jection (IBP) algorithm [11] on the initial output to enforcethe global reconstruction constraint as well as maintainingthe consistency between the initial HR image and the finaloutcome. Let denote the initial estimation and representthe underlying HR image, which is assumed to get the observedLR observation after being degraded by the operators ofblurring and downsampling , i.e., . The finalreconstructed image is obtained from

    (9)

    where is a balancing parameter. The gradient-descent methodis utilized to solve the above formulation,

    (10)

    where denotes the HR image estimation after the th itera-tion, and is the step size of the gradient descent. The improve-ment of SR performance induced by IBP is given in Table I,where that for JLSR [5] is also given because it is also performedon patches. From the Table I, we see that the IBP procedure hasimproved the SR performance of both methods.

    III. EXPERIMENTAL RESULTS AND DISCUSSION

    In our experiments, the training HR images are selectedfrom the software package for [12]1. The HR images aredownsampled by Bicubic interpolation to generate LR images.We only perform SR reconstruction on the luminance com-ponent while merely upscale the chrominance componentsusing Bicubic interpolation to the desired size. Instead of usingLR images directly, a common means is to magnify the LRimages by two times using the Bicubic interpolation becausethe middle-frequency information of LR images has greatercorrelation with high-frequency than low-frequency [5], [12].The gradient features are obtained using the strategy in [12].The -NNs for group is 128. The neighborhood size forNE procedure is five. In this letter, the magnification factor is 3for all experiments.First, we perform experiments on different values of patch

    size and overlap. The PSNR results are shown in Fig. 1(b).It shows that the PSNR values get larger as the overlap getslarger with the patch size fixed, and the PSNR values decreaseas the patch size increases with the overlap fixed. The highestPSNR value is achieved when the patch size is three and theoverlap is two. We find that the optimal parameters remain un-changed when the magnification factor is four (please refer tothe supplemental material for details). But as the overlap in-creases, the execution time of the algorithm also increases be-cause the number of patches to be processed gets larger. To bal-ance the SR performance and the computation time, we chooseto divide the original LR images into patches with anoverlap of one pixel between the adjacent patches. Correspond-ingly, the patch size is with an overlap of three pixels forHR images.We compare our method with NESR [3], SAI [13], JLSR [5]

    and IBP algorithm [11]. The peak signal-to-noise ratio (PSNR)and the structural similarity (SSIM) are utilized to evaluate theperformance of the SR results objectively. Table II shows thePSNR and SSIM values of the SR results obtained by differentalgorithms. We can see that the proposed method gets largervalues in PSNR and SSIM than other methods, which suggestthe effectiveness of the proposed method.Figs. 2 and 3 show the SR reconstruction images to compare

    our algorithm with other SR methods visually. We can observethat there are artifacts for NESR and the edges are blurry forSAI. There are jaggy artifacts along the edges in the IBP recon-structed images. While our algorithm can produce clean imageswithout artifacts and get sharper edges than JLSR. It suggeststhat the subjective quality of the SR images obtained by our pro-posed method is also superior to other methods.To comprehensively validate the SR capability of the pro-

    posed method, we conduct more experiments on 20 images se-lected from the Berkeley Segmentation Database [14]. The av-erage values of PSNR and SSIM for the SR results are reportedin Table III. From the table, we can see that our approach alsoachieves the highest PSNR and SSIM values. It shows the sta-bility and robustness of our method.1http://www.ifp.illinois.edu/~jyang29/

  • 82 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 1, JANUARY 2014

    Fig. 2. Visual comparison with different methods on the Butter fly image magnified by a factor of 3. (a) NESR. (b) SAI. (c) JLSR. (d) IBP. (e) Our proposedmethod. (f) Original image. (Please refer to the electronical version and zoom in for better comparison).

    Fig. 3. Visual comparison with different methods on the Girl image magnified by a factor of 3. (a) NESR. (b) SAI. (c) JLSR. (d) IBP. (e) Our proposed method.(f) Original image. (Please refer to the electronical version and zoom in for better comparison).

    TABLE IIPSNRS AND SSIMS FOR THE RECONSTRUCTED

    IMAGES BY DIFFERENT METHODS

    TABLE IIIAVERAGE PSNRS AND SSIMS FOR THE SR RESULTS OF THE 20 IMAGES

    FROM THE BERKELEY SEGMENTATION DATABASE

    IV. CONCLUSION

    This letter proposes a low-rank neighbor embedding methodto perform single image SR reconstruction. Experiments com-pared with several SR algorithms validate the effectiveness ofour approach both quantitatively and qualitatively. It is worthmentioning that we use the low-rank components of HR patcheswhen performing NE reconstruction. We do not use sparse com-ponents and how to take advantage of their information willbe studied in the future work. Moreover, the self-learning ap-proaches [15] have been widely studied in recent years becauseit does not need to collect or select the training set additionally.

    We will consider applying self-learning strategy in our furtherinvestigation.

    REFERENCES[1] S. Baker and T. Kanade, Limits on super-resolution and how to

    break them, IEEE Trans. Patt. Anal. Mach. Intell., vol. 24, no. 9, pp.11671183, 2002.

    [2] W. T. Freeman, T. R. Jones, and E. C. Pasztor, Example-based super-resolution, IEEE Comput. Graph. Appl., vol. 22, no. 2, pp. 5665,2002.

    [3] H. Chang, D.-Y. Yeung, and Y. Xiong, Super-resolution throughneighbor embedding, in IEEE Conf. CVPR, 2004, pp. 275282.

    [4] B. Li, H. Chang, S. Shan, and X. Chen, Aligning coupled manifoldsfor face hallucination, IEEE Signal Process. Lett., vol. 16, no. 11, pp.957960, 2009.

    [5] X. B. Gao, K. B. Zhang, D. C. Tao, and X. L. Li, Joint learning forsingle-image super-resolution via a coupled constraint, IEEE Trans.Image Process., vol. 21, no. 2, pp. 469480, 2012.

    [6] E. J. Candes, X. D. Li, Y. Ma, and J. Wright, Robust principal com-ponent analysis, J. ACM, vol. 58, no. 113, 2011.

    [7] L. Ma, C. Wang, B. Xiao, andW. Zhou, Sparse representation for facerecognition based on discriminative low-rank dictionary learning, inIEEE Conf. CVPR, 2012, pp. 25862593.

    [8] X. Cui, J. Huang, S. Zhang, and D. Metaxas, Background subtractionusing low rank and group sparsity constraints, ECCV, vol. 7572, pp.612625, 2012.

    [9] Z. Lin, M. Chen, and Y. Ma, The augmented lagrange multipliermethod for exact recovery of corrupted low-rank matrices, arXivpreprint arXiv:1009.5055, 2010.

    [10] J. Yang, Z. Wang, Z. Lin, S. Cohen, and T. S. Huang, Couple dictio-nary training for image super-resolution, IEEE Trans. Image Process.,vol. 21, no. 8, pp. 34673478, 2012.

    [11] M. Irani and S. Peleg, Motion analysis for image enhancement: Res-olution, occlusion, and transparency, J. Vis. Commun. Image Repre-sent., vol. 4, no. 4, pp. 324335, 1993.

    [12] J. C. Yang, J. Wright, T. S. Huang, and Y. Ma, Image super-resolutionvia sparse representation, IEEE Trans. Image Process., vol. 19, no. 11,pp. 28612873, 2010.

    [13] X. Zhang and X. Wu, Image interpolation by adaptive 2-d autore-gressive modeling and soft-decision estimation, IEEE Trans. ImageProcess., vol. 17, no. 6, pp. 887896, 2008.

    [14] D. Martin, C. Fowlkes, D. Tal, and J. Malik, A database of humansegmented natural images and its application to evaluating segmenta-tion algorithms andmeasuring ecological statistics, in IEEE 8th ICCV,2001, vol. 2, pp. 416423.

    [15] D. Glasner, S. Bagon, and M. Irani, Super-resolution from a singleimage, in IEEE 12th ICCV, 2009, pp. 349356.