14
Choice of low resolution sample sets for efficient super-resolution signal reconstruction Meghna Singh a , Cheng Lu a , Anup Basu b , Mrinal Mandal a,a Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada T6G 2V4 b Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2V4 article info Article history: Received 13 March 2010 Accepted 23 September 2011 Available online 1 October 2011 Keywords: Temporal registration Recurrent non-uniform sampling Confidence measure Event dynamics Super-resolution Signal reconstruction Iterative ranking MR imaging abstract In applications such as super-resolution imaging and mosaicking, multiple video sequences are registered to reconstruct video with enhanced resolution. However, not all computed registration is reliable. In addition, not all sequences contribute useful information towards reconstruction from multiple non-uni- formly distributed sample sets. In this paper we present two algorithms that can help determine which low resolution sample sets should be combined in order to maximize reconstruction accuracy while min- imizing the number of sample sets. The first algorithm computes a confidence measure which is derived as a combination of two objective functions. The second algorithm is an iterative ranked-based method for reconstruction which uses confidence measures to assign priority to sample sets that maximize infor- mation gain while minimizing reconstruction error. Experimental results with real and synthetic sequences validate the effectiveness of the proposed algorithms. Application of our work in medical visu- alization and super-resolution reconstruction of MRI data are also presented. Ó 2011 Elsevier Inc. All rights reserved. 1. Introduction Temporal registration, which is the computation of a correspon- dence in time between two sequences, is an important component in applications such as mosaicking [1], multiview surveillance [2], sprite generation [3], 3D visualization [4], medical imaging [5], time-series alignment [6] and super-resolution imaging [7,8]. In spatio-temporal super-resolution (SR), for example, low resolution (LR) videos from single or multiple sources are combined to gener- ate a super-resolution video. These LR videos can differ from each other in terms of various parameters, such as viewpoint, frame rate (and therefore sampling instances) and spatial resolution. Assum- ing that spatial viewpoint remains the same or can be estimated using stereo registration algorithms [9], a critical step in SR is to register the LR videos in time. When the frame rates of the videos are low, accuracy of the temporal registration process becomes crucial. In order to address the abovementioned issues, we use concepts developed in the field of recurrent non-uniform sample (RNUS) reconstruction. In RNUS reconstruction, a signal is reconstructed from multiple sample sets which are offset from each other by a known time interval [10]. Although RNUS was developed for applications where accurate time stamp information is available, and it is assumed that the sample sets are from the same continu- ous time signal, this assumption does not always hold true for SR reconstruction. However, it still provides useful insights into some of the factors that must be considered for SR reconstruction. Given that in SR reconstruction multiple LR sequences are avail- able, it is important to differentiate between LR sequences that have better registration and contribute more towards reconstruc- tion versus LR sequences that have higher uncertainty associated with their registration and may not contribute at all to the recon- struction process. An approach to making this differentiation is to associate with each pair of LR sequences a certain level of confi- dence so that higher confidence indicates better reconstruction. Confidence measures have been proposed in a variety of fields in the past. In signal processing and pattern recognition, confidence measures have been computed extensively for speech recognition [11,12], where they are used to reliably assess performance of speech recognition systems. These confidence measures are mostly based on probability distributions of likelihood functions of speech utterances, which are derived from Hidden Markov Models. In im- age processing, confidence measures have been proposed in mo- tion estimation [13,14], stereo matching [15] and moving object extraction [16]. For example, in [15], spatial, temporal and direc- tional confidence measures are developed based on the premise that good motion vectors are those that do not change drastically; hence, a confidence measure based on gradient information is computed that favors smooth gradients. 1047-3203/$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.jvcir.2011.09.009 Corresponding author. Fax: +1 780 492 1811. E-mail addresses: [email protected] (M. Singh), [email protected] (C. Lu), [email protected] (A. Basu), [email protected], [email protected] (M. Mandal). J. Vis. Commun. Image R. 23 (2012) 194–207 Contents lists available at SciVerse ScienceDirect J. Vis. Commun. Image R. journal homepage: www.elsevier.com/locate/jvci

Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

J. Vis. Commun. Image R. 23 (2012) 194–207

Contents lists available at SciVerse ScienceDirect

J. Vis. Commun. Image R.

journal homepage: www.elsevier .com/ locate / jvc i

Choice of low resolution sample sets for efficient super-resolutionsignal reconstruction

Meghna Singh a, Cheng Lu a, Anup Basu b, Mrinal Mandal a,⇑a Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada T6G 2V4b Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2V4

a r t i c l e i n f o a b s t r a c t

Article history:Received 13 March 2010Accepted 23 September 2011Available online 1 October 2011

Keywords:Temporal registrationRecurrent non-uniform samplingConfidence measureEvent dynamicsSuper-resolutionSignal reconstructionIterative rankingMR imaging

1047-3203/$ - see front matter � 2011 Elsevier Inc. Adoi:10.1016/j.jvcir.2011.09.009

⇑ Corresponding author. Fax: +1 780 492 1811.E-mail addresses: [email protected] (M. Singh),

[email protected] (A. Basu), [email protected],Mandal).

In applications such as super-resolution imaging and mosaicking, multiple video sequences are registeredto reconstruct video with enhanced resolution. However, not all computed registration is reliable. Inaddition, not all sequences contribute useful information towards reconstruction from multiple non-uni-formly distributed sample sets. In this paper we present two algorithms that can help determine whichlow resolution sample sets should be combined in order to maximize reconstruction accuracy while min-imizing the number of sample sets. The first algorithm computes a confidence measure which is derivedas a combination of two objective functions. The second algorithm is an iterative ranked-based methodfor reconstruction which uses confidence measures to assign priority to sample sets that maximize infor-mation gain while minimizing reconstruction error. Experimental results with real and syntheticsequences validate the effectiveness of the proposed algorithms. Application of our work in medical visu-alization and super-resolution reconstruction of MRI data are also presented.

� 2011 Elsevier Inc. All rights reserved.

1. Introduction

Temporal registration, which is the computation of a correspon-dence in time between two sequences, is an important componentin applications such as mosaicking [1], multiview surveillance [2],sprite generation [3], 3D visualization [4], medical imaging [5],time-series alignment [6] and super-resolution imaging [7,8]. Inspatio-temporal super-resolution (SR), for example, low resolution(LR) videos from single or multiple sources are combined to gener-ate a super-resolution video. These LR videos can differ from eachother in terms of various parameters, such as viewpoint, frame rate(and therefore sampling instances) and spatial resolution. Assum-ing that spatial viewpoint remains the same or can be estimatedusing stereo registration algorithms [9], a critical step in SR is toregister the LR videos in time. When the frame rates of the videosare low, accuracy of the temporal registration process becomescrucial.

In order to address the abovementioned issues, we use conceptsdeveloped in the field of recurrent non-uniform sample (RNUS)reconstruction. In RNUS reconstruction, a signal is reconstructedfrom multiple sample sets which are offset from each other by aknown time interval [10]. Although RNUS was developed for

ll rights reserved.

[email protected] (C. Lu),[email protected] (M.

applications where accurate time stamp information is available,and it is assumed that the sample sets are from the same continu-ous time signal, this assumption does not always hold true for SRreconstruction. However, it still provides useful insights into someof the factors that must be considered for SR reconstruction.

Given that in SR reconstruction multiple LR sequences are avail-able, it is important to differentiate between LR sequences thathave better registration and contribute more towards reconstruc-tion versus LR sequences that have higher uncertainty associatedwith their registration and may not contribute at all to the recon-struction process. An approach to making this differentiation is toassociate with each pair of LR sequences a certain level of confi-dence so that higher confidence indicates better reconstruction.Confidence measures have been proposed in a variety of fields inthe past. In signal processing and pattern recognition, confidencemeasures have been computed extensively for speech recognition[11,12], where they are used to reliably assess performance ofspeech recognition systems. These confidence measures are mostlybased on probability distributions of likelihood functions of speechutterances, which are derived from Hidden Markov Models. In im-age processing, confidence measures have been proposed in mo-tion estimation [13,14], stereo matching [15] and moving objectextraction [16]. For example, in [15], spatial, temporal and direc-tional confidence measures are developed based on the premisethat good motion vectors are those that do not change drastically;hence, a confidence measure based on gradient information iscomputed that favors smooth gradients.

Page 2: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Fig. 1. Illustration of recurrent non-uniform sampling with two sample sets.

M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207 195

Our work is unique as it introduces the concept of a confidencemeasure in temporal registration and reconstruction fromrecurrent non-uniform samples. The formulation criterion for theconfidence measure is two-fold: (1) it provides an estimate ofhow much confidence we have in the registration, and (2) it alsoprovides an estimate of how much new information is added tothe reconstruction process by the inclusion of a particular sampleset. We also present an iterative ranking method that not only pri-oritizes the sample sets, but given that some registration may beinaccurate, it also introduces a threshold limit beyond which add-ing more sample sets becomes redundant. Preliminary results ofour work have appeared in [17,18]. In this paper, we present a de-tailed examination of the confidence measure and the various fac-tors that influence it. We also address previously unansweredquestions about determining weights for the confidence measureand present quantitative and subjective results of the applicationof this work in super-resolution magnetic resonance (MR) imaging.

The rest of this paper is organized as follows. In Section 2, weprovide some preliminary definitions and review some approachesthat are used in this work. In Section 3, we present our confidencemeasure (along with a detailed discussion of various influencingfactors) and an iterative greedy rank based reconstruction method.Evaluation of the confidence measure and ranking algorithm with1D (synthetic and audio) and 2D (real video) data is presented inSection 4. In Section 5, we discuss the application of the proposedmethod in SR MR imaging and present performance results for thisapplication. Lastly, conclusions of this work and ideas for futurework are presented in Section 6.

2. Preliminary definitions and review

In this section we review some preliminary definitions andmethods with regards to recurrent non-uniform sampling, featureextraction, event modeling and super-resolution reconstructionthat are used in this work.

2.1. Recurrent non-uniform sampling

Recurrent non-uniform sampling (RNUS) is used to describe thesampling strategy where a signal is sampled below it’s Nyquistrate, but multiple such sample sets offset by a time delay are avail-able, i.e. the sampling frequency is fixed, but the sampling time israndomly initialized. Fig. 1 illustrates such recurrent non-uniformsampling, where x(t) is a 1D continuous time signal which is sam-pled at a sampling rate of T, giving rise to samples at T, 2T, . . . ,MT.

Fig. 2. (a) Reconstruction from uniform samples using sinc kernels. (b) Illustration of no(a).

Another sample set is also acquired at a sampling rate of T, how-ever, this sample set is offset by a timing offset s.

Direct reconstruction of a continuous signal from its N non-uniformly sampled sequences [19] can be done as follows:

xðtÞ ¼XN

n¼1

Xk2Z

xðkT þ MnÞ/n t � ðkT þ MnÞð Þ; ð1Þ

where Mn ¼ ðn�1ÞTN þ sn and /n represents the reconstruction kernels

such as splines, Lagrange’s polynomials and the cardinal series. Anindirect approach to reconstruction from RNUS is to derive uni-formly separated samples from the non-uniform signal instances,and then reconstruct using the standard interpolation formula in(2). Suppose a bandlimited signal x(t), is sampled at the Nyquist rateto obtain uniform samples x(kT), x(t) can be reconstructed from thesamples using the interpolation formula:

xðtÞ ¼Xþ1

k¼�1xðkTÞ sin Xðt � kTÞ=2

Xðt � kTÞ=2; T ¼ 2p=X: ð2Þ

Let x0, x1 and x2 correspond to three discrete samples of x(t) takenwith a uniform time interval at time t0, t1, and t2 (see Fig. 2(a)).Assuming a finite window of reconstruction (instead of the infinitesamples in (2)), an approximate reconstructed signal can be com-puted as:

x̂ðtÞ ¼ x0sinc t � t0ð Þ þ x1sinc t � t1ð Þ þ x2sinc t � t2ð Þ: ð3Þ

If x(t) was also sampled at non-uniform time instances t00; t01 and t02,

as shown in Fig. 2(b), by substituting t with t0ið0 6 i 6 2Þ in (3) wecan write the following linear equations:

x t0i� �¼ x0sinc t0i � t0

� �þ x1sinc t0i � t1

� �þ x2sinc t0i � t2

� �: 0 6 i 6 2;

ð4Þ

where xðt0iÞ are the known non-uniform samples. Eq. (4) can be ex-pressed as a system of linear equations:

n-uniform samples which can be expressed as linear combinations of samples from

Page 3: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

196 M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207

B ¼ A � x; ð5Þ

where

A ¼sinc t00 � t0

� �sinc t00 � t1

� �sinc t00 � t2

� �sinc t01 � t0

� �sinc t01 � t1

� �sinc t01 � t2

� �sinc t02 � t0

� �sinc t02 � t1

� �sinc t02 � t2

� �264

375;B ¼ x00; x

01; x

02½ �T ; x ¼ x0; x1; x2½ �T :

ð6Þ

These linear equations can be solved using standard methods(such as LU decomposition [20]) to calculate the sample values atthe uniform sampling instances (x in (5)). By plugging the solutionof x (sample values at uniform instances of time) into (3), approx-imate reconstruction of the original signal can be done.

These sampling and reconstruction formulations can be easilyextended to 2D time varying signals such as video sequences. Con-sider the case when multiple cameras (with fixed frame rates) cap-ture a scene at a fixed spatial resolution. If the acquisition time ofthe cameras is not hardware controlled, the sample instances ofthe video sequences are offset from each other by unknown non-uniform offsets. Thus, reconstruction of a high resolution videosequence from multiple LR video sequences can be equated toreconstructing from 2D RNUS.

2.2. Feature extraction and event modeling

Prior to SR reconstruction, registration is done either by usingall the pixels in the video frames (which can be computationallyexpensive) or by extracting feature and using feature trajectoriesfor alignment. Let {Si,1 6 i 6 N} denote N video sequences thatare acquired at a constant frame rate and are offset from each otherby a random time interval sn. Each sequence Si has M frames (I)such that Ii,k denotes the kth frame of the ith video sequence.Features are extracted in all sequences to generate discrete trajec-tories Xi,k,p (1 6 p 6 P, where P is the number of features ex-tracted). Features can be extracted based on point characteristicssuch as corners [21,22] or based on region characteristics such asshape and color [23] or a combination of both. In this work weimplement a region based feature extraction method where we ex-tract a single blob region based on motion and color informationand use the centroid of the blob as a feature. If multiple featuresare extracted, they can be tracked using algorithms such as theKLT tracker [24], Kalman Filter [25] or the RANSAC algorithm[26] to generate feature trajectories. For the sake of brevity inthe following discussion, we will ignore the subscript p and assumethat Xi,k refers to all the extracted features.

On their own the feature trajectories are discrete representa-tions of an event or activity in the scene, and we need to interpo-late between the discrete representations for sub-frameregistration. An efficient approach to generate continuous repre-sentations of the discrete trajectories is to generate event models.In our past work [4], we built a continuous time event model (Xi,t)of the discrete feature space (Xi,k) as follows:

Xi;t ¼ Xi;kbi þ �i; ð7Þ

where bi is the regression parameter and �i is the model error term.An approximate regression parameter b̂i is iteratively computedsuch that the following weighted residual error is minimized:

�i ¼XM

k¼1

wk Xi;t � bXi;t

��� ���2: t ¼ k; bXi;t ¼ Xi;kb̂i: ð8Þ

The method of computing weights wk is described in [4]. Usingevent models result in a more accurate estimate of the subframetemporal offset compared to the commonly used linear interpola-tion approach [7]. Once the event models Xi,t are available, the

temporal offset (sn) between the ith and jth sequences is computedby minimizing the following function:

sn ¼ argminsn

Xt

Xi;t �Xj;tþsn

�� ��2:

" #: i – j: ð9Þ

The above minimization formulation deals with event models de-rived from the entire sequence length and therefore results in moreaccurate computation of the offset sn.

2.3. Super-resolution reconstruction

Super-resolution reconstruction is usually solved using iterativemethods such as projections onto convex sets (POCS) [27,28], iter-ated back-projection algorithm [7] and stochastic methods [29,30].In these methods an image generative model is created, whichmodels the various phenomena that cause LR acquisition such asfinite aperture time, motion blurring and spatial downsampling.From an initial guess of the SR image, LR image estimates are gen-erated via the image-generative model. These LR images are thencompared to the actual available LR images, and the SR image isiteratively updated so that the difference between the LR estimatesthe actual LR images is minimized.

Similar resolution enhancement techniques have been used inmedical image analysis. However, image generative models arenot easy to create in medical imaging as the acquisition methodol-ogy is very different. For example, in magnetic resonance (MR)imaging, data is acquired in k-space (frequency domain) and theacquisition protocols can be as varied as spiral, cartesian or radialacquisition. Thus for MR images, reconstruction is usually per-formed in the frequency space. Consider the case when the LRMR images are acquired as undersampled radial projections. A reg-istration algorithm is used to determine which radial projectionsfrom multiple LR MR images correspond to the same instance ofthe event [31]. These projections can be combined to increasethe sampling resolution in k-space. The inverse Fourier transformof these radial projection lines cannot be directly computed usingstandard inverse Fourier transform implementations. Therefore,these LR radial samples are regridded to a cartesian representationby weighting the data samples (based on distances from cartesiancoordinates) and convolving with a finite kernel. A symmetricKeiser–Bessel window (b = 3) is usually used to interpolate fre-quency information in between the radial projections [32]. Com-puting the inverse Fourier transform of the regridded andinterpolated data results in the super-resolved MR images. Thisapproach to SR reconstruction of MRI data is often used in imagingof dynamic physiological phenomenon such as cardiac imaging [5],and is the method we have used for validating the confidencemeasure in SR MRI application.

3. Proposed method

In standard SR reconstruction, as shown in Fig. 3(a), all availableinput sequences are registered and a HR sequence is reconstructed.One of the motivations behind this work is to develop an enhancedSR system, as shown in Fig. 3(b), which receives multiple low res-olution videos of the same (or related) scene as input and deliversas output a ranking of sequences which should be used for furtherreconstruction. The system discards those sequences which eitherdo not provide any new information for reconstruction or whoseregistration is unreliable. The main modules of this system includethe computation of a confidence measure and an iterative greedyranking algorithm. In this section, we first discuss the various fac-tors that influence reconstruction from multiple sample sets. Wethen present an algorithm to compute a confidence measure which

Page 4: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Fig. 3. Simplified flowcharts of (a) standard SR reconstruction process, (b) enhanced SR reconstruction process based on computed confidence measure and iterative greedyranking algorithm.

Fig. 4. Illustration of decrease in reconstruction error with increase range of sn

(reported as a normalized number [0,1], where 1 corresponds to the sampling rateT).

M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207 197

is representative of these factors, followed by an algorithm to iter-atively rank the sequences.

3.1. Factors affecting sample confidence

When reconstructing signals from multiple sample sets, twofactors need to be kept in mind: (1) the uniformity (or lack thereof)of sample data, and (2) the accuracy with which the datasets havebeen registered. In the following sections we look at the influenceof both these factors on the development of an efficient approachto super-resolution registration.

3.1.1. Non-uniformity of sample setsConsider the reconstruction formulation in Section 2.1 and the

system of linear Eq. (5) which can be used to derive approxima-tions of samples at uniform instances from the known non-uniform samples. Due to a finite window of reconstruction, thesystem of equations represents only an approximate linear relationbetween uniform and non-uniform samples, and this approxima-tion can result in an ill-conditioned linear system. Also, if thenon-uniform sampling instances are close to each other, then dueto finite precision, round off error or erroneous computation ofthe offset sn, the system of equations can become singular. By max-imizing the row-wise difference between the coefficients of matrixA in (5) we can reduce the chances of A becoming ill-conditioned orsingular. Maximizing the difference between the coefficients of Atranslates into maximizing the distance between the closest sam-pling time instances of the ith and jth recurrent non-uniform sam-ple sets as follows:

maximize t0i � t0� �

� t0j � t0

� �h i: i – j;

maximize t0i � t0jh i

) maximize sij� �

:ð10Þ

One interpretation of (10) is that for optimal reconstruction, thesampling instances of the recurrent sample sets should be as faraway from each other as possible. Intuitively, without any a prioriinformation about the signal, this will allow for sampling of themajor trends in a signal. Proponents of non-uniform sampling ar-gue that sampling such that higher number of samples are takenin high frequency regions would be an optimal sampling approach.

However, note that most acquisition methods have fixed samplingrates with little user control over the sampling process. The valid-ity of the criterion in (10) can be demonstrated experimentally asfollows. We generate random HR signals bandlimited to a user-controlled frequency band (by applying a low-pass filter), and sub-sample them to create multiple LR sample sets. We assume thatthere is no temporal registration error, and the location of the sam-ple sets with respect to each other is known accurately. We add upto 10 sample sets iteratively during reconstruction, repeating theexperiment with 100 signals with different ranges of temporal off-set sn, i.e., sn 2 [0,0.1], sn 2 [0,0.2], sn 2 [0,0.3], sn 2 [0,0.4],sn 2 [0,0.5], respectively. The performance is shown in Fig. 4. Ob-serve the decrease in the reconstruction error as more and moresample sets are combined (for different ranges of sn) as shown inFig. 4. It can be seen that sample sets that have large offset rangebetween each other result in lower reconstruction errors at lowernumber of sample sets added, when compared to sample sets that

Page 5: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

198 M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207

have smaller offset from each other. This implies that being able tomeasure the non-uniformity between sample sets is importantwhen reconstructing from RNUS.

3.1.2. Error in temporal registrationComputing the temporal offset between sample sets is a non-

trivial task and there is possibility of error in the computation. Inorder to show the effect of temporal error with the reconstructionerror, we did the experiment as explained below. At first, we gen-erated random HR signals bandlimited to a user-controlled fre-quency (by applying a low-pass filter), and sample them tocreate multiple LR sample sets. We then intentionally made tem-poral errors with 5%, 15%, 25%, 35%, 45% and 55% of the samplingrate T. Finally, we added up to 10 sample sets iteratively duringreconstruction and calculated the reconstruction errors, repeatingthe experiment with 100 signals with different temporal errors.The experimental result is shown in Fig. 5. Note that when thetemporal error is above 25%, adding more sample sets does not im-prove the reconstruction results, and in fact the reconstruction re-sults deteriorate. Since error in temporal registration determines athreshold limit to the number of sample sets needed to achieve acertain reconstruction efficiency, we need to include a suitable rep-resentation of this error in our confidence measure. In conclusion,we found that there are two effects of error in temporalregistration.

� As the error increases, more and more sample sets are needed toachieve the same reconstruction efficiency as with lesser moreaccurately registered sample sets.� For a given distribution of error, there exists a threshold num-

ber of sample sets beyond which adding more sample sets doesnot affect the reconstruction error, and adding more sample setsis redundant.

3.2. Confidence measure

3.2.1. Computing the confidence measureWe have discussed two factors related to recurrent non-uni-

form samples that affect the reconstruction process. Due to lackof correct time-stamp information we can neither accurately deter-mine the non-uniformity of sample sets, nor the error in temporalregistration. We can, however, determine other parameters whichare indicative of non-uniformity and temporal registration error.We define two such parameters in the form of objective functionsUg and Ul, which are presented next.

Given two sample sets x(kT) and x(kT + sn) (as shown in Fig. 1)and their respective feature space Xi,k,p (defined in Section 2), we

Fig. 5. Effects of temporal errors in signal reconstruction.

define an objective function that estimates the non-uniformity ofthe sample using the following equation:

Ug ¼XP

p

XM

k¼1

Xi;kT;p �Xj;kTþsn ;p

�� ��2� �

: i; j 2 ð1 . . . NÞ: ð11Þ

Intuitively, Ug represents the global registration error of the dis-crete trajectories. Discrete samples that are closer to each otherhave relatively smaller differences in sample values compared tosamples that are farther apart in time from each other. Thus, theformulation of Ug (11), as a sum of difference between the discretefeature trajectories (after they have been approximately registered),reflects how far the sample set are in time.

We also propose the following objective function that estimatesthe error in temporal registration (subsequent to computing thecontinuous event models bXi;t [4]):

Ul ¼XP

p

Xt

bXi;t;p � bXj;tþsn ;p

��� ���2

: i; j 2 ð1 . . . NÞ: ð12Þ

Intuitively, Ul represents the local registration error of the eventmodels, and event models that have been incorrectly registered re-sult in larger values of Ul. These objective functions, defined in (11)and (12), give a general idea of the confidence in the temporal reg-istration. However, they do not relate to the confidence by a simpleproportionality, i.e. a large Ug does not imply a poor confidence inthe registration. It has been experimentally observed that a largeUg (along with a small Ul) indicates more uniform distribution ofthe sample sets and hence better signal reconstruction or higherconfidence in the choice of sample sets. Therefore, we define theconfidence measure as a linear weighted sum of Ug and Ul asfollows:

v ¼ wg Ug� �q þwl Ulð Þr ; ð13Þ

where wg and wl are weights assigned to the contribution of boththe objective functions to the overall confidence measure. Rela-tional parameters q and r, that define whether the objective func-tions and the confidence measure are directly or inversely related,are examined shortly. A method to compute the weights is dis-cussed later in this section. We now present two hypotheses withrespect to Ug and Ul which are intuitively supported by our discus-sion in Section 3.1 and supported by experimental validation thatfollows.

Hypothesis 1. Ug is an indicator of sn, and a value of sn whichplaces the sample sets as far apart from each other as possible,results in better reconstruction. Hence, an increase in Ug shouldincrease the confidence measure.

Hypothesis 2. Ul is an indicator of the overall error in registration,and a large Ul results in poorer reconstruction. Hence, an increasein Ul should decrease the confidence measure.

We validate the above hypotheses on 1D synthetic data. Foreach experiment a pseudo-random high resolution (HR) 1D signalis generated at a user specified bandwidth using a modified versionof Marsaglia’s subtract with borrow algorithm [33]. LR sample setsare generated by sampling the HR signal with a fixed sampling rateand a uniformly distributed temporal offset sn. Ug and Ul are com-puted for various combinations of the LR sample sets. An approxi-mate HR signal is also reconstructed from the LR combinationsusing code provided in [10] for Feichtinger’s algorithm [34]. Recon-struction error is computed as the sum of squared error (SSE) be-tween the reconstructed signal and the original signal. Fig. 6(a)plots the reconstruction error versus Ug computed for the syn-thetic test signals. It can be seen that Ug demonstrates a linearrelationship with the reconstruction error: as Ug increases the

Page 6: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Fig. 6. (a) Relationship between reconstruction error and objective function Ug. (b) Relationship between reconstruction error and objective function Ul.

M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207 199

reconstruction error decreases, i.e. the confidence measure whichshould be associated with Ug should be in direct increasing propor-tion. Therefore ‘q’ in (13) can be approximated to ‘1’. We also fittedthe reconstruction error versus Ug curve with a quadratic functionand it can be seen from Fig. 6(a) that a linear fit is a sufficientlygood approximation of the curve. The same analysis is applied tothe relationship between the reconstruction error and Ul, whichis shown in Fig. 6(b). It is observed that as Ul increases, the recon-struction error increases. Hence, ‘r’ in (13) can be approximated to‘�1’. The scale of the values of Ug and U�1

l are different, hence theyare normalized to lie between [0,1]. The normalization of Ug (andsimilarly U�1

l ) is computed as follows:

fUg ¼Ug �min Ug

� �max Ug

� ��minðUgÞ

� � : ð14Þ

The proposed confidence measure v can therefore be expressedas follows:

v ¼ wgfUg þwl

fUl�1: ð15Þ

Fig. 7. Confidence measure values computed for two different set of weights, (wg1,wl1) athe vertical axis corresponds to normalized values of confidence measure. It is obvious thwith the increasing of the reconstruction error along the x-axis, the confidence measure iscompare to the confidence measure with weights (wg1,wl1).

3.2.2. Computing weights wg and wl

In this section, we proposed a weight optimization strategy inorder to compute the confidence measure accurate.

It can be observed in Fig. 6(a) that the reconstruction error isdecreased while the objective function Ug is increased. On theother hand, in Fig. 6(b), the reconstruction error is increased whilethe objective function Ul is increased. Linear fit is an adequate wayto represent such relationships, however the process has some er-rors. Ideally, we want the confidence measure to linearly increasewith a decrease in the associated reconstruction error. Thisrequirement is a key factor in determining the weights (wg,wl) thatcontrol the effect of sampling non-uniformity and offset estima-tion. The confidence measure (v) computed using arbitrary valuesof weights may not be a linear curve. Therefore, in order to main-tain a linear relationship between the confidence measure (whichconsists of fUg and fUl ) and reconstruction error, the goal of tuningthe weights is to reduce residual of a linear fit of the confidencemeasure. In Fig. 7 for example, we generated 5 low resolutionsamplsets from a high resolution signal, and computed the

nd (wg2,wl2), where the horizontal axis represents combinations of sample sets andat the linear fit for (wg2,wl2) results in a smaller residual, and it follows the fact thatdecreasing. Hence the confidence measure with weights (wg2,wl2) is more accurate

Page 7: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Table 1Pseudocode for computing optimal weights.

For wg = 0.1: Dwg: 0.9

(i) Compute vðwgÞ ¼ wgfUg þwl

fUl�1 ) wg

fUg þ ð1�wgÞfUl�1

(ii) Perform a quick sort based on either Ug or Ul

(iii) Construct a linear fit to estimate v̂ðwgÞ(iv) Compute the residual of the linear fit as: RðwgÞ ¼ kv̂ðwgÞ � vðwgÞk2

End

(v) Optimize weights as: woptg ¼argminwg

RðwgÞ¼argminwgkv̂ðwgÞ�vðwgÞk2

(vi) Compute woptl ¼ 1�wopt

g

200 M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207

confidence measures with two sets of weights ((wg1,wl1) and(wg2,wl2)) for all possible 10(5C2) samplesets combinations. Wethen sorted the confidence measures based on decreasing Ug. Notethat the reconstruction errors of each samplesets combinations arein ascendant order along with the decreasing Ug, i.e. with the in-crease of the x-axis the reconstruction error increase. Ideally, ifthe confidence measure right on the fitting line, then with the in-crease of the reconstruction errors the confidence measure will de-crease linearly. In the case of Fig. 7, weights set (wg2,wl2) result insmaller residual error (which means it can maintaining the linear-ity better) in the linear estimation of v as compared to wg1, wl1,and are hence more suited for computing the confidence measure.

In reality, we cannot estimate the reconstruction error of a setof samples, since the original signal is not available to us. However,the objective functions Ug and Ul can be computed, and as vali-dated previously, these functions are linearly related to the recon-struction error. An additional assumption that is made with respectto the weights is that they sum to unity, i.e., wg + wl = 1. A pseudo-code for computing optimal weights is presented in Table 1. Letthere be N sample sets of an event from which we compute Ug

and Ul for all NC2 pairs of sample sets. We sort the sample set pairsbased on either Ug or Ul (we found experimentally that the choiceof the objective function does not affect the computation of opti-mal weights). For incremental increases Dwg in the value of wg

(wl = 1 � wg), we iterate over steps (i)–(iv) described in Table 1.The weight corresponding to the minimal residual value is chosenas the optimal weight.

In order to test the proposed weight optimization strategy wegenerated five sample sets from a high resolution signal. We con-sider all combinations, i.e. 10 combinations (5C2), of these samplesets and reconstructed an estimate of the original signal. The aimof the experiment was to optimize weights (based on the lineariza-tion strategy discussed) such that the highest confidence measurefor the combinations corresponds to the lowest reconstruction er-ror. The results for all combinations (shown in column one) with

Table 2Reconstruction error values for optimized and sub-optimal weights wg.

Combinations wg

0.1 0.2 0.3 0.4 0.5

S1–S4 1.000 1.000 1.000 1.000 1.0S1–S3 0.420 0.370 0.313 0.240 0.1S4–S5 0.242 0.187 0.131 0.069 0.0S2–S4 0.192 0.149 0.116 0.099 0.1S2–S3 0.072 0.029 0.000 0.000 0.0S3–S4 0.049 0.014 0.000 0.027 0.1S1–S2 0.022 0.000 0.010 0.082 0.3S2–S5 0.012 0.005 0.040 0.159 0.5S3–S5 0.000 0.009 0.073 0.244 0.7S1–S5 0.008 0.041 0.144 0.387 1.0

Fig. 8. Flowchart of iterative ranking method based on the proposed confide

respect to all possible weights (ranging from 0.1 to 0.9) are shownin Table 2. The corresponding reconstruction error, shown in thelast column, is also presented in Table 2. Note that optimal weightin this case is wg = 0.9, and we sort the result in a descending orderaccording to the reconstruction error, i.e. SSE. It can be seen thatoptimizing the weights results in the desired correspondence be-tween a high confidence measure and low reconstruction error;whereas, the same relationship does not hold true for otherweights in the confidence measure.

3.3. Iterative rank-based algorithm

In reconstructing from multiple sample sets we need to orderthe sample sets such that the information added for reconstructionis maximized and the error in the reconstruction is minimized. Thiscan be accomplished by ranking the possible combinations of thesample sets based on the proposed confidence measure. We useranking instead of directly using the numerical confidence mea-sure scores as the scales of the confidence may change over eachiteration, while ranking is a more consistent relative measure ofthe confidence measure. We assume that in each iteration thenumber of distinct ranks decreases by 1. In practice, however,

0.6 0.7 0.8 0.9 SSE

00 0.336 0.065 0.000 0.000 1578.32932 0.000 0.000 0.076 0.148 1305.35000 0.009 0.075 0.177 0.261 1210.86932 0.153 0.215 0.306 0.378 1196.59378 0.188 0.290 0.395 0.472 1241.87968 0.278 0.375 0.471 0.541 1264.50821 0.424 0.508 0.588 0.645 1303.08801 0.580 0.644 0.703 0.745 1254.27701 0.756 0.797 0.833 0.858 1195.00900 1.000 1.000 1.000 1.000 1135.742

nce measure. FR⁄ indicates a RNUS reconstruction algorithm from [10].

Page 8: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207 201

confidence measure scores may result in ties. In such cases aweighted measure of the previous rank score can be added to thecurrent rank to break the tie. This weighted addition of the previ-ous rank incorporates prior rank information rather than arbi-trarily choosing one sample set over another.

A flowchart of the iterative rank based reconstruction (IRBR)algorithm is shown in Fig. 8. The IRBR method is implemented asa greedy algorithm. Basically, the IRBR method contains four steps,as explained below.

(i) In the first iteration the algorithm computes confidencemeasures between all possible combinations of two samplesets.

(ii) Sample set combinations are then ranked based on the con-fidence measure. The sample set combination which has thehighest confidence measure is combined to reconstruct thefirst sample set of a new sample set array. The remainingsample sets are then added to this new sample set array inno particular order.

(iii) Next, confidence measures are computed between thereconstructed sample set from the previous iteration andall other remaining sample sets in the current iteration. Thisstep is what defines IRBR as a greedy algorithm, since thereconstruction error minima which was computed in thevery first iteration determines the path that the followingiterations take.

(iv) If absolute difference between the signal reconstructed atthe current iteration and previous iteration is less than athreshold (empirically determined), or if all the sample setshave been combined, the iterations are stopped.

4. Performance evaluation of the proposed method

In this section we present the experimental setup and valida-tion of each module of the proposed system. We first evaluate eachobjective function independently and present a representative re-sult that illustrates why a weighted measure of both the objectivefunctions (Ug and Ul) is more suitable than using either objectivefunction independently. We then evaluate the confidence measure(v) with synthetic and real (video) data sets. An evaluation of theiterative ranking algorithm is also presented with synthetic andreal (audio) data sets. Lastly, we discuss the computationalcomplexity of the system.

Table 3Experimental results for independent evaluation of objective functions Ug and Ul.

Combinations eUg

Case1 sample1–sample2 1.0sample2–sample3 0.161sample1–sample3 0

Case2 sample4–sample5 0sample5–sample6 0.8983sample4–sample6 1.0

Table 4Confidence measure v and corresponding reconstruction error for three different syntheti

eUg1 eU�1l 1 v1 SSE1 eUg2 eU�1

l 2

Com.1 0.00 1.00 0.00 96.79 0.00 1.00Com.2 0.30 0.14 0.23 30.38 0.23 0.23Com.3 0.47 0.21 0.43 12.66 0.44 0.11Com.4 0.69 0.02 0.66 7.74 0.68 0.05Com.5 0.87 0.02 0.85 4.44 0.86 0.02Com.6 1.00 0.00 1.00 3.03 1.00 0.00

4.1. Independent evaluation of Ug and Ul

In order to understand the complementary nature of Ug and Ul,we evaluate each objective function independently. We set up ourexperiments such that six synthetic sample sets are divided intotwo experimental cases of three sample sets each. The objectiveof the experiment is to determine which pair of three sample sets(i.e. sample1-sample2, sample2-sample3 or sample1-sample3) willresult in the minimum reconstruction error when combined. Sincethe sample sets are generated synthetically, the actual signal isknown and the reconstruction error (SSE) can be computed. TheSSE error is only used to validate the decisions that we take withrespect to the sample set combinations.

In the experiments, we compute Ug ;U�1l and v for all possible

combinations of sample sets. These values along with the SSE errorfor sample set combinations in Case 1 and Case 2 are shown inTable 3, where values of Ug and U�1

l are normalized within eachtest case to lie between [0,1]. If we observe the values of Ug forthe three combinations in Case 1, and choose the combinationcorresponding to the highest Ug as the best combination(sample1–sample2), that decision will be incorrect as sample1–sample2 combination does not correspond to the lowest SSE.However, if we were to choose based on the highest value ofU�1

l , the decision would be correct. Now, consider the results forCase 2. For Case 2, choosing based on Ug will result in the correctchoice, while U�1

l will result in an incorrect answer. Thus, usingonly Ug or U�1

l as a metric to choose sample set combinations re-sults in unreliable decisions. However, it can be seen for both casesthat the confidence measure v accurately determines the bestcombination in both these cases.

4.2. Evaluation of confidence measure

We evaluated the proposed confidence measure on both syn-thetic and real data. Synthetic data was generated as a high resolu-tion random signal which was band-limited to a user controlledfrequency. This high resolution data was then sampled at a lowsampling rate. For example, a 25 Hz band-limited signal was sam-pled at 2 Hz. Four sample sets at a fixed low sampling rate werealso generated by initializing the starting point of the sample setsrandomly with a uniform distribution. Temporal registration wasthen computed using methods described in Section 2.2. Finally,we fused each two sample sets (in total 6 possible combinations),and computed the confidence measure and reconstruction error.The reconstruction algorithm in [34] was used to reconstruct a

eU�1l

v SSE

0.0 0.2 47.700.0165 0.0454 318.851.0 0.8 31.86

1.0 0.2 1469.310.0066 0.7199 162.330 0.8 148.41

c signals.

v2 SSE2 eUg3 eU�1l 3 v3 SSE3

0.00 19.71 0.00 1.00 0.00 53.280.16 6.79 0.28 0.31 0.23 48.140.38 2.99 0.48 0.17 0.44 48.680.64 1.23 0.70 0.12 0.68 44.410.84 0.84 0.86 0.06 0.85 46.061.00 0.51 1.00 0.00 1.00 43.28

Page 9: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Table 5Confidence measure v and corresponding reconstruction error for real video sequences.

Scene Sequence eUgeU�1

lv SSE

Scene 1 seq1-seq2 0.5072 0.3943 0.4733 1700.0seq2–seq3 1.0 1 1 420.0seq1–seq3 0 0 0 8120.0

Scene 2 seq1-seq2 0 1 0 20.9seq2–seq3 1 0 1 4.0seq1–seq3 0.4553 0.2604 0.25 14.7

Fig. 9. (a) Sample frames from real data sequence, (b) sample trajectory from real data sequence.

202 M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207

signal from the fused sample sets. The confidence measure andcorresponding reconstruction error for three different syntheticsignals are shown in Table 4.

For our real test cases, we used video sequences of an individualswinging a ball tied to the end of a string. The video sequenceswere captured at 30 frames a second and the trajectory of the ballwas extracted via background subtraction techniques and motiontracking. This trajectory was then used as a high resolution signalwhich was further down-sampled at low sampling rates, as shownin Fig. 9(a)–(b). An event model [4] was used to compute the tem-poral registration between the undersampled signals. In eachexperiment, we arbitrarily chose one sample set as the parentagainst which other recurrent sample sets were registered. Thetwo objective functions defined in (11) and (12), and the confi-dence measure (15) for these sample sets were computed. Thesevalues along with the reconstruction error (SSE) for synthetic andreal data are presented in Tables 4 and 5, respectively. A higherconfidence measure indicates that the corresponding recurrentset is a better candidate for reconstruction, as corroborated bythe corresponding reconstruction error. It can be seen that the pro-posed confidence measure is a suitable indicator of the reconstruc-tion error. Further results with MRI data are presented in Section 5.

4.3. Evaluation of the overall system

We evaluated the rank-based reconstruction system on twosynthetic high resolution signals with different bandwidth. Foreach high resolution synthetic signal, 10 sample sets were createdby sampling the original high resolution signal with random initialpoints. Thus, each signal had 10 recurrent non-uniform low resolu-tion sample sets. The IRBR algorithm was used to reconstruct highresolution signal from 10 low resolution samples sets. We repeat

these process 20 times for each synthetic high resolution signal.To the best of our knowledge, there no other technique exists inthe literature solving the same problem as we consider. Therefore,in our experiment, we compare the IRBR algorithm with the con-ventional method that sample sets are selected randomly (wenamed it RO algorithm afterwards).

In the experiment, the sample sets are temporally aligned andreconstructed based on the proposed IRBR algorithm. Fig. 10(a)and Fig. 11(a) show two different synthetic high resolution sig-nals. The normalized reconstruction error corresponds to theIRBR algorithm and RO algorithm are shown in Fig. 10(b),Fig. 11(b). The x-axis and y-axis represent the number of samplesets used for signal reconstruction and the normalized recon-struction error, respectively. It can be seen that the proposedconfidence measure and the ranking system successfully orderthe sample sets such that fewer number of sample sets areneeded to reconstruct the same signal, compared to a randomordering of the low resolution sample sets. In Fig. 10(c), thereconstructed signal using 3 low resolution sample sets by theIRBR algorithm and RO algorithm are compared to the originalsignal. It is observed that the proposed IRBR algorithm providesa better reconstruction with the same number of sample sets.Fig. 11(c) shows the original signal, and the reconstructed signalusing 7 low resolution sample sets by the IRBR algorithm and ROalgorithm, respectively. Fig. 11(d) shows the zoom-in plot in theinterval [200,280] of reconstructed signals and original signal forbetter visual comparison. It is clear that the proposed IRBR algo-rithm indeed help to choose the good sample sets combinationthat achieve lower reconstruction error.

In our controlled experiment, it is observed that using the pro-posed IRBR algorithm and the confidence measure, we can use asubset of all the sample sets to reconstruct the original signal.

Page 10: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Fig. 10. Comparison of IRBR algorithm and RO algorithm on synthetic high resolution signal 1. (a) High resolution signal 1, (b) comparison results, and (c) comparison ofreconstructed signal when 3 sample sets are used (Note that the reconstructed signal be IRBR algorithm is close to the original signal).

M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207 203

We also show a promising application of the proposed systemto the super-resolution magnetic resonance (MR) imaging inSection 5.

4.4. Complexity analysis

In order to compute the worst case complexity analysis of theproposed method, we first define a few terms.

N

number of sample sets or LR video sequences available M number of samples (or video frames) in each sample set P number of features extracted in each frame R resolution factor by which event models are created

Complexity of computing Ug and Ul between two sequences isO(PM) and O(RPM) respectively. The computational complexity ofcomputing the weights, can be derived as follows. Given N valuesof Ug and Ul, computing v is of complexity O(N), a quick sort oper-ation has complexity O(NlogN) and weighted linear regression hasa worst case complexity of O(N3logN). These three operations areperformed over L iterations (Dwg = 1/(L � 1)) as indicated inTable 1. Computing the minimum of the residual, which is com-puted as part of linear regression, is of complexity O(L). Thus theworst case complexity of computing the weights is primarilydependant on the complexity of the linear regression O(LN3logN).These weights are computed only once per experiment and donot add a significant overhead to the SR reconstruction process.The weighted addition of the two objective functions is of com-plexity O(M). In total, the complexity of computing the confidencemeasure for two sample sets is O(RPM) + O(PM) + O(M) � O(RPM).The complexity of the IRBR method can be derived as follows. In

the first step, the algorithm forms all possible NC2 combinations.It then computes the confidence measure for all these combina-tions, and performs a quick-sort to rank these combinations basedon the computed confidence measure. The complexity for comput-ing the confidence measure for NC2 combinations is

O N2�N2

� �ðRPMÞ

� �� OðN2RPMÞ. The quick sort operation is of com-

plexity O(nlogn), where n ¼ N2�N2 or O(N2logN2). The overall com-

plexity of the first iteration is O(N2logN2) + O(N2RPM). In thesecond iteration, the algorithm only compares n = (N � 2) combi-nations, hence its order of complexity is O(nlogn) + O(nRPM), orin other words O(NlogN) + O(NRPM). Every iteration henceforth re-duces the number of combinations by 1. Thus the overall worstcase complexity of the algorithm is determined by the first itera-tion O(N2logN2) + O(N2RPM).

5. Application to super-resolution MR imaging

In this section, we present the application of this work to super-resolution magnetic resonance (MR) imaging. We target a specificproblem of diagnosing swallowing disorders (dysphagia) via MRimaging. As events such as swallows occur at high speeds com-pared to the temporal resolution of MRI scanners, we are able tosample only a few representative frames of the swallow event ina single acquisition. However, multiple swallows can be capturedthereby generating multiple LR videos that can be fused to enhancethe spatial as well as temporal resolution of the acquired data. Wevalidate the proposed algorithms by showing that LR MR videocombinations with high confidence measure have better SR recon-struction in terms of improvement in both spatial and temporalresolution.

Page 11: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Fig. 11. Comparison of IRBR algorithm and RO algorithm on synthetic high resolution signal 2.

204 M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207

5.1. Acquisition

MRI data is acquired via radial acquisition as a series of radialprojections though the center of k-space (animated video of radialacquisition is available online at [35]). For short scan times, radialacquisition results in higher spatial resolution of undersampleddata as compared to conventional cartesian acquisition which suf-fers from aliasing and streaking artifacts. The MRI scan was con-ducted at the University of Alberta at the Centre for the NMR

Evaluation of Human Function and Disease. All image data was ac-quired with subjects lying supine in a Siemens SonataTM 1.5T MRIscanner. Measured amounts of water (bolus) were delivered to thesubject via a system of tubing and the swallow event was capturedin the mid-saggital plane. As current work deals with a prototypesystem, we captured only three repetitions of the swallow (avail-able as video1, video2 and video3 at [35]). The data was acquiredas 96 radial projections of 192 points and reconstructed to an im-age size of 384 � 384. Acquisition time for each image with the

Page 12: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Fig. 12. (a) Illustrative frame from LR video2, (b) closest corresponding frame in LR video3, (c) intermediate frame reconstructed using (a) and (b).

Table 6SNR values for 4 ROIs in LR and SR video sequences and Confidence measures.

Sequence SNR Confidence measure

ROI 1 ROI 2 ROI 3 ROI 4

vid1 7.4419 17.445 19.576 27.329vid2 6.6593 14.794 15.642 36.559vid3 6.1011 15.472 16.683 29.99vid1–vid2 10.672 24.478 39.766 58.35 0vid2–vid3 20.257 95.31 97.632 154.44 1vid1–vid3 13.645 27.96 39.785 69.543 0.046

M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207 205

above configuration is 0.138 s, which computes to a frame rate of7.2 fps. A few representative frames of the LR MRI sequences areshown in Fig. 12(a) and (b).

5.2. Processing

The first few frames of the LR MRI videos are used to spatiallyregister the video frames to each other. This step ensures that evensmall movements of the subject in the MRI scanner is compensatedfor. Next, the progression of the bolus (water) down the oral-pharyngeal tract is segmented using standard background subtrac-tion techniques. Centroid coordinates are computed from thismoving blob region to generate feature trajectories for all threeLR MRI videos. The multiple centroid trajectories can be consideredto be LR sample sets acquired from the same continuous event –swallow. These centroid trajectories are used to compute the tem-poral registration and also the confidence measure between the LRvideos. The confidence measures computed between the three LRsequences are presented in Table 6.

5.3. Reconstruction

Subsequent to computing the confidence measure, reconstruc-tion of a higher resolution MRI video is done in the frequency do-main. Even and odd undersampled projection lines fromcorresponding frames of the registered LR videos are combinedto form a higher resolution radially sampled dataset in frequencyspace. SR reconstruction method described in Section 2.3 is usedto reconstruct the SR videos.

5.4. Results

Super-resolution reconstruction of LR MR video sequences re-sults in improvement in both the spatial and temporal resolutionof data. However, some combinations of LR input videos result inbetter reconstruction than others. This can be validated using theconfidence measure as well quantitatively measured by computingsignal to noise ratios (SNR). The SNR is computed as follows. Foreach video sequence, two consecutive frames (with no bolus mo-tion) are used to compute a difference image. A region of interest(ROI), corresponding to homogeneous tissues, is manually chosenin one of the frames and the mean pixel intensity l is computed.For the same ROI in the difference image, the standard deviationr of the pixel intensities is computed. The SNR of that video se-quence is then measured as:

Fig. 13. ROIs used to compute SNR values in Table 6.

SNR ¼

ffiffiffi2p� l

r: ð16Þ

This method of computing SNR in MR images is commonly usedwhen image homogeneity is poor [36]. The improvement in the spa-tial resolution after SR reconstruction can be seen in Table 6, wherethe SNR values computed for four different ROIs in the LR and SR se-quences are presented. These ROIs are highlighted in Fig. 13. Fromthese SNR values it can be seen that while SR reconstruction im-proves the signal to noise ratio of all the video combinations,vid2–vid3 combination has the highest SNR for all four ROIs, whichagrees with the computed confidence measure. Visually theimprovement in the spatial resolution of the data can be seen inFig. 12(c), where a reconstructed SR frame is presented. The LR vi-deo frames that contributed to the SR frame are shown asFig. 12(a) and (b). It can be seen that the SR frame has much lessnoise compared to either of the LR frames.

Page 13: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

Fig. 14. Representative frames of SR MRI videos: (a) vid1–vid2, v = 27.64, zoomed position of the tongue shows incorrect registration, (b) vid2–vid3, v = 38.7, zoomedposition of tongue shows correct registration.

Fig. 15. Representative frames of SR MRI videos: (a) vid1–vid2, (b) vid2–vid3 and (c) vid1–vid3. The position of the epiglottis has been highlighted with arrows.

206 M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207

The improvement in the temporal resolution of the data is dem-onstrated by the reduction in the motion blurring in the SR videosequence which can be viewed online at [35]. Fig. 14 shows twoSR frames from sequence combinations: vid1–vid2 and vid2–vid3. A zoomed section of the tongue is also shown in order tohighlight the two visibly distinct positions of the tip of the tonguein vid1–vid2, which is caused by poor temporal registration. Thezoomed section in Fig. 14(b) shows that this spatial distinction isless visible for the sequence combination vid2–vid3, which alsohas a higher confidence measure. Another illustrative result isshown in Fig. 15 (zoomed section shown in Fig. 16), where afterthe oesophageal stage, the first frame in which the epiglottis be-comes visible are shown. The position of the epiglottis has beenhighlighted in each frame with an arrow. It can be seen that forvid2–vid3 combination the spatial detail of the epiglottis is the

Fig. 16. Zoomed in sections of SR MRI frames shown in Fig. 15: (a) vid1–vid2, (b)vid2–vid3 and (c) vid1–vid3. The position of the epiglottis has been highlightedwith arrows.

clearest, while for other video combinations two distinct positionsof the epiglottis are visible. Thus, fusing vid2–vid3 results in thebetter registration and reconstruction as compared to vid1–vid2or vid1–vid3. From the confidence measures listed in Table 6 itcan be seen that the confidence measure for vid2–vid3 combina-tion is indeed the highest, thus corroborating the subjective evalu-ation of the reconstruction.

6. Conclusions and future work

In this paper we presented a confidence measure based strategythat allows us to choose recurrent non-uniform samples sets suchthat the overall signal reconstruction error is minimized. The con-fidence measure was developed as a linear weighted sum of twoobjective functions that are based on two precepts: (i) sample setsthat are placed farther apart from each other will result in betterreconstruction and proposed objective function Ug is a suitableestimate of this relation, and, (ii) the proposed objective functionUl can be used to determine the reliability of the computed tempo-ral registration. We independently evaluated the objective func-tions to highlight their complementary nature. We alsopresented a method to determine the optimal weights for the lin-ear weighted sum of the objective functions. An iterative rankingsystem was also proposed, that updates the rank assigned to sam-ple sets and fuses two sample sets to optimize reconstruction. Sucha ranking system based on the confidence measure is shown tooutperform a random ordering of the sample set, which would

Page 14: Choice of low resolution sample sets for efficient super ...lcheng4/Papers/JVCIR2011_CLRSS.pdfChoice of low resolution sample sets for efficient super-resolution signal reconstruction

M. Singh et al. / J. Vis. Commun. Image R. 23 (2012) 194–207 207

otherwise have been used when no prior information about thesample-set order is known. We demonstrated the applications ofthis work in three areas, namely, super-resolution MR imaging,enhanced video reconstruction and enhanced audio generation.In future work we plan to extend the confidence measure to 3-Dand 4-D super-resolution MR imaging techniques.

References

[1] R. Hess, A. Fern, Improved video registration using non-distinctive local imagefeatures, IEEE Conf. Comput. Vis. Pattern Recognit. (2007) 1–8, doi:10.1109/CVPR.2007.382989.

[2] L. Lee, R. Romano, G. Stein, Monitoring activities from multiple video streams:establishing a common coordinate frame, IEEE Trans. Pattern Anal. Mach.Intell. 22 (8) (2000) 758–767, doi:10.1109/34.868678.

[3] N. Grammalidis, D. Beletsiotis, M. Strintzis, Sprite generation and coding inmultiview image sequences, IEEE Trans. Circ. Syst. Video Technol. 10 (2) (2000)302–311, doi:10.1109/76.825729.

[4] M. Singh, A. Basu, M. Mandal, Event dynamics based temporal registration,IEEE Trans. Multimedia 9 (5) (2007) 1004–1015, doi:10.1109/TMM.2007.898937.

[5] R. Thompson, E. McVeigh, High temporal resolution phase contrast MRI withmultiple echo acquisitions, Magn. Reson. Med. 47 (2002) 499–512.

[6] J. Listgarten, R. Neal, S. Roweis, A. Emili, Multiple alignment of continuous timeseries, Adv. Neural Inf. Process. Syst. 17 (2005) 817–824.

[7] M. Irani, S. Peleg, Image sequence enhancement using multiple motionsanalysis, IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (1992) 216–221, doi:10.1109/CVPR.1992.223272.

[8] Y. Caspi, M. Irani, Spatio-temporal alignment of sequences, IEEE Trans. PatternAnal. Mach. Intell. 24 (11) (2002) 1409–1424, doi:10.1109/TPAMI.2002.1046148.

[9] T. Tuytelaars, L.V. Gool, Matching widely separated views based on affineinvariant regions, Int. J. Comput. Vis. 59 (1) (2004) 61–85. <http://dx.doi.org/10.1023/B:VISI.0000020671.28016.e8>.

[10] F. Marvasti, in: Nonuniform Sampling Theory and Practice, Kluwer Academic,2001.

[11] E. Mortensen, W. Barrett, A confidence measure for boundary detection andobject selection, IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 1(2001) I-477–I-484, doi:10.1109/CVPR.2001.990513.

[12] N. Moreau, D. Jouvet, Use of a confidence measure based on frame levellikelihood ratios for the rejection of incorrect data, in: Sixth EuropeanConference on Speech Communication and Technology, 1999.

[13] M. Hemmendorff, M. Andersson, H. Knutsson, Phase-based image motionestimation and registration, IEEE Int. Conf. Acoustics Speech and SignalProcessing 6 (1999) 3345–3348, doi:10.1109/ICASSP.1999.757558.

[14] J. Magarey, N. Kingsbury, Motion estimation using complex wavelets, IEEE Int.Conf. Acoustics Speech Signal Process. (1996) 2371–2374.

[15] E. Izquierdo, Stereo matching for enhanced telepresence in three-dimensionalvideocommunications, IEEE Trans. Circ. Syst. Video Technol. 7 (4) (1997) 629–643, doi:10.1109/76.611174.

[16] R. Wang, H.-J. Zhang, Y.-Q. Zhang, A confidence measure based moving objectextraction system built for compressed domain, IEEE Int. Symp. Circ. Syst. 5(2000) 21–24, doi:10.1109/ISCAS.2000.857353.

[17] M. Singh, M. Mandal, A. Basu, A confidence measure and iterative rank-basedmethod for temporal registration, IEEE Int. Conf. Acoustics Speech SignalProcess. (2008) 1289–1292, doi:10.1109/ICASSP.2008.4517853.

[18] M. Singh, M.K. Mandal, A. Basu, Confidence measure for temporal registrationof recurrent non-uniform samples, Int. Conf. Pattern Recognit. Mach. Intell.(2007) 608–615.

[19] T. Strohmer, J. Tanner, Fast reconstruction algorithms for periodic nonuniformsampling with applications to time-interleaved ADCS, IEEE Int. Conf. AcousticsSpeech Signal Process. 3 (2007) III-881–III-884, doi:10.1109/ICASSP.2007.366821.

[20] W. Press, S. Teukolsky, W. Vetterling, B. Flannery, Numerical Recipes in C,second ed., Cambridge University Press, Cambridge, UK, 1992.

[21] C. Harris, M. Stephens, A combined corner and edge detector, in: Alvey VisionConference, 1988, pp. 147–152.

[22] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky,T. Kadir, L.V. Gool, A comparison of affine region detectors, Int. J. Comput. Vis.65 (1–2) (2005) 43–72. http://dx.doi.org/10.1007/s11263-005-3848-x.

[23] T. Lindeberg, Detecting salient blob-like image structures and their scales witha scale-space primal sketch: A method for focus-of-attention, Int. J. Comput.Vis. 11 (3) (1993) 283–318.

[24] B. Lucas, T. Kanade, An iterative image registration technique with anapplication to stereo vision, in: IJCAI’81, 1981, pp. 674–679.

[25] G. Welch, G. Bishop, An Introduction to The Kalman Filter, Technical Report,Chapel Hill, NC, USA, 1995.

[26] M. Fischler, R. Bolles, Random sample consensus: a paradigm for model fittingwith applications to image analysis and automated cartography, Commun.ACM 24 (6) (1981) 381–395.

[27] A. Patti, M. Sezan, A. Murat Tekalp, Superresolution video reconstruction witharbitrary sampling lattices and nonzero aperture time, IEEE Trans. ImageProcess. 6 (8) (1997) 1064–1076, doi:10.1109/83.605404.

[28] H. Stark, P. Oskoui, High-resolution image recovery from image-plane arraysusing convex projections, J. Opt. Soc. (1989) 1715–1726.

[29] R. Schultz, R. Stevenson, Extraction of high-resolution frames from videosequences, IEEE Trans. Image Process. 5 (6) (1996) 996–1011, doi:10.1109/83.503915.

[30] D. Capel, A. Zisserman, Computer vision applied to super resolution, IEEESignal Process. Mag. 20 (3) (2003) 75–86, doi:10.1109/MSP.2003.1203211.

[31] M. Singh, R. Thompson, A. Basu, J. Rieger, M. Mandal, Image based temporalregistration of MRI data for medical visualization, IEEE Int. Conf. ImageProcess. (2006) 1169–1172, doi:10.1109/ICIP.2006.312765.

[32] J. Jackson, C. Meyer, D. Nishimura, A. Macovski, Selection of a convolutionfunction for fourier inversion using gridding (computerised tomographyapplication), IEEE Trans. Med. Imaging 10 (3) (1991) 473–478, doi:10.1109/42.97598.

[33] G. Marsaglia, A. Zaman, A new class of random number generators, Ann. Appl.Probab. 3 (1991) 462–480.

[34] H.G. Feichtinger, T. Werther, Improved locality for irregular samplingalgorithms, IEEE Int. Conf. Acoustics Speech Signal Process. (2000) 3834–3837.

[35] Authors. <www.ece.ualberta.ca/�meghna/J2008.html>.[36] O. Dietrich, J. Raya, S. Reeder, M. Reiser, S. Schoenberg, Measurement of signal-

to-noise ratios in mr images: influence of multi-channel coils, parallel imagingand reconstruction filters, Magn. Reson. Imaging 26 (2) (2007) 375–385.