Motion adaptive scan rate up-conversion

Multidimensional Systems and Signal Processing, 3, 113-130 (1992) © 1992 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.

Motion Adaptive Scan Rate Up-conversion

PETRt HAAVISTO AND YRI() NEUVO Signal Processing Laboratory, Tampere University of Technology, PO. Box 553, SF-33101 Tampere, Finland

JANNE JUHOLA Nokia Research Center, Kanslerinkatu 8, SF-33720 Tampere, Finland

Received September £ 1990, Revised April 22, 1991 Invited Paper

Abstract. The interlaced scan format and the low frame rate in current television systems cause visible degrada- tion in picture quality. To improve the picture, scan rate up-conversion can be implemented in the receiver. Typically, the up-com'ersion algorithms needed in stationary scenes are different from those needed in nonstationary scenes. In this paper we discuss problems related to scan rate up conversion and motion detection. We present an algorithm that adapts to the motion in the picture and yet effectively eliminates most of the artifacts caused by imperfect motion detection. The algorithm is based on a weighted median filter structure and a simple motion detector. The weights of the median filter are adapted according to the motion detector output. All algorithms discussed have been tested with real sequences using a video sequencer.

Key Words: Television, scan rate conversion, median filtering, weighted median, motion detection

1. Introduction

The European television standards and also the future HDTV transmission standard employ an interlaced scan format and a field rate of 50 Hz. These properties give rise to visible artifacts in the TV picture. The low field rate causes large area flicker, which is very disturbing on large and bright picture areas. In this respect the 60 Hz field rate is considerably bettei: However, as the size of the picture increases, these artifacts become quite visible in the 60 Hz standard, too. Line flicker arises due to the interlaced display format and occurs at the 25-Hz frame rate in the areas where there are high vertical frequencies. Another disturbance caused by the interlaced scan format is line crawling, which is sometimes visible in vertically moving objects. The above-mentioned problems are widely recognized. For a more detailed discussion see, e.g., [1], [2].

The problems themselves suggest two obvious solutions to reduce the flickering: the field rate must be increased, or the interlaced scan format must be converted to progressive. Bandwidth restrictions do not allow a progressive format or a higher field rate to be used in TV transmissions, leaving the elimination of the display flicker for the receiver. When the received interlaced signal is converted to progressive (sequential) format, the artifacts caused by the interlaced scan can be removed. Subjective tests have shown considerable improvements, by a factor of 1.5, for scenes with little or no movement [3]. In these scenes the temporal information can be utilized in the conversion, and the perceived vertical resolution of the picture is improved. One case where interlaced to progressive conversion wii1 be needed increasingly often is when the TV screen is used for viewing text.

Large area flicker, however, is not affected by the interlaced to progressive conversion. The screen is still refreshed only 50 times per second. When the field rate of a 50-Hz

t 14 P. HAAVISTO, L JUHOLA AND Y. NEUVO

television system is doubled to 100 Hz, the artifacts can be translated to high enough frequencies where they are no longer visible. This procedure eliminates the large area flicker. The importance of removing large area flicker will further increase when large screens are used for viewing HDTV transmissions. It should be stressed that the problems of interlaced scan in a 50-Hz TV system can be significantly reduced by field rate doubling. An extreme example of line flicker can be seen when a test picture consisting of 1-pixel- wide horizontal black and white lines following each other is created. In an interlaced system this means that one of the two fields is white and the other is black. When these fields are displayed using a 50-Hz field rate, the line structure is invisible and all the viewer can see is vigorous flickering. If the display format is changed to a 50-Hz progressive or a 100-Hz interlaced format, the line structure becomes visible.

Scan rate up-conversion is a special problem of interpolation. The special circumstances arise from the differences between static and moving scenes and from the fact that the algorithms have to operate at video rates. From the implementation point of view they must not require too much memory. This limits the amount of temporal information available. In interlaced to progressive conversion the task is to interpolate the missing lines in every received field, whereas when the field rate is doubled the problem is to create a new field for every received field. The difference between the two is that in the latter it is possible to interpolate different motion phases between the existing fields. These motion-compensated algorithms, however, require quite complex processing, which is not typically considered feasible in a commercial TV receiver. The two problems become equivalent if computation of motion vectors and the use of motion compensation are considered too complex to be implemented. The algorithm we propose in this paper is used to double the scan rate of the received picture sequence. The resulting up-converted sequence can be displayed using either a progressive scan format or a higher field rate.

We first review some simple up-conversion algorithms for scan rate doubling. The problems involved are brought up. In Section 3 we present an interpolation filter based on the weighted median operation [4] and discuss its properties. The effects of varying the filter weights are shown, and the use of the filter for scan rate doubling is described. Section 4 presents a novel idea of adjusting the weights of a median filter according to a motion detector output. When simple motion detectors based on the absolute frame difference and a threshold value are used, it is common that new artifacts appear in the picture. With the adaptive weighted median filter we have been able to almost completely remove the disturbances caused by erroneous motion detection.

2. Scan rate up-conversion algorithms

Let us first define the notation that will be used. The received frames are numbered con- secutively, and the frame is referred to by its number. The two fields in the frame t are denoted as to and t e, where the subscripts o and e stand for odd and even. Note that we will talk about the fields of the frame also in the case of progressive scan. Furthermore, I and c will be used to denote the line and the column of the pixel position in the frame. The odd and even values of 1 are in the odd and even fields of the frame, respectively. The image sequence can thus be represented as a function of three variables: x = x(t, l, c).

The received sequence of fields is

MOTION ADAPTIVE SCAN RATE UP-CONVERSION 115

{ . . . . ( t - 1 ) o , ( t - 1) e , t o , t e , ( t + 1) o , ( t + 1) e, . . . } .

When the scan rate is doubled, two new fields are created between the fields to and to. These are denoted as to and t e. The up-converted sequence is

{ . . . . (t - 1)e, to, t*, t*, re, (t + 1)o . . . . }.

With the above notation, the up-conversion algorithms can be presented by giving an algorithm for calculating the fields t£* and t o. The sequences involved are shown in Figure 1. Now it is easy to see f rom Figure 1 that if the up-converted sequence is displayed using an interlaced format, the t ime axis is distorted and some motion artifacts occur if motion compensation is not used. These inevitable at'tifacts can be seen in scenes with uniform motion at constant velocity. I f the sequence is displayed using a progressive scan format, the time axis is not distorted and there are no such motion artifacts.

The most obvious way to double the field rate is to use repetition of existing information. Field repetition,

t• *

= to, t o = te, (1)

is easy to implement and has been adopted by several TV manufacturers in commercial 100-Hz receivers. The algorithm has a reasonably good performance with moving scenes, but since temporal information is not used the loss of resolution in static areas is inevitable. Also, the algorithm does not reduce line flicker or line crawl. Frame repetition,

t e = to, t o = (t + 1) 0 , (2)

is optimal for stationary scenes but fails badly when there is motion in the picture. This is because the temporal order of the displayed fields is different from that of the received

( t - l ) e t o

i i

( t - l ) e t o t e t o

t e ( t + l ) o

t e ( t + l ) o ( t + l ) e

Figure 1. Principle of up-conversion. The upper four fields are from the received sequence, and the lower fields are from the up-converted sequence. The fields marked by a * are created by interpolation.

116 P. HAAVISTO, J. JUHOLA AND Y. NEUVO

fields. In moving scenes it is necessary to use spatial information. Obviously, simple repetition alone is not satisfactory for scan rate doubling.

Line averaging is a linear interpolation algorithm that produces good results with moving scenes. This algorithm can be described by the equations

X(te, l, c) = l/2(X(to, l - 1, c) + X(to, l + 1, c)) X(to, l, c) = 1/2(x(te, l - 1, c) + x ( t e, l + 1, c)) .

(3)

The missing pixels are created by averaging the 2 pixels immediately above and below the missing pixel position. Line averaging is a purely spatial operation and results in a loss of resolution with nonmoving scenes. It, too, has the advantage of simplicity.

Median filters have been proposed for scan rate up-conversions [5] because of their edge preserving properties [6]. In the three-point vertical median filter pixels from three spatially consecutive lines are compared in amplitude, and the second largest value (the median value) is taken to be the interpolated pixel:

X(te, l, c) = med[x(to, l - 1, c) , X(to, 1 + 1, c) , X(te, 1, c)],

X(to, l, c) = med[x(te, l - 1, c) + x(te, l + 1, c), x ( ( t + 1)o, l, c)].

(4)

Here, med[ ] denotes the median operation. This filter utilizes both temporal and spatial information, and the interpolation result is clearly better than with any of the earlier mentioned methods. The median filter inherently adapts to motion so that in moving scenes the filter output typically comes from above or below the current p!xel position (spatial information), and in stationary scenes the output is usually the pixel in the same location at a different time (temporal information). The three-point median filter has already been used in prototype I I~V receivers [7].

The problem with the three-point median filter is that it is not able to preserve diagonal narrow lines, and, more importantly, it causes disturbing serration effects on moving diagonal edges. These serration artifacts are clearly visible in real sequences. The magnified part of one picture of the sequence "car" illustrates this problem in Figure 2(a). The original interlaced sequence was converted to progressive format using three-point vertical median filtering. The diagonal edge formed by the windscreen of the car appears staircased as if it were made of blocks bigger than 1 pixel.

To improve the vertical median filter, Doyle introduced the concept of direction dependent processing in [5] and further developed it in [8]. The basic idea of the direction dependent median filter is that it operates in a 3 × 3 window and is based on a correlation measure where the absolute difference is calculated over the center point vertically and in the two diagonal directions. The three-point median is computed in the direction where the absolute difference is the smallest. The actual algorithm proposed in [8] is a combination of averaging and median filtering. The computation of the correlation measure involves low- pass filtering and special handling in cases where clear correlation is not found. This procedure effectively eliminates the serration effects.

From the above discussion some conclusions can be drawn. An interpolation algorithm should be able to use both spatial and temporal information. Also, diagonal information should be included to prevent serration effects. It has become clear that some algorithms


(a)

(b)

Figure 2(a). Serration effects in moving diagonal edges caused by vertical three-point median filtering. Extracted from the up-converted "car" sequence (see Figure 4(b)). Figure 2(b). The same picture when the proposed filter was used for interpolation. The pixel size appears smaller than in Figure 2(a).

perform better in stationary areas, others in nonstationary areas. This suggests that a good interpolation algorithm should be able to adapt according to motion information if it is available.

3. Weighted median filter for scan rate conversion

One restriction of median filters is that the samples in the filter window are always given the same importance. The weighted median filter was suggested in [9] as an extension to


the standard median filter. The output y ( n ) of the weighted median filter of length N = 2k + 1 associated with integer weights Wl . . . . . WN and applied to sequence x ( n ) is

W 1 times w2 times

y ( n ) = med[x(n - k) , . . . , x (n - k), x (n - k + 1), . . . , x ( n - k + 1) . . . . .

x ( n + k) . . . . . x (n + k)], ¢

wN times (5)

where n is the time index. In this paper we will use the symbol 0 to denote replication, i.e., the weighting operation:

n O x = x , . . . , x .

n times

(6)

With this notation, the previous equation becomes

y ( n ) = reed[w1 0 x (n - k) , w2 0 x (n - k + 1), . . . , WN 0 x ( n + k)]. (7)

The weighted median filter is an analogous extension to the median filter as the weighted average, or the FIR filter, is to the averaging filter [10]. The weighted median filter has been analyzed in [11], [12], [10].

In [4] a weighted median filter structure was proposed for scan rate up-conversion. The filter operates in a 7-pixel window shown in Figure 3. Six of the pixels come from the current field (spatial information), and the center pixel is taken from the previous, or the following, field (temporal information). For the window in Figure 3, the filter output y is given by the following equation:

y = med[x(to, l - 1, c - 1), X(to, l - 1, c), x ( t o, l - 1, c + 1),

X(to, l + 1, c - 1), X(to, l + 1, c), X(to, 1 + 1, c + 1), (8)

0 X(te, l, c ) , /3 0 1/2(X(to, l - 1, c) , + X(to, I + 1, c))].

The weights of the pixels in the previous and the following lines, i.e., the lines l - 1 and l + 1, are fixed to the value 1, and the weight of the center pixel is c~ and can be adjusted to change the filter properties. Increasing a will increase the amount of temporal information in the interpolation process and adjust the filter to stationary scenes. The other weighting coefficient/3 controls the amount of spatial information; therefore, increasing /3 will adjust the filter for nonstationary scenes. Note that the value controlled by/3 is the average of the 2 pixels immediately above and below the current pixel and gives an estimate of y based on spatial information only. If /3 is made infinite, the filter will operate as a

10


x(t o, 1-1,c-l) x(t o, 1-1,c) x(t o, 1-1,c+l)

X(te, 1,c)

x(t o, l+ l , c -1 ) x(t o, 1+ 1,c) x(t o, 1+ 1,c+ 1 )

l ine 1-1, f i e ld t o

l ine 1, f ie ld t e

l ine 1+1, f i e ld t o

Figure 3. The weighted median filter window. The pixel in the middle is taken from an adjacent field from the same position. The others are from the current field.

line averaging filter given by (3). Filters combining linear substructures with the median operation, FIR-median hybrid filters, are studied in [13]. Due to use of the diagonal points,

the filter does not suffer from serration effects l ike the vertical three-point median filter. Even though the number of different weight combinations is infinite, many of the weighted

median filters produce exactly the same output [11]. The exact number of weight combinations producing different filtering results is not generally known, but for simple cases it can be found easily. Table 1 shows all 13 different combinations for the weights oe and

in (8). Only combinations where o~ + ~ is odd are considered, so the median is unique. Any combination of integer weights that is not in the table can be reduced to one of those given in the table. For example, the filter with weights c~ = 7 and ~ = 6 is identical to the filter with c~ = 3 and/3 = 2.

Note also, that the weight combination o~ = 7 and/~ = 0 always produces the center

point as the output value, whereas the combination ee = 0 and/3 = 5 converts the filter into the line averaging filter. The weight combinations oe = 0, ~ = 1 and ~ = 0, o~ = 1 can be useful in some applications. The latter in fact is identical to the seven-point median filter discussed in [14]. These structures, however, were not included in this study.

Table 1. All the different weight combinations for the weighted median interpolator of (9). Also shown is how the selection of the output depends on the weight combinations.

Middle value Average value Middle value=average value c~ B ~ + /3 selected (%) selected (%) selected (%)

1 0 1 36 16 6 0 1 1 1! 58 6 3 0 3 64 13 9 2 1 3 43 30 9 1 2 3 17 72 9 0 3 3 10 88 9 5 0 5 83 11 9 4 1 5 66 18 9 3 2 5 45 35 9 2 3 5 19 76 9 1 4 5 12 93 9 o 5 5 9 100 9 7 o 7 100 9 9

11


Figure 4(a). The test picture "home": 512 x 512 pixels, 256 gray levels.

An issue of interest is how often the filter selects the center point or the line average to be the output value. This is the property that we want to change by varying the weights. The individual percentage values for each weight combination are heavily image dependent, but the effects of altering the weights are consistent. The results in Table 1 were calculated using the test picture "home" shown in Figure 4(a). The results show that the operation of the filter can be quite drastically changed by adjusting the weights. For the set of weight combinations where el + 13 = 3 starting from el = 3, t5 = 0 up to ot = 0, 13 = 3 the percentage of occurrences of the middle value in the output of the filter decreases from 64 % to 10%. At the same time the corresponding value for the line average increases from 13% to 88%.

The equations describing how the weighted median interpolator creates the missing fields te and to are

x(t*e, l, c) = med[x(to, 1 - 1, c - 1), X(to, l - 1, c ) , x( to , 1 - 1, c + I), x( to , l + 1, c - 1),x(to, l + 1, c ) , x ( t o, I + 1, c + 1),

<> x(t~, l, c ) ,

t3 0 ~A(x(to, l - 1, c) + X(to, l + 1, c))],

x ( t o, l, c) = med[x(te, 1 - 1, c - 1), X(te, l - 1, c ) , x ( t e, l - 1, c + 1), x ( t e, l + 1, c - 1),x(te , l + 1, c ) , x( te , l + 1, c + 1), a 0 x ( ( t + 1)o, 1, c), t3 0 Vz(x(te, 1 - 1, c) + x(te, l + 1, c))].

(9)

12


Figure 4(b), Two pictures from the test sequence "car." The sequence that was used for testing consists of 100 pictures of 720 × 576 pixets with 256 gray levels. The pictures in this figure show a 512 × 512-size part of the original pictures. The rectangles mark the details used in the following figures.

]b achieve best possible results, motion information must be available in the interpolation process. The weight combinations suitable for controlling the amount of spatial and temporal information were chosen using the above reasoning and experiments and are shown in Figure 5. The combinations with a +/3 = 5 that are not included increase the performance very little. This is already indicated by Table I, which shows that, e.g., the weight combination a = 3 and ~ = 2 is very similar to that of tx = 2 and ~ = 1.

If no motion information is available, a good choice for fixed weights is ct = 2 and/3 = 1. This combination performs reasonably well in both stationary and moving areas. Visu- ally the results are clearly better than obtained with the three-point vertical median filter. A VLSI implementation of this filter is described in [15], and an implementation using PGAs in [16].

4. Combining interpolation with motion detection

Some interpolation algorithms are well suited for stationary- scenes, whereas some others perform well when there is motion in the picture. Even though there are good compromise solutions, like direction-dependent processing and the weighted median filter with weights c~ = 2 and ~ = 1, these are not adequate in stationary highly detailed scenes. If the original sequence consists of identical pictures, the up-converted sequence will consist of two frames

13


SPATIAL .......... T E M P O R A ~ RMATION INFORMATION

(~ 7 5 3 2 1 0

13 0 0 0 1 2 3 Figure 5. Different combinations of weighted median filter weights and their effect on the interpolation. The figure illustrates how the amount of temporal and spatial information varies with the fdter weights. A motion detector can be used to control the choice of weights.

Figure 6 Static sequence after up-converting without motion detection. The original sequence was created by repeating the test picture "home" The up-converted sequence consists of two frames repeating after each other. Parts of the two frames are shown. Differences can be easily seen, e.g., inside the rectangular area. These differences are seen as flickering.

repeating after each other. No matter how good the interpolation algorithm is, the two frames

will always differ from each other, causing disturbing flickering. Figure 6 shows magnif ied parts of the two frames result ing when the picture "home" is repeated in a sequence and the sequence is then up-converted. The up-convers ion methods used was the weighted

14


median filter with weights ct = 2 and/3 = 1. Differences can be seen in several places, and even the smallest differences cause visible flickering if the scene remains static.

Obviously, a motion detector is needed to control the interpolation. The required motion detector must be spatially precise so that detailed static areas are detected to be nonmoving even if there is motion in the immediate neighborhood of the area. Restrictions must be made also concerning the complexity of the detector. Computation of motion vectors in the receiver is not considered feasible here. Motion-compensated algorithms have been studied, e.g., in [17], [18], [19]. Here we will only consider simple detectors based on the absolute difference between two consecutive frames.

The basic procedure in combining an interpolator with a motion detector is to compute the motion detector output d and to compare it with a predetermined threshold value h. If motion is detected, i.e., d > h, the interpolator is applied, otherwise temporal pixel repetition is used:

f- interpolator output, x(t~, l, c ) =

~_x ( t e, l, c ) ,

x( t~, l, c ) = interpolator output,

~_x((t + 1) o, l, c),

i f d > h,

otherwise,

if d > h,

otherwise.

(10)

If the picture does not change at all, the algorithm reduces to the frame repetition given by (2).

The simplest frame-difference-based motion detector is the local movement detector, called the one-point detector in this paper, whose output is given as

d = ( i x ( ( t + 1)o, l, c ) - X(to, l, c)],

( Ix(re, I, c ) - x ( ( t - 1)e, l, c)[,

when 1 is odd,

when l is even. (11)

The bad performance of this detector is well documented (e.g., [20]). However, since know- ing the kind of errors it produces is important for the understanding of our algorithm, we discuss this simple detector here in some detail.

The one-point detector easily makes errors when the scene contains, e.g., vertical bars (Figure 7) that move at least their own width between two consecutive frames [3]. The moving bar is not detected, and the output will be as in Figure 8(a). The moving vertical line in the background wall does not appear at all in the interpolated field. These errors are very common in nonstationary sequences if the one-point detector is used.

To alleviate the moving bar problem, it is necessary to utilize motion information from the other field of the frame and spread the detector spatially. Figure 7 also illustrates how the moving bar can be detected if the frame difference is computed also from the position below the current pixel position. A three-point motion detector that uses the pixel posi- tions above and below the current position is given by the following equation for odd values of I (for even l exchange (t + 1)o and te, and t o and (t-1)e):

15

1 2 4 P. HAAVISTO, J. JUHOLA AND Y. NEUVO

z e r o f r a m e d i f f e r e n c e

U,,,,,, I ~1 '1 I I I I

........ ~ , . ............................

..... H - ' - ...........................

H..--,I'""+ I I I I!~i

+" " l ' "++. . . I H.. '+..,~.

I l l l l l ..... ~ .............................................. I l l l t i H " - + + ' . ' l H " , + " . t . ~ l l l l ..... t ................................................... I l l l

H?i' I I I I I .........................................................

• H - . + . . . I ..... l l~l

I I 1 1

n o n - z e r o f r a m e d i f f e r e n c e

Figure 7. The problem of a moving bar. Four consecutive fields are shown. The dotted lines denote the missing lines, and the X shows the current pixel that is being interpolated. The one-point detector does not detect the bar (arrows above the fields), but the three-point one does (arrows below the fields),

Figure 8(a). Artifacts caused by motion detector errors. The one-point detector does not detect the moving vertical line in the background. Figure 8(b). The artifacts in Figure 8(a) can be corrected if the three-point detector is used, Pictures in Figure 8 were extracted from the sequence "car." The exact place is shown in Figure 4(b).

16


d = max[lx((t + 1)o, l, c ) - x ( to , l, c ) l ,

IX(re, l - 1, c ) - x ( ( t - 1)e, l - 1, c)l, (12)

Ix ( t e, l + 1, c ) - x ( ( t - 1) e, l + 1, c)l].

The maximum operator guarantees that if motion is detected in any of the three pixel posi- tions it will be reflected in the output. This detector already dramatically reduces detection errors. Figure 8(b) shows the improvement in up-conversion compared to the one-point detector. A single moving bar is detected by the three-point detector.

Some problems still remain in fast-moving sequences. The three-point detector fails when two bars having the same orientation move in the same direction. Figure 9 shows a case when neither of the frame differences is nonzero. This kind of pattern does not appear very often, but appears often enough to create a problem. Figures 10(a) an 10(b) show extracts from the "car" sequence up-converted using the one- and three-point detectors. The patterns in the background also make the three-point detector fail. By complicating the detector it would be possible to detect also this pattern. However, more memory would be needed, and we would find new patterns that could not be detected. In fact, certain motion types can never be distinguished from certain stationary patterns [21]. We will take another approach and accept occasional motion detector errors and modify the algorithm to prevent them from showing in the up-conversion result.

The choice of the threshold value h has a critical effect on the operation of the algorithm. Generally finding an optimal threshold value is very difficult. If h is chosen very small, the motion detector errors can be almost completely eliminated. This is because in real sequences the luminance values of the two bars in Figure 9 would always be different from each other. Also the luminance level of the background is not constant. In general, nonstationary sequences require a small threshold value. However, very small threshold values eliminate the advantage of the motion detector since the interpolator is active almost all the time. If the scene is static with a lot of detail, the small amount of noise that is always

zero frame difference

F I I I

N I I I I I ~...+....H.. }"+....H......] I I t I I I I I.'.'.+....~N~ b,t.,,.+,+-,t ~:~

N i l .............. @.++-] ~ I'"'1"'"'~ ....

I 1 1 1 I I H'"+-":I':

M.-I..I

+.+++q

N zero frame difference

Figure 9. The problem of two moving bars. Four consecutive fields are shown. The dotted lines denote the missing lines, and the X shows the current pixel that is being interpolated. Neither the one-point detector nor the three-point detector is able to detect the motion.

17

126 E HAAVISTO, L JUHOLA AND Y. NEUVO

(a)

(b)

present will activate the interpolator and result in flickering that the detector was supposed to remove. Small changes in the average luminance level of a static picture can also activate the interpolator. For these reasons it is not possible to achieve good results by varying between line averaging and frame repetition according to the motion detector output. To achieve good results, we must use several threshold values and gradually adjust the interpolation algorithm. The weighted median filter offers possibilities for this approach.

18

MOTION ADAPTIVE SCAN RATE UP-CONVERSION t27

(c)

Figure 10. Some results of up-converting algorithms using different motion detection schemes. The first three- picture sequence, Figure 10(a), was interpolated using the one-point detector with a single threshold of 20. In the second sequence, Figure 10(b), the three-point detector was used with the same threshold. The third sequeace, Figure 10(c), is the result of the proposed adaptive scheme. Pictures in Figure 10 were extracted from the sequence "car." The exact place is shown in Figure 4(b).

4.1. The p r o p o s e d in terpo la tor

We propose the following scheme for scan rate up-conversion by a factor of 2. As a motion detector, we will use the seven-point detector described by the equation

d = max[Ix(( t + 1)o, l, c ) - X(to, l, c)] ,

Ix(re, l - 1, c - 1) - x ( ( t - 1)e, 1 - 1, c - 1)l,

Ix(re, l + 1, c -- 1) -- x ( ( t -- l)e, l + 1, C -- 1)1,

]X(te, l -- 1, c ) -- x ( ( t -- l )e , I -- 1, c)] , (13)

lX(te, l q- t , C) -- X(( t - - l )e , t + 1, c) t ,

tX(te, t - 1, c + 1) - x ( ( t - 1)e, l - 1, c + 1)t,

[X(te, 1 + 1, c + 1) - x ( ( t - 1)e, l + 1, c + 1)l].

The above equation is for the odd values of I, and, again, for even values we exchange (t + 1)o and t e, and to and (t - l)e. This detector is not much more complicated than the

19


three-point one. Only the four diagonal points have been added to slightly improve the performance where there are moving diagonally oriented bars.

The output of the detector is compared to four threshold values, and the weights depend on the comparison results as follows:

d < h 1 = a = 7 , / 3 = 0 ;

hi < d_< h 2 = c~ = 3 , / 3 = 0 ;

h2 < d < h 3 ~ ~ = 2 , / 3 = 1;

h 3 < d < h 4 = c~ = 1, /3 = 2;

d > h 4 = c~ = 0 , / 3 = 3 .

(14)

The interpolation is performed according to (9). This procedure allows us to set the lowest threshold hi very low to eliminate motion detector errors. When d > hi, the interpolator is activated, but now if d _< h2 the weights ct = 3 and/3 = 0 guarantee very good performance in these nearly static areas, and practically no flicker is produced. These weights, however, are able to filter out incorrect values caused by the motion detector errors. When the output of the motion detector grows, the interpolator gradually approaches the line averaging filter. An example of this is shown in Figure 10. Three pictures extracted from the up-converted "car" sequence are shown with three different motion detection schemes explained in this paper. The extracts are from a difficult part of the sequence, and the one- point and the three-point detectors both leave serious artifacts. These artifacts are not visible with the proposed adaptive method.

This adaptiye weighted median algorithm has been tested with several real image sequences. It provides noticeably better results than any of the other methods described in this paper. In the sequences where there is fast motion, the algorithms with a single threshold either cause serious motion artifacts or flicker in the stationary areas. Our approach is free of motion artifacts even in difficult sequences. The performance in stationary areas is very close to that of frame repetition. The filter does not suffer from serration effects; see Figure 2(b).

In the proposed algorithm the weights a = 5 and t3 = 0 are not used. This is because no additional visible advantage was achieved with the real sequences that were used for testing. If the picture source is of very high quality, or if computer generated sequences are used, the lowest threshold may still be too high to eliminate all motion detector errors. In this case (14) can be modified so that the weights c~ = 7 and/3 = 0 are replaced by ot = 5 and/3 = 0. In static scenes the performance of this weight combination is only slightly worse than that of repetition, but since repetition is not used motion detector errors are eliminated.

The luminance values of the test sequences were between 0 and 255, and suitable threshold values were found to be hi = 3-5, h2 = 10, h3 = 15, and h a = 20. For chrominance components no interpolation algorithm was found superior to others. Most of the tests have been carried out with black-and-white sequences. With color sequences the chrominance components were interpolated using only spatial information.

20


Conclusions

In this paper an algorithm for television scan rate up-conversion was introduced. The

algorithm employs a weighted median filter and a motion detector. The motion detector is based on the absolute difference between consecutive frames and uses several threshold values to control the median filter weights. This scheme efficiently decreases the possibil- ity of visible artifacts that otherwise would arise due to errors caused by" the simple motion detector. The interpolation filter does not suffer from serration effects like the vertical three-

point median filter and is able to preserve fine details, including diagonal lines. The algorithm has been tested using a VTE DVSR 100 video sequencer and with real-life input sequences. The results with all sequences are quite good. The artifacts caused by the low scan rate could be removed, and no visible artifacts were introduced by the interpolator.

Acknowledgments

This project was part ly supported by Nokia Consumer Electronics. The authors wish to thank Vesa Lund~n for performing many of the simulations using the video sequencer.

References

1. H.P. Gaggioni, "The Evolution of Video Technologies," IEEE Congnunications Magazine, vol. 25, Nov. 1987, pp. 20-36.

2. C. Hentchel, "Comparison between Median Filtering and Vertical Edge Controlled Interpolation for Flicker Reduction" IEEE Transactions on Consumer Electronics, vol CE-35, Aug. 1989, pp. 279-289.

3. G.J. Tonge, "Image Processing for Higher Definition Television," IEEE Transactions on Circuits and Systems, vol. CAS-34, Nov. 1987, pp. 1385-1398.

4. J. 2uhola, A. Nieminen, J. Salo, and Y. Neuvo, "Scan Rate Conversions Using Weighted Median Filtering," in Proceedings of the 1989 IEEE International Symposium Circuits and Systems, Portland, OR, 1989, pp. 433436.

5. T. Doyle, "Interlaced to Sequential Conversion for EDTV Applications" in Proceedings of the Second Inter- national Workshop on HDTV, 19iquila, Italy, 1988.

6. N.C. Gallagher, Jr., and G.L. Wise, '7i Theoretical Analysis of the Properties of Median Filters" IEEE Tran- sactions on Acoustics, Speech, and Signal Processing, vol. ASSP-29, Dec. 1981, pp. 1136-1141.

7. S. Naimpally, L. Johnson, T. Darby, R. Meyer, L. Phillips, and L Vantrease, "Integrated Digital ffYI'V Receiver with Features," IEEE Transactions on Consumer Electronics, vot. CE-34, Aug. 1988, pp. 4104-419.

8. T. Doyle and M. Looymans, "Progressive Scan Conversion Using Edge Information," in Proceedings of the Third International Workshop on HDTV, Torino, Italy, 1989.

9. B.I. Justusson, "Median Filtering: Statistical Properties" in Topics in Applied Physics, vol. 43, Two-Dimensional Digital Signal Processing 11, (T.S. Huang, ed.), Berlin: Springer-Verlag, 1981, pp. 161-196.

10. O. Yli-Harja, L Astola, and Y. Neuvo, '9tnalysis of the Properties of Median and Weighted Median Filters Using Threshold Logic and Stack Filter Representation;' IEEE Transactions on Signal Processing, vol. SP-39, Feb. 1991, pp. 395410.

11. D.R.K. Brownrigg, "The Weighted Median Filter" Communications of the Association of Computing Machinery, vol. 27, Aug. 1984, pp. 807-818.

12. M.K. Prasad and Y.H. Lee, "Weighted Median Filters: Generation and Properties,' in Proceedings of the 1989 IEEE International Symposium on Circuits and Systems, Portland, OR, 198% pp. 425-428.

21

130 P. HAAVIS~, J. JUHOLA AND Y. NEUVO

13. E Heinonen and Y. Neuvo, "FIR-Median Hybrid Filters;' IEEE Transactions on Acoustics, Speech, and Signal Processing, voL ASSP-35, June 1987, pp. 832-838.

14. L Salt, Y. Neuvo, and V. HSaneenaho, "Improving TV Picture Quality with Linear-Median Type Operators" IEEE Transactions on Consumer Electronics, vol. CE-34, Aug. 1988, pp. 373-379.

15. J. Juhola, P. Haavisto, O. Vainio, T. Raita-Aho, and Y.Neuvo, "On VLSI Implementation of Median Based Field Rate Up-conversion" in Proceedings of the IEEE International Symposium on Circuits and Systems, New Orleans, LA, 1990, pp. 3042-3045.

16. H. Rantanen and A. Nieminen, "Prototyping IDTV Functions Using PGAs" IEEE Transactions on Con- sumer Electronics, vol. CE-36, Aug. 1990, pp. 296-300.

17. M. Haghiri and P. Guichard, ' ~ Motion Compensated Field Rate Conversion Algorithm," in Proceedings of the Third International Workshop on HDTK, Torino, Italy, 1989.

18. E-M. Wang, D. Anastassiou, and A.N. Netravali, "Tune-t/.ecursive Motion Compensated Deinterlacing" in Proceedings of the Third International Workshop on HD'I~, Torino, Italy, 1989.

19. G.M.X. Fernando, D.W. Parker, and P.T. Rogers, "Motion Compensated Display Rate Conversion of Band- width Compressed HD-MAC Pictures" in Proceedings of the Third International Workshop on HDTE, Torino, Italy, 1989.

20. DJ. Cormor, B.G. Haskell, and F.W. Mounts, "A Frame-to-Frame Picturephone Coder for Signals Contain- ing Differential Quantizing Noise" The Bell System Technical Journal, vol. 52, no. 1, 1973, pp. 35-51.

21. N. Suzuki, M. Kageyanaa, H. Yoshigi, and T. Fukinuki, "Improved Synthetic Motion Signal for Perfect Motion- Adaptive Pro-Scan Conversion in IDTV-Receivers;' IEEE Transactions on Consumer Electronics, vol. CE-35, Aug. 1989, pp. 266-271.

22

Documents

Motion adaptive scan rate up-conversion