15
Using texture to complement color in image matting Ehsan Shahrian, Deepu Rajan Center for Multimedia and Network Technology, School of Computer Engineering, Nanyang Technological University, 639798, Singapore abstract article info Article history: Received 6 April 2012 Received in revised form 4 January 2013 Accepted 3 June 2013 Keywords: Alpha matting Texture matte Current image matting methods based on color sampling use color to distinguish between foreground and background pixels. However, they fail when the corresponding color distributions overlap. Other methods that dene correlation between neighboring pixels based on color aim to propagate the opacity parameter α from known pixels to unknown pixels. However, strong edges of textured regions may block the propaga- tion of α. In this paper, a new matting strategy is proposed that delivers an accurate matte by considering tex- ture as a feature that can complement color even if the foreground and background color distributions overlap and the image is a complex one with highly textured regions. The texture feature is extracted in such a way as to increase distinction between foreground and background regions. An objective function containing color and texture components is optimized to nd the best foreground and background pair among a set of candidate pairs. The effectiveness of proposed method is compared quantitatively as well as qualitatively with other matting methods by evaluating their results on a benchmark dataset and a set of complex images. The evaluations show that the proposed method presented the best among state of the art matting methods. © 2013 Elsevier B.V. All rights reserved. 1. Introduction Digital matting is the accurate extraction of foreground regions from images and videos and is useful for image and video editing operations. The compositing equation is given by [1] I p ¼ α p F p þ 1α p B p ð1Þ where F p and B p are the foreground and background colors of pixel p which are linearly combined using α p to represent its observed color I p . The alpha values lie in [0, 1] with pixels having α = 1 and α = 0 belong- ing to foreground and background, respectively. For a three-channel color image, the digital matting task (also known as alpha matting) in- volves estimation of seven unknowns from three compositing equations for each pixel. Typically, the solution space is constrained through the availability of trimaps, which partition the image into three regions known foreground, known background and unknown (i.e., consists of a mixture of foreground and background colors). Trimaps are usually user-dened, but can also be guided [2] or be generated automatically [3]. Assumptions on image statistics are also used to constrain the solu- tion of Eq. (1) [46]. The algorithm for matte estimation in this paper is based on trimaps. Once the alpha matte is estimated, an editing opera- tion could combine the extracted foreground image with a new back- ground using the compositing equation. The accuracy of alpha matte estimation is largely dependent on the accuracy of the trimap. If the trimap is coarse, i.e., it partitions the fore- ground and background sparsely, then the quality of alpha matting de- teriorates. One of the main approaches to alpha matting is to select best known samples of foreground and background regions that contribute to an accurate matte estimation. The α value for an unknown pixel p is calculated from the selected foreground and background samples by α p ¼ I p B F B ð Þ F B ð Þ2 : ð2Þ The other approach is to assume that neighboring pixels are corre- lated under some image statistics. This involves the estimation of α over a small neighborhood followed by propagation of the α towards unknown pixels guided by gradient of the image. This approach over- comes the disadvantage of a coarse trimap from which reliable samples of foreground and background colors cannot be obtained for unknown pixels that are far away from the trimap. Yet another approach casts the matting problem as an optimization problem where the cost func- tion consists of a data term representing color sampling and a smooth- ness term representing alpha propagation. Thus, they can be seen as a combination of sampling and α-propagation methods. Color sampling based matting methods rely on color values of some selected known pixels to estimate the alpha values of unknown pixels. These approaches work well when dealing with images that have well separated foreground and background color distributions. However, they fail to pull alpha matte when the foreground and background Image and Vision Computing 31 (2013) 658672 This paper has been recommended for acceptance by Philippos Mordohai. Corresponding author. Tel.: +65 67904933; fax: +65 67926559. E-mail addresses: [email protected] (E. Shahrian), [email protected] (D. Rajan). 0262-8856/$ see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.imavis.2013.06.002 Contents lists available at SciVerse ScienceDirect Image and Vision Computing journal homepage: www.elsevier.com/locate/imavis

Using texture to complement color in image matting

  • Upload
    deepu

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using texture to complement color in image matting

Image and Vision Computing 31 (2013) 658–672

Contents lists available at SciVerse ScienceDirect

Image and Vision Computing

j ourna l homepage: www.e lsev ie r .com/ locate / imav is

Using texture to complement color in image matting☆

Ehsan Shahrian, Deepu Rajan ⁎Center for Multimedia and Network Technology, School of Computer Engineering, Nanyang Technological University, 639798, Singapore

☆ This paper has been recommended for acceptance b⁎ Corresponding author. Tel.: +65 67904933; fax: +

E-mail addresses: [email protected] (E. Shahrian)

0262-8856/$ – see front matter © 2013 Elsevier B.V. Allhttp://dx.doi.org/10.1016/j.imavis.2013.06.002

a b s t r a c t

a r t i c l e i n f o

Article history:Received 6 April 2012Received in revised form 4 January 2013Accepted 3 June 2013

Keywords:Alpha mattingTexture matte

Current image matting methods based on color sampling use color to distinguish between foreground andbackground pixels. However, they fail when the corresponding color distributions overlap. Other methodsthat define correlation between neighboring pixels based on color aim to propagate the opacity parameterα from known pixels to unknown pixels. However, strong edges of textured regions may block the propaga-tion of α. In this paper, a newmatting strategy is proposed that delivers an accurate matte by considering tex-ture as a feature that can complement color even if the foreground and background color distributionsoverlap and the image is a complex one with highly textured regions. The texture feature is extracted insuch a way as to increase distinction between foreground and background regions. An objective functioncontaining color and texture components is optimized to find the best foreground and background pairamong a set of candidate pairs. The effectiveness of proposed method is compared quantitatively as well asqualitatively with other matting methods by evaluating their results on a benchmark dataset and a set ofcomplex images. The evaluations show that the proposed method presented the best among state of theart matting methods.

© 2013 Elsevier B.V. All rights reserved.

1. Introduction

Digital matting is the accurate extraction of foreground regionsfrom images and videos and is useful for image and video editingoperations. The compositing equation is given by [1]

Ip ¼ αpFp þ 1−αp

� �Bp ð1Þ

where Fp and Bp are the foreground and background colors of pixel pwhich are linearly combined using αp to represent its observed color Ip.The alpha values lie in [0, 1]with pixels havingα = 1andα = 0belong-ing to foreground and background, respectively. For a three-channelcolor image, the digital matting task (also known as alpha matting) in-volves estimation of seven unknowns from three compositing equationsfor each pixel. Typically, the solution space is constrained through theavailability of trimaps, which partition the image into three regions —

known foreground, known background and unknown (i.e., consists of amixture of foreground and background colors). Trimaps are usuallyuser-defined, but can also be guided [2] or be generated automatically[3]. Assumptions on image statistics are also used to constrain the solu-tion of Eq. (1) [4–6]. The algorithm for matte estimation in this paper isbased on trimaps. Once the alpha matte is estimated, an editing opera-tion could combine the extracted foreground image with a new back-ground using the compositing equation.

y Philippos Mordohai.65 67926559., [email protected] (D. Rajan).

rights reserved.

The accuracy of alpha matte estimation is largely dependent on theaccuracy of the trimap. If the trimap is coarse, i.e., it partitions the fore-ground and background sparsely, then the quality of alpha matting de-teriorates. One of themain approaches to alphamatting is to select bestknown samples of foreground and background regions that contributeto an accurate matte estimation. The α value for an unknown pixel pis calculated from the selected foreground and background samples by

αp ¼Ip−B

� �F−Bð Þ

‖ F−Bð Þ‖2 : ð2Þ

The other approach is to assume that neighboring pixels are corre-lated under some image statistics. This involves the estimation of αover a small neighborhood followed by propagation of the α towardsunknown pixels guided by gradient of the image. This approach over-comes the disadvantage of a coarse trimap fromwhich reliable samplesof foreground and background colors cannot be obtained for unknownpixels that are far away from the trimap. Yet another approach caststhe matting problem as an optimization problem where the cost func-tion consists of a data term representing color sampling and a smooth-ness term representing alpha propagation. Thus, they can be seen as acombination of sampling and α-propagation methods.

Color sampling basedmatting methods rely on color values of someselected known pixels to estimate the alpha values of unknown pixels.These approaches work well when dealing with images that have wellseparated foreground and background color distributions. However,they fail to pull alpha matte when the foreground and background

Page 2: Using texture to complement color in image matting

659E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

color distributions are close together and have some overlap since thecolor feature cannot discriminate between foreground and backgroundcolors anymore. Also, α-propagation basedmattingmethods fail to esti-mate alpha accurately when dealing with complex images because tex-tured background or foreground of complex imagesmay contain strongedges that block the propagation of alpha. The above two drawbacks areillustrated in Fig. 1. The alpha mattes extracted by color sampling ap-proach of Bayesian matting [7] and shared matting [8], alpha propaga-tion based approach of closed-form matting [4] and the combinedapproach of robust matting [9] (which takes advantage of color sam-pling and alpha propagation techniques) show that parts of the bookare also considered as foreground as seen in the zoomed version ofwindow 2. Due to the similarity between the color of hair and part ofthe letters in window 2, which causes the local foreground and back-ground color distributions to overlap, even careful selection of knownforeground and background pairs results in some portions of the back-ground beingwrongly estimated as belonging to thematte. At the sametime, strong edges in the background prevent propagation of alpha,which also contributes to the erroneous matte. In fact, the edges corre-sponding to some of the letters in the title of the book are visible inthese methods.

In this paper, we propose to alleviate the above problems of existingmatting methods through two contributions: (i) The first one is the de-velopment of a color sampling based alpha matting algorithm whereseveral properties of potential candidates for foreground and back-ground are considered to select a suitable foreground–background pairfrom which the alpha matte can be calculated. (ii) The second contribu-tion is an evaluation of the role that texture plays in an accurate estima-tion of alpha. As mentioned, color features fail when there is overlapbetween foreground and background distributions. In such cases, weshow that the texture feature can be used to discriminate betweenforeground and background. When the alpha matte from the texturefeatures is combined with that obtained from the color features, weachieve a more accurate estimate of the unknown alpha matte. Weshow the result of our combined matting approach in Fig. 1, wheremuch of the background that appeared in the other methods havebeen eliminated. Compared to shared sampling [8], our method hasperformed well as seen in windows 1 and 2, where more of the bookregion has been correctly identified as belonging to the background.

The paper is organized as follows: in Section 2, we present a briefreview of some important matting methods. In Section 3, a new colorbased matting method is proposed and the role of texture in alphamatting is studied in Section 4. The combination of color and texturefeatures to arrive at the final alpha matte is discussed in Section 5.Experimental results are presented in Section 6 and finally conclusionsare presented in Section 7.

2. Related work

A comprehensive survey on image and videomatting is presented in[10], where the various techniques have been categorized into colorsampling methods, alpha propagation methods and their combination.

Input Image

1

2

Input Image

1

2

Closed Form MattingZoomed Windows

Fig. 1. Comparison of proposed method of combining color and texture information for mzoomed versions of windows 1 and 2, respectively.

Statistically, neighboring pixels with similar color often have similarα values. Color sampling based methods use this local correlationbetweenunknownpixels and their nearby known foregroundand back-ground pixels. A set of known foreground and background candidatesamples are selected and an analysis of their spatial, photometric, andtextural properties are carried out to identify the best foreground–background pair. Different sample collection approaches are used.Robust matting [9] collects known samples that are spatially close tounknown pixels while shared matting [8] collects samples that liealong rays that are emanated from unknown pixels. Global samplingmatting [11] captures all known boundary samples to construct theset of F–B pairs for unknown pixels. The earliest sampling approacheswere the blue screenmatting idea ofMishima [12] and the interpolationof color distributions from known foreground and background regions[6]. Bayesian matting uses local oriented Gaussian distributions to con-struct foreground and background color models and formulates alphaestimation in a Bayesian framework to be solved using maximum aposteriori (MAP) technique [7]. Local color based matting methodswork well when the trimap is accurate and the unknown region is anarrow band around known regions. For coarse trimaps, an iterativematting approach is proposed in [3] where a user scribbles an initial(coarse) trimap and the algorithm builds a global color model by train-ing GaussianMixture Models (GMMs) on known foreground and back-ground colors from which samples are drawn. The known regions arethen extended gradually based on a belief propagation framework. Inrobust matting [9], high confidence foreground–background pairs thatlinearly explain the color of pixels are used to estimate alpha matte.This idea is extended by [8,13,11] with respect to spatial and gradientproperties of the image. Once the best foreground (F) and background(B) samples are selected, α can be computed using Eq. (2).

Matting methods based on alpha propagation use affinities ofneighboring pixels to propagate alpha values of known pixels towardunknown ones. In [14], an extension of spectral segmentation is pro-posed to generate a basis set of fuzzy matting components. Spectralsegmentation obtains an unsupervised segmentation of the image intohard components based on the smallest eigenvectors of the graphLaplacian matrix of the image. A random walk on a graph whose edgeweights reflect affinity between pixels is used to extract the matte in[15]. For every unknown pixel, the probability of a random walkerstarting from an unknown pixel to reach foreground before backgroundindicates its alpha value and is obtained by solving a system of linearequations. In [16], the alpha value is based on a weighted geodesic dis-tance covered by a randomwalker traveling fromanunknownpixel to aforeground pixel. Assumption of local color smoothness in foregroundand background regions is made by Poisson matting [17] to estimategradient of alpha matte using image gradient. A closed form solutionfor alpha matting is proposed in [4] by minimizing a cost functionbased on local smoothness assumptions on foreground (F) and back-ground (B) colors. The authors show analytically that it is possible to re-move F and B from the cost function and define it as a quadratic functionbased on α. The alpha matting problem is treated as semi-supervisedlearning task in [18] in which global and local learning processes

Robust Matting Proposed Method Bayesian Matting Shared Matting

atte extraction with other state-of-the-art methods. Top row and bottom row show

Page 3: Using texture to complement color in image matting

660 E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

model the dependencies between alpha and color in the neighborhoodof a pixel.

The color sampling and alpha propagation approaches have beencombined together to form a third category of matte extraction tech-niques. Here, the matte estimation is formulated as an optimizationproblem consisting of a data term and a smoothness term. The iterativematting technique [3] thatwas referred to earlier can also be seen as be-longing to this category since thematte ismodeled as aMarkov randomfield and an energy function is minimized using loopy belief propaga-tion. In robustmatting [9], the optimization is solved as a graph labelingproblemwith the data term of the energy function determined by highconfidence foreground–background pairs obtained from a robust sam-pling scheme and the smoothness term is similar to the cost functionof closed form matting. This idea is improved by Rhemann et al. [19]and a new confidence metric is proposed to weight the data term ofenergy function with respect to smoothness term for every pixel andenergy function is minimized by solving sparse set of linear equations.In [20], the prior probability of alpha matte is modeled as a convolutionof a high-resolution binary segmented image with the point spreadfunction of the imaging process and the result is applied to [13] toobtain an improved matte. In [11], all known foreground and back-ground pixels on the unknown region's boundary are used to constructsets of global F–B samples for unknown pixels. A simple cost functionand an efficient random search are used to find the best F–B pair. Theestimated alpha matte is used as the data term in a global optimizationproblem to refine the α estimated from the F–B pairs and the mattingLaplacian matrix [4] is the smoothness term.

The color-based sampling techniques discussed above assumesmoothness of color in a neighborhood. Moreover, since they relyonly on color, if the local distributions of foreground and backgroundcolor overlap, the methods tend to fail.

The problem is clearly shown in Fig. 2. Foreground and backgroundregions of the image (Fig. 2a) have similar color distribution as shown inFig. 2b. Four foreground–background pairs (F1, B1), (F2, B2), (F3, B2) and(F3, B1) are collected to estimate α of unknown pixel p with observedgray value 100. The unknown pixel p is close to foreground componentof pair (F1, B1) and itsα is estimated as 0.75while it is also close to back-ground component of (B2, F2) and its α is estimated as 0.25 and theestimated α by other F–B pairs are out of the range [0, 1]. In suchcases, color feature is unable to select appropriate samples and the

1B

1B

BackgroundForeground

0.05

0.04

0.03

0.02

0.01

00 50 100 150 200 250

a

b

Fig. 2. Illustration of the problem of sample selectionwhen foreground and background have sibackground of image in (a). (c) Effect of overlapped distribution on α. (d) Estimated alpha for

question iswhich pair can accurately estimateα? The proposedmethoduses texture feature along with color to alleviate the problem by mini-mizing an objective function that consists of a component that is espe-cially targeted to handle overlapped distributions.

Alpha propagation methods assume that neighboring pixels arecorrelated under some image statistics. The strong edges of high texturedregionsmay block the propagation of alpha since the correlations are de-fined based on color similarities. Therefore we proposed to use texturefeatures together with color information to overcome this problem.The use of texture features for matting has not been studied extensively.In [21], good background samples for unknown regions are found by tex-ture synthesis using known background and observed color of unknownpixel. A similar procedure is carried out to gather good foreground sam-ples.With this collection of F and B samples, an alpha estimationmethodis proposed.While thismethod uses texture information of knownpixelsto synthesize texture in unknown regions, our approach is to extracttexture features that will aid the color matting method proposed inthis paper by delineating foreground and background regions.

Although previous combined matting methods are successful tosome degree in complementing poor sample selection methods withbetter alpha propagation strategies and vice versa, theymay not be suc-cessful in challenging images like the one in Fig. 1where robustmattingdoes not perform as well as the proposed matting method that com-bines color and texture features.

3. Proposed method

The newmatting method based on color and texture feature is pro-posed to illustrate the performance of texture in complementing colorto estimate a more accurate matte for textured regions that have fore-ground and background regions with similar color distributions. Theproposedmethod collects color and texture samples for every unknownpixel and then estimates texture matte by using texture features; theestimated texture matte is then used with color features to estimate amore accurate alpha for unknown pixels. In this context, first the sam-ple selection is presented and then objective function to choose thebest F–B pair followed by Laplacian post-processing is described andfinally texture feature extraction and its combination with color featurein objective function are explained.

pI

0.75α1F

0α <

3F 2B

2F2B0.25α

3F

1α >

(F , B ) = (110 , 70)1 1

(F , B ) = (130 , 90)2 2

(F , B ) = (80 , 90)3 2

(F , B ) = ( 80 , 70)3 1

F-B pairs

0.750.25

3

Alpha

-1

d

c

milar color. (a) Original image and its alphamatte. (b) Color distribution of foreground anddifferent F–B pairs.

Page 4: Using texture to complement color in image matting

661E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

3.1. Sample selection

The objective of the proposed color sampling scheme is to ensurethat the choice of foreground and background samples are judiciouslychosen so that the estimated α when used in conjunction with αextracted from texture feature results in an improved matte. Oursampling approach is inspired by the sample gathering stage of [8],which performs best in a benchmark database [22].

The input image is divided into blocks of size 5 × 5 and amixture ofGaussian distributions is assigned to each block tomodel its color distri-bution. A set of color models for unknown blocks are constructed bychoosing models from some known blocks. This process is illustratedin Fig. 3.

For each unknown block, m line segments are drawn at angularincrements of θinc = 2π/mwith the first line segment having an orien-tation of 0°. These line segments choose the model of the first knownblock that they encounter. Thus, m line segments collect at most mcolor models of background blocks and at most m color models offoreground blocks. The selected blocks are shown in Fig. 3 for eightline segments. In the experiments, we use 20 line segments to constructa set of foreground and background color models for each unknownblock. Each component in a model is then sampled individuallyand the generated samples are paired together to construct a set offoreground–background (F–B) pairs. Such a sampling ensures that F–Bpairs are obtained for every block so that samples are generated evenfor small clusters, which might be missed if random sampling wereemployed. This method of sample generation is different from [8] inthat in the latter, the F–B pair is formed from each pixel as opposed tosampling from a distribution in a block. The obvious advantage in theproposedmethod is that it is not affected by noisy pixels since the infor-mation in an entire block is captured in the distribution from whichsamples are generated.

3.2. Best (F, B) pair selection

Once the set of F–B pairs for unknownblocks are collected, the task isto select the best F–B pair among the set that can best represent the trueforeground and background colors for each pixel in the block and esti-mate its α using Eq. (2). The selection is done through a brute-force op-timization of an objective function based on photometric and spatialimage statistics. It consists of three main parts as follows:

O1p Fi;Bj

� �¼ Dp Fi;Bj

� �� Ep Fi;Bj

� �� Sp Fi;Bj

� �ð3Þ

where the D indicates chromatic distortion and E and S use spatial andcolor statistics of the image to find high quality F–B pair for everypixel. Before discussing each of these terms, we present some notationsrelated to an unknown pixel p. The block of pixel p is Blkp and set of gen-erated F–B pairs for the block is denoted by SFBBlkp . The ith foregroundsamples and jth background samples are Fi and Bj, respectively. The

Known Background Block

Unknown Region

Mixture of Gaussian Distributions for Foreground

Known Foreground Block

Mixture of Gaussian Distributions for Background

Fig. 3. Illustration of the process of known bl

set of foreground and background components of F–B pairs are SFBlkpand SBBlkp respectively. The number of F–B pairs and the number offoreground and background components are NFB

Blki, NF

Blkiand NB

Blki,

respectively.D: The linearmodel of compositing equation successfully explains the

color of a pixel as a convex combination when the estimated color isclose to its observed color. Hence, D is defined to account for chromaticdistortion. For a certain F–B pair, the estimated color of the unknownpixel is obtained by the compositing equation. The distortion betweenestimated color and observed color is given by

Dp Fi;Bj

� �¼ e−‖Ip− αFiþ 1−αð ÞBjð Þ‖ ð4Þ

where Ip is observed color of pixel p. It has a high value for F–B pairs forwhich the estimated color is close to the observed color.

E: This term involves the distance between an F–B pair and theunknown pixel at block resolution in the spatial domain. If an F–B paircan easily propagate its information to the unknown pixel due to itslow spatial distance, then E should be high. Thus,

Ep Fi;Bj

� �¼ e

−‖BlkFi−Blkp‖1

NFBlkp

∑Fk∈SFBlkp‖BlkFk−Blkp‖

�e

−‖BlkBj−Blkp‖

1NB

Blkp

∑Bk∈SBBlkp‖BlkBk−Blkp‖

ð5Þ

where ‖BlkFi−Blkp‖ is the Euclidean distance between block of fore-ground sample Fi and the block containing the unknown pixel p in spa-tial domain and indicates the spatial cost required for Fi to reach thepixel.

S: This term is definedwith respect tomain problem of current colorsampling based approaches, viz., the overlapping of foreground andbackground distributions. It is biased towards those F–B pairs thatcome from well separated distributions and is formulated as

S Fi;Bj

� �¼ e

−1d Fi ;Bjð Þ ð6Þ

where

d Fi;Bj

� �¼

μFi−μBjffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

NBi−1

� �σ2

Biþ NFi

−1

� �σ2

Fi

NBiþNFi

−2

r ð7Þ

is the Cohen's d value of distributions [23] that generated Fi and Bj. μFi,

σ2Fiand NFi

are mean, variance and population size of distribution thatgenerated sample Fi. Cohen's d value is inversely proportional to theoverlap of distributions. When F–B pairs are generated by highlyoverlapped foreground and background distributions (d → 0) then

Background Foreground

Distributions of Selected Known Blocks

. . .

set of F-B pairs

. . .

set of foreground samples

. . .

set of background samples

ocks selection by using 8 line segments.

Page 5: Using texture to complement color in image matting

662 E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

S → 0 to reduce their influence. When they are generated from wellseparated distributions their influence is increased since S → 1.

3.3. Refinement process

The alpha matte obtained by estimating α for each pixel using thebest (F, B) pair in Eq. (2) is further refined to obtain a smooth matteby considering correlation between neighboring pixels. In particular,we adopt the post-processing method of [8] where a cost functionconsisting of the data term α̂ and a confidence value f together witha smoothness term consisting of the matting Laplacian [4] is mini-mized with respect to α. The confidence value is the value of theobjective function in Eq. (3) for the selected (F, B) pair. The cost functionis given by [8]

α ¼ argmin αTLα þ λ α−α̂ð ÞTD α−α̂ð Þ þ γ α−α̂ð ÞT Γ̂ α−α̂ð Þ ð8Þ

where λ is a large weighting parameter compared to the estimatedalpha α̂ and its associated confidence f while γ is a constant (10−1)that indicates the relative importance of data and smoothness terms.D is a diagonal matrix with values 1 for known foreground and back-groundpixels and0 for unknownones,while diagonalmatrix Γ̂ has values0 for known foreground and background pixels and f (the confidencevalue) for unknown pixels.

4. Texture feature for matting

Asmentioned,when there is significant overlap in the color distribu-tion of foreground and background regions, current color samplingmethodsmay fail to estimate a reliable matte. Moreover, in alpha prop-agation based methods, since the correlation between neighboringpixels is defined based on color similarities, strong edges of texturedregions may block the propagation of alpha precluding the extractionof an accurate matte. In such cases, color information does not sufficeand hence, we turn to texture features that may define the correlationbetween neighbors more robustly. Thus, our solution is to extract a tex-ture feature in suchway as to increase the discrimination between fore-ground and background regions. Any of the texture feature extractiontechniques available in the literature can be used for the purpose [24].

The proposed texture feature is generated from chromatic and struc-tural content in an image. Texture properties are considered over animage patch in which the chromatic information is as homogeneousas possible and only the dominant color in the patch is retained. Tothis end, we smooth the image. However, we need to ensure that strongedges that participate in the texture do not get smoothed inadvertently.Hence, the first step is the application of a set of three edge-preservingbilateral filters to obtain high, medium and low smoothing. The param-eters associated with the bilateral filters are the Gaussian window sizes(5, 15, 25), geometric spread (10, 15, 25) and photometric spread (1, 1.5and 3). The absolute difference between the original image and the bi-lateral filtered images is denoted by Ii

AbsDiff = |I0 − IiBF| for i = 1, 2

and 3 and its intensities are scaled to [0, 1]. The final smoothed imagethat removes details of color variation is obtained as

ISmooth ¼ w0⊗I0 þ13

X3i¼1

IAbsDiffi ⊗IBFi� �

ð9Þ

where w0 ¼ 1− 13∑

3i¼1I

AbsDiffi

� �and ⊗ is the element wise product

operator. w0 takes values in the range [0, 1] and indicates how muchinformation is taken from the input and bilateral filtered images withrespect to absolute differences between them.

The original image of Net along with its trimap and three bilateralsmoothed images are shown in the first row of Fig. 4. The absolute dif-ferences between the original image and the bilateral filtered imagesare shown in the first three columns of Fig. 4 in the second row. Thefinal smoothed image in which, color values of pixels come from

both the original as well as the bilateral smoothed images is shownin fourth column of Fig. 4 in second row. If the difference betweenoriginal and bilateral filtered images is small, the contribution to thesmoothed image comes from the former, otherwise the contributionis from the bilateral filtered images. Therefore, the color details inthe smoothed image are replaced by the overall chromatic informa-tion in the texture. The chromatic component of the texture featurevector for a pixel is, thus, given by

FVTextureS1 ¼ ISmooth

; IAbsDiff1 ; IAbsDiff2 ; IAbsDiff3

n oð10Þ

which is a 12 dimensional vector.The structural content of the texture is obtained through a Haar

wavelet decomposition of the image into 4 sub-images called Approxi-mation (IA), Horizontal (IH), Vertical (IV) and Diagonal (ID) details. Eachof the sub-images is resized to that of the original image and the meanof the coefficients over a neighborhood of size 3 × 3 is computed foreach pixel. Similarly, the variance of the coefficients over a similarneighborhood is calculated along four directions — horizontal, vertical,major and minor diagonals. Thus the structural component of the tex-ture feature vector for a pixel is given by

FVTextureS2 ¼ Ivark;ið Þ; I

meank

n o; k ¼ A;H;V ;Df g; i ¼ 1;…;4; ð11Þ

and its dimension is 20. The chromatic content and the structuralcontent in a texture are concatenated to form a 32-dimensional texturefeature vector given by

FVTexture ¼ FVTextureS1 ; FVTexture

S2

n o: ð12Þ

In order to generate the texture features, we reduce the32-dimensional feature vector down through a two-stagedimensional re-duction process. The first stage involves the application of principal com-ponent analysis (PCA) where 95% of the information is retained. In thesecond stage,wewish to take advantage of the label information availablefrom the trimap in terms of known foreground and background colorvalues of pixels and apply Linear Discriminant Analysis (LDA) to furtherreduce the dimensions. Thus, the foreground and background pixels areclustered intom andn clusters, respectively,where thenumber of clustersare optimal and are determined by the Akaike Information Criterion (AIC).The top projections (eigenvectors corresponding to three largest eigen-vectors) that represent the best separation between clusters constitutesthe reduced dimensions for known data in suchway that 90% of informa-tion is retained. The selected eigenvectors in PCA and projections in LDAare used to reduce the dimensions of texture feature vector for unknownpixels to ensure that the same dimension reduction process is applied forboth known and unknownpixels. The extracted texture feature of theNetimage is shown in last column of Fig. 4 in second row that indicates howsuccessfully chromatic and structural information are combined andtransformed in such a way to increase distinctions between foregroundand background regions.

Fig. 5 shows how texture feature can discriminate between fore-ground and background with overlapped color distributions. The origi-nal images with foreground and background distributions in red, greenand blue channels are shown in Fig. 5a, b, c and d, respectively. The im-ages in Fig. 5a are taken from standard training set of image proposedby[22]. The first dimensions of extracted texture features and their histo-grams are shown in Fig. 5e and f and as expected, the distributions offoreground and background are more separated in texture space thancolor space. Foreground and background regions of first and second im-ages have similar color distributions as shown in first and second rowsof Fig. 5b, c, d. The first dimension of extracted texture feature is shownin Fig. 5e while texture distributions of foreground and background aremore distinct as shown in Fig. 5e. The discussion is same for rest ofimages in the figure. The discrimination between foreground and

Page 6: Using texture to complement color in image matting

Original Image I 1BF I 3

BFI 2BF

I 1AbsDiff I 2

AbsDiff I 3AbsDiff

I Smooth Estimated Texture

Trimap

Fig. 4. Illustration of texture components and final extracted texture feature.

663E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

background regions in Fig. 5e and f indicates the performance of pro-posed texture feature in dealing with natural images. The proposedmethod uses discriminative power of texture alongwith color informa-tion to estimate robust mattingwhen foreground and background haveoverlapped color distributions.

Image Histogram on Red Channel

Histogram on Green Channel

a b c

1

2

3

4

5

Foreground

Background

Fig. 5. Overlap distributions of foreground and background in color and texture spaces. (a)(d) blue channels. (e) First dimension of texture feature as an image. (f) Texture distributio

5. Combining texture and color features

The combination of texture and color features for alpha mattingentails inclusion of an additional parameter that tests for compatibilityof the color matte with previously estimated texture matte αT.

Histogram on Blue Channel

Histogram on First Dimension of Texture

First Dimension of Texture Feature

d e f

Original image. Distributions of foreground and background in (b) red, (c) green andn of foreground and background in (e).

Page 7: Using texture to complement color in image matting

664 E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

Tp: Although the texture matte by itself is not accurate over theentire image, those pixels for which it is correct should be includedin the combination processes with a higher weighting. The correct-ness is measured by its compatibility with the color based α given by

Tp Fi;Bj;αTp

� �¼ αT

p � αp þ 1−αTp

� �� 1−αp

� �ð13Þ

where αpT is previously estimated alpha for pixel p based on texture

features through optimizing Eq. (3). The αp is estimated alpha forpixel p based on (Fi, Bj) pair by using Eq. (2). With the inclusion oftexture alpha, the objective function to select the best F–B pairregarding estimated texture matte is modified as

O2p Fi;Bj;α

Tp

� �¼ O1

p Fi;Bj

� �� Tp Fi;Bj;α

Tp

� �: ð14Þ

The block diagram of the proposed alpha matting system is shownin Fig. 6. The texture matte is obtained by applying the proposed colormatting algorithm on the texture features using Eq. (3) followed byEq. (8) to refine the matte. This texture matte is used in conjunctionwith the proposed color matting method through Eqs. (14) and (8)to obtain the final matte.

6. Experimental results

The effectiveness of the proposed method in alleviating a majordrawback of current matting methods is evaluated through compre-hensive experiments. We demonstrate the inadequacy of color as amatting feature when there is significant overlap in F and B colors.The effect of texture resolution on matting quality is evaluated in thefirst experiment. The advantages of combining color and texture inmat-ting are shown in the second experiment. In the third experiment, theeffectiveness of the proposed method is quantitatively compared withother matting methods on dataset of images developed by Rhemannet al. [22]. It consists of 8 images for which the ground truth alphamattes are hidden from the public. For each image, three trimaps arepresented; they are small, large (coarse) and user defined. Examplesof images and trimaps are shown in Fig. 7. We use these images to com-pare the performance of the proposed method with other methodslisted in the website based on different measures — the mean squareerror (MSE), Gradient error and the sum of absolute difference (SAD)between the extracted matte and the ground truth. The performanceof proposedmethod and state of the art mattingmethods are evaluatedfor highly textured images in the fourth experiment. Applying the pro-posed method on only color or texture features using Eq. (3) followedby Laplacian post-processing is called color and texture matting, respec-tively. Combining color and texture features through the proposedmethod using Eq. (14) is called combined matting in the followingexperiments.

6.1. Experiment 1: Effect of texture resolution on alpha matte

The proposed method takes advantage of texture information tocomplement color in the matting process. We show that although

Texture FeatureExtraction

Color Matting(using texture features)

eq.(3)

Color Mattingeq.(14)

Laplacian Post-Processing

eq(7)

Laplacian Post-Processing

eq(7)

Color Image

Texture Matte

Final matte

Fig. 6. Block diagram of proposed alpha matting method.

texture provides important additional information for the extractionof high quality mattes, the resolution at which texture descriptorsare computed does not affect the extracted matte. However, asexpected, the matte extracted using only texture information isdependent on the texture resolution. The objective function definedin Eq. (14) combines texture and color information to find the bestF–B pairs for unknown pixels. Alpha values are then estimatedbased only on color components of best F–B pairs which are definedat pixel resolution. Thus, the resolution of estimated mattes is notdependent on the size of texture patches.

We compare, the rawestimatedmattes by proposedmethod based ondifferent texture patch sizes without applying any refinement process.The original images with zoomed regions are shown in the first row ofFig. 8. The estimated mattes using only texture information (texturematte) of zoomed regions of Net, Troll, Giraffe and Barbie images areshown in Fig. 8a, c, e and g, respectively. More details such as hair strandsand structure of the net are visible in the texturemattes with 3 × 3 patchsize as shown in second row of Fig. 8c and a. The quality of estimatedtexture mattes are degraded when size of patches is increased as shownin third and fourth rows of Fig. 8a, c, e and g for patch sizes of 13 × 13and 21 × 21, respectively.

The proposedmethod uses color and texture information to estimatemattes at pixel resolution regardless of patch size of texture descriptors.Results of proposed method (combined mattes) with different texturepatch sizes are shown in Fig. 8b, d, f and h forNet, Troll, Giraffe andBarbieimages, respectively. The fine details like hair strands or net structure areclearly visible in the combinedmattes as shown in second row of Fig. 8b,d, f and h. The quality of combined mattes is high even when the detailsinformation are lost by large texture patches as shown in fourth row ofFig. 8b, d, f and h. Thus, even when the texture patch sizes are changed,there is no marked change in the quality of the extracted matte.

6.2. Experiment 2: Analysis of the contribution of different components ofthe algorithm

This experiment illustrates, qualitatively and quantitatively, theimprovement when texture matte is combined with color to obtainthe final matte. We also analyze the contributions of the spatial (E)and color (S) statistics terms of Eq. (3). In the original images shownin Fig. 9a the backgrounds are designed so that there is significant over-lap between foreground and background distributions. The first dimen-sion of texture feature of the images is shown in Fig. 9b. The proposedalpha matting is applied on texture and color features individually toestimate texture and colormattes as shown in Fig. 9c and d. As expectedthe texture matte is not as good as color matte because texture featuresare defined over patch of pixels while color features are defined for sin-gle pixels. This is clear from estimated mattes using color and texturefeatures for the image in row 1 containing hair strands where someparts of background are wrongly estimated as foreground (Fig. 9c, d).The alpha matte for background is more accurately estimated bycombining color and texture through proposed method as shown inFig. 9e. The estimated mattes for plant leaves in row 4 are not accurateand some of the leaves are considered as foreground while they areaccurately estimated by combining color and texture features.

A quantitative evaluation to show the effect of combining color andtexture matte is shown in Table 1 where the sum of absolute difference(SAD) between the generated matte and the ground truth is calculated(the best results are shown in bold). These results have been generatedfrom alpha matting website1 using images in the database. It is evidentthat the combined matting method has consistently outperformedeither of color matting or texture matting. The only exception is theTroll image for which the colormatte performs the best.When the fore-ground consists of pixel-wide regions, e.g., hair strands, color plays a

1 www.alphamatting.com.

Page 8: Using texture to complement color in image matting

a

b

c

Fig. 7. (a) The Troll image and its trimaps (small, large and user-defined) (b). (c) Images from the dataset.

665E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

more important role than texture and hence, color matting has less SADas seen in the Troll image.

The contributions of E and S terms in the objective function of Eq. (3)is quantitatively evaluated and shown in the last two columns ofTable 1. When the E term, which is a function of the distance of a fore-ground or background block from the block containing the unknownpixel, is removed from the objective function, the bias towards largerseparation of color distributions reflected in the S term plays an impor-tant part in addition to the chromatic distortion. The resultswithout the

1

2

3

4

Original Image Zoomed Region

Texture Matte Combined Matte

a b

Original Image Zoomed Region

Texture Matte Combined Matte

c d

Win

dow

3x3

Win

dow

13x

13W

indo

w 2

1x21

Net Troll

Fig. 8. Effect of texture patch s

E term in the objective function show that the performance is poorerthan when the E term is included. Similarly, when the S term is exclud-ed, the objective function contains the chromatic distortion term andthe E term. In this case also the performance is poorer for most of theimages than when the S term is also included. The exception is for theDoll image and the reason could be that in this particular image, the dis-tance factor is more relevant than the separation of color distributionsbecause the colors are already well separated for most parts of theunknown trimap.

Original Image Zoomed Region

Texture Matte Combined Matte

e f

Original Image Zoomed Region

Texture Matte Combined Matte

g h

BarbieGiraffe

izes on matte resolution.

Page 9: Using texture to complement color in image matting

1

2

4

3

a b c d e

Fig. 9. Effect of combining color and texture matte to obtain improved alpha matte. (a) Original image, (b) visualization of first dimension of extracted texture features, (c) estimatedmatte using only texture features, (d) estimated mattes using only color features, (e) estimated mattes using texture and color features by proposed method.

Table 1Sum of absolute difference for color, texture, combined, combined without E and Sterms matting methods.

Image Trimap Texture Color Combined Combinedwithout Eterm

Combinedwithout Sterm

Troll Small 49.1 14.5 15.1 16.7 15.3Large 56.2 19.2 19.4 21.2 19.6User 51.7 17.3 18 19 18.4

Doll Small 20.4 8.6 7.6 9.1 7.7Large 27 12.3 10.7 12.2 10.4User 20.1 8.7 8 9.6 7.9

Donkey Small 12.6 5.1 4.5 5 4.5Large 14.9 7.1 5.9 6.6 6User 13.1 4.8 4.1 4.5 4.2

Elephant Small 8.9 3.9 3.1 3.6 3.2Large 15.7 9.3 7.2 9 7.3User 10.2 5.5 4.3 4.6 4.5

Plant Small 23.4 6.3 6.7 6.8 6.8Large 37.6 10.8 10.5 11.4 10.7User 41.1 14.6 14.3 14.7 14.3

Pineapple Small 13.6 6.9 6 6.5 6.3Large 17.8 13.2 11.3 11.8 11.6User 14.5 8 7.2 7.1 7.5

Plastic bag Small 26.8 24.4 24.1 25.1 24.3Large 32.2 26.5 25.8 28.6 27User 26.2 23.2 22.9 25.1 23.9

Net Small 60.3 20.7 20 26 20.8Large 64.5 25.2 23.1 30.1 23.6User 60.4 22.9 21.8 26.5 22.6

666 E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

6.3. Experiment 3: Evaluation on standard data set

For further evaluation, the proposed combined matting method iscompared with other techniques listed in the alpha matting website.Table 2 shows the MSE for all the images and for all trimaps. The super-script next to the MSE for eachmethod indicates its rank for the particu-lar image and trimap. Methods with first rank are in bold. The estimatedmattes by the proposed combinedmethod for Donkey, Plastic bag, Plant,Net and Pineapple images with small trimaps are ranked 1st among allmatting method and also presented very good rank for other imageswith different trimaps. The average ranks of proposed combinedmethodfor small, large and user trimaps are 2.75, 2.5 and 2.5, respectively andoverall rank of proposed method for all images with different trimapsis 2.58which is the best among allmattingmethods. The results of differ-ent matting methods are evaluated by SAD and Gradient error as shownin Table 3. The proposed combinedmethod has overall rank of 3.21 withrespect to Gradient error which is the best and has overall rank of 3.92based on SAD which is second after shared matting whose overall rankis 3.13.

Once the foreground andbackground colors alongwith alpha value ofunknown pixels are estimated, the foreground object can be seamlesslycomposed with new background by using the compositing equationand by replacing estimated background colors of pixels with new back-ground colors. Some of images with trimaps are shown in first columnof Fig. 10 in which red and blue colors on images indicate boundariesof known foreground and background regions. The estimated alphaand foreground color by proposed combined method are shown in sec-ond and third column of Fig. 10. The composition of foreground objectswith new background and its zoomed region are shown in last columnof Fig. 10. The hair strands of Lady Face 1, Lady Face 2 and Doll 2 are

smoothly combined with new background as shown in zoomed regionsin Fig. 10. The solid foreground of Doll 1 is seamlessly pasted to newbackground as shown in zoomed regions in third row of Fig. 10. The

Page 10: Using texture to complement color in image matting

Table 2Comparison of some matting methods with proposed one on benchmark set of images with respect to MSE.

Mean square error Overall rank Avg. small rank Avg. large rank Avg. user rank Troll Doll Donkey

Small Large User Small Large User Small Large User

1. Proposed matting 2.58 2.75 2.5 2.5 0.85 1.22 1.14 0.45 0.56 0.41 0.21 0.31 0.21

2. Shared matting 3.83 3.63 4.25 3.63 0.51 1.64 .91 0.57 0.97 0.56 0.32 0.42 0.36

3. Segmentation-based 4.42 4.5 4.63 4.13 0.53 2.28 1.13 0.33 0.41 0.42 0.37 0.47 0.33

4. Improved color matting 4.67 4.38 5.38 4.25 0.86 2.410 1.58 0.32 0.55 0.55 0.33 0.48 0.32

5. Learning based matting 6.04 6.5 5.25 6.38 0.87 1.63 1.37 0.46 0.43 0.54 0.38 0.46 0.35

6. Closed-form matting 6.17 5.5 5.5 7.5 0.52 1.87 1.15 0.31 0.42 0.67 0.35 0.45 0.34

7. Shared matting (real time) 6.33 6.75 6.38 5.88 0.64 1.76 12 0.710 1.29 0.89 0.36 0.43 0.39

8. Large kernel matting 6.88 7.5 6.5 6.63 1.210 1.65 1.79 0.34 0.54 0.43 0.410 0.510 0.31

9. Robust matting 7.67 6.5 8.25 8.25 1.19 2.812 1.710 0.79 1.510 0.910 0.34 0.49 0.37

10. High res matting 8.71 8.38 9.5 8.25 1.311 2.29 2.211 0.58 1.18 0.88 0.39 0.44 0.38

11. Iterative BP matting 11.4 10.8 11.4 12 1.712 2.611 2.312 1.513 2.613 2.314 0.511 0.712 0.411

12. Random walk matting 11.5 12.4 10.5 11.5 18 1.11 1.26 111 1.711 1.111 0.512 0.611 0.612

13. Geodesic matting 12.3 12.8 12 12.1 2.414 4.614 3.415 1.512 1.812 1.912 1.615 2.116 1.114

14. Bayesian matting 13.7 13.8 14.3 13.1 315 4.615 3.414 2.314 3.214 2.113 1.414 .514 1.216

15. Easy matting 14 14 14 14 2.213 3.713 3.313 2.415 3.215 2.915 0.713 0.913 0.613

16. Poisson matting 15.9 16 15.8 15.9 6.916 7.516 7.116 4.716 7.716 5.316 1.716 1.915 1.115

Mean square error Elephant Plant Pineapple Plastic bag Net

Small Large User Small Large User Small Large User Small Large User Small Large User

1. Proposed matting 0.17 0.35 0.27 0.41 0.52 13 0.31 0.62 0.41 1.51 1.61 1.42 0.81 0.91 1.81

2. Shared matting 0.15 0.47 0.24 0.42 0.63 0.91 0.42 0.61 0.52 2.98 2.87 2.77 12 1.33 12

3. Segmentation-based 0.11 0.21 0.21 0.66 0.51 1.25 0.44 0.73 0.85 2.76 3.310 3.69 1.46 1.96 1.65

4. Improved color matting 0.16 0.34 0.26 0.78 0.74 0.92 0.43 0.74 0.74 24 1.94 1.43 1.33 1.54 1.54

5. Learning based matting 0.14 0.22 0.23 0.55 1.27 28 0.911 1.710 212 1.63 1.72 11 2.28 2.79 4.211

6. Closed-form matting 0.12 0.33 0.25 1.211 1.410 2.311 0.810 1.69 1.610 39 2.76 1.95 1.34 1.22 513

7. Shared matting (real time) 0.211 0.610 0.29 0.53 0.85 1.14 0.45 0.85 0.63 310 38 38 1.35 1.95 1.53

8. Large kernel matting 0.18 0.59 0.28 0.99 1.16 1.46 0.67 1.16 16 2.75 2.45 1.74 2.17 27 2.87

9. Robust matting 0.19 0.58 0.310 0.54 1.28 1.97 0.56 1.58 1.28 1.52 1.83 2.66 2.49 2.38 2.98

10. High res matting 0.13 0.711 0.22 0.67 1.29 2.29 0.88 212 1.49 3.211 3.412 4.213 2.610 4.311 2.26

11. Iterative BP matting 0.210 0.812 0.412 1.110 211 3.112 112 211 1.611 2.87 3.311 4.514 311 3.810 3.610

12. Random walk matting 0.212 0.46 0.311 215 3.414 4.214 1.613 2.313 2.113 4.614 4.414 411 8.314 9.414 8.514

13. Geodesic matting 0.815 1.914 0.915 1.812 2.512 2.210 0.89 1.47 1.17 3.312 3.29 4.112 3.813 4.312 4.212

14. Bayesian matting 0.714 2.115 0.613 1.813 4.215 5.215 2.215 4.615 2.915 3.413 3.913 3.910 3.712 8.613 3.59

15. Easy matting 0.513 1.413 0.614 1.814 2.513 4.113 1.714 2.714 2.414 5.415 5.415 4.715 1015 15.916 16.315

16. Poisson matting 1.616 2.516 1.416 3.516 6.316 11.516 3.616 5.716 3.916 6.116 9.416 6.816 19.416 1115 21.616

667E.Shahrian,D

.Rajan/Im

ageand

Vision

Computing

31(2013)

658–672

Page 11: Using texture to complement color in image matting

Table 3Ranks of different matting methods for different trimaps based on SAD and Gradient error.

Sum of absolute differences Gradient errorAvg. small

rankAvg. large

rankAvg. user

rankOverall

rankAvg. small

rankAvg. large

rankAvg. user

rankOverall

rank3.133.924.174.635.335.755.887.1789.33

11.41212.513

3.1254.54.1254.3755.255.755.6257.756.758.875

12.2512.6311.8813.13

3.753.754.3754.6255.6255.255.1256.758.75

10.2510.2511.7512.7512.875

2.53.544.8755.1256.256.87578.58.875

11.7511.751313

3.383.134.384.256.636.386.136.7588

11.910.813.313.4

33.753.133.756.386.636.758.887.887.25

10.911.512.614.1

3.253.1333.3877.137.886.757.387.75

11.611.812.613.6

1. Shared mattingShared matting2. Proposed mattingProposed matting

3. Segmentation-based matting4. Improved color matting5. Shared matting (real time)6. Learning based matting7. Closed-form matting8. Large kernel matting9. Robust matting10. High-resmatting11. Random walk matting12. Geodesic matting13. Iterative BP matting14. Easy matting

1.2.3. Segmentation-based matting4. Improved color matting5. Closed-form matting6. Learning based matting7. Robust matting8. High-res matting9. Large kernel matting10. Shared matting (real time)11. Random walk matting12. Iterative BP matting13. Geodesic matting14. Bayesian matting

3.213.333.53.796.676.716.927.467.757.67

11.511.312.813.7

13.915.8

1416

14.515.625

13.2515.75

13.816

13.915.6

13.816

15. Bayesian Matting16. Poisson Matting

15. Easy Matting16. Poisson Matting

13.815.9

668 E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

quality of composition image of peacock with complicated feathersstructure as shown in zoomed regions in first row of Fig. 10 indicatehow effectively foreground colors are estimated for pixels using theproposed method.

The visual comparison of proposed combined matting methodwith some color sampling based matting methods like Shared [8],Robust [9] and Bayesian [7] matting and with alpha propagationbased matting methods like closed-form matting [4] is shown inFig. 11. Original images and their zoomed portions are shown inFig. 11a and b. The extracted mattes for zoomed areas using proposedcombined, shared, robust, closed-form and Bayesian matting methodsare shown in Fig. 11c–g, respectively. In the Troll image, the foregroundand background have similar color distribution therefore some parts ofbridge are wrongly estimated as foreground by different mattingmethods. However this problem is alleviated by combining color andtexture through the proposed method.

In the Pineapple image, the leaves of pineapple have highlyoverlapped color distribution with background causing some part ofleaves to be considered as background by shared, robust and Bayesianmatting methods. Also the propagation of alpha is blocked by strongedges of leaves due to which the alpha value of background area iswrongly estimated as foreground in closed-form matting. The texturedistinction between leaves of pineapple and background area is lever-aged along with color information to estimate more accurate alphamatte by the proposedmethod. Another challenging image is the Plasticbag in which the metallic wire tied around the bag has different colorfrom foreground that makes it hard for color sampling based methodsto estimate it as foreground. In addition the closed-form matting per-forms poorly because of blocked alpha by strong edges of metallicwire. The proposed combined method performs well and estimatesthe matte for the Plastic bag and the metallic wire as foreground.

6.4. Experiment 4: Evaluation on highly textured images

This experiment is designed to illustrate the effectiveness of the pro-posed combinedmattingmethod for highly textured images. To this end,a new set of 15 images (some of images are taken from the training setproposedby [22]) is synthesized in such away that foreground and back-groundhave similar color distributions (Fig. 12). The proposed combinedmethod is compared with shared [8], robust [9] and closed-form [4]matting methods over the new set of images with three type of trimapscalled as small, large and very large trimaps. In rows 1 and 2 of Fig. 13,foreground regions have similar color distributions as backgroundones, e.g., the petals of the flower in the first image have similar colors

to the textured background and color sampling methods of shared,robustmatting perform poorly (Fig. 13d, e) as compared to the proposedcombined method that uses texture feature to complement color(Fig. 13c). The foreground object in the second row image (Fig. 13a)contains holes through which the background having a similar color isvisible. The propagation of alpha is blocked by strong edges of fore-ground and the holes are considered as foreground by closed-formmat-ting (Fig. 13f) while they are visible in the ground truthmatte (Fig. 13b).The color similarity of foreground and background makes it hard forcolor sampling based matting methods to estimate accurate alphamattes while combining color and texture information through theproposed method leads to the most accurate mattes (Fig. 13c).

For further evaluation, the performance of matting methods is evalu-ated overmore challenging images like a hairy doll in front of textured re-gions as shown in the third and fourth rows images of Fig. 13a. The alphais not accurately propagated through the background regions which arepartially occluded by doll's hairs as shown in Fig. 13f for closed-formmatting.Moreover the estimatedmattes by color sampling basedmattingmethods are not accurate. Some parts of textured background arewrong-ly estimated as foreground as shown in Fig. 13d and e for shared and ro-bust matting. The proposed method still estimates the most accuratemattes as shown in the zoomed regions in row 4 of Fig. 13c.

The most challenging images are fuzzy images in which foregroundregions are gradually blended with background ones as shown in rows5 and 6 of Fig. 13a. The color/texture of foreground and background areblended together leading to newcolor/texture for fuzzy region that is dif-ferent from foreground or background as seen in the zoomedwindow ofFig. 13a. Here, the background has vertical red lines and foreground hasdiagonal red lines and the fuzzy region has both vertical and diagonal redlines that generate different texture patterns. This phenomenon makesfuzzy images a challenging one for alpha matting especially when fore-ground and background have similar color distributions. Robust andshared matting methods cannot properly estimate alpha mattes forfuzzy regions as shown in Fig. 13d and e. The propagation of alpha alsodoes not yield a good matte due to blended textures in fuzzy regions asshown in Fig. 13f. The proposed methods smoothly estimate alphamattes for fuzzy regions as shown in rows 5 and 6 of Fig. 13d and e.

The quality of estimated mattes for 15 synthesized images overthree type of trimaps (small, large and very large trimaps) by shared,robust, closed-form and proposed matting methods are quantitativelyevaluated according to MSE of [9] as shown in Fig. 14. The proposedcombined method is seen to perform the best with least variance inMSE compared to other methods irrespective of the size of the trimaps.The performance of the proposed method in estimating high quality

Page 12: Using texture to complement color in image matting

Plas

ticB

agT

roll

Pine

appl

e

a b c d e f g

Fig. 11. (a) Original image, (b) zoomed window, (c) estimated mattes by proposed combined matting method, (d) estimated mattes by shared matting [8], (e) estimated mattes byrobust matting [9], (f) estimated mattes by closed-form matting [4], (g) estimated mattes by Bayesian matting [7].

Peac

ock

Gan

dalf

Dol

l #1

Dol

l #2

Lad

y Fa

ce #

1L

ady

Face

#2

Original Images Estimated matteEstimated

Foreground Color Composited Image Zoomed Region

a eb dc

Fig. 10. (a) Original images, (b) estimated matte by proposed method, (c) estimated foreground color by proposed method, (d) composited image, (e) zoomed region of compositionimage.

669E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

Page 13: Using texture to complement color in image matting

1 5432

6 10987

Image Very Large TrimapLarge TrimapSmall TrimapGround truth

11 15141312

Fig. 12. New database of 15 synthesized images with three type of trimap and ground truth.

Ground truth Combined Shared Robust Closed FormImagea b c d e f

( 1 )

( 6 )

( 5 )

( 4 )

( 3 )

( 2 )

Fig. 13. Qualitative evaluations of shared [8], robust [9], closed-form [4] and combined matting methods for synthesized images with similar color distribution in foreground andbackground.

670 E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

Page 14: Using texture to complement color in image matting

MSE

MSE

MSE

a cbImage Index Image Index Image Index

Combined Matting

Fig. 14. Quantitative evaluations of shared, robust, closed-form and combined matting methods for synthesized images of Fig. 12 with (a) small, (b) large and (c) very large trimapswith respect to MSE calculated according to [9].

Image Trimap Groundtruth Combined matte

a b c d

Don

key

Rab

bit

Fig. 15. (a) Original low contrast images, (b) trimap, (c) ground truth matte, (d) estimated matte by proposed combined method.

671E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

mattes for textured images shows the potential power of texture whenit is used as a complementary feature with color one.

7. Limitations

The proposed matting method takes advantages of texture informa-tion to discriminate between foreground and background regions whenthey have similar colors. However its quality degradeswhen both textureand color are unable to discriminate between F and B regions due to theirsimilarity in both color and texture spaces. In order to illustrate the prob-lem, the contrast of two images from [22] is reduced as shown in first col-umn of Fig. 15. The rabbit and donkey has similar colors as background.Moreover texture information is not reliable due to the low contrast ofthe images. Thus, the proposed method fails to estimate high qualitymattes as shown in last column of Fig. 15. In these cases, the result canbe improved by refining the trimap which needs more user effort.

8. Conclusion

This paper addresses the problem ofmattingwhen there is overlap incolor distributions of foreground and background regions. This causesmethods that belong to the color sampling approach to select samplesthat may not be representative of the foreground or background region.Further, in the presence of strong edges of high textured regions, thepropagation of α from known pixels to similar unknown pixels couldbe blocked. Our approach consists of judiciously choosing samples sothat the above problems are alleviated. First, we present a color mattingmethod inwhich an objective function that incorporates color and spatialstatistics of image is minimized in a brute-force fashion and it is appliedon extracted texture features to obtain a texture matte. Then a texturematte is leveraged to improve the matte by combining color and texture

information using new objective function. Experiments show thatcombining color and texture information leads to improved matte ex-traction. Comparison with other state of the art methods on a standardbenchmarking dataset reveals that the proposed method obtains thefirst ranks with respect to mean square and Gradient errors and alsohas the second rank after sharedmattingwith respect to sum of absolutedifferences. However, we have shown that sharedmatting does not per-form well for complex images that contain highly textured regions.

References

[1] T. Porter, T. Duff, Compositing digital images, Proc. SIGGRAPH, vol. 18, 1984,pp. 253–259.

[2] P. Lee, Y. Wu, Nonlocal matting, IEEE Conference on Computer Vision and PatternRecognition (CVPR), 2011, pp. 2193–2200.

[3] J. Wang, M. Cohen, An iterative optimization approach for unified image segmentationand matting, Proc. International Conference on Computer Vision, vol. 2, 2005,pp. 936–943.

[4] A. Levin, D. Lischinski, Y. Weiss, A closed-form solution to natural image matting,IEEE Trans. Pattern Anal. Mach. Intell. 30 (2007) 228–242.

[5] S. Yu, J. Shi, Multiclass spectral clustering, Proc. International Conference onComputer Vision, vol. 1, 2003, pp. 313–319.

[6] M. Ruzon, C. Tomasi, Alpha estimation in natural images, IEEE Conference onComputer Vision and Pattern Recognition (CVPR), vol. 1, 2000, pp. 18–25.

[7] Y. Chuang, B. Curless, D. Salesin, R. Szeliski, A Bayesian approach to digital matting, IEEEConference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 2001, pp. 7–15.

[8] E. Gastal, M. Oliveira, Shared sampling for real time alpha matting, Proc.Eurographics, vol. 29, 2010, pp. 575–584.

[9] J. Wang, M. Cohen, Optimized color sampling for robust matting, IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), 2007, pp. 1–8.

[10] J. Wang, M. Cohen, Image and video matting: a survey, Found. Trends Comput.Graph. Vis. 3 (2007) 97–175.

[11] K. He, C. Rhemann, C. Rother, X. Tang, J. Sun, A global sampling method for alphamatting, IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2011, pp. 2049–2056.

[12] Mishima, Y. (1994). Soft edge chroma-key generation based upon hexoctahedralcolor space. U.S. Patent 5,355,174.

Page 15: Using texture to complement color in image matting

672 E. Shahrian, D. Rajan / Image and Vision Computing 31 (2013) 658–672

[13] C. Rhemann, C. Rother, A. Rav-Acha, T. Sharp, High resolutionmatting via interactivetrimap segmentation, IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2008, pp. 1–8.

[14] A. Levin, R. Acha, D. Lischinski, Spectral matting, IEEE Trans. Pattern Anal. Mach.Intell. 30 (2008) 1699–1712.

[15] L. Grady, T. Schiwietz, S. Aharon, R. Westermann, Random walks for interactivealpha-matting, Proc. Visualization, Imaging, and Image Processing, 2005, pp. 423–429.

[16] X. Bai, G. Sapiro, A geodesic framework for fast interactive image and videosegmentation and matting, Proc. International Conference on Computer Vision,2007, pp. 1–8.

[17] J. Sun, J. Jia, C. Tang, H. Shum, Poisson matting, ACM Trans. Graph. (TOG) 23(2004) 315–321.

[18] Y. Zheng, C. Kambhamettu, Learning based digital matting, Proc.12th InternationalConference on Computer Vision, 2009, pp. 889–896.

[19] C. Rhemann, C. Rother, M. Gelautz, Improving color modeling for alpha matting,Proc. British Machine Vision Conference, 2009, pp. 1155–1164.

[20] C. Rhemann, C. Rother, P. Kohli, M. Gelautz, A spatially varying PSF-based prior foralpha matting, IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2010, pp. 2149–2156.

[21] Y. Shen, Single complex image matting. (MSc. Dissertation) University of Alberta,2010.

[22] C. Rhemann, C. Rother, J. Wang, M. Gelautz, P. Kohli, P. Rott, A perceptually motivatedonline benchmark for image matting, IEEE Conference on Computer Vision andPattern Recognition (CVPR), 2009, pp. 1826–1833.

[23] J. Cohen, Statistical Power Analysis for the Behavioral Sciences, LawrenceErlbaum, 1988.

[24] M. Mirmehdi, X. Xie, J. Suri, Handbook of Texture Analysis, Imperial College Pr,2008.