Multiscale Bayesian texture segmentation using neural networks and Markov random fields

ORIGINAL ARTICLE

Multiscale Bayesian texture segmentation using neural networksand Markov random fields

Tae Hyung Kim Æ Il Kyu Eom Æ Yoo Shin Kim

Received: 21 December 2006 / Accepted: 20 December 2007 / Published online: 8 January 2008

� Springer-Verlag London Limited 2008

Abstract This paper presents a wavelet-based texture

segmentation method using multilayer perceptron (MLP)

networks and Markov random fields (MRF) in a multi-scale

Bayesian framework. Inputs and outputs of MLP networks

are constructed to estimate a posterior probability. The

multi-scale features produced by multi-level wavelet

decompositions of textured images are classified at each

scale by maximum a posterior (MAP) classification and the

posterior probabilities from MLP networks. An MRF

model is used in order to model the prior distribution of

each texture class, and a factor, which fuses the classifi-

cation information through scales and acts as a guide for

the labeling decision, is incorporated into the MAP clas-

sification of each scale. By fusing the multi-scale MAP

classifications sequentially from coarse to fine scales, our

proposed method gets the final and improved segmentation

result at the finest scale. In this fusion process, the MRF

model serves as the smoothness constraint and the Gibbs

sampler acts as the MAP classifier. Our texture segmen-

tation method was applied to segmentation of gray-level

textured images. The proposed segmentation method

shows better performance than texture segmentation using

the hidden Markov trees (HMT) model and the HMTseg

algorithm, which is a multi-scale Bayesian image seg-

mentation algorithm.

Keywords Neural networks �Multi-scale Bayesian texture segmentation �Wavelet transform �Markov random fields � Gibbs sampler

1 Introduction

Visual textures play important roles in many computer-

vision and image-processing applications, since real

images have textures. Texture segmentation involves par-

titioning an image into differently textured regions and

identifying the boundaries between different textures in an

image. Applications for texture segmentation include aerial

photo segmentation, document segmentation and analysis,

texture-based object detection and segmentation, recovery

of shape information from an image, image compression,

and content-based image retrieval. Approaches to texture

segmentation are analogous to methods for image

segmentation [1]. Therefore, approaches to texture seg-

mentation can be categorized into region-based and

boundary-based methods [2]. In texture analysis, region-

based analysis is effective in terms of information extrac-

tion and noise suppression. Therefore, researchers have

concentrated on region-based approaches to texture seg-

mentation. The texture description or feature extraction

method is one of the important factors for texture seg-

mentation [1]. The texture segmentation method depends

on the texture description method. Of all methods regard-

ing texture feature extraction, signal processing methods

T. H. Kim

Agency for Defense Development, Technology 3-3,

YuSeong, PO Box 35-3, DaeJeon 305-600, South Korea

e-mail: [email protected]

I. K. Eom (&)

Department of Electronics Engineering, Pusan National

University, 30 Jangjeon-dong, Geumjeong-gu,

Pusan 609-735, South Korea


Y. S. Kim

Research Institute of Computer, Information

and Communication, Pusan National University,

30 Jangjeon-dong, Geumjeong-gu, Pusan 609-735, South Korea


123

Neural Comput & Applic (2009) 18:141–155

DOI 10.1007/s00521-007-0167-x

are attractive due to their simplicity. In addition, for signal

processing methods, the multi-channel filtering methods

are supported by psychophysical research. This research

has demonstrated that the human brain does a frequency

analysis of the image [3]. Multi-channel filtering theory

was inspired by the multi-channel filtering mechanism in

neurophysiology [4, 5]. The concept of multi-channel fil-

tering, which is referred to as multi-resolution processing

in the literature, was further refined and developed in

multi-scale processing.

A long-term trend has been the incorporation of multi-

scale techniques into the texture segmentation algorithm,

and there have been evident advantages in utilizing multi-

scale texture analysis [6–13]. The use of both global and

local information of images through multi-scale processing

offers improved segmentation accuracy. In addition,

Bayesian approaches to texture segmentation have become

popular because they form a natural framework for inte-

grating both statistical models of image behavior and

contextual information [6–8, 11–15]. Not surprisingly,

there has been considerable interest in combining both

Bayesian and multi-scale techniques into a single frame-

work [6, 7, 11–13, 16, 17]. This framework can be called

the multi-scale Bayesian framework. Multi-scale Bayesian

approaches to texture segmentation have attracted

increasing attention because they form a natural framework

for integrating both global and local information of image

behavior, together with contextual information. Texture

segmentations in multi-scale images have compromises

between reliability and minuteness according to scale. The

segmentation in coarse scales is accurate for large, homo-

geneous regions but poor along the boundaries between

regions. The segmentation in fine scales is accurate near

the boundaries between regions but has poor classification

reliability due to the paucity of statistical information. A

recent study demonstrated that texture segmentation in the

finest scale is improved by using a multi-scale Bayesian

image segmentation algorithm. This is called the HMTseg,

which fuses segmentations in multi-scale images [13].

Texture characterization based on the discrete wavelet

transform (DWT), which integrates the above aspects, has

attracted much attention and was found useful for a variety

of texture applications [1, 13, 14, 16, 18–21]. The complex

wavelet transform with a better shift-invariant property

than the decimated DWT has recently begun to receive

attention for texture segmentation [17].

In the multi-scale Bayesian framework for texture seg-

mentation, the probability model of the features extracted

from texture images is very important. A number of sta-

tistical models have been developed for modeling textures.

In one important study [20], the non-Gaussian wavelet

coefficient marginal statistics observed in natural images

were characterized by the wavelet-domain hidden Markov

trees (HMT) model. In the wavelet-domain HMT model,

the histogram distribution of wavelet coefficients is mod-

eled by using a two-component Gaussian mixture model

and a Markov chain. Neural networks can represent any

distribution of inputs without complicated modeling

methods [22–25]. It is possible to compose neural networks

so that their output represents a posteriori when they are

trained in a supervised training mode [23–25]. It is then

possible for maximum a posterior (MAP) classification to

use the neural networks output. A method using neural

networks and the HMTseg algorithm [13] in the multi-scale

wavelet domain was studied for texture segmentation [12].

In this paper, based on Bayesian estimation and the

wavelet-based multi-scale analysis of images, we present a

new texture segmentation method using multilayer per-

ceptron (MLP) networks, Markov random fields (MRF),

and a Gibbs sampler. In the multi-scale Bayesian frame-

work, a factor, which fuses the classification information

through scales and acts as a guide for the labeling decision,

is also incorporated into our proposed texture segmentation

algorithm. We consider the so-called supervised segmen-

tation problem, in which the number of textures and the

parameters of their associated model are known or esti-

mated before segmentation. MLP networks have roughly

shown good classification performance for any kind of

data. In the multi-scale wavelet domain, MLP networks can

show better classification performance than the HMT

model [12]. We incorporate MLP networks in the multi-

scale Bayesian texture segmentation. Our proposed method

uses a posterior probability obtained from MLP networks,

which are used for modeling the statistics of multi-scale

wavelet coefficients. The statistical dependencies across

three DWT subbands (HL, LH, and HH) are considered to

be responsible for designing the input structure of MLP

networks. By fusing the classification information of all

scales sequentially from coarse to fine scales, our method

gets the final and improved segmentation result at the finest

scale. The sequential estimation methods through scales

have been frequently studied [7, 13]. Our sequential seg-

mentation method is motivated by sequential estimation

methods. In the concept of sequential estimation through

scales, we present a method for fusing the information

between scales. Our method formulates the interscale

dependency by defining the context vector, which has

contextual information extracted from the adjacent coarser

scale segmentation. The fusion process is executed by

computing the MAP classification given the probability

values of texture features at one scale and a priori knowl-

edge regarding contextual information which is extracted

from the adjacent coarser scale segmentation. In this fusion

process of each scale, our method uses an MRF prior model

as the local smoothness constraint, and a Gibbs sampler as

the MAP classifier. A Gibbs sampler has been used as the

142 Neural Comput & Applic (2009) 18:141–155

123

MRF parameter estimator, or used as the texture synthe-

sizer for generating a texture from an MRF texture model

[26, 27]. But our method uses a Gibbs sampler as the MAP

classifier. In addition, the Gibbs sampler integrates the

above mentioned factors; i.e., a posterior probability from

MLP networks, an MRF smoothness prior model, and

a priori knowledge regarding contextual information which

is extracted from the adjacent coarser scale segmentation.

Our texture segmentation method is applied to segment

gray-level textured images. Texture segmentation by the

proposed method performs better than the segmentation by

the HMT model and the HMTseg, and performs better than

texture segmentation by MLP network and the HMTseg.

Experimental results also show that our fusion method for

obtaining more improved segmentation results at the finest

scale performs better than the HMTseg algorithm. The

organization of the paper is as follows. In Sect. 2, the

Bayesian approach to texture segmentation is reviewed,

and a brief description of our texture segmentation system

is provided. The input structures, output structures, and the

use of MLP networks in terms of wavelet-based texture

segmentation are represented in Sect. 3. Texture segmen-

tation, using an MRF prior model, context model, and

Gibbs sampler, is represented in Sect. 4. Results on mosaic

textures are reported in Sect. 5. Section 6 is devoted to a

final discussion and conclusion.

2 Bayesian approach to texture segmentation

In this section, we explain the texture segmentation prob-

lem in the Bayesian framework. Then, matters regarding

Bayesian classifiers for texture segmentation are summa-

rized.

2.1 Texture segmentation in the Bayesian framework

Pixels or blocks of an image are formed on a lattice. Sites

on a rectangular lattice of size n 9 n can be denoted by

M¼ fðx; yÞj1� x; y� ng: Also, the location (x, y) can be

conveniently re-indexed by a single number i, where i takes

on values in {1, 2,…, m} with m = n 9 n. Therefore, sites

on the lattice can be indexed in M¼ f1; 2; . . .;mg: When

the texture segmentation problem is formulated in the

Bayesian framework, pixels or blocks of an image can be

regarded as a realization of a random field on the lattice

with distinct statistical behavior in different regions. Tex-

ture segmentation problems can be considered as a process

of assigning class labels from a set of class labels; i.e.,

L ¼ f1; 2; . . .;Ncg; to each of the sites (as indexed in M)

on the rectangular lattice that the realization (pixels or

blocks of an image) of the random field is formed on. The

labeling C records the class label of each site on the lattice.

Each element (a label) ciði 2 MÞ in the labeling C can be

regarded as a mapping from M to L; i.e., for 8i 2 M; ci :

M! L: In the terminology of random fields, a labeling C

is called a configuration. Therefore, texture segmentation

problems can be rephrased as a process of finding the

optimal configuration of the sites on the lattice which

pixels or blocks of an image are formed on.

In the Bayesian framework, the optimal solution for

texture segmentation is given by a feasible configuration

C* of an image G as follows:

C� � argmax

C 2 XpðCjGÞ; ð1Þ

where X is the configuration space, which is the set of all

possible configurations in terms of the class labels in L;and G � fgij8i2Mg is the collection of all pixels (or

blocks) gi. In practice, because of the high complexity of

random variables on both G and C, it is computationally

intractable to calculate a posterior p(C|G). Therefore, we

must impose some presupposition on both the image G

and a posterior p(C|G). It is assumed that all ci are

conditionally independent, given G, and all gi are

independent and identically distributed (i.i.d.). It is also

assumed that each ci is distributed based on the distribution

p(ci |gi) independently of all other ck and gk, k = i. It

follows that the p(C|G) of Eq. 1 can be rewritten as

follows:

pðCjGÞ ¼Y

i2MpðcijgiÞ: ð2Þ

Equation 2 shows that the global posterior probability

p(C|G) is determined by the local posterior probabilities

p(ci |gi). In this study, we will use MLP networks to esti-

mate the local posterior probabilities (more details will be

provided in Sect. 3), and by the naive Bayes rule [28], prior

models as a guide for the labeling decision are integrated

into our segmentation method.

2.2 Prior models and multi-scale processing

Bayesian approaches to texture segmentation are able to

incorporate knowledge concerning the properties of the

class labels themselves into a prior p(C). In addition,

external factors, which act as a guide for the labeling

decision, can be incorporated into a prior p(C) of the

Bayesian classifier. Most segmentation algorithms employ

a classification window of some size in the hope that all

pixels in the window belong to the same class. Texture

segmentations with large or small windows have compro-

mises between reliability and minuteness according to the

size of the windows. The segmentation with large windows

Neural Comput & Applic (2009) 18:141–155 143

123

is accurate for large, homogeneous regions but poor along

the boundaries between regions. The segmentation with

small windows is accurate near the boundaries between

regions but has poor classification reliability due to the

paucity of statistical information. To obtain a high-quality

segmentation result, clearly we should combine both the

large and small scale behaviors of images (both global and

local information of images) to benefit from both the

robustness of large windows and the resolution of small

windows. In segmentation algorithms using multi-scale

processing, the results of many classification windows of

different sizes are combined to obtain an accurate seg-

mentation at fine scales [6, 7, 11, 13, 16, 17]. Though the

orthogonal DWT can decorrelate an image with almost

uncorrelated wavelet coefficients, it is widely understood

that considerable high-order dependencies still exist in

wavelet coefficients, as observed from the characteristics of

wavelet coefficient distribution. In fact, considerable

attention has been given to the statistical image modeling

of high-order dependencies in the wavelet-domain with

applications for image coding [29], compression [30],

restoration [31], and segmentation. In multi-scale Bayesian

segmentation, these high-order dependencies of wavelet

coefficients can be considered as using context models.

Context models can be roughly categorized into three

groups: the interscale models [20], the intrascale models

[21], and the hybrid inter and intra scale models [31].

These models allow for more accurate image modeling and

more effective processing than other methods which

assume wavelet coefficients to be independent.

It is reasonable to assume that image pixels which are

spatially close are likely to be of the same texture. For these

reasons, most segmentation algorithms either implicitly or

explicitly impose some form of smoothness in the resulting

segmentation [8]. MRFs can be used to incorporate a priori

knowledge concerning the properties of the labels them-

selves [26, 27, 32, 33]. In our method, a prior probability

p(C) is modeled by using an MRF model, where the MRF

model forms an intrascale context model and acts as a

smoothing factor. In texture segmentation based on multi-

scale processing, some segmentation algorithms constructed

a prior model which can capture the interscale dependencies

between the image blocks and their class labels [7, 11, 13].

One study used the multi-scale prior model as a guide for the

decision of a labeling [13]. In the multi-scale Bayesian

texture segmentation of our study, a multi-scale context

model, which can be considered as a multi-scale prior

model, is constructed. These proposed multi-scale Bayesian

context models can capture complex aspects of both local

and global contextual behavior. More details will be pro-

vided in Sect. 4.

2.3 A brief description of our texture segmentation

system

As shown in Fig. 1, our proposed system has two steps for

texture segmentation, i.e., the generation of texture features

and the processing of texture features. In the generation of

texture features, a textured image is transformed by the

Fig. 1 A brief description of

our texture segmentation system


123

three-levels Haar wavelets transform, and from the multi-

level wavelet coefficients, feature vectors are formed at

each scale. In the processing of texture features, feature

vectors are used as inputs of MLP networks of each class

and each scale, and from MLP networks, posterior values

are estimated at each scale and each class. In addition,

texture segmentation by interscale guidance (an MRF

model) and interscale guidance (the context vector) is

performed from coarse to fine scales. A block in the image

is supported by its neighborhoods using an MRF model

(the intrascale context model), and the coarse scale pro-

vides context information to the fine scale through the

interscale context model (more details will be explained in

Sect. 4.2). These posterior, intrascale, and interscale

models are integrated in the Bayesian framework. Gibbs

sampler acts as the MAP classifier at each scale. Finally,

our system gets the improved segmentation results at the

finest scale by fusing the multi-scale segmentations

sequentially from coarse to fine scales.

In order to assess the performance and the usefulness of

our proposed algorithm, we will experimentally compare

our proposed algorithm with other texture segmentation

algorithm, and show the superiority of our method

through figures and table of experimental results. Texture

segmentation problems are classified as unsupervised seg-

mentation, semisupervised segmentation, and supervised

segmentation. This research belongs to the methodology

for supervised segmentation problem. In the supervised

texture segmentation, the training and test data sets need,

and image databases for performance tests are limited. In

order to assess the performance of our method in the

environment of complex and various textures, our method

is tested on mosaic texture images which contain complex

and various textures. Our method performs texture seg-

mentation by using Haar wavelet coefficients as texture

features, in the multi-scale Bayesian framework, and in

terms of the supervised segmentation problem. Texture

segmentation method of one study [13] has also performed

texture segmentation by using Haar wavelet coefficients as

texture features, in the multi-scale Bayesian framework,

and in terms of the supervised segmentation problem. The

method of Ref. [13] can be directly compared with our

method. In Ref. [13], the HMT model is used as statistical

model of Haar wavelet coefficients, and the HMTseg

algorithm is used for the sequential segmentation through

scales. In order to assess our approaches to texture seg-

mentation, our statistical model based on MLP networks is

experimentally compared with the HMT model, and the

HMTseg algorithm is experimentally compared with our

method for sequential segmentation and the fusion of

classification information through scales.

3 Structures of multi-layer perceptron networks

for wavelet-based texture segmentation

In this section, we explain structures of MLP networks for

estimating a posterior probability in a multi-scale wavelet

domain, and their application to texture segmentation.

3.1 Inputs of MLP networks for wavelet-based texture

segmentation

In this study, we employ blocks of different sizes to

implement classification windows of different sizes. The

most natural method to obtain blocks of different size is to

start with an initial image, corresponding to blocks of size

1 9 1 (pixel-level); next, they are grouped into 2 9 2

blocks. Repeating this process with incrementing the

number n leads to blocks of size 2n 9 2n. Given an initial

square image (for example, a 24 9 24 square image), the

blocks of different sizes are obtained simply by recursively

grouping subblocks of size 2n - 1 9 2n - 1 into square

blocks of size 2n 9 2n (see Fig. 2).

We use the multi-level Haar wavelet transform to

implement classification windows of different sizes through

scales. The multi-level Haar wavelet transform forms a

pyramid structure through all scales. Coefficients of Haar

wavelets are computed by using four filters as follows:

hLL ¼1

2

1 1

1 1

� �; gLH ¼

1

2

1 1

�1 �1

� �;

gHL ¼1

2

1 �1

1 �1

� �; gHH ¼

1

2

1 �1

�1 1

� �;

ð3Þ

where hLL is a local smoother, gLH is a horizontal edge

detector, gHL is a vertical edge detector, and gHH is a

1×1 blocks 2×2 blocks 4×4 blocks 8×8 blocks 16×16 blocks Scale s =0 Scale s =1 Scale s =2 Scale s =3 Scale s =4

Fig. 2 The blocks of different

sizes of the multiscale images

obtained from an initial square

image of 16 9 16 size. Each

square block can be associated

with a subtree of Haar wavelet

coefficients


123

diagonal edge detector [34]. Most real-world images,

especially gray-scale texture images, are well characterized

by their singularity (edges and ridges) structures. The

wavelet transform of a discrete image x(= us = 0; scale

s = 0) forms four subband images: a smoothed image

us = 1, a horizontal edge image wLHs = 1, a vertical edge

image wHLs = 1, and a diagonal edge image wHH

s = 1. The

wavelet transform process can now be continued on the

image us = 1 and the wavelet transform for smoothed ima-

ges us = 0, 1,…, S - 1 is repeated up to the scale S - 1.

Afterwards, pixel-level (1 9 1 blocks), 2 9 2 blocks,

22 9 22 blocks,…, and 2S - 1 9 2S - 1 blocks resolution

images are formed at each different scale s = 0, 1,

2,..., and S - 1. Haar wavelet transforms for the three

levels are shown in Fig. 3a. As can be seen, the coarse-scale

coefficient ws = 3 corresponds to four coefficients in the

next finer scale and the dependency of these coefficients

across scales has a quad-tree structure. The multi-scale

wavelet coefficients, which analyze a common sub-region

of an image, have persistence across the scale. The depen-

dency between these coefficients can be represented as the

dependency between parent and child nodes of a wavelet

quad-tree (see Fig. 3b). In Fig. 3b, a black node represents a

wavelet coefficient, and Ti is a sub-tree rooted at location

i (node i). Node i (or sub-tree Ti) analyzes a sub-region of an

image. Node i at scale s will be classified according to

texture class. That is, the classification of node i at scale s

will assign a texture class to the sub-image block which is

analyzed by node i (or sub-tree Ti) at scale s. The classifi-

cation will be accomplished for pixels, 2 9 2 blocks,

22 9 22 blocks,…, and 2S - 1 9 2S - 1 blocks at each

different scale s = 0, 1, 2,…, and S - 1. In this paper, we

will often drop the scale (S, S - 1,…) and the direction

(LH, HL, and HH), if these droppings do not prevent the

understanding of contents.

The input vector of the MLP network is determined by

considering the dependency (persistency) of wavelet

coefficients across scale. The coefficients of the sub-tree Ti

(see Fig. 3b) and the pixel intensities (gray-level values) of

the sub-image (analyzed by node i) are used as inputs of the

MLP network for node i of a wavelet quad-tree. The input

vector of the MLP network is also determined by consid-

ering the statistical dependencies across three wavelet

subbands (HL, LH, and HH). Ti is composed of TiHL, Ti

LH,

and TiHH for each direction. By considering all sub-trees on

node i of three wavelet subband quad-trees, the input

vector gi of the MLP network is defined as

gi ¼ Ti; pif g; ð4Þ

where gi is the block of the sub-image analyzed by sub-tree

Ti; the sub-trees, Ti = {TiHL, Ti

LH, TiHH}, represent the

wavelet coefficients of sub-trees rooted at node i in three

wavelet sub-band quad-trees; and pi represents the pixel

intensities of the sub-image which are analyzed by node i.

Wavelet coefficients of elements of gi can only be obtained

down to a 2 9 2 block resolution (scale s = 1). At scale

s = 0, the input vector gi only has the component pi. It

does not have the component Ti.

3.2 Output structure and training of MLP networks

The input and output structures of MLP networks can be

constructed so that a posterior probability is estimated

[23–25]. Our wavelet-based texture segmentation system

has one group of MLP networks for each wavelet scale. In

each scale, one MLP network is assigned for each texture

class. In addition, at pixel level, one group of MLP net-

works is also assigned. If the wavelet decomposition level

is L and the number of texture classes is C, then the number

of MLP networks is (L + 1) 9 C. One MLP network has

one output node. For the MLP network of texture class

c 2 L; the target output for training is set up as follows:

ti ¼1; gi 2 c0; gi 62 c

�: ð5Þ

In this study, the resilient back-propagation algorithm

is used for training MLP networks [35]. The purpose of

the resilient back-propagation training algorithm is to

eliminate the faults of the back-propagation algorithm,

which uses the magnitudes of the partial derivatives of the

cost function (in this paper, the sum of mean squared errors

between targets and outputs of MLP networks is used as

the cost function). Compared to several different back-

propagation training algorithms, the resilient back-

propagation algorithm is reasonably fast and very useful

HHsw 3=

HHsw 2=

LL HL

LH HH HLs 1=w

HHs 1=wLH

s 1=w

)(iρ

i

HHs

(a) (b)

3=

HHsw 2=

LL HL

LH HH HLs 1=w

HHs 1=wLH

s 1=w

)(iρ

i

Ti

Fig. 3 a The three-level Harr

wavelet transform and the quad-

tree structure of the wavelet

coefficients; b The quad-tree

structure of the wavelet

coefficients in a wavelet sub-

band and sub-tree Ti rooted at

node i and its elements


123

for large problems. It has also the nice property that it

requires only a modest increase in memory requirements

[35, 36].

After training the MLP networks, we compute a pos-

terior probability from outputs of MLP networks of one

scale as follows:

pðcijgiÞ ¼Mðgi;wcÞPNc

k¼1 Mðgi;wkÞ; ð6Þ

where ci is the class label for input vector gi, and Mðgi;wcÞis the output of the MLP network model. This model has

weight vector wc for a class c [ {1, 2,…, Nc } for input

vector gi.

3.3 Wavelet-based texture segmentation

by MLP networks

Suppose that for each texture class c [ {1, 2,…, Nc } and

each scale we have trained an MLP network. Now, given

the wavelet coefficients of an image consisting of a mon-

tage of these textures, applying Eq. (6) to the MLP

networks yields a posterior probabilities pðcijgiÞ; where

ci [ {1, 2,…, Nc } for gi of each block. The classification ci

of an input vector gi at scale s is performed as follows:

ci � argmax

ci 2 f1; 2; . . .;NcgpðcijgiÞ

¼ argmax

ci 2 f1; 2; . . .;NcgMðgi;w

scÞPNc

k¼1 Mðgi;wskÞ; ð7Þ

where wsc is the weight vector of the MLP network of a

texture class c at scale s. Node i of the wavelet quad-tree

corresponds to site i on the random field, which is com-

posed of pixels or blocks of an image at one scale. By

Eq. (7) and assumptions which are imposed in Eq. (2), we

can obtain a possible configuration Cs at scale s, where Cs

is the configuration on the blocks of scale s. We will call

this classification process the initial MAP segmentation.

The initial MAP segmentation yields a set of S different

texture segmentations Cs, one for each different scale s.

Thereafter, the multi-scale texture segmentation maps are

obtained by using the initial MAP segmentation.

Figure 4 illustrates the multi-scale texture segmentations

by the initial MAP segmentation. After training MLP net-

works on the D94 and D103 textures (Fig. 7), we performed

the initial MAP segmentation on the test image displayed to

the left of Fig. 4a to obtain the multi-scale texture seg-

mentation maps (c) at various scales. To compare our initial

MAP segmentation with other segmentation methods, we

represent the multi-scale texture segmentation maps (b)

obtained by the HMT model and ML classification [12, 13].

The segmentation method of Ref. [13] uses multi-level

wavelet coefficients as features of texture images, and

employs the HMT model as a statistical model for the

marginal and joint wavelet coefficients statistics. As shown

in Fig. 4b, c, the texture segmentation performance

obtained by using MLP networks is better than that obtained

by using the HMT model. Texture segmentation results by

MLP networks have less noise (noise due to the classifica-

tion error). Such results have been confirmed in [12].

While quick and easy, as Fig. 4c attests, the initial MAP

segmentation suffers from the classical ‘‘blockiness versus

robustness’’ tradeoff. These phenomena can also be found

Scales s = 3 s= 2 s= 1 s= 0

Scales s = 3 s= 2 s= 1 s= 0

Scales s = 3 s= 2 s= 1 s= 0 Scales s = 3 s= 2 s= 1 s= 0

(a) (b)

(d)(c)

(e)

Fig. 4 The multi-scale texture segmentation maps; a Left: a

128 9 128 D94/D103 mosaic test image to be segmented, Right:ideal texture segmentation map for the mosaic test image; b Multi-

scale texture segmentations Cs of the mosaic test image by the HMT

model (for 8 9 8, 4 9 4, 2 9 2 blocks resolution, and pixel-level

resolution); c Multi-scale texture segmentations Cs of the mosaic test

image by MLP networks (for 8 9 8, 4 9 4, 2 9 2 blocks resolution,

and pixel-level resolution); d The application of the HMTseg

algorithm to multi-scale texture segmentations by the HMT model;

e The application of the HMTseg algorithm to multi-scale texture

segmentations by MLP networks


123

in the multi-scale texture segmentation obtained by using

the HMT model, as shown in Fig. 4b. Classification

accuracy increases with block size (toward the coarser

scale), because more statistical information is available for

the class label decision. Classification of large blocks

(segmentation at coarse scales) produces accurate seg-

mentations in large, homogeneous regions. However, this

comes at the cost of reduced boundary resolution; that is,

poor segmentations along the boundaries between regions.

A small block (segmentation at fine scales) sacrifices

classification reliability due to the paucity of statistical

information. However, a small block is more appropriate

near the boundaries between regions. Therefore, texture

segmentations in multi-scale images have compromises

between reliability and minuteness according to scale. To

obtain a high-quality segmentation, clearly we should

combine the multi-scale results to benefit from both the

robustness of large blocks and the resolution of small

blocks. The study of Ref. [13] also demonstrated that

image segmentation in the finest scale is improved by using

a multi-scale Bayesian image segmentation algorithm

called the HMTseg algorithm that fuses the multi-scale

image segmentations obtained by the HMT model. Fig-

ure 4d shows the application of the HMTseg algorithm to

multi-scale texture segmentations obtained by using the

HMT model. In Ref. [12], the application of the HMTseg

algorithm to multi-scale texture segmentations obtained by

MLP networks has been shown. In the same way as [12],

we apply the HMTseg algorithm to multi-scale texture

segmentations obtained by MLP networks, with the results

shown in Fig. 4e. The segmentation performance of

the HMTseg algorithm depends on multi-scale texture

segmentations before the application of the HMTseg.

Therefore, the better the texture segmentation result of

each scale before the application of the HMTseg, the better

the texture segmentation result of the finest scale after the

application of the HMTseg. As confirmed in Ref. [12], the

texture segmentation result of Fig. 4e represents better

performance than that of Fig. 4d. This is so because MLP

networks perform better than the HMT model in the texture

segmentation of each scale before the application of the

HMTseg.

We constructed a multi-scale context model, which can

be considered as a multi-scale prior model, to combine the

benefits from both the robustness of large blocks and the

resolution of small blocks.

4 Multi-scale Bayesian texture segmentation

using MRFs and Gibbs sampler

The multi-scale texture segmentation maps by the ini-

tial MAP segmentation contain noises caused by

classification errors. Since finer scale blocks nest inside

coarser scale blocks, the blocks will be statistically

dependent across scale for images consisting of fairly

large, homogeneous regions. Hence, coarse-scale infor-

mation should be able to help guide finer-scale decisions.

In this section, we will explain the reduction of noises in

the segmentation maps by using an MRF model (the

MRF smoothness prior), the interscale decision fusion

by defining the context vector v which has contextual

information extracted from the adjacent coarser scale

segmentation, and MAP classification by using a Gibbs

sampler.

4.1 The context vector v and the Bayesian interscale

decision fusion

When we refer to Sect. 3.1, Figs. 2 and 3, it is perceived that

the dependency of wavelet coefficients across scales has a

quad-tree structure and that blocks at locations (nodes) of

the wavelet quad-tree have a dependency across scales.

Therefore, if gi of a block at scale s was classified as a class

label c, then it is quite likely that its four children blocks at

scale s - 1 belong to the same class, especially when s is

small (at fine scales). Hence, we will guide the classification

decisions for the child blocks based on the decision made

for their parent block. In addition to the parent block, we

can also use the neighbors of the parent to guide the deci-

sion process. Similar multi-scale decision ideas have been

used in Refs. [7, 13, 16, 17].

In the process of segmentation of the current scale, to

use the segmentation result of an adjacent coarse scale as a

guide for segmentation of the current scale, we define the

context vector vis for site i at scale s.

vsi � ½csþ1

qðiÞ ; csþ1N1qðiÞ

; csþ1N2qðiÞ

; csþ1N3qðiÞ

; csþ1N4qðiÞ

;

csþ1N5qðiÞ

; csþ1N6qðiÞ

; csþ1N7qðiÞ

; csþ1N8qðiÞ�; ð8Þ

where cq(i)s + 1 is the class label of parent site q(i) of site i, and

½csþ1N1qðiÞ

; . . .; csþ1N8qðiÞ� are class labels of eight neighbor sites of

parent site q(i). It follows that a random field Vs in terms of

the context vectors at scale s is represented as Vs = {vis |Vi

[Ms; for the set Ms; of sites at scales}. The Vs contains

MAP classification information at the previous coarse scale.

In this study, configurations of each scale are combined by

Bayesian classifier while simultaneously considering the

random field V, which contains contextual information

from the adjacent coarse scale.

We consider maximizing the posterior p(Cs|Gs, Vs),

where Gs : {gis|Vi [Ms;} is the collection of all gi

s

at scale s. Assuming that Gs and Vs are conditionally

independent given the configuration Cs, the posterior

p(Cs|Gs, Vs) can be rewritten by the naive Bayes rule [28]:


123

pðCsjGs;VsÞ ¼ pðCsjGsÞpðCsjVsÞ: ð9Þ

The naive Bayes rule works quite well in practice, despite its

manifest simplicity. The naive Bayes classifier based on the

naive rule is surprisingly useful in practice, and is robust

enough to ignore serious deficiencies in its underlying naive

Bayes rule [28]. At scale s, texture segmentation based on the

naive Bayes rule is performed by finding the optimal

configuration Cs* as follows:

Cs� � argmax

Cs 2 Xs pðCsjGs;VsÞ

¼ argmax

Cs 2 Xs pðCsjGsÞpðCsjVsÞf g

¼ argmax

Cs 2 Xs pðCsjGsÞpðVsjCsÞpðCsÞf g; ð10Þ

where Xs is the configuration space at scale s (the set of all

possible configurations of the sites of scale s in terms of the

class labels in L). The probability p(Cs|Gs) has information

obtained from all blocks gis of scale s for segmentation. The

distribution p(Vs|Cs) has information obtained from a

random filed Vs that is extracted from the configuration

Cs + 1 of the adjacent coarse scale s + 1. The factor

p(Vs|Cs) models the dependencies between the square

blocks across scale in a Markov-1 fashion [13, 26, 27],

where the square blocks of scale s are assumed to depend

only on the square blocks at scale s + 1. Using Eq. (10),

we perform the Bayesian interscale decision fusion

between scale s and s + 1 by classifying Gs based on its

probability p(Gs|Cs) and guidance p(Vs|Cs) from the adja-

cent coarse scale s + 1. This Bayesian interscale decision

fusion of Eq. (10) computes a MAP estimate of the con-

figuration Cs of scale s. Markov modeling leads us to a

simple scale-recursive classification of the square blocks.

In this study, the multi-scale Bayesian texture segmentation

is performed by sequentially using Eq. (10) from coarse to

fine scales. Starting from an initial coarse scale and halting

the fusion at scale s, we obtain the MAP segmentation

Cs

MAP: This multi-scale decision fusion greatly improves

the robustness and accuracy of the segmentation (we will

see the results of the fusion in Sect. 5). Halting the fusion at

the final scale (pixel-level), we obtain the final and

improved MAP segmentation Cs¼0

MAP:

4.2 Factors for MAP classification at one scale

The MAP classification derived by using Eq. (10) has three

factors; a posterior probability p(Gs|Cs), the distribution

p(Vs|Cs), and a prior p(Cs). Below, these factors are

modeled and explained.

4.2.1 Neighborhood system and Markov–Gibbs

equivalence

The MRF theory is a branch of probability theory for

analyzing the spatial or contextual dependencies of

physical phenomena [26, 27]. It is used in a labeling to

establish probabilistic distributions of interacting class

labels. This subsection introduces notations for MRFs,

and the use of an MRF model for constructing a prior

model.

The sites in M are related to one another via a

neighborhood system. A neighborhood system for M is

defined as N ¼ fN ij8i 2Mg; where N i is the set of

site neighboring i. For a regular rectangular lattice M;

the sites on the lattice M¼ fðx; yÞj1� x; y� ng; where

the set of neighbors of (x, y) is defined as N x;y ¼fðx� 1; yÞ; ðxþ 1; yÞ; ðx; y� 1Þ; ðx; yþ 1Þg in the first

order neighborhood system (also called the ‘‘4-neigh-

borhood system’’). When the location (x, y) is

conveniently re-indexed by a single number i, where i

takes on values in {1, 2,…, m} with m ¼ n�n;N i of the

4-neighborhood system is represented as N i ¼ fi� 1; iþ 1;

i� n; iþ ng (A site i does not belong to the set of

neighbors of i : i 62 N i). In the second order neighborhood

system, also called the 8-neighborhood system, there are

eight neighbors for every (interior) site of a regular rect-

angular lattice Mfij1� i�m;m ¼ n �ngas N i ¼ fi� 1;

iþ 1; i� n; iþ n; i� n� 1; i� nþ 1;þn� 1; iþ nþ 1g:We modeled a prior p(Cs) in an MRF model. There

are two approaches for specifying an MRF, that in terms

of the conditional probabilities pðcsi jcsN iÞ and that in

terms of the joint probability p(Cs), where csN i

represents

class labeling on the set of neighbors of site i. A theo-

retical result about the equivalence between MRFs and

Gibbs distributions [26, 27] provides a mathematically

tractable means of specifying the joint probability of an

MRF. Writing out a Gibbs distribution pðCsÞ ¼Z�1e�UðCsÞ gives the conditional probability pðcs

i jcsN iÞ as

follows:

pðCsÞ ¼Y

i2Ms

pðcsi jcs

N iÞ: ð11Þ

In a Gibbs distribution p(Cs) = Z-1e-U(Cs), U(Cs) is the

energy function, and Z is a normalizing constant called

the partition function: Z ¼P

Cs2Xs e�UðCsÞ: One can

specify the joint probability p(Cs) by specifying an

appropriate energy function U(Cs) for desired system

behavior. In this way, one encodes the a priori

knowledge or preference about interaction between class

labels.


123

4.2.2 The MRF smoothness prior

Physical properties in a neighborhood of space present

some coherence and generally do not change abruptly. It is

reasonable to assume that image pixels (or ‘‘blocks’’)

which are spatially close are likely to be of the same tex-

ture. Therefore, the smoothness constraint can be used as a

contextual constraint between class labels of sites. In an

MRF, each point on a lattice is statistically dependent only

on its neighbors so that the complexity of the model is

restricted. Smoothness constraints can be expressed as the

prior probability. Our proposed method uses an MRF

smoothness prior model for local smoothness constraint.

The MRF smoothness prior is characterized by the multi-

level logistic (MLL) model which tends to create a smooth

solution (or, to prefer uniform class labels) [27]. In this

paper, the energy function U(Cs) of the MLL model takes

the form as follows:

UðCsÞ ¼X

i2Ms

ai þX

i2Ms

X

i02NV2ðcs

i ; csi0 Þ; ð12Þ

where ai = 0 for all i 2 Ms; the set N i is the

8-neighborhood system, and the V2ðcsi ; c

si0 Þ is defined as

follows:

V2ðcsi ; c

si0 Þ ¼

�b if sites i and i0 have the same label

b otherwise

�:

ð13Þ

Here, b has a predetermined positive value. Next, the

conditional probability pðcsi jcsN iÞ is represented as follows:

pðcsi jcsN iÞ ¼ 1

Pcs

i2Le�P

i02N iV2ðcs

i ;csi0 Þ

e�P

i02N iV2ðcs

i ;csi0 Þ; ð14Þ

where L is the set of class labels; i.e., L ¼ f1; 2; . . .;Ncg:Then, the MRF smoothness prior model p(Cs) is con-

structed by using Eqs. (11) and (14). This MRF model,

pðcsi jcsN iÞ; is the intrascale context model.

4.2.3 Factors for interscale dependency

between class labels

In Eq. (10), p(Vs|Cs) represents the interscale dependency

between class labels, and can be rewritten as follows:

pðVsjCsÞ ¼Y

i2Ms

pðvsi jcs

i Þ: ð15Þ

In Eq. (15), it is assumed that all vis are conditionally

independent given Cs and each vis is distributed based on

the distribution p(vis|ci

s), independently of all other cks and

vks, k = i. Here, our proposed method estimates p(vi

s|cis) as

follows:

where dm,n is the Kronecker delta function. The distribution

p(vis|ci

s) is an important factor for the multi-scale decision

fusion. The distribution p(vis|ci

s), which is formulated by

Eq. (16), is the interscale context model.

4.2.4 A posterior probability from MLP networks

As mentioned in Sect. 2.1, it is assumed that all cis are

conditionally independent given Gs and all gis (the pixels or

blocks of an image Gs) are independent and identically

distributed (i.i.d.). It is also assumed that each cis is dis-

tributed based on the distribution p(cis|gi

s), independently of

all other cks and gk

s, k = i. From this, it follows that the p

(Cs|Gs) of Eq. (10) can be rewritten as follows:

pðCsjGsÞ ¼Y

i2Ms

pðcsi jgs

i Þ: ð17Þ

Here, the probability p(cis|gi

s) is obtained from outputs of

MLP networks of scale s for each class by using the

method explained in Eq. (6).

pðcsi jgs

i Þ ¼Mðgs

i ;wscÞPNc

k¼1 Mðgsi ;w

skÞ; ð18Þ

where wcs is the weight vector of MLP network for a class

c [ {1, 2,…, Nc} at scale s.

pðvsi jcs

i Þ �the number of elements of vector vs

i with same value as label csi

the number of all elements of vector vsi

¼dcs

i ;csþ1qðiÞþ dcs

i ;csþ1N1qðiÞþ dcs

i ;csþ1N2qðiÞþ � � � þ dcs

i ;csþ1N7qðiÞ

dcsi ;c

sþ1N8qðiÞ

9; ð16Þ


123

4.3 Multi-scale Bayesian texture segmentation

and Gibbs sampler

From Eqs. (11), (15), and (17), we can rewrite Eq. (10) as

follows:

Cs� ¼ argmax

Cs 2 Xs pðCsjGsÞpðVsjCsÞpðCsÞf g

¼ argmax

Cs 2 Xs

Y

i2Ms

pðcsi jgs

i Þpðvsi jcs

i Þpðcsi jcsN iÞ

( ) :

ð19Þ

We can perform MAP segmentation at scale s by using the

MAP classification of Eq. (19). A Gibbs sampler can be

used for MAP classification by utilizing Eq. (19). A Gibbs

sampler has been used as the MRF parameter estimator, or

used as the texture synthesizer for generating a texture

from an MRF texture model [26, 17]. In this multi-scale

Bayesian approach to segmentation, the optimal configu-

ration Cs* is determined by using a Gibbs sampler. That is,

the Gibbs sampler acts as a MAP classifier, and texture

segmentation at scale s is performed by Gibbs sampler. If

texture segmentation is performed sequentially from coarse

to fine scales by using Eq. (19), then texture segmentation

results of each scale are fused sequentially from coarse to

fine scales. In other words, the multi-scale decision fusion

is performed. Finally, in the full resolution image (at the

finest scale), the more improved texture segmentation

result is obtained by the multi-scale decision fusion and the

MRF smoothness prior that has the ability to reduce noises

caused by classification errors. When Eq. (19) is used at

the initial coarse scale, the context vector does not exist.

But we can assume that the configuration Cs is statistically

independent of the context vectors at the initial coarse

scale. In other words, at the initial coarse scale,

p(Cs|Vs) = p(Cs).

The procedure of texture segmentation at scale s is

shown in Fig. 5. The proposed segmentation algorithm of

Fig. 5 uses the Gibbs sampler as the MAP classifier. The

procedure of sequential texture segmentations from a

coarse to fine scale is shown in Fig. 6. The final texture

segmentation result in the full resolution image is obtained

by the proposed algorithm of Fig. 6.

5 Experiments and results

Our proposed texture segmentation method has been tested

on mosaic texture images. To assess the performance and

the usefulness of our approaches to texture segmentation,

our statistical model based on MLP networks is experi-

mentally compared with the HMT model, and the HMTseg

algorithm [13] (a multi-scale Bayesian image segmentation

algorithm called the HMTseg algorithm that fuses the

multi-scale image segmentations obtained by the HMT) is

experimentally compared with our method for sequential

segmentation and the fusion of classification information

through scales.

5.1 Data set and configurations of texture segmentation

systems

In our experiments, 20 Brodatz textures were used. These

textures are shown in Fig. 7. From each 512 9 512 Bro-

datz texture image, we randomly picked ten (overlapping)

Fig. 5 MAP segmentation at

scale s using a Gibbs sampler

Fig. 6 The multi-scale decision

fusion by sequential texture

segmentation from coarse to

fine scales


123

64 9 64 blocks. Then, the multi-level (three level) wavelet

coefficients of those image blocks were used as training

data. All MLP networks have one hidden layer. The hidden

layer of each MLP network has 20 hidden nodes. The

inputs of MLP networks of scale s are (((4s - 4)/

3 + 1) 9 2 + ((4s + 1 - 4)/3 + 1))-dimensional vectors

[refer to Eq. (4)]. MLP networks were trained using those

training data. HMT models were also trained for each

texture using the identical training data with MLP net-

works. But, while an MLP network for one texture class is

trained by using the training data of all texture classes, an

HMT model for one texture class is trained only by using

the training data of the texture class. Weights of MLP

networks were updated by using the MATLAB neural

network toolbox [36] (with the resilient back-propagation

algorithm). The parameters of HMT were estimated using

an EM algorithm (the threshold value 10-7 is used to

determine the model convergence) with an intelligent

parameter initialization [37]. For image segmentation using

the HMT model, a pixel brightness pdf model at the pixel

level is required [13]. Therefore, to obtain pixel-level

segmentation, 2-density Gaussian mixture models were

used and trained as a pixel brightness pdf model.

The initial coarse scale (the starting scale for the multi-

scale decision fusion) is set at s0 = 3, such that the coarsest

texture segmentations have sufficient reliability. At a very

coarse scale, the blocks, gi, of the very coarse scale

are very large and hence are likely to contain several

differently textured regions. Therefore, we ignore the

information at very coarse scales. The test texture images

for the experiments are shown in Fig. 8. Since MLP net-

works are trained for the multi-level wavelet coefficients

of 64 9 64 image blocks, MLP networks can manage

64 9 64 image blocks. When the size (64 9 64) of image

blocks that MLP networks can manage is smaller than a

test texture image, we can use the MLP network repeatedly

for (64 9 64) test image subblocks assuming that the

blocks are independent. Since the size of an image, which

the trained HMT model can manage, is also smaller than

the size of the test images of Fig. 8, we used the HMT

model repeatedly for test image subblocks with the same

assumption.

In the proposed algorithm of Fig. 5, a Gibbs sampler

was executed for three iterations. In the HMTseg algo-

rithm, the threshold value 10-4 was used to determine the

convergence of the algorithm.

5.2 Texture segmentation results and comparisons

As shown in Figs. 5 and 6, our proposed segmentation

algorithm contains the multi-scale Bayesian framework.

The HMTseg algorithm is also a multi-scale Bayesian

image segmentation technique. To compare our proposed

method (Fig. 6) and the HMTseg, we experimented with

the test images of Fig. 8. The procedure of Fig. 6 contains

the multi-scale decision fusion, the MRF smoothness prior,

and the MAP segmentation using a Gibbs sampler. Let the

procedure of Fig. 6 be called the Gibbs segmentation.

Figure 9 shows texture segmentation results for the 2-tex-

tures test image (diagonal2). Figure 9a shows the multi-

scale texture segmentations by using MLP networks. As

shown in Fig. 9, the Gibbs segmentation of our proposed

method is applied to the multi-scale texture segmentations

of Fig. 9a (Fig. 9c), and the HMTseg is also applied to the

multi-scale texture segmentations of Fig. 9a, b). Figure 9c

displays the texture segmentation result of each scale in the

process of the multi-scale decision fusion by using the

Fig. 7 The training images

(512 9 512 pixels) of the 20

Brodatz textures used in the

experiment

Fig. 8 Mosaic test texture images and ideal texture segmentation

map for the mosaic test images; a The 64 9 64 diagonal2 mosaic, a

two-textures image (D24/D84); b The 64 9 64 blocks3 mosaic,

a three-textures image (D24/D68/D84); c The 64 9 64 cross4 mosaic,

a four-textures image (D22/D24/ D68/D84); d The 192 9 192

squares9 mosaic, a 9-textures image (D15/D19/D22/D24/D26/D38/

D49/D68/D103); e Ideal texture segmentation map for diagonal2; fIdeal texture segmentation map for blocks3; g Ideal texture segmen-

tation map for cross4; h Ideal texture segmentation map for squares9


123

Gibbs segmentation. Figure 9b also displays the texture

segmentation result of each scale in the process of the

multi-scale decision fusion by using the HMTseg algo-

rithm. When Fig. 9b, c are compared, texture segmentation

results by the Gibbs segmentation have less noise (noise

due to the classification error), and the Gibbs segmentation

of our proposed method exhibits outstanding performance.

Texture segmentations for test images of Fig. 8 using our

proposed method (which is represented in Figs. 5 and 6)

were compared with those using the HMT model (at scale

s = 0, a 2-density Gaussian mixture model is used) and the

HMTseg (or, MLP networks and the HMTseg) in Fig. 10.

Considering the segmentations of all scales, the segmenta-

tion of the finest scale is the final texture segmentation

which we would get. Final texture segmentations of the

finest scale obtained by each segmentation method are

shown in Fig. 10. Texture segmentation results by our

proposed method have less noise (noise due to the classi-

fication error). Therefore, we can see that our proposed

method exhibits outstanding performance. The segmenta-

tion error rate between the ideal segmentation (Fig. 8) and

the final texture segmentation (Fig. 10) of the finest scale is

given in Table 1. The segmentation error rate is the rate of

the number of misclassified pixels to the total number of

pixels in a test image. As shown in Table 1, our proposed

segmentation method performs better than the segmentation

method using the HMT, Gaussian mixtures, and the

HMTseg. Table 1 also shows that our proposed segmenta-

tion method performs better than the segmentation method

using MLP networks and the HMTseg. In addition, in terms

of texture segmentation using the multi-scale decision

fusion process and the reduction of noises caused by clas-

sification error, the segmentation error rates of Table 1

reconfirm the excellence of the Gibbs segmentation.

Fig. 9 Segmentations of

diagonal2; a the texture

segmentation C of each scale by

using MLP networks; b the

texture segmentation C of each

scale after the application of the

HMTseg to a; c the texture

segmentation C of each scale

after the application of the

Gibbs segmentation to a

Fig. 10 Texture segmentation

results by each segmentation

method for test images of

Fig. 7; In a–d the first column

shows results by the HMT and

the HMTseg, the second column

by MLP networks and the

HMTseg, and the third column

by our proposed method; a For

diagonal2; b For blocks3; c For

cross4; d For squares9

Table 1 The segmentation

error rate for 2-, 3-, 4-,

and 9-textures test images

Segmentation error rate = (the

number of misclassified pixels

of test image)/(the number of all

pixels of test image) 9 100%

Test images Method

The HMT, Gaussian

mixtures, and the

HMTseg (%)

MLP networks and

the HMTseg (%)

MLP networks and the

Gibbs segmentation

(our proposed method) (%)

Diagonal2 6.81 3.86 1.20

Blocks3 13.72 9.11 2.56

Cross4 11.62 5.32 1.56

Squares9 8.07 5.16 0.98


123

6 Discussion and conclusion

Multi-scale Bayesian approaches to texture segmentation

form a natural framework for integrating both global and

local information of image behavior, together with con-

textual information. Motivated by the merit of multi-scale

Bayesian approaches, we proposed a novel method of

supervised texture segmentation using MLP networks, an

MRF model, and Gibbs sampler within a multi-scale

Bayesian framework. Multi-scale DWT coefficients, that

are suitable for multi-scale image processing, were used as

inputs for MLP networks. Texture segmentation was per-

formed by using outputs of MLP networks. Our proposed

multi-scale Bayesian texture segmentation method used the

MLL model (an MRF smoothness prior model) for

reducing noises caused by classification error; in addition,

it defined the context vector v and modeled interscale

dependency for the multi-scale decision fusion. A Gibbs

sampler integrated the above factors (a posterior proba-

bility from MLP networks, the MRF smoothness prior

model, and interscale context model) and acted as the MAP

classifier. We called the procedure of integrating the above

factors (Fig. 6) the Gibbs segmentation.

In experiments to compare the Gibbs segmentation and

the HMTseg algorithm (which is a multi-scale Bayesian

image segmentation technique), the Gibbs segmentation

displayed outstanding performance. The results of texture

segmentation by our proposed method were compared with

those using other methods (the HMT and the HMTseg [13];

or, MLP networks and the HMTseg [12]). Through these

experiments, we can see that our proposed method is

superior to other methods in terms of texture segmentation

performance. The reason why our proposed method can

show outstanding performance can be concluded as fol-

lows: our method has adequately integrated the

classification power of MLP networks into the Bayesian

framework, and by adequately constructing prior models,

our method has reduced noises due to classification errors.

In addition, the MAP estimation by a Gibbs sampler makes

our method show outstanding performance.

References

1. Tuceryan M, Jain AK (1998) Texture analysis. In: Chen CH, Pau

LF, Wang PSP (eds) The handbook of pattern recognition and

computer vision, 2nd edn. World Scientific Publishing Co., pp.

207–248

2. Vaidyanathan G, Lynch PM (1990) Edge based texture seg-

mentation. In: IEEE proceedings of Southeastcon 90’ 3:1110–

1115

3. Georgeson MA (1979) Spatial fourier analysis and human vision,

chap. 2. In: Southland NS (ed) Tutorial essays in psychology, a

guide to recent advance, vol 2, Lawrence Earlbaum Associate,

Hillsdale

4. Devalois RL, Albrecht DG, Thorell LG (1982) Spatial-frequency

selectivity of cells in macaque visual cortex. Vis Res 22:545–559

5. Silverman MS, Crosof DH, De Valois RL, Elfar SD (1989)

Spatial-frequency organization in primate strate cortex. Natl

Acad Sci USA 86

6. Fan G, Xia XG (2001) A joint multicontext and multiscale

approach to Bayesian image segmentation. IEEE Trans Geosci

Remote Sens 39(12):2680–2688

7. Cheng H, Bouman CA (2001) Multiscale Bayesian segmentation

using a trainable context model. IEEE Trans Image Process

10(4):511–525

8. Bouman C, Liu B (1991) Multiple resolution segmentation of tex-

tured images. IEEE Trans Pattern Anal Mach Intell 13(2):99–113

9. Ng I, Kittler J, Illingworth J (1993) Supervised segmentation

using a multiresolution data representation. Signal Process

31:133–163

10. Meyer Y (1993) Wavelets algorithm and application. SIAM,

Philadelphia

11. Li J, Gray RM, Olshen RA (2000) Multiresolution image clas-

sification by hierarchical modeling with two-dimensional hidden

Markov models. IEEE Trans Inf Theory 46(5):1826–1841

12. Kim TH, Eom IK, Kim YS (2005) Texture segmentation using

neural networks and multi-scale wavelet feature. Lecture Notes in

Computer Science 3611. Springer, Berlin, Heidelberg, pp 395–

404

13. Choi HK, Baraniuk RG (2001) Multiscale image segmentation

using wavelet-domain hidden Markov models. IEEE Trans Image

Process 10(9):1309–1321

14. Unser M (1995) Texture classification and segmentation using

wavelet frames. IEEE Trans Image Process 4:1549–1560

15. Weldon TP, Higgins WE (1996) Design of multiple Gabor filters

for texture segmentation. In Proceedings of international con-

ference acoustic speech, signal proceeding, Atlanta, pp 2243–

2246

16. Fan G, Xia XG (2003) Wavelet-based texture analysis and syn-

thesis using hidden Markov models. IEEE Trans Circuits Syst

Fundam Theory Appl 50(1):106–120

17. Sun J, Gu D, Zhang S, Chen Y (2004) Hidden Markov Bayesian

texture segmentation using complex wavelet transform. IEE Proc

Visi Image Signal Process 151(3):215–223

18. Randen T, Husoy JH (1999) Filtering for texture classification: a

comparative study. IEEE Trans Pattern Anal Mach Intell

21(4):291–310

19. Wouwer GV, Scheunders P, Dyck DV (1999) Statistical texture

characterization from discrete wavelet representations. IEEE

Trans Image Process 8(4):592–598

20. Crouse M, Nowak R, Baraniuk RG (1998) Wavelet-based sta-

tistical signal processing using hidden Markov models. IEEE

Trans Signal Process 46(4):886–902

21. Fan G, Xia XG (2001) Image denoising using a local contextual

hidden Markov model in the wavelet domain. IEEE Signal Pro-

cess Lett 8(5):125–128

22. Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd

edn. A Wiley Interscience Publication, London. pp 51–63, 161–

192, 576–582

23. Gish H (1990) A probabilistic approach to the understanding and

training of neural network classifiers. In: Proceedings of IEEE

international conference on acoustics, speech and signal pro-

cessing. Albuquerque, pp 1361–1364

24. Richard MD, Lippmann RP (1991) Neural network classifiers

estimate Bayesian a posteriori probabilities. Neural Comput

3:461–483

25. Rojas R (1996) Short proof of the posterior probability property

of classifier neural networks. Neural Comput 8:41–43

26. Li SZ (1995) Markov random field modeling in computer vision.

Springer, New York


123

27. Li SZ (2001) In: Kunii TL (eds) Markov random field modeling

in image analysis, 2nd edn. Computer science workbench.

Springer, Berlin

28. Duda RO, Hart PE, Stork DG (2002) Pattern classification, 2nd

edn. Wiley Interscience Publication, London. Revised chapter

section 2.11

29. Shapiro JM (1993) Embedded image coding using zerotrees of

wavelet coefficients. IEEE Trans Signal Process 41(12):3445–3462

30. Shapiro JM (1996) Image compression by texture modeling in the

wavelet domains. IEEE Trans Signal Process 5(1):26–36

31. Simoncelli EP (1997) Statistical models for images: compression,

restoration and synthesis. In: Proceedings of 31st Asilomar con-

ference on signals, systems and computers. Pacific Grove, pp

673–678

32. Derin H, Elliot H (1987) Modeling and segmentation of noisy and

textured images using Gibbs random fields. IEEE Trans Pattern

Anal Mach Intell 9(1):39–55

33. Manjunath BS, Simchony T, Chellappa R (1990) Stochastic and

deterministic networks for texture segmentation. IEEE Trans

Acoust Speech Signal Process 38(6):39–55

34. Mallat S (1998) A wavelet tour of signal processing. Academic

Press, New York

35. Reidmiller M, Braun H (1993) A direct adaptive method for

faster backpropagation learning: the Rprop algorithm. In: Pro-

ceedings of the IEEE international conference on neural

networks, San Francisco

36. Demuth H, Beale M, Neural network toolbox for use with

MATLAB, User’s Guide Version 4, The MathWorks Inc.,

pp137–194

37. Fan G, Xia XG (2001) Improved hidden Markov models in the

wavelet-domain. IEEE Trans Signal Process 49(1):115–120


123

Documents

Multiscale Bayesian texture segmentation using neural networks and Markov random fields