Upload
tae-hyung-kim
View
215
Download
2
Embed Size (px)
Citation preview
ORIGINAL ARTICLE
Multiscale Bayesian texture segmentation using neural networksand Markov random fields
Tae Hyung Kim Æ Il Kyu Eom Æ Yoo Shin Kim
Received: 21 December 2006 / Accepted: 20 December 2007 / Published online: 8 January 2008
� Springer-Verlag London Limited 2008
Abstract This paper presents a wavelet-based texture
segmentation method using multilayer perceptron (MLP)
networks and Markov random fields (MRF) in a multi-scale
Bayesian framework. Inputs and outputs of MLP networks
are constructed to estimate a posterior probability. The
multi-scale features produced by multi-level wavelet
decompositions of textured images are classified at each
scale by maximum a posterior (MAP) classification and the
posterior probabilities from MLP networks. An MRF
model is used in order to model the prior distribution of
each texture class, and a factor, which fuses the classifi-
cation information through scales and acts as a guide for
the labeling decision, is incorporated into the MAP clas-
sification of each scale. By fusing the multi-scale MAP
classifications sequentially from coarse to fine scales, our
proposed method gets the final and improved segmentation
result at the finest scale. In this fusion process, the MRF
model serves as the smoothness constraint and the Gibbs
sampler acts as the MAP classifier. Our texture segmen-
tation method was applied to segmentation of gray-level
textured images. The proposed segmentation method
shows better performance than texture segmentation using
the hidden Markov trees (HMT) model and the HMTseg
algorithm, which is a multi-scale Bayesian image seg-
mentation algorithm.
Keywords Neural networks �Multi-scale Bayesian texture segmentation �Wavelet transform �Markov random fields � Gibbs sampler
1 Introduction
Visual textures play important roles in many computer-
vision and image-processing applications, since real
images have textures. Texture segmentation involves par-
titioning an image into differently textured regions and
identifying the boundaries between different textures in an
image. Applications for texture segmentation include aerial
photo segmentation, document segmentation and analysis,
texture-based object detection and segmentation, recovery
of shape information from an image, image compression,
and content-based image retrieval. Approaches to texture
segmentation are analogous to methods for image
segmentation [1]. Therefore, approaches to texture seg-
mentation can be categorized into region-based and
boundary-based methods [2]. In texture analysis, region-
based analysis is effective in terms of information extrac-
tion and noise suppression. Therefore, researchers have
concentrated on region-based approaches to texture seg-
mentation. The texture description or feature extraction
method is one of the important factors for texture seg-
mentation [1]. The texture segmentation method depends
on the texture description method. Of all methods regard-
ing texture feature extraction, signal processing methods
T. H. Kim
Agency for Defense Development, Technology 3-3,
YuSeong, PO Box 35-3, DaeJeon 305-600, South Korea
e-mail: [email protected]
I. K. Eom (&)
Department of Electronics Engineering, Pusan National
University, 30 Jangjeon-dong, Geumjeong-gu,
Pusan 609-735, South Korea
e-mail: [email protected]
Y. S. Kim
Research Institute of Computer, Information
and Communication, Pusan National University,
30 Jangjeon-dong, Geumjeong-gu, Pusan 609-735, South Korea
e-mail: [email protected]
123
Neural Comput & Applic (2009) 18:141–155
DOI 10.1007/s00521-007-0167-x
are attractive due to their simplicity. In addition, for signal
processing methods, the multi-channel filtering methods
are supported by psychophysical research. This research
has demonstrated that the human brain does a frequency
analysis of the image [3]. Multi-channel filtering theory
was inspired by the multi-channel filtering mechanism in
neurophysiology [4, 5]. The concept of multi-channel fil-
tering, which is referred to as multi-resolution processing
in the literature, was further refined and developed in
multi-scale processing.
A long-term trend has been the incorporation of multi-
scale techniques into the texture segmentation algorithm,
and there have been evident advantages in utilizing multi-
scale texture analysis [6–13]. The use of both global and
local information of images through multi-scale processing
offers improved segmentation accuracy. In addition,
Bayesian approaches to texture segmentation have become
popular because they form a natural framework for inte-
grating both statistical models of image behavior and
contextual information [6–8, 11–15]. Not surprisingly,
there has been considerable interest in combining both
Bayesian and multi-scale techniques into a single frame-
work [6, 7, 11–13, 16, 17]. This framework can be called
the multi-scale Bayesian framework. Multi-scale Bayesian
approaches to texture segmentation have attracted
increasing attention because they form a natural framework
for integrating both global and local information of image
behavior, together with contextual information. Texture
segmentations in multi-scale images have compromises
between reliability and minuteness according to scale. The
segmentation in coarse scales is accurate for large, homo-
geneous regions but poor along the boundaries between
regions. The segmentation in fine scales is accurate near
the boundaries between regions but has poor classification
reliability due to the paucity of statistical information. A
recent study demonstrated that texture segmentation in the
finest scale is improved by using a multi-scale Bayesian
image segmentation algorithm. This is called the HMTseg,
which fuses segmentations in multi-scale images [13].
Texture characterization based on the discrete wavelet
transform (DWT), which integrates the above aspects, has
attracted much attention and was found useful for a variety
of texture applications [1, 13, 14, 16, 18–21]. The complex
wavelet transform with a better shift-invariant property
than the decimated DWT has recently begun to receive
attention for texture segmentation [17].
In the multi-scale Bayesian framework for texture seg-
mentation, the probability model of the features extracted
from texture images is very important. A number of sta-
tistical models have been developed for modeling textures.
In one important study [20], the non-Gaussian wavelet
coefficient marginal statistics observed in natural images
were characterized by the wavelet-domain hidden Markov
trees (HMT) model. In the wavelet-domain HMT model,
the histogram distribution of wavelet coefficients is mod-
eled by using a two-component Gaussian mixture model
and a Markov chain. Neural networks can represent any
distribution of inputs without complicated modeling
methods [22–25]. It is possible to compose neural networks
so that their output represents a posteriori when they are
trained in a supervised training mode [23–25]. It is then
possible for maximum a posterior (MAP) classification to
use the neural networks output. A method using neural
networks and the HMTseg algorithm [13] in the multi-scale
wavelet domain was studied for texture segmentation [12].
In this paper, based on Bayesian estimation and the
wavelet-based multi-scale analysis of images, we present a
new texture segmentation method using multilayer per-
ceptron (MLP) networks, Markov random fields (MRF),
and a Gibbs sampler. In the multi-scale Bayesian frame-
work, a factor, which fuses the classification information
through scales and acts as a guide for the labeling decision,
is also incorporated into our proposed texture segmentation
algorithm. We consider the so-called supervised segmen-
tation problem, in which the number of textures and the
parameters of their associated model are known or esti-
mated before segmentation. MLP networks have roughly
shown good classification performance for any kind of
data. In the multi-scale wavelet domain, MLP networks can
show better classification performance than the HMT
model [12]. We incorporate MLP networks in the multi-
scale Bayesian texture segmentation. Our proposed method
uses a posterior probability obtained from MLP networks,
which are used for modeling the statistics of multi-scale
wavelet coefficients. The statistical dependencies across
three DWT subbands (HL, LH, and HH) are considered to
be responsible for designing the input structure of MLP
networks. By fusing the classification information of all
scales sequentially from coarse to fine scales, our method
gets the final and improved segmentation result at the finest
scale. The sequential estimation methods through scales
have been frequently studied [7, 13]. Our sequential seg-
mentation method is motivated by sequential estimation
methods. In the concept of sequential estimation through
scales, we present a method for fusing the information
between scales. Our method formulates the interscale
dependency by defining the context vector, which has
contextual information extracted from the adjacent coarser
scale segmentation. The fusion process is executed by
computing the MAP classification given the probability
values of texture features at one scale and a priori knowl-
edge regarding contextual information which is extracted
from the adjacent coarser scale segmentation. In this fusion
process of each scale, our method uses an MRF prior model
as the local smoothness constraint, and a Gibbs sampler as
the MAP classifier. A Gibbs sampler has been used as the
142 Neural Comput & Applic (2009) 18:141–155
123
MRF parameter estimator, or used as the texture synthe-
sizer for generating a texture from an MRF texture model
[26, 27]. But our method uses a Gibbs sampler as the MAP
classifier. In addition, the Gibbs sampler integrates the
above mentioned factors; i.e., a posterior probability from
MLP networks, an MRF smoothness prior model, and
a priori knowledge regarding contextual information which
is extracted from the adjacent coarser scale segmentation.
Our texture segmentation method is applied to segment
gray-level textured images. Texture segmentation by the
proposed method performs better than the segmentation by
the HMT model and the HMTseg, and performs better than
texture segmentation by MLP network and the HMTseg.
Experimental results also show that our fusion method for
obtaining more improved segmentation results at the finest
scale performs better than the HMTseg algorithm. The
organization of the paper is as follows. In Sect. 2, the
Bayesian approach to texture segmentation is reviewed,
and a brief description of our texture segmentation system
is provided. The input structures, output structures, and the
use of MLP networks in terms of wavelet-based texture
segmentation are represented in Sect. 3. Texture segmen-
tation, using an MRF prior model, context model, and
Gibbs sampler, is represented in Sect. 4. Results on mosaic
textures are reported in Sect. 5. Section 6 is devoted to a
final discussion and conclusion.
2 Bayesian approach to texture segmentation
In this section, we explain the texture segmentation prob-
lem in the Bayesian framework. Then, matters regarding
Bayesian classifiers for texture segmentation are summa-
rized.
2.1 Texture segmentation in the Bayesian framework
Pixels or blocks of an image are formed on a lattice. Sites
on a rectangular lattice of size n 9 n can be denoted by
M¼ fðx; yÞj1� x; y� ng: Also, the location (x, y) can be
conveniently re-indexed by a single number i, where i takes
on values in {1, 2,…, m} with m = n 9 n. Therefore, sites
on the lattice can be indexed in M¼ f1; 2; . . .;mg: When
the texture segmentation problem is formulated in the
Bayesian framework, pixels or blocks of an image can be
regarded as a realization of a random field on the lattice
with distinct statistical behavior in different regions. Tex-
ture segmentation problems can be considered as a process
of assigning class labels from a set of class labels; i.e.,
L ¼ f1; 2; . . .;Ncg; to each of the sites (as indexed in M)
on the rectangular lattice that the realization (pixels or
blocks of an image) of the random field is formed on. The
labeling C records the class label of each site on the lattice.
Each element (a label) ciði 2 MÞ in the labeling C can be
regarded as a mapping from M to L; i.e., for 8i 2 M; ci :
M! L: In the terminology of random fields, a labeling C
is called a configuration. Therefore, texture segmentation
problems can be rephrased as a process of finding the
optimal configuration of the sites on the lattice which
pixels or blocks of an image are formed on.
In the Bayesian framework, the optimal solution for
texture segmentation is given by a feasible configuration
C* of an image G as follows:
C� � argmax
C 2 XpðCjGÞ; ð1Þ
where X is the configuration space, which is the set of all
possible configurations in terms of the class labels in L;and G � fgij8i2Mg is the collection of all pixels (or
blocks) gi. In practice, because of the high complexity of
random variables on both G and C, it is computationally
intractable to calculate a posterior p(C|G). Therefore, we
must impose some presupposition on both the image G
and a posterior p(C|G). It is assumed that all ci are
conditionally independent, given G, and all gi are
independent and identically distributed (i.i.d.). It is also
assumed that each ci is distributed based on the distribution
p(ci |gi) independently of all other ck and gk, k = i. It
follows that the p(C|G) of Eq. 1 can be rewritten as
follows:
pðCjGÞ ¼Y
i2MpðcijgiÞ: ð2Þ
Equation 2 shows that the global posterior probability
p(C|G) is determined by the local posterior probabilities
p(ci |gi). In this study, we will use MLP networks to esti-
mate the local posterior probabilities (more details will be
provided in Sect. 3), and by the naive Bayes rule [28], prior
models as a guide for the labeling decision are integrated
into our segmentation method.
2.2 Prior models and multi-scale processing
Bayesian approaches to texture segmentation are able to
incorporate knowledge concerning the properties of the
class labels themselves into a prior p(C). In addition,
external factors, which act as a guide for the labeling
decision, can be incorporated into a prior p(C) of the
Bayesian classifier. Most segmentation algorithms employ
a classification window of some size in the hope that all
pixels in the window belong to the same class. Texture
segmentations with large or small windows have compro-
mises between reliability and minuteness according to the
size of the windows. The segmentation with large windows
Neural Comput & Applic (2009) 18:141–155 143
123
is accurate for large, homogeneous regions but poor along
the boundaries between regions. The segmentation with
small windows is accurate near the boundaries between
regions but has poor classification reliability due to the
paucity of statistical information. To obtain a high-quality
segmentation result, clearly we should combine both the
large and small scale behaviors of images (both global and
local information of images) to benefit from both the
robustness of large windows and the resolution of small
windows. In segmentation algorithms using multi-scale
processing, the results of many classification windows of
different sizes are combined to obtain an accurate seg-
mentation at fine scales [6, 7, 11, 13, 16, 17]. Though the
orthogonal DWT can decorrelate an image with almost
uncorrelated wavelet coefficients, it is widely understood
that considerable high-order dependencies still exist in
wavelet coefficients, as observed from the characteristics of
wavelet coefficient distribution. In fact, considerable
attention has been given to the statistical image modeling
of high-order dependencies in the wavelet-domain with
applications for image coding [29], compression [30],
restoration [31], and segmentation. In multi-scale Bayesian
segmentation, these high-order dependencies of wavelet
coefficients can be considered as using context models.
Context models can be roughly categorized into three
groups: the interscale models [20], the intrascale models
[21], and the hybrid inter and intra scale models [31].
These models allow for more accurate image modeling and
more effective processing than other methods which
assume wavelet coefficients to be independent.
It is reasonable to assume that image pixels which are
spatially close are likely to be of the same texture. For these
reasons, most segmentation algorithms either implicitly or
explicitly impose some form of smoothness in the resulting
segmentation [8]. MRFs can be used to incorporate a priori
knowledge concerning the properties of the labels them-
selves [26, 27, 32, 33]. In our method, a prior probability
p(C) is modeled by using an MRF model, where the MRF
model forms an intrascale context model and acts as a
smoothing factor. In texture segmentation based on multi-
scale processing, some segmentation algorithms constructed
a prior model which can capture the interscale dependencies
between the image blocks and their class labels [7, 11, 13].
One study used the multi-scale prior model as a guide for the
decision of a labeling [13]. In the multi-scale Bayesian
texture segmentation of our study, a multi-scale context
model, which can be considered as a multi-scale prior
model, is constructed. These proposed multi-scale Bayesian
context models can capture complex aspects of both local
and global contextual behavior. More details will be pro-
vided in Sect. 4.
2.3 A brief description of our texture segmentation
system
As shown in Fig. 1, our proposed system has two steps for
texture segmentation, i.e., the generation of texture features
and the processing of texture features. In the generation of
texture features, a textured image is transformed by the
Fig. 1 A brief description of
our texture segmentation system
144 Neural Comput & Applic (2009) 18:141–155
123
three-levels Haar wavelets transform, and from the multi-
level wavelet coefficients, feature vectors are formed at
each scale. In the processing of texture features, feature
vectors are used as inputs of MLP networks of each class
and each scale, and from MLP networks, posterior values
are estimated at each scale and each class. In addition,
texture segmentation by interscale guidance (an MRF
model) and interscale guidance (the context vector) is
performed from coarse to fine scales. A block in the image
is supported by its neighborhoods using an MRF model
(the intrascale context model), and the coarse scale pro-
vides context information to the fine scale through the
interscale context model (more details will be explained in
Sect. 4.2). These posterior, intrascale, and interscale
models are integrated in the Bayesian framework. Gibbs
sampler acts as the MAP classifier at each scale. Finally,
our system gets the improved segmentation results at the
finest scale by fusing the multi-scale segmentations
sequentially from coarse to fine scales.
In order to assess the performance and the usefulness of
our proposed algorithm, we will experimentally compare
our proposed algorithm with other texture segmentation
algorithm, and show the superiority of our method
through figures and table of experimental results. Texture
segmentation problems are classified as unsupervised seg-
mentation, semisupervised segmentation, and supervised
segmentation. This research belongs to the methodology
for supervised segmentation problem. In the supervised
texture segmentation, the training and test data sets need,
and image databases for performance tests are limited. In
order to assess the performance of our method in the
environment of complex and various textures, our method
is tested on mosaic texture images which contain complex
and various textures. Our method performs texture seg-
mentation by using Haar wavelet coefficients as texture
features, in the multi-scale Bayesian framework, and in
terms of the supervised segmentation problem. Texture
segmentation method of one study [13] has also performed
texture segmentation by using Haar wavelet coefficients as
texture features, in the multi-scale Bayesian framework,
and in terms of the supervised segmentation problem. The
method of Ref. [13] can be directly compared with our
method. In Ref. [13], the HMT model is used as statistical
model of Haar wavelet coefficients, and the HMTseg
algorithm is used for the sequential segmentation through
scales. In order to assess our approaches to texture seg-
mentation, our statistical model based on MLP networks is
experimentally compared with the HMT model, and the
HMTseg algorithm is experimentally compared with our
method for sequential segmentation and the fusion of
classification information through scales.
3 Structures of multi-layer perceptron networks
for wavelet-based texture segmentation
In this section, we explain structures of MLP networks for
estimating a posterior probability in a multi-scale wavelet
domain, and their application to texture segmentation.
3.1 Inputs of MLP networks for wavelet-based texture
segmentation
In this study, we employ blocks of different sizes to
implement classification windows of different sizes. The
most natural method to obtain blocks of different size is to
start with an initial image, corresponding to blocks of size
1 9 1 (pixel-level); next, they are grouped into 2 9 2
blocks. Repeating this process with incrementing the
number n leads to blocks of size 2n 9 2n. Given an initial
square image (for example, a 24 9 24 square image), the
blocks of different sizes are obtained simply by recursively
grouping subblocks of size 2n - 1 9 2n - 1 into square
blocks of size 2n 9 2n (see Fig. 2).
We use the multi-level Haar wavelet transform to
implement classification windows of different sizes through
scales. The multi-level Haar wavelet transform forms a
pyramid structure through all scales. Coefficients of Haar
wavelets are computed by using four filters as follows:
hLL ¼1
2
1 1
1 1
� �; gLH ¼
1
2
1 1
�1 �1
� �;
gHL ¼1
2
1 �1
1 �1
� �; gHH ¼
1
2
1 �1
�1 1
� �;
ð3Þ
where hLL is a local smoother, gLH is a horizontal edge
detector, gHL is a vertical edge detector, and gHH is a
1×1 blocks 2×2 blocks 4×4 blocks 8×8 blocks 16×16 blocks Scale s =0 Scale s =1 Scale s =2 Scale s =3 Scale s =4
Fig. 2 The blocks of different
sizes of the multiscale images
obtained from an initial square
image of 16 9 16 size. Each
square block can be associated
with a subtree of Haar wavelet
coefficients
Neural Comput & Applic (2009) 18:141–155 145
123
diagonal edge detector [34]. Most real-world images,
especially gray-scale texture images, are well characterized
by their singularity (edges and ridges) structures. The
wavelet transform of a discrete image x(= us = 0; scale
s = 0) forms four subband images: a smoothed image
us = 1, a horizontal edge image wLHs = 1, a vertical edge
image wHLs = 1, and a diagonal edge image wHH
s = 1. The
wavelet transform process can now be continued on the
image us = 1 and the wavelet transform for smoothed ima-
ges us = 0, 1,…, S - 1 is repeated up to the scale S - 1.
Afterwards, pixel-level (1 9 1 blocks), 2 9 2 blocks,
22 9 22 blocks,…, and 2S - 1 9 2S - 1 blocks resolution
images are formed at each different scale s = 0, 1,
2,..., and S - 1. Haar wavelet transforms for the three
levels are shown in Fig. 3a. As can be seen, the coarse-scale
coefficient ws = 3 corresponds to four coefficients in the
next finer scale and the dependency of these coefficients
across scales has a quad-tree structure. The multi-scale
wavelet coefficients, which analyze a common sub-region
of an image, have persistence across the scale. The depen-
dency between these coefficients can be represented as the
dependency between parent and child nodes of a wavelet
quad-tree (see Fig. 3b). In Fig. 3b, a black node represents a
wavelet coefficient, and Ti is a sub-tree rooted at location
i (node i). Node i (or sub-tree Ti) analyzes a sub-region of an
image. Node i at scale s will be classified according to
texture class. That is, the classification of node i at scale s
will assign a texture class to the sub-image block which is
analyzed by node i (or sub-tree Ti) at scale s. The classifi-
cation will be accomplished for pixels, 2 9 2 blocks,
22 9 22 blocks,…, and 2S - 1 9 2S - 1 blocks at each
different scale s = 0, 1, 2,…, and S - 1. In this paper, we
will often drop the scale (S, S - 1,…) and the direction
(LH, HL, and HH), if these droppings do not prevent the
understanding of contents.
The input vector of the MLP network is determined by
considering the dependency (persistency) of wavelet
coefficients across scale. The coefficients of the sub-tree Ti
(see Fig. 3b) and the pixel intensities (gray-level values) of
the sub-image (analyzed by node i) are used as inputs of the
MLP network for node i of a wavelet quad-tree. The input
vector of the MLP network is also determined by consid-
ering the statistical dependencies across three wavelet
subbands (HL, LH, and HH). Ti is composed of TiHL, Ti
LH,
and TiHH for each direction. By considering all sub-trees on
node i of three wavelet subband quad-trees, the input
vector gi of the MLP network is defined as
gi ¼ Ti; pif g; ð4Þ
where gi is the block of the sub-image analyzed by sub-tree
Ti; the sub-trees, Ti = {TiHL, Ti
LH, TiHH}, represent the
wavelet coefficients of sub-trees rooted at node i in three
wavelet sub-band quad-trees; and pi represents the pixel
intensities of the sub-image which are analyzed by node i.
Wavelet coefficients of elements of gi can only be obtained
down to a 2 9 2 block resolution (scale s = 1). At scale
s = 0, the input vector gi only has the component pi. It
does not have the component Ti.
3.2 Output structure and training of MLP networks
The input and output structures of MLP networks can be
constructed so that a posterior probability is estimated
[23–25]. Our wavelet-based texture segmentation system
has one group of MLP networks for each wavelet scale. In
each scale, one MLP network is assigned for each texture
class. In addition, at pixel level, one group of MLP net-
works is also assigned. If the wavelet decomposition level
is L and the number of texture classes is C, then the number
of MLP networks is (L + 1) 9 C. One MLP network has
one output node. For the MLP network of texture class
c 2 L; the target output for training is set up as follows:
ti ¼1; gi 2 c0; gi 62 c
�: ð5Þ
In this study, the resilient back-propagation algorithm
is used for training MLP networks [35]. The purpose of
the resilient back-propagation training algorithm is to
eliminate the faults of the back-propagation algorithm,
which uses the magnitudes of the partial derivatives of the
cost function (in this paper, the sum of mean squared errors
between targets and outputs of MLP networks is used as
the cost function). Compared to several different back-
propagation training algorithms, the resilient back-
propagation algorithm is reasonably fast and very useful
HHsw 3=
HHsw 2=
LL HL
LH HH HLs 1=w
HHs 1=wLH
s 1=w
)(iρ
i
HHs
(a) (b)
3=
HHsw 2=
LL HL
LH HH HLs 1=w
HHs 1=wLH
s 1=w
)(iρ
i
Ti
Fig. 3 a The three-level Harr
wavelet transform and the quad-
tree structure of the wavelet
coefficients; b The quad-tree
structure of the wavelet
coefficients in a wavelet sub-
band and sub-tree Ti rooted at
node i and its elements
146 Neural Comput & Applic (2009) 18:141–155
123
for large problems. It has also the nice property that it
requires only a modest increase in memory requirements
[35, 36].
After training the MLP networks, we compute a pos-
terior probability from outputs of MLP networks of one
scale as follows:
pðcijgiÞ ¼Mðgi;wcÞPNc
k¼1 Mðgi;wkÞ; ð6Þ
where ci is the class label for input vector gi, and Mðgi;wcÞis the output of the MLP network model. This model has
weight vector wc for a class c [ {1, 2,…, Nc } for input
vector gi.
3.3 Wavelet-based texture segmentation
by MLP networks
Suppose that for each texture class c [ {1, 2,…, Nc } and
each scale we have trained an MLP network. Now, given
the wavelet coefficients of an image consisting of a mon-
tage of these textures, applying Eq. (6) to the MLP
networks yields a posterior probabilities pðcijgiÞ; where
ci [ {1, 2,…, Nc } for gi of each block. The classification ci
of an input vector gi at scale s is performed as follows:
ci � argmax
ci 2 f1; 2; . . .;NcgpðcijgiÞ
¼ argmax
ci 2 f1; 2; . . .;NcgMðgi;w
scÞPNc
k¼1 Mðgi;wskÞ; ð7Þ
where wsc is the weight vector of the MLP network of a
texture class c at scale s. Node i of the wavelet quad-tree
corresponds to site i on the random field, which is com-
posed of pixels or blocks of an image at one scale. By
Eq. (7) and assumptions which are imposed in Eq. (2), we
can obtain a possible configuration Cs at scale s, where Cs
is the configuration on the blocks of scale s. We will call
this classification process the initial MAP segmentation.
The initial MAP segmentation yields a set of S different
texture segmentations Cs, one for each different scale s.
Thereafter, the multi-scale texture segmentation maps are
obtained by using the initial MAP segmentation.
Figure 4 illustrates the multi-scale texture segmentations
by the initial MAP segmentation. After training MLP net-
works on the D94 and D103 textures (Fig. 7), we performed
the initial MAP segmentation on the test image displayed to
the left of Fig. 4a to obtain the multi-scale texture seg-
mentation maps (c) at various scales. To compare our initial
MAP segmentation with other segmentation methods, we
represent the multi-scale texture segmentation maps (b)
obtained by the HMT model and ML classification [12, 13].
The segmentation method of Ref. [13] uses multi-level
wavelet coefficients as features of texture images, and
employs the HMT model as a statistical model for the
marginal and joint wavelet coefficients statistics. As shown
in Fig. 4b, c, the texture segmentation performance
obtained by using MLP networks is better than that obtained
by using the HMT model. Texture segmentation results by
MLP networks have less noise (noise due to the classifica-
tion error). Such results have been confirmed in [12].
While quick and easy, as Fig. 4c attests, the initial MAP
segmentation suffers from the classical ‘‘blockiness versus
robustness’’ tradeoff. These phenomena can also be found
Scales s = 3 s= 2 s= 1 s= 0
Scales s = 3 s= 2 s= 1 s= 0
Scales s = 3 s= 2 s= 1 s= 0 Scales s = 3 s= 2 s= 1 s= 0
(a) (b)
(d)(c)
(e)
Fig. 4 The multi-scale texture segmentation maps; a Left: a
128 9 128 D94/D103 mosaic test image to be segmented, Right:ideal texture segmentation map for the mosaic test image; b Multi-
scale texture segmentations Cs of the mosaic test image by the HMT
model (for 8 9 8, 4 9 4, 2 9 2 blocks resolution, and pixel-level
resolution); c Multi-scale texture segmentations Cs of the mosaic test
image by MLP networks (for 8 9 8, 4 9 4, 2 9 2 blocks resolution,
and pixel-level resolution); d The application of the HMTseg
algorithm to multi-scale texture segmentations by the HMT model;
e The application of the HMTseg algorithm to multi-scale texture
segmentations by MLP networks
Neural Comput & Applic (2009) 18:141–155 147
123
in the multi-scale texture segmentation obtained by using
the HMT model, as shown in Fig. 4b. Classification
accuracy increases with block size (toward the coarser
scale), because more statistical information is available for
the class label decision. Classification of large blocks
(segmentation at coarse scales) produces accurate seg-
mentations in large, homogeneous regions. However, this
comes at the cost of reduced boundary resolution; that is,
poor segmentations along the boundaries between regions.
A small block (segmentation at fine scales) sacrifices
classification reliability due to the paucity of statistical
information. However, a small block is more appropriate
near the boundaries between regions. Therefore, texture
segmentations in multi-scale images have compromises
between reliability and minuteness according to scale. To
obtain a high-quality segmentation, clearly we should
combine the multi-scale results to benefit from both the
robustness of large blocks and the resolution of small
blocks. The study of Ref. [13] also demonstrated that
image segmentation in the finest scale is improved by using
a multi-scale Bayesian image segmentation algorithm
called the HMTseg algorithm that fuses the multi-scale
image segmentations obtained by the HMT model. Fig-
ure 4d shows the application of the HMTseg algorithm to
multi-scale texture segmentations obtained by using the
HMT model. In Ref. [12], the application of the HMTseg
algorithm to multi-scale texture segmentations obtained by
MLP networks has been shown. In the same way as [12],
we apply the HMTseg algorithm to multi-scale texture
segmentations obtained by MLP networks, with the results
shown in Fig. 4e. The segmentation performance of
the HMTseg algorithm depends on multi-scale texture
segmentations before the application of the HMTseg.
Therefore, the better the texture segmentation result of
each scale before the application of the HMTseg, the better
the texture segmentation result of the finest scale after the
application of the HMTseg. As confirmed in Ref. [12], the
texture segmentation result of Fig. 4e represents better
performance than that of Fig. 4d. This is so because MLP
networks perform better than the HMT model in the texture
segmentation of each scale before the application of the
HMTseg.
We constructed a multi-scale context model, which can
be considered as a multi-scale prior model, to combine the
benefits from both the robustness of large blocks and the
resolution of small blocks.
4 Multi-scale Bayesian texture segmentation
using MRFs and Gibbs sampler
The multi-scale texture segmentation maps by the ini-
tial MAP segmentation contain noises caused by
classification errors. Since finer scale blocks nest inside
coarser scale blocks, the blocks will be statistically
dependent across scale for images consisting of fairly
large, homogeneous regions. Hence, coarse-scale infor-
mation should be able to help guide finer-scale decisions.
In this section, we will explain the reduction of noises in
the segmentation maps by using an MRF model (the
MRF smoothness prior), the interscale decision fusion
by defining the context vector v which has contextual
information extracted from the adjacent coarser scale
segmentation, and MAP classification by using a Gibbs
sampler.
4.1 The context vector v and the Bayesian interscale
decision fusion
When we refer to Sect. 3.1, Figs. 2 and 3, it is perceived that
the dependency of wavelet coefficients across scales has a
quad-tree structure and that blocks at locations (nodes) of
the wavelet quad-tree have a dependency across scales.
Therefore, if gi of a block at scale s was classified as a class
label c, then it is quite likely that its four children blocks at
scale s - 1 belong to the same class, especially when s is
small (at fine scales). Hence, we will guide the classification
decisions for the child blocks based on the decision made
for their parent block. In addition to the parent block, we
can also use the neighbors of the parent to guide the deci-
sion process. Similar multi-scale decision ideas have been
used in Refs. [7, 13, 16, 17].
In the process of segmentation of the current scale, to
use the segmentation result of an adjacent coarse scale as a
guide for segmentation of the current scale, we define the
context vector vis for site i at scale s.
vsi � ½csþ1
qðiÞ ; csþ1N1qðiÞ
; csþ1N2qðiÞ
; csþ1N3qðiÞ
; csþ1N4qðiÞ
;
csþ1N5qðiÞ
; csþ1N6qðiÞ
; csþ1N7qðiÞ
; csþ1N8qðiÞ�; ð8Þ
where cq(i)s + 1 is the class label of parent site q(i) of site i, and
½csþ1N1qðiÞ
; . . .; csþ1N8qðiÞ� are class labels of eight neighbor sites of
parent site q(i). It follows that a random field Vs in terms of
the context vectors at scale s is represented as Vs = {vis |Vi
[Ms; for the set Ms; of sites at scales}. The Vs contains
MAP classification information at the previous coarse scale.
In this study, configurations of each scale are combined by
Bayesian classifier while simultaneously considering the
random field V, which contains contextual information
from the adjacent coarse scale.
We consider maximizing the posterior p(Cs|Gs, Vs),
where Gs : {gis|Vi [Ms;} is the collection of all gi
s
at scale s. Assuming that Gs and Vs are conditionally
independent given the configuration Cs, the posterior
p(Cs|Gs, Vs) can be rewritten by the naive Bayes rule [28]:
148 Neural Comput & Applic (2009) 18:141–155
123
pðCsjGs;VsÞ ¼ pðCsjGsÞpðCsjVsÞ: ð9Þ
The naive Bayes rule works quite well in practice, despite its
manifest simplicity. The naive Bayes classifier based on the
naive rule is surprisingly useful in practice, and is robust
enough to ignore serious deficiencies in its underlying naive
Bayes rule [28]. At scale s, texture segmentation based on the
naive Bayes rule is performed by finding the optimal
configuration Cs* as follows:
Cs� � argmax
Cs 2 Xs pðCsjGs;VsÞ
¼ argmax
Cs 2 Xs pðCsjGsÞpðCsjVsÞf g
¼ argmax
Cs 2 Xs pðCsjGsÞpðVsjCsÞpðCsÞf g; ð10Þ
where Xs is the configuration space at scale s (the set of all
possible configurations of the sites of scale s in terms of the
class labels in L). The probability p(Cs|Gs) has information
obtained from all blocks gis of scale s for segmentation. The
distribution p(Vs|Cs) has information obtained from a
random filed Vs that is extracted from the configuration
Cs + 1 of the adjacent coarse scale s + 1. The factor
p(Vs|Cs) models the dependencies between the square
blocks across scale in a Markov-1 fashion [13, 26, 27],
where the square blocks of scale s are assumed to depend
only on the square blocks at scale s + 1. Using Eq. (10),
we perform the Bayesian interscale decision fusion
between scale s and s + 1 by classifying Gs based on its
probability p(Gs|Cs) and guidance p(Vs|Cs) from the adja-
cent coarse scale s + 1. This Bayesian interscale decision
fusion of Eq. (10) computes a MAP estimate of the con-
figuration Cs of scale s. Markov modeling leads us to a
simple scale-recursive classification of the square blocks.
In this study, the multi-scale Bayesian texture segmentation
is performed by sequentially using Eq. (10) from coarse to
fine scales. Starting from an initial coarse scale and halting
the fusion at scale s, we obtain the MAP segmentation
Cs
MAP: This multi-scale decision fusion greatly improves
the robustness and accuracy of the segmentation (we will
see the results of the fusion in Sect. 5). Halting the fusion at
the final scale (pixel-level), we obtain the final and
improved MAP segmentation Cs¼0
MAP:
4.2 Factors for MAP classification at one scale
The MAP classification derived by using Eq. (10) has three
factors; a posterior probability p(Gs|Cs), the distribution
p(Vs|Cs), and a prior p(Cs). Below, these factors are
modeled and explained.
4.2.1 Neighborhood system and Markov–Gibbs
equivalence
The MRF theory is a branch of probability theory for
analyzing the spatial or contextual dependencies of
physical phenomena [26, 27]. It is used in a labeling to
establish probabilistic distributions of interacting class
labels. This subsection introduces notations for MRFs,
and the use of an MRF model for constructing a prior
model.
The sites in M are related to one another via a
neighborhood system. A neighborhood system for M is
defined as N ¼ fN ij8i 2Mg; where N i is the set of
site neighboring i. For a regular rectangular lattice M;
the sites on the lattice M¼ fðx; yÞj1� x; y� ng; where
the set of neighbors of (x, y) is defined as N x;y ¼fðx� 1; yÞ; ðxþ 1; yÞ; ðx; y� 1Þ; ðx; yþ 1Þg in the first
order neighborhood system (also called the ‘‘4-neigh-
borhood system’’). When the location (x, y) is
conveniently re-indexed by a single number i, where i
takes on values in {1, 2,…, m} with m ¼ n�n;N i of the
4-neighborhood system is represented as N i ¼ fi� 1; iþ 1;
i� n; iþ ng (A site i does not belong to the set of
neighbors of i : i 62 N i). In the second order neighborhood
system, also called the 8-neighborhood system, there are
eight neighbors for every (interior) site of a regular rect-
angular lattice Mfij1� i�m;m ¼ n �ngas N i ¼ fi� 1;
iþ 1; i� n; iþ n; i� n� 1; i� nþ 1;þn� 1; iþ nþ 1g:We modeled a prior p(Cs) in an MRF model. There
are two approaches for specifying an MRF, that in terms
of the conditional probabilities pðcsi jcsN iÞ and that in
terms of the joint probability p(Cs), where csN i
represents
class labeling on the set of neighbors of site i. A theo-
retical result about the equivalence between MRFs and
Gibbs distributions [26, 27] provides a mathematically
tractable means of specifying the joint probability of an
MRF. Writing out a Gibbs distribution pðCsÞ ¼Z�1e�UðCsÞ gives the conditional probability pðcs
i jcsN iÞ as
follows:
pðCsÞ ¼Y
i2Ms
pðcsi jcs
N iÞ: ð11Þ
In a Gibbs distribution p(Cs) = Z-1e-U(Cs), U(Cs) is the
energy function, and Z is a normalizing constant called
the partition function: Z ¼P
Cs2Xs e�UðCsÞ: One can
specify the joint probability p(Cs) by specifying an
appropriate energy function U(Cs) for desired system
behavior. In this way, one encodes the a priori
knowledge or preference about interaction between class
labels.
Neural Comput & Applic (2009) 18:141–155 149
123
4.2.2 The MRF smoothness prior
Physical properties in a neighborhood of space present
some coherence and generally do not change abruptly. It is
reasonable to assume that image pixels (or ‘‘blocks’’)
which are spatially close are likely to be of the same tex-
ture. Therefore, the smoothness constraint can be used as a
contextual constraint between class labels of sites. In an
MRF, each point on a lattice is statistically dependent only
on its neighbors so that the complexity of the model is
restricted. Smoothness constraints can be expressed as the
prior probability. Our proposed method uses an MRF
smoothness prior model for local smoothness constraint.
The MRF smoothness prior is characterized by the multi-
level logistic (MLL) model which tends to create a smooth
solution (or, to prefer uniform class labels) [27]. In this
paper, the energy function U(Cs) of the MLL model takes
the form as follows:
UðCsÞ ¼X
i2Ms
ai þX
i2Ms
X
i02NV2ðcs
i ; csi0 Þ; ð12Þ
where ai = 0 for all i 2 Ms; the set N i is the
8-neighborhood system, and the V2ðcsi ; c
si0 Þ is defined as
follows:
V2ðcsi ; c
si0 Þ ¼
�b if sites i and i0 have the same label
b otherwise
�:
ð13Þ
Here, b has a predetermined positive value. Next, the
conditional probability pðcsi jcsN iÞ is represented as follows:
pðcsi jcsN iÞ ¼ 1
Pcs
i2Le�P
i02N iV2ðcs
i ;csi0 Þ
e�P
i02N iV2ðcs
i ;csi0 Þ; ð14Þ
where L is the set of class labels; i.e., L ¼ f1; 2; . . .;Ncg:Then, the MRF smoothness prior model p(Cs) is con-
structed by using Eqs. (11) and (14). This MRF model,
pðcsi jcsN iÞ; is the intrascale context model.
4.2.3 Factors for interscale dependency
between class labels
In Eq. (10), p(Vs|Cs) represents the interscale dependency
between class labels, and can be rewritten as follows:
pðVsjCsÞ ¼Y
i2Ms
pðvsi jcs
i Þ: ð15Þ
In Eq. (15), it is assumed that all vis are conditionally
independent given Cs and each vis is distributed based on
the distribution p(vis|ci
s), independently of all other cks and
vks, k = i. Here, our proposed method estimates p(vi
s|cis) as
follows:
where dm,n is the Kronecker delta function. The distribution
p(vis|ci
s) is an important factor for the multi-scale decision
fusion. The distribution p(vis|ci
s), which is formulated by
Eq. (16), is the interscale context model.
4.2.4 A posterior probability from MLP networks
As mentioned in Sect. 2.1, it is assumed that all cis are
conditionally independent given Gs and all gis (the pixels or
blocks of an image Gs) are independent and identically
distributed (i.i.d.). It is also assumed that each cis is dis-
tributed based on the distribution p(cis|gi
s), independently of
all other cks and gk
s, k = i. From this, it follows that the p
(Cs|Gs) of Eq. (10) can be rewritten as follows:
pðCsjGsÞ ¼Y
i2Ms
pðcsi jgs
i Þ: ð17Þ
Here, the probability p(cis|gi
s) is obtained from outputs of
MLP networks of scale s for each class by using the
method explained in Eq. (6).
pðcsi jgs
i Þ ¼Mðgs
i ;wscÞPNc
k¼1 Mðgsi ;w
skÞ; ð18Þ
where wcs is the weight vector of MLP network for a class
c [ {1, 2,…, Nc} at scale s.
pðvsi jcs
i Þ �the number of elements of vector vs
i with same value as label csi
the number of all elements of vector vsi
¼dcs
i ;csþ1qðiÞþ dcs
i ;csþ1N1qðiÞþ dcs
i ;csþ1N2qðiÞþ � � � þ dcs
i ;csþ1N7qðiÞ
dcsi ;c
sþ1N8qðiÞ
9; ð16Þ
150 Neural Comput & Applic (2009) 18:141–155
123
4.3 Multi-scale Bayesian texture segmentation
and Gibbs sampler
From Eqs. (11), (15), and (17), we can rewrite Eq. (10) as
follows:
Cs� ¼ argmax
Cs 2 Xs pðCsjGsÞpðVsjCsÞpðCsÞf g
¼ argmax
Cs 2 Xs
Y
i2Ms
pðcsi jgs
i Þpðvsi jcs
i Þpðcsi jcsN iÞ
( ) :
ð19Þ
We can perform MAP segmentation at scale s by using the
MAP classification of Eq. (19). A Gibbs sampler can be
used for MAP classification by utilizing Eq. (19). A Gibbs
sampler has been used as the MRF parameter estimator, or
used as the texture synthesizer for generating a texture
from an MRF texture model [26, 17]. In this multi-scale
Bayesian approach to segmentation, the optimal configu-
ration Cs* is determined by using a Gibbs sampler. That is,
the Gibbs sampler acts as a MAP classifier, and texture
segmentation at scale s is performed by Gibbs sampler. If
texture segmentation is performed sequentially from coarse
to fine scales by using Eq. (19), then texture segmentation
results of each scale are fused sequentially from coarse to
fine scales. In other words, the multi-scale decision fusion
is performed. Finally, in the full resolution image (at the
finest scale), the more improved texture segmentation
result is obtained by the multi-scale decision fusion and the
MRF smoothness prior that has the ability to reduce noises
caused by classification errors. When Eq. (19) is used at
the initial coarse scale, the context vector does not exist.
But we can assume that the configuration Cs is statistically
independent of the context vectors at the initial coarse
scale. In other words, at the initial coarse scale,
p(Cs|Vs) = p(Cs).
The procedure of texture segmentation at scale s is
shown in Fig. 5. The proposed segmentation algorithm of
Fig. 5 uses the Gibbs sampler as the MAP classifier. The
procedure of sequential texture segmentations from a
coarse to fine scale is shown in Fig. 6. The final texture
segmentation result in the full resolution image is obtained
by the proposed algorithm of Fig. 6.
5 Experiments and results
Our proposed texture segmentation method has been tested
on mosaic texture images. To assess the performance and
the usefulness of our approaches to texture segmentation,
our statistical model based on MLP networks is experi-
mentally compared with the HMT model, and the HMTseg
algorithm [13] (a multi-scale Bayesian image segmentation
algorithm called the HMTseg algorithm that fuses the
multi-scale image segmentations obtained by the HMT) is
experimentally compared with our method for sequential
segmentation and the fusion of classification information
through scales.
5.1 Data set and configurations of texture segmentation
systems
In our experiments, 20 Brodatz textures were used. These
textures are shown in Fig. 7. From each 512 9 512 Bro-
datz texture image, we randomly picked ten (overlapping)
Fig. 5 MAP segmentation at
scale s using a Gibbs sampler
Fig. 6 The multi-scale decision
fusion by sequential texture
segmentation from coarse to
fine scales
Neural Comput & Applic (2009) 18:141–155 151
123
64 9 64 blocks. Then, the multi-level (three level) wavelet
coefficients of those image blocks were used as training
data. All MLP networks have one hidden layer. The hidden
layer of each MLP network has 20 hidden nodes. The
inputs of MLP networks of scale s are (((4s - 4)/
3 + 1) 9 2 + ((4s + 1 - 4)/3 + 1))-dimensional vectors
[refer to Eq. (4)]. MLP networks were trained using those
training data. HMT models were also trained for each
texture using the identical training data with MLP net-
works. But, while an MLP network for one texture class is
trained by using the training data of all texture classes, an
HMT model for one texture class is trained only by using
the training data of the texture class. Weights of MLP
networks were updated by using the MATLAB neural
network toolbox [36] (with the resilient back-propagation
algorithm). The parameters of HMT were estimated using
an EM algorithm (the threshold value 10-7 is used to
determine the model convergence) with an intelligent
parameter initialization [37]. For image segmentation using
the HMT model, a pixel brightness pdf model at the pixel
level is required [13]. Therefore, to obtain pixel-level
segmentation, 2-density Gaussian mixture models were
used and trained as a pixel brightness pdf model.
The initial coarse scale (the starting scale for the multi-
scale decision fusion) is set at s0 = 3, such that the coarsest
texture segmentations have sufficient reliability. At a very
coarse scale, the blocks, gi, of the very coarse scale
are very large and hence are likely to contain several
differently textured regions. Therefore, we ignore the
information at very coarse scales. The test texture images
for the experiments are shown in Fig. 8. Since MLP net-
works are trained for the multi-level wavelet coefficients
of 64 9 64 image blocks, MLP networks can manage
64 9 64 image blocks. When the size (64 9 64) of image
blocks that MLP networks can manage is smaller than a
test texture image, we can use the MLP network repeatedly
for (64 9 64) test image subblocks assuming that the
blocks are independent. Since the size of an image, which
the trained HMT model can manage, is also smaller than
the size of the test images of Fig. 8, we used the HMT
model repeatedly for test image subblocks with the same
assumption.
In the proposed algorithm of Fig. 5, a Gibbs sampler
was executed for three iterations. In the HMTseg algo-
rithm, the threshold value 10-4 was used to determine the
convergence of the algorithm.
5.2 Texture segmentation results and comparisons
As shown in Figs. 5 and 6, our proposed segmentation
algorithm contains the multi-scale Bayesian framework.
The HMTseg algorithm is also a multi-scale Bayesian
image segmentation technique. To compare our proposed
method (Fig. 6) and the HMTseg, we experimented with
the test images of Fig. 8. The procedure of Fig. 6 contains
the multi-scale decision fusion, the MRF smoothness prior,
and the MAP segmentation using a Gibbs sampler. Let the
procedure of Fig. 6 be called the Gibbs segmentation.
Figure 9 shows texture segmentation results for the 2-tex-
tures test image (diagonal2). Figure 9a shows the multi-
scale texture segmentations by using MLP networks. As
shown in Fig. 9, the Gibbs segmentation of our proposed
method is applied to the multi-scale texture segmentations
of Fig. 9a (Fig. 9c), and the HMTseg is also applied to the
multi-scale texture segmentations of Fig. 9a, b). Figure 9c
displays the texture segmentation result of each scale in the
process of the multi-scale decision fusion by using the
Fig. 7 The training images
(512 9 512 pixels) of the 20
Brodatz textures used in the
experiment
Fig. 8 Mosaic test texture images and ideal texture segmentation
map for the mosaic test images; a The 64 9 64 diagonal2 mosaic, a
two-textures image (D24/D84); b The 64 9 64 blocks3 mosaic,
a three-textures image (D24/D68/D84); c The 64 9 64 cross4 mosaic,
a four-textures image (D22/D24/ D68/D84); d The 192 9 192
squares9 mosaic, a 9-textures image (D15/D19/D22/D24/D26/D38/
D49/D68/D103); e Ideal texture segmentation map for diagonal2; fIdeal texture segmentation map for blocks3; g Ideal texture segmen-
tation map for cross4; h Ideal texture segmentation map for squares9
152 Neural Comput & Applic (2009) 18:141–155
123
Gibbs segmentation. Figure 9b also displays the texture
segmentation result of each scale in the process of the
multi-scale decision fusion by using the HMTseg algo-
rithm. When Fig. 9b, c are compared, texture segmentation
results by the Gibbs segmentation have less noise (noise
due to the classification error), and the Gibbs segmentation
of our proposed method exhibits outstanding performance.
Texture segmentations for test images of Fig. 8 using our
proposed method (which is represented in Figs. 5 and 6)
were compared with those using the HMT model (at scale
s = 0, a 2-density Gaussian mixture model is used) and the
HMTseg (or, MLP networks and the HMTseg) in Fig. 10.
Considering the segmentations of all scales, the segmenta-
tion of the finest scale is the final texture segmentation
which we would get. Final texture segmentations of the
finest scale obtained by each segmentation method are
shown in Fig. 10. Texture segmentation results by our
proposed method have less noise (noise due to the classi-
fication error). Therefore, we can see that our proposed
method exhibits outstanding performance. The segmenta-
tion error rate between the ideal segmentation (Fig. 8) and
the final texture segmentation (Fig. 10) of the finest scale is
given in Table 1. The segmentation error rate is the rate of
the number of misclassified pixels to the total number of
pixels in a test image. As shown in Table 1, our proposed
segmentation method performs better than the segmentation
method using the HMT, Gaussian mixtures, and the
HMTseg. Table 1 also shows that our proposed segmenta-
tion method performs better than the segmentation method
using MLP networks and the HMTseg. In addition, in terms
of texture segmentation using the multi-scale decision
fusion process and the reduction of noises caused by clas-
sification error, the segmentation error rates of Table 1
reconfirm the excellence of the Gibbs segmentation.
Fig. 9 Segmentations of
diagonal2; a the texture
segmentation C of each scale by
using MLP networks; b the
texture segmentation C of each
scale after the application of the
HMTseg to a; c the texture
segmentation C of each scale
after the application of the
Gibbs segmentation to a
Fig. 10 Texture segmentation
results by each segmentation
method for test images of
Fig. 7; In a–d the first column
shows results by the HMT and
the HMTseg, the second column
by MLP networks and the
HMTseg, and the third column
by our proposed method; a For
diagonal2; b For blocks3; c For
cross4; d For squares9
Table 1 The segmentation
error rate for 2-, 3-, 4-,
and 9-textures test images
Segmentation error rate = (the
number of misclassified pixels
of test image)/(the number of all
pixels of test image) 9 100%
Test images Method
The HMT, Gaussian
mixtures, and the
HMTseg (%)
MLP networks and
the HMTseg (%)
MLP networks and the
Gibbs segmentation
(our proposed method) (%)
Diagonal2 6.81 3.86 1.20
Blocks3 13.72 9.11 2.56
Cross4 11.62 5.32 1.56
Squares9 8.07 5.16 0.98
Neural Comput & Applic (2009) 18:141–155 153
123
6 Discussion and conclusion
Multi-scale Bayesian approaches to texture segmentation
form a natural framework for integrating both global and
local information of image behavior, together with con-
textual information. Motivated by the merit of multi-scale
Bayesian approaches, we proposed a novel method of
supervised texture segmentation using MLP networks, an
MRF model, and Gibbs sampler within a multi-scale
Bayesian framework. Multi-scale DWT coefficients, that
are suitable for multi-scale image processing, were used as
inputs for MLP networks. Texture segmentation was per-
formed by using outputs of MLP networks. Our proposed
multi-scale Bayesian texture segmentation method used the
MLL model (an MRF smoothness prior model) for
reducing noises caused by classification error; in addition,
it defined the context vector v and modeled interscale
dependency for the multi-scale decision fusion. A Gibbs
sampler integrated the above factors (a posterior proba-
bility from MLP networks, the MRF smoothness prior
model, and interscale context model) and acted as the MAP
classifier. We called the procedure of integrating the above
factors (Fig. 6) the Gibbs segmentation.
In experiments to compare the Gibbs segmentation and
the HMTseg algorithm (which is a multi-scale Bayesian
image segmentation technique), the Gibbs segmentation
displayed outstanding performance. The results of texture
segmentation by our proposed method were compared with
those using other methods (the HMT and the HMTseg [13];
or, MLP networks and the HMTseg [12]). Through these
experiments, we can see that our proposed method is
superior to other methods in terms of texture segmentation
performance. The reason why our proposed method can
show outstanding performance can be concluded as fol-
lows: our method has adequately integrated the
classification power of MLP networks into the Bayesian
framework, and by adequately constructing prior models,
our method has reduced noises due to classification errors.
In addition, the MAP estimation by a Gibbs sampler makes
our method show outstanding performance.
References
1. Tuceryan M, Jain AK (1998) Texture analysis. In: Chen CH, Pau
LF, Wang PSP (eds) The handbook of pattern recognition and
computer vision, 2nd edn. World Scientific Publishing Co., pp.
207–248
2. Vaidyanathan G, Lynch PM (1990) Edge based texture seg-
mentation. In: IEEE proceedings of Southeastcon 90’ 3:1110–
1115
3. Georgeson MA (1979) Spatial fourier analysis and human vision,
chap. 2. In: Southland NS (ed) Tutorial essays in psychology, a
guide to recent advance, vol 2, Lawrence Earlbaum Associate,
Hillsdale
4. Devalois RL, Albrecht DG, Thorell LG (1982) Spatial-frequency
selectivity of cells in macaque visual cortex. Vis Res 22:545–559
5. Silverman MS, Crosof DH, De Valois RL, Elfar SD (1989)
Spatial-frequency organization in primate strate cortex. Natl
Acad Sci USA 86
6. Fan G, Xia XG (2001) A joint multicontext and multiscale
approach to Bayesian image segmentation. IEEE Trans Geosci
Remote Sens 39(12):2680–2688
7. Cheng H, Bouman CA (2001) Multiscale Bayesian segmentation
using a trainable context model. IEEE Trans Image Process
10(4):511–525
8. Bouman C, Liu B (1991) Multiple resolution segmentation of tex-
tured images. IEEE Trans Pattern Anal Mach Intell 13(2):99–113
9. Ng I, Kittler J, Illingworth J (1993) Supervised segmentation
using a multiresolution data representation. Signal Process
31:133–163
10. Meyer Y (1993) Wavelets algorithm and application. SIAM,
Philadelphia
11. Li J, Gray RM, Olshen RA (2000) Multiresolution image clas-
sification by hierarchical modeling with two-dimensional hidden
Markov models. IEEE Trans Inf Theory 46(5):1826–1841
12. Kim TH, Eom IK, Kim YS (2005) Texture segmentation using
neural networks and multi-scale wavelet feature. Lecture Notes in
Computer Science 3611. Springer, Berlin, Heidelberg, pp 395–
404
13. Choi HK, Baraniuk RG (2001) Multiscale image segmentation
using wavelet-domain hidden Markov models. IEEE Trans Image
Process 10(9):1309–1321
14. Unser M (1995) Texture classification and segmentation using
wavelet frames. IEEE Trans Image Process 4:1549–1560
15. Weldon TP, Higgins WE (1996) Design of multiple Gabor filters
for texture segmentation. In Proceedings of international con-
ference acoustic speech, signal proceeding, Atlanta, pp 2243–
2246
16. Fan G, Xia XG (2003) Wavelet-based texture analysis and syn-
thesis using hidden Markov models. IEEE Trans Circuits Syst
Fundam Theory Appl 50(1):106–120
17. Sun J, Gu D, Zhang S, Chen Y (2004) Hidden Markov Bayesian
texture segmentation using complex wavelet transform. IEE Proc
Visi Image Signal Process 151(3):215–223
18. Randen T, Husoy JH (1999) Filtering for texture classification: a
comparative study. IEEE Trans Pattern Anal Mach Intell
21(4):291–310
19. Wouwer GV, Scheunders P, Dyck DV (1999) Statistical texture
characterization from discrete wavelet representations. IEEE
Trans Image Process 8(4):592–598
20. Crouse M, Nowak R, Baraniuk RG (1998) Wavelet-based sta-
tistical signal processing using hidden Markov models. IEEE
Trans Signal Process 46(4):886–902
21. Fan G, Xia XG (2001) Image denoising using a local contextual
hidden Markov model in the wavelet domain. IEEE Signal Pro-
cess Lett 8(5):125–128
22. Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd
edn. A Wiley Interscience Publication, London. pp 51–63, 161–
192, 576–582
23. Gish H (1990) A probabilistic approach to the understanding and
training of neural network classifiers. In: Proceedings of IEEE
international conference on acoustics, speech and signal pro-
cessing. Albuquerque, pp 1361–1364
24. Richard MD, Lippmann RP (1991) Neural network classifiers
estimate Bayesian a posteriori probabilities. Neural Comput
3:461–483
25. Rojas R (1996) Short proof of the posterior probability property
of classifier neural networks. Neural Comput 8:41–43
26. Li SZ (1995) Markov random field modeling in computer vision.
Springer, New York
154 Neural Comput & Applic (2009) 18:141–155
123
27. Li SZ (2001) In: Kunii TL (eds) Markov random field modeling
in image analysis, 2nd edn. Computer science workbench.
Springer, Berlin
28. Duda RO, Hart PE, Stork DG (2002) Pattern classification, 2nd
edn. Wiley Interscience Publication, London. Revised chapter
section 2.11
29. Shapiro JM (1993) Embedded image coding using zerotrees of
wavelet coefficients. IEEE Trans Signal Process 41(12):3445–3462
30. Shapiro JM (1996) Image compression by texture modeling in the
wavelet domains. IEEE Trans Signal Process 5(1):26–36
31. Simoncelli EP (1997) Statistical models for images: compression,
restoration and synthesis. In: Proceedings of 31st Asilomar con-
ference on signals, systems and computers. Pacific Grove, pp
673–678
32. Derin H, Elliot H (1987) Modeling and segmentation of noisy and
textured images using Gibbs random fields. IEEE Trans Pattern
Anal Mach Intell 9(1):39–55
33. Manjunath BS, Simchony T, Chellappa R (1990) Stochastic and
deterministic networks for texture segmentation. IEEE Trans
Acoust Speech Signal Process 38(6):39–55
34. Mallat S (1998) A wavelet tour of signal processing. Academic
Press, New York
35. Reidmiller M, Braun H (1993) A direct adaptive method for
faster backpropagation learning: the Rprop algorithm. In: Pro-
ceedings of the IEEE international conference on neural
networks, San Francisco
36. Demuth H, Beale M, Neural network toolbox for use with
MATLAB, User’s Guide Version 4, The MathWorks Inc.,
pp137–194
37. Fan G, Xia XG (2001) Improved hidden Markov models in the
wavelet-domain. IEEE Trans Signal Process 49(1):115–120
Neural Comput & Applic (2009) 18:141–155 155
123