Texture Characterization and Analysis Tutorial I - version 3.0

Jorge Marquez - UNAM 2008 Texture Tutorial I 1/90

Texture Characterization and Analysis Tutorial I - version 3.0

Jorge Márquez Flores, PhD - [email protected] Images and Vision Laboratory, CCADET-UNAM Copyright ©2005,2007,2009

Dept. Traitement de Signal et Images - Ecole Nationale Supérieure des Télécommunications, Paris, Copyright ©2000,2007 Texture Analysis Workshop - Technische Universität Hamburg, Copyright ©1996,2001

Diplomado de Teledetección, Facultad de Ciencias, UNAM, Copyright © 2006,2007,2008,2009. This tutorial introduces several concepts and ways to define and to analyze textural features in complex images from various fields of scientific research. It is designed as a preliminary lecture, before attending my short course/workshop on texture characterization and analysis, where further material will be explained, accompanied with many graphic examples. Note it is copyrighted material, at present belonging to the CCADET-UNAM in Mexico City, Mexico. It is only provided to course attendants and will be submitted for publication as a specialized book in 2010; please do not reproduce without permission. Introduction: texture definitions We begin by recognizing that: there is no universally accepted mathematical definition of texture. In the modern framework of fractal analysis (which we only will review very briefly), according to Benoit Mandelbrot,


“Texture is an elusive notion which mathematicians and scientists tend to avoid because they cannot grasp it and much of fractal geometry could pass as an implicit study of texture [Mandelbrot 1983, The Fractal Geometry of Nature].”

A vague but intuitive definition from the Wikipedia (July 2007) says:

Texture refers to the properties held and sensations caused by the external surface of objects received through the sense of touch. The term texture is used to describe the feel of non-tactile sensations. Texture can also be termed as a pattern that has been scaled down (especially in case of two dimensional non-tactile textures) where the individual elements that go on to make the pattern are not distinguishable.

Textures are often defined as a class of “patterns” (but note the scaling down condition. above), but pattern is a notion in itself, where a number of elements are identified and more or less isolated, and their relationships studied in Pattern Analysis. There are also “sound patterns” and “sound textures”, implying complex-varying details. All human senses (and all kind of information) prompt the existence of domain-textural qualities, even in psychological behavior, literature and within abstract concepts. A simpler but still limited definition in image analysis is that of gray-level variations, following some kind of periodic or quasi-periodic pattern (but there are also textures with no periodicities at all) Here “pattern” stands for a set of rules or an approximated configuration or structure. In addition, a degree of randomness (against determinism) has to be included. There is also the notion of mixture of

http://en.wikipedia.org/wiki/Touch

http://en.wikipedia.org/wiki/Tactile


different kinds of variations; that is, a given texture can be constituted (or modeled, or combined, if texture synthesis is desired) by simpler textures. In computer graphics, texture refers to two notions, distinguished by context: the tactile, 3D variations of surfaces, and the color maps or texture mapping (at a pixel level, any bitmap image) wrapped on computer meshes of objects, to make their rendering easier and faster. Most models of textures began by treating them as “repeating” patterns and some old definitions abuse of this simplification. Some statistical repetitiveness may indeed be present in many textures, specially at some scales where features average out. Complex and even just irregular textures were for a long time ignored, since no mathematical or computer tools could grasp all their properties with success. The ability to characterize, model and even synthesize a particular pattern or texture has been accompanied by an understanding of physical, biochemical or other process giving rise to them. Modern definitions rely on descriptive approaches, mainly statistical and structural. Repetitiveness may or may not be present (or only in some statistical, or filtered sense), and textures may have textural components with both extremes. Randomness is another element present in natural or “real” textures. Literature on texture analysis reports up to date about 2000 textural parameters!, but most of them are strongly correlated and may be difficult to interpret and to compare. We rather start by a higher level of description, based on a few, simple, intuitive notions based on human perception.


1. Human-Perceived Measures of Texture It is one objective of computer vision that an unsupervised, automatic means of extracting texture features that agrees with human sensory perception can be found. Tamura et al., in [Tamura, H., Mori, S., and Yamawaki, T. Textural features corresponding to visual perception. IEEE Transactions on Systems, Man, and Cybernetics, 8(6):460, 1978] defined six mathematical measures which are considered necessary to classify a given texture. These measures are: Coarseness, coarse versus fine. Coarseness is the most fundamental texture measure and

has been investigated since early studies in the 1970s by Hayes [IEEE Trans. On Systems, Man and Cybern. 4:467, 1974]. When two patterns differ only in scale, the magnified one is coarser. For patterns containing different structure, the bigger the element size, or the less the elements are repeated, the coarser it is felt to be. Some authors use busy or close for fine. The elements are also known as “the grain of the texture”. They may include the texels in a structural approach (see below); the bigger the texels, the larger the coarseness. When two texture regions are jointly scaled in such a way that they have “the same” coarseness or size of grain (features), we may thing of other measured features as being invariant to scale, to some degree.

Contrast, high versus low. The simplest method of varying image contrast is the

stretching or shrinking of the grey scale [Rosenfeld and Kak, Digital Picture Processing, 1982]. By changing (globally) the contrast level of an image we alter the image quality,


not the image structure. When two patterns differ only in the grey-level distribution, the difference in their contrast can be measured. However, more factors are supposed to influence the contrast differences between two texture patterns with different structures. The following four factors are considered for a definition of contrast: • Dynamic range of grey scale, • Polarisation (bias, skewness) of the distribution of black and white, on the grey-

level histogram, or the distribution of the ratio of black and white areas. • Sharpness of edges (against background) and shape sharpness (corners). • Period of repeating patterns, frequency characterisation of them. High global contrast is associated with weakness or strength of a texture, but it may be normalized to, for example, full dynamic range. This is why most texture features may be though as invariant to contrast.

Directionality, directional versus non-directional. This is an average property over the given region. Directionality involves both element shape and placement rules. Bajcsy [7] divides directionality into two groups, mono-directional and bidirectional, but Tamura measures only the total degree of directionality. In such a scheme the orientation of the texture pattern does not matter. That is, patterns, which differ only in orientation, should have the same degrees of directionality. A subtle difference exists between directionality and coherence; the latter is a measure of a kind of spatial correlation


where local directionality exists. Two directions may co-exist an even be separable. Another tool for studying both, orientation and directionality is the local tensor of inertia, whose eigenvalues –magnitude and angle- give the orientation and an “intensity” or degree of directionality/coherence. Directionality may be also called “anisotropy”, and no dominant directionality is the characteristic of an isotropic texture.

Since the dominant orientation of a texture region can be obtained, a rotation in the

principal axes is always possible and this constitutes a normalization with respect to orientation: it is best to compare textures once they are aligned (same region). In this sense, the other texture features may be though as invariant under rotations.

Line-likeness; line-like versus blob-like. This concept is concerned only with the shape

of a texture element. This aspect of texture may supplement the major ones previously mentioned, especially when two patterns cannot be distinguished by directionality. The blob-like nature corresponds to the “mosaic index” used in metallurgy and porosity analysis: it describes the existence of domains or large isotropic grains.

Regularity, regular versus irregular. This can be thought of as a variation in a

placement rule. For texture elements, however, it can be supposed that variation between elements, especially in the case of natural textures, reduces the overall regularity. Additionally, a fine structure tends to be perceived as regular. Irregularity is not exactly synonym of randomness or noise.


Roughness, rough versus smooth. This description was originally intended for tactile textures, not for visual textures. However, when we observe natural textures, such as Hessian cloth, we are able to compare them in terms of rough and smooth. Tamura asks is this subjective judgment due to the total energy of changes in grey level or due to our imaginary tactile sense when we touch it physically? It is worth noting that roughness is different from contrast, which deals only with gray level variations (“vertical” in a z = I(x,y) representation), while roughness includes local shape variations.

Tamura gives mathematical definitions of all these measures plus results of experiments to find the correlation between these measures and human perception. A final property of textures concerns the degree to which they are regular or random. The vast majority of natural textures are random in nature, which allows any of the above parameters to vary. Formulae for all these techniques are presented in the following paragraphs. A structure, which is also possible, is the use of a pyramid to hold an image at multiple different resolutions, each a factor of 2 or 4 smaller than the previous level. This provides an intuitive way of detecting different textures at different scale levels. Other properties are sometimes taken into account, such as the presence of spatial gradients (a texture for which one or more features change in one or many directions), local, regional and scaling properties. Also a degree of sharpness or smoothness, due to blurring by filtering (digital or from the PSF of the acquisition system) may be though or treated as textural properties. Again, some degree of invariance with respect to sharpness, may be obtained by


properly correcting (by deconvolution) or regularizing (smoothing, to obtain the “same focus”) two textures, before comparison. Color variations are similar to contrast variation, when two textures change only in hue, saturation, luminosity (HSL) or, equivalently, in individual RGB intensities. Color normalization may be difficult, in order to enhance color changes due to textural differences, but a global equalization may be useful. In relation with the scaling properties in the texture itself, we show in Tutorial 2, that fractal geometry can act as an alternative approach to the problem and provide texture classification through fractal descriptions. It should be noted, however, that these approaches are not mutually exclusive. For example, some of the recent papers on morphology also include fractal ideas [96]. Ohanian and Dubes [107], compare the performance of four classes of textural features and come to the conclusion that the results show that co-occurrence features perform best followed by fractal features but however, there is no universally best subset of features. The feature selection task has to be performed for each specific problem to decide which feature of which type one should use. Other qualities have been identified with textures, including those mentioned, it can be coarse, fine, smooth, granulated, rippled, regular, irregular or linear.


2. Summary:

Textural attributes are difficult to define and characterize. There is certain proximity to the nature of noise (blurred distinction). Textural details are often also close to resolution limits and to the aperture functions

(in particular PSF –Point Spread Functions) characterizing the observing system or the methods for processing and analysis.

There may be a texture at one scale of analysis and another, different texture at another scale of analysis. Both may or may not be related or simmilar.

When detail frequencies fall below the Shannon-Nyquist (sampling) frequency, observed textures are in fact sub-sampled textures and may exhibit superposed “emergent” textural patterns, such as Moïré and multiplicative noise (speckle). Since details of images of natural phenomena do eventually fall bellow resolution limits of any acquisition system, there may be sub-sampled sub-textures.

Sets of many local characteristics: need of analyzing spatial relations, etc., among two or more pixels – co-occurrence, correlation, mutual information…

Language adds to confusion with ill-defined terms: roughness, grain, wavy-ness, blobby-ness, busy-ness, bumpy-ness, energy…

Several textures could be roughly defined (synthesized) as:

k ktexture w texturesΣ k (1)


with wk a set of weights, and the texture k may be composed of simpler different textures, while “” may stand for: averaging, morphological blending, set union, intersection, and, or, over, overlapping, modulation or any superposition in general. A texture may be called (linearly) separable if the above decomposition is possible. Most often, it is only an approximation (a superposition model, with coefficients weighting components from a “textural” basis). There are other kinds of decomposition, for example, at different resolutions, and the time-frequency paradigm is often used (wavelets).

Note: a linear decomposition or linear projection into an orthonormal basis E={ek} would imply an “E-domain transform TE” obtaining coefficients as result of the transformation wk = TE(texture) such that:

k kk

texture w e (2)

Which can be interpreted as a weighted average (blending) or superposition (the set W={wk} being the spectrum in E-domain) and the set E may also be a set of texture primitives in all possible configurations (positioning rules).

A texture may combine (or even be the result of): two or more simpler textures (as a superposition or other form of combination); gradients (either illumination, material changes in darkness or by projection, such as

perspective from 3D orientation);


noise (of several kinds), distortions, artifacts and structure (not being properly textures);

blurring (from the system acquisition PSF, or from explicit filtering); aliasing effects from several sources (mostly sub-sampling, due to interaction with other

textures, rather than discretization or resolution reduction). Coloring or discoloration can be though as a specific kind of noise, gradient or artifact (a splotch).

3. Transitions between structure and texture:

Images are often though as comprising shape and texture. Shape is also called structure. The boundary may be difficult to establish and is not necessary implied by sampling limits (Nyquist), but rather by the PSF of the sensing/perceiving system or any other aperture size characterizing a particular task. A more general and realistic decomposition includes also background. Texture may or may not include noise elements, while shape may also include artifacts and shape variation. Many analysis approaches decompose also a signal f (t) into at least three raw components:

ftrend (t) + finterest(t) + fdetails(t) (details, texture or “noise”) (3) .


The first, ftrend (t), is also known as “background”. It may be characterized by low frequencies (a relative notion). The second, finterest(t), is also known as the “foreground” and may include middle and, optionally, high frequencies, it also comprises structural or shape components of the signal/image (it may also be considered that ftrend (t) may comprise some structure, too) and the third component fdetails(t) may constitute some texture, noise and is characterized by high frequencies. An approximated way to separate each component is by passing the signal through a bank of three filters: a low pass, a band pass and a high pass. In a real application the “information of interest” may possess high intensity or high spatial frequencies (for example, the sharp corners of a triangle, or any binary shape) and the noise may have a broad frequency spectrum, thus a clear separation into three components is more difficult. Models or knowledge of the nature of noise (its spectrum or statistics), the background, as well as the foreground, are then used to better accomplish such artificial decomposition. Another question is where does each “component” begins and where does it ends. In one problem a texture may be taken as “noise” (unwanted information affecting the analysis of other features) and in another, as the very subject of analysis. Artifacts are defined as objects that are confounded with foreground components, and other criteria must be applied to their elimination. The transitions also depend on maximum resolution, and the PSF of the acquisition/processing system.

Jorge Marquez - Texture Tutorial I 13/90 UNAM 2008

Texture Classification Approaches

Statistical Structural

Periodic • Surface analysis • Moments • ACF • Transforms • Edge-ness • Co-occurrence matrices • Texture transforms • PCA • Random field models • Variograms • HVS approach: features from banks of filters (Laws, Gabor, Hermite) and pyramids / scalespace

• Spectral Analysis

• Hibrid approaches • Mosaic models -Voronoi • Grammatical models • Autoregressive models • Fractal dimensions • Stereological (sampling) • Porous material properties • Mathematical Morphology •

Other approaches

Quasi-periodic Random

• Fuzzy-spectral • Wavelet Analysis

• Edge density • Extreme density • Run lenghts • PCA • Variograms • Codewords, LBP

Complex features:

• Flows • Gradients • Color textures • 3D and time varying textures

Primitives (texels):

• Gray levels and color • Shape • Homogeneity

Placement rules:

• Period(s) • Adjacency • Closest distances • Orientation • Scaling • Distortion


Coherence

Blobbi-ness (mosaic index)

Line-like

microstructure

Coarseness Grain distribution

Bussi-ness

High/Low Frequencies

Near/Far of transition to structure/shape Wavy-ness

Randomness Noise (gaussian, white, etc.)

Repeating patternsRegularity

Self-simmilarityScale relations

hierarchies

Directionality orientation

Roughness

Contrast

COLOR (Hue Sat Lum) Elemental grain / TEXEL

Rose of Directions

Anisotropy

Intensity variation

The “Many” Dimensions of Textural FeaturesLocal Correlations

Co-occurrence


In the above figure, axes do not exactly reflect opposed features, many are somehow considered by some authors as synonyms or special cases of other features, and labels in black refer to those that human vision clearly distinguish, after psychophysical experiments. At first sight, some features seem synonyms to other, but one can always find counter-examples were differences are evident; for example, a texture may present directionality with low line-like character and low isotropy in one sense but high isotropy on another sense (that is, particular features), while another may present anisotropy at one scale, being highly isotropic in another scale. Must features may refer either to spatial and intensity variations, while some other features, only to one of them. 4. Two (or three) Main Approaches. Structural Approach – It is the point of view that we can identify (or there exist) one or

more texture primitives (also called tonal primitives), called texels for texture elements, (or also texons, or textons, for textural atoms, or structural elements). These consist of any pattern which repeats according to some rules:

Texture = texel(s) + positioning rules (from deterministic to loose, or random). The latter include translation, rotation, scaling, distortion, superposition, blending or other texel-combination methods. Positioning may be periodic, quasi-periodic, anisotropic, or change in complex ways. As discrete sub-images, the texels may be though as any small set (array) of pixel values (binary, gray-level intensities or color values), such as the symbol “4”, a logo, or colored dots, as shown in Figure 1. Note: Any pattern, such as a line or a set of lines or curves, a branch, a blob, a


fading Point Spread Function, an image, etc. constitutes a potential texel. In old textbooks the texels are said to “repeat periodically”, which is only a very small subset of possible positioning rules, which includes non-periodic.

Human written language can be viewed as structural textures. Chinese, Arabic, Roman, Cyrillic, hieroglyphics, mathematical, graffiti or any other symbol-based texts, are constituted by texels called characters or symbols, following precise, organized rules for combination, called syntax of the specific language. Some quasi-periodicities may appear, and are more evident in poetry or tables describing properties of a list of items.

Spectral approach – texel definition may be extended to include basis functions (sin, cos,

exp, polynomials, Splines, Bessel functions, etc.) and texture is then characterized by frequency-domain properties, in the case of harmonic basis functions. Note that this is equivalent to non-finite space-support texels, and the particular positioning rule of superposition. Many authors consider that, by its importance and special properties, the Spectral Approach is the third approach, to be mentioned, since most of the analysis requires obtaining frequency domain features. In consequence, several spectra domains should be considered, since there is a great number of important basis functions and transforms, for example the Walsh-Hadamard, Haar, Hankel, Radon, Hough, as well as all the Wavelet transforms.

Statistical Approach – It refers to the analysis of distributions and statistical properties, in

general. No order or deterministic structure is assumed, as in the structural approach.


Textures may be treated as instantiations (or “realizations” and sometimes also “snapshots”) of random fields. Local stationarity is defined by local constancy of statistical properties (see below). We review Markov Random Fields applied to texture analysis in Tutorial 2.

Hybrid approach: ½ structural + ½ statistical (or other proportions): when texels may have a random nature or random distribution properties. If texels strictly follow deterministic placement rules, the produced texture is called a strong texture. For pure randomly placed texels, the associated texture is then called a weak texture. An example of strong, deterministic texture is an image (specially a high resolution, computer rendering) of a text, in any human language, since placement rules are complex (syntax, orthography, logic), and there is a fixed number of texels called characters or symbols. A less strong texture may be hand-written language, which includes random and stylistic variations of spacing and texel-shape. Even errors, loose syntax and bad orthography add a “texture”.

Figure 1. Texel examples, from simple and discrete to complex

(from few to many pixels); positioning rules are not shown.


In fingerprints, a structural primitive is difficult to locate, as well as a positioning rule, but there are clearly curved lines, which rarely cross or bifurcate; these present some coherence, and defined average separations. Some specialist consider fingerprints rather as “patterns” (identified components spatially related), and texture descriptors may miss information that shape analysis and morphometrics better capture. So we may also account for a morphometrical approach to texture analysis, in which a texture is treated as a pattern, provided its components (not exactly texels) can be identified/extracted. Specialists apply Statistics higher another levels, such as for studying the distribution of shape features, or configuration properties. The pattern or morphometrical approach is described in Tutorial 2. Fingerprints and other textures (or patterns) introduce a seventh important feature, which can be regarded as “flow”: a changing directionality in an anisotropic texture.

Discrete texels: Better described as texel patterns (or texels represented by pixel patterns), they are the array of pixel intensities (or colors) that make up a texture primitive in practical computer images (it is pattern, since individual pixels are identified). Simple examples are fonts of text characters (for vectorized graphics, there are rules to build continuous patterns, as splines from a low number of control points; these are derived texels).

Sub-sampled structural textures, are those where the primitive(s) are highly “noised” blurred, or sub-sampled, and may or may not have random components. A human-cell metaphase, seen at the microscope, consists of a set of 26 chromosomes with well defined shapes similar to “X”, “Y”, “H”, “K”; a low resolution digitization will produce, sub-sampled


images of very few pixels, forming a characteristic texture, whose analysis may help to identify metaphases in a spread sample [Corkidi98].

Note that the statistical approach corresponds more to a decision: to choose that very approach in a texture analysis study, rather than referring to the nature of the texture: all approaches are based on models. The use of statistics reflects our lack of information of underlying structures or laws, its excessive complexity or variability, or a very high cardinality of components. The use of structural modeling may be empirical, but eventually requires deterministic information (how the texture has arisen from physical, mathematical or other causes). If we select the statistical approach to study a texture, where texels and positioning rules exist, both texels and placement rules are secondary or ignored in such statistical analysis.

Most textures turn to fall between the structural and the statistical descriptions.

In the modern called stereological approach (Tutorial 2), the analysis is about sampling of the texture by stochastic geometry probes (point, line processes, given texels, or any other sampling schema). The analysis may be based on studying the resulting interactions of a texture and the stochastic geometry probes. Such kind of analysis may also be considered as an example of “Monte Carlo” methods of analysis (the statistical component being part of the measurement tools).

Another sub-approach is the surface-intensity or 3D-approach, where the image gray levels I(x,y) are interpreted as a third dimension, height: z = I(x,y), and we analyze the 3D surface


defined by all (x, y, z) coordinates. In this approach, shape and intensity are “married” and the local normal vectors are sensible descriptors of local roughness and other textural features. In Section 8, a parameter known as RED (Ratio of Extrema Density) is described and further properties are found in Tutorial 2. The surface-intensity approach includes, in image processing, the Topographic Primal Sketch [Haralick92], where 3D-relief assessment is based on features like ridge and ravine lines, saddle points, flats, hills, hillsides, peaks and pits, and their properties. Interactions (superposition) with the acquisition/display system: PSF (all apertures), discrete array of sensors and pixelization. The PSF introduces a

superposed structured texture, giving rise to phenomena similar to sub-sampled textures: aliasing and Moiré patterns.

Interactions with the Human Visual System (HVS): Interaction with the “sensor texture” of cone and rods, and resolution limits. Tendency to group, to perceive clusters, domains or cells, lines, flux lines, networks and

Gestalt configurations (to complete lines, to extrapolate figures, to subitize items, etc.). Integrated Fourier analysis and scale-space perception also extracts high-level features.

Note: Subitize (Spanish: “conteo súbito” or “subitemizar”) is an implicit “counting” task of a cognitive system, in which a pattern of N objects is recognized (and the number N associated), without explicitly identifying and counting each object. The input to the recognition system is a pattern, the output is N, or a reaction or behavior related to N in some


way. Usually N < 7 in normal individuals, since the number M of possible configurations of simple items (identified as “dots”) in a planar perceptual space is about M ~ O(N!). Subliminal subitizing occurs when texture details is well over the Nyquist sampling limits of the HVS. This may lead to the intuitive notion of what is not a texture precisely when subitizing occurs!

Pointillistic paintings (Saurat, 1880?) – have a local texture of colored dots. A question arises: where lies the transition to perceived patterns and structure? (faces, trees, landscape). The interaction with the HVS, at different distances, makes the PSF to change, and causes adjustments of the bank of neural pass-band filters at the Lateral Geniculus brain structures, before transmission of visual information to the visual cortex. In simple words: there is a perceived transition: “these are dots,… now they become texture A, … texture B, …, this is now a colored region,… wait! - now it is a shape, a human head, … and this is context,… this is art!” “to see the trees and not seeing the wood” “to see the wood and not seeing the wood” These opposed points of view refer to perceptual properties of textures or figures or context, and describe the same goal-oriented decomposition described in Section Transitions Between Structure and Texture.


5. Texture-related tasks: Parameter extraction and characterization or modeling. Texture segmentation: segmenting regions by texture, besides intensity) Texture classification: given N texture classes, decide where to classify a new texture. Identifying texture boundaries assisted by models of regions. Texel modeling/identification: adjust a structural-based model. Texture synthesis: to mimic a given texture, to produce a texture from parameters or

texture samples, or to “grow” a textured region into non-textured region. Certain (artificial) textures may relate to encrypted or coded information, as text. Separate a given textural component (texture filtering). A particular texture may

have a physical cause, but it may also be composed of several textures corresponding to many physical causes (thus explaining out a textural component). Physical phenomena may happen at different scales (v.g. erosion by rain, by wind, by dust, by glaciers, by human or animal action, by microorganisms, by mechanical interaction (wear) and by chemicals).

Computer vision tasks and paradigms exist, known as “shape from texture”, “depth or 3D-orientation (perspective)”, “depth from shade and texture”, “surface wear or damage from texture”, and other. In these tasks, an assumption is made about textural properties of an object/scene (shift-invariance, for example, and no texture gradients), and then a deviation


of this behavior is interpreted as the effect of some transformation that is then found (v.g., varying illumination, and orientation).

6. Textured Images as Noised Bi-dimensional Signals.

An image I, or any sub-region A I, considered as a bi-dimensional signal, may be described as: (a) deterministic, exhibiting periodic or quasi-periodic (a few dominant frequencies

concentrate most energy, in Fourier space; there are shift-invariant properties), (b) non-deterministic (random), but stationary (statistical properties are more or less

constant), (c) random, not stationary (statistical properties are time or space-varying), (d) deterministic or non-deterministic with a transient structure (gradients,

discontinuities, with wavelet-analysis being a more useful framework than Fourier spatial-frequency analysis). Example: the PQRST complex in the cardiac signal, or,

(e) chaotic, a special, complex, deterministic behavior which may be modeled by dynamical-systems attractors in a phase space, and by fractal analysis. Periodicities may exist locally or with complex frequency modulation.


Figure 2. Different kinds of noisy signals.


The proper and most used frameworks to characterize, filter or synthesize each kind of noise are the following: For deterministic, periodic signals/images: (Spatial) frequency-domain techniques,

mainly Fourier analysis, power spectrums and periodograms. For deterministic, quasi-periodic signals/images: Wavelets or other frequency-spatial

domain techniques, such as the short-time Fourier Transform. Variograms. For transients: Wavelets or other frequency-spatial domain techniques; impulse/step

response analysis and non-linear approaches. For stationary stochastic signals/images: Statistical analysis, Auto-Regressive

Moving average (ARMA) models, stochastic-process frameworks and Principal Component Analysis.

For non-stationary stochastic signals/images: Independent Component Analysis, Markov-Random field analysis and stochastic-process frameworks.

For chaotic signals/images: Dynamical complex system analysis (linear and non-linear), analysis of attractors in phase-space and fractal analysis, since these patterns have scale-invariance. An example of fractal feature that meaningful in a fractal signal is the correlation dimension.


A highly complex texture may be a mixture of all six kinds of noise, being the texels a case of deterministic, quasi-periodic components, but there may be texels with highly random positioning rules, falling into the stationary or non-stationary stochastic case. 7. Color Textures. In a simplistic approach, a color image may be treated as three different B&W images, corresponding to RGB components (red, green. blue color channels), or as three channels HSL (Hue-Saturation-Intensity representation), or even in other color spaces. Thus, a colored-texture may be analyzed channel-per-channel (within any color space), as three texture images and textural features may change among color channels. That is, different textures may exist in different color-space channels, but the very nature of color perception makes it a far richer field of study. Inter-channel coupling may require ad-hoc tools to understand and characterize some color textures, especially from the point of view of human perception. Some textural parameters for color textures are described in Tutorial 2. 8. Simple Texture Parameters (Statistical Approach) After defining a suite of edge detection and grouping algorithms at our disposal, the problem of texture segmentation turns into a problem of texture classification. Given a method able to classify all the elements in an image, as a certain type of texture, we can apply a general edge detector to divide different texture classified areas into the required components. In


Section 3 we presented in outline a series of six different characteristics to define a texture. We will shortly explore the first five of these, in a mathematical sense, and then investigate in Tutorial 2 the use of fractal dimensions in a texture classifier. The sixth classifier, roughness, is more suited in psychological test and has been mathematically defined as the sum of the coarseness and contrast. We have shown above how grey scale images can use their intensity values as a simple form of texture measure. If each pixel in an image can be classified with reference to its neighbors and given a numeric value, this can be substituted as the image's grey scale value at that point and all the above techniques can be applied. We now present a short aside to show some simple statistical measures that can be used. Rank Operators A rank operator works on ranked values of intensity in a small region. Once ranked, we may extract parameters calculated form the maximum, minimum and median values, as well as percentile intervals (the first quartile (of four) is from the minimum to 25% of the maximum values, and the interquartile starts at 25% and ends at 75% of the maximum. One of the simplest texture operators to detect different textures, like those present in microscope images of histological samples, is the range or difference between minimum and maximum brightness values in the neighborhood of each pixel. This produces a range image, where gray levels are the range values. A suitable threshold allows to produce a mask to


separate textures in the original image. The size of such neighborhood must be large enough to include small uniform details (note the similarity to the sampling Nyquist criteria). For a uniform, soft region, the range is small or 0. A surface with larger roughness gives larger values of the range. Another operator producing similar results on the same images, is the statistical variance, calculated on a moving neighborhood. It is described in the following paragraph. Statistical Moments The first statistical classifiers that can create this measure are simple moments. Given an image I we can calculate its local mean value over a region or window W, centered in the image position (i, j) as

1i, j i-k, j-li,j

k,l WN

I I (4)

Alternatively the mode or median value can be used instead of the mean. The size of the averaging region W allows to define what details are not texture (since they are larger, they will not be smoothed out by this low-pass filtering). Thus, the textural details will be the residual information, either after subtracting i,j or considering the variance, below). Robust weighted averages may be also defined, including outlier rejection or outlier penalization. The local variance, 2 (with a window neighborhood centered at pixel i,j), is a measure that characterizes how the values deviate from the mean value,


212i, j i-k, j-l i, j

k,l WN

I (5)

The local coefficient of variation (over a window W) CVi,j = i,j / i,j is dimensionless and scale-invariant. A CVW image can also be build. Note that, in Signal Processing, the mesure / is the signal-to-noise ratio SNR. The images 2

W and CVW may already provide a “map” of textural regions, if they have different local variance. It is in fact, the simplest textural feature, and depends on the size of window W. A whole family of local moments can be calculated and used:

1

n i, j i-k, j-l i, jnn

I (6)

The standard deviation, 2, is equivalent to the second moment (n=2). Two other popular statistics are the skewness,

3

31 1

i, j i-k, j-l i, jk,l Wi, j

N-1s

I (7)

and the kurtosis,


42i, j

i, ji, j

(8)

Skewness (“biais” in French, “Schiefe” in German and “sesgo” in Spanish) is a measure of the shape bias (asymmetry) of the distribution of values {Ii-h, j-l}. The fourth moment 4 i, j is also called flatness or relative monotony. The derived feature, called Kurtosis (eq. 8), measures the concentration –sharpness– of the shape of the distribution. A large value corresponds to ecto-peak shapes (wider than the Gaussian, stretched out), a medium value (about the std. dev. value) correspond to meso-peak shapes (similar to the Gaussian in width) and a small value (sharp, thinner than the Gaussian, stretched in) are called endo-peak shapes. Higher-order moments bear information without a simple interpretation relating to the shape of the distribution. Note that once we define a local-average image W which constitutes a “background” we may define a second image by background removing (pixel by pixel): Idetails i,j = Ii,j i,j , with 1 a parameter with typical values 1.0 or 0.5 (degree of background removing). Idetails enhances textural features under half the size of the averaging window. Several local coefficients of variation (over a window W), of order k can be obtained from the moments and central moments of order k (equation 5); they are called standardized moments:


kk i, j

i, j ki, j

CV

( )

A related measure calculated from the image derivatives is the total variance, TV. This consists of the sum of gradient values throughout the window W. Any of the different gradient operator, G, can be used, with Rudin and Osher presenting a preference for G’:

( )i, j i-k, j-lk,l W

TV u

G, (9)

where x x y y is the image gradient magnitude where Dx=I(x,y)/x, and

Dy =I(x,y)/y are the horizontal and vertical derivatives.

G = D D D D

Signed Variances There is a number of variance-like descriptors, where the difference of pixel value and the local mean is taken into account. This allows making a parametric image where the pixel values indicate the signed difference, using a color scale for negative and positive values. Such descriptors are the signed variance and the signed standard deviation (see equation (5)):


21var sgn( )i, j i, j i, j i-k, j-l i, jk,l Wcard W

I I (10a)

21sgn( )i, j i, j i, j i-k, j-l i, jk,l Wcard W

I I (10b)

+1 01 0where sgn( ) = -

if xif xx

Sobel and Kirsch operators for enhancement of oriented edges In its continuous version, G constitutes the Sobel operator and is also written as:

22

x x y y x y

G = D D D DI I

(10b)

In their discrete version, they correspond to directional convolution kernels: [-1,0], [-1,0]T, [1,0], [1,0]T or [0,+1], [0,+1]T, [0,-1], [0,-1]T, or [-1,0,1], etc. These comes either from the finite backward difference approximation:


Dx = I(x,y) / x I(x,y)/ x = I(x,y) – I(x–1, y), since, at pixel level, we set x=y=1. We also may define the forward finite difference I(x,y) / x I(x+1,y) – I(x, y), and the average of both, as well as diagonal mixtures (such as the one including x+1, y–1), giving all rise to the known Prewitt and Roberts operators for edge detection. A direction value, for each pixel can be also calculated, in addition to the magnitude of the Sobel operator:

Direction angle arctan yx

(11)

II

A combined image of magnitude and direction information, from the Sobel gradient operator, can be obtained by coding direction as Hue, and magnitude as gray level Intensity. A common approximation of the discrete Sobel gradient operator is exemplified by the left-to-right and the top-down 33 edge-detection kernels, where the boxed element indicates the origin location:

1 0 1 1 2 12 0 2 and 0 0 01 0 1 1 2 1

(12)


The Kirsch operator is another practical way to avoid the square root of G, by applying each of the eight orientations of the derivative kernel and keeping the maximum. Both, the Sobel and the Kirsch operator allow to extract orientation information and to form orientation components of a texture. Other edge detectors exist, with optimal properties, such as the Canny-Deriche edge filter. The Spatial Autocorrelation Function (ACF) In the hybrid approach (half structural, half statistical), the spatial size of the texels can be approximated by the width of the ACF in space domain defined as:

r(k, l) = m2(k, l) /m2(0, 0), where (13)

( , )

1( , ) = ( , )m n WW

iim k l m k n l

N

I

with i=1,2,…, and NW is the number of pixels in the moving window W. When texels are more or less contrasted against the background, they become like grains of the texture and the coarseness of the texture may be expected to be proportional to the width of the spatial ACF. The latter can be represented by distances x0, y0 such that the ACF is


r(x0, 0) = r(0, y0) = ½ . Different measures of the spread of the ACF can be obtained from the moment-generating function:

1 2( , ) ( ) ( ) ( , )k l

m n

M k l m n r m n (14) 9. Statistical Moments from the Grey-Level Histograms. A very similar statistical approach to first-order statistics of grey-levels directly from an image, is the use of statistical moments from the grey-level histogram of a image or region. The histogram refers to (1) the graphic plot of grey-level intensities against the event incidence of them (frequency of occurrence of any particular intensity) and (2) an approximation of the Probabilistic Density Function (probabilistic distribution), or, in practice, the discrete probabilities p(ui) of grey levels ui , with i =0,…, L-1 (usually L=256), in an image or region. We rewrite the above definitions, referring to the histogram p(ui):

1

0

( ) ( ) ( )L

nn i

iu u u p u

i (15)

where < u > is the statistical mean value of u (the statistical average gray level or expected value):


1

0( )

L

i ii

u u p u

(16) Note the weighting probability (histogram); the mean is not over pixel observations, as in a sample average, but over all possible intensities u0, u1,…, uL, according to their frequency p(ui). These intensities may be the mid-value of bin intervals (classes of the histogram) and quantized into a few number. Since histograms lose any domain information, when such domain is the location of the attribute (intensity values), a global histogram may be exactly the same, regardless of shape and texture, provided the distributions are identical. In binary images where there is the same proportions of black and pixels (say half and half), the histogram consist of two peaks. The figure shows examples of images with the same gray level distributions.

Block pattern Checkerboard Diagonal stripped pattern A shape


Figure 3. Four different textures with the same global distribution (histogram) of black and white. The first and the later tend to be seen as non-textural, and the term

“shape” is the applied. As defined, we note that moments of order 0 and 1 are 0 = 1 and 1 = 0. The second-order moment is the variance 2(u) = 2(u):

12

20

( ) ( ) ( )L

ii

u u u p u

i (17)

and is specially important as a textural descriptor, since it is a measure of gray level contrast that can be used to establish descriptors of relative smoothness. Histogram features are extracted from the histogram of an image or a region of an image. In the following it is required that k p(uk)=1, otherwise a normalization constant must be introduced (e.g., total number of pixels in the ROI).

Histogram features (histogram {p(uk)}) ( ) Prob[ ]uk kp u u and uk {u0 , u1 , …, uL-1}

max

max

med

max{ ( )},

argmax{ ( )}

median{ ( )}

k

k

k

k

k

p u

p u

p u

pu

u

Maximum probability,

Mode (most frequent intensity)

Median (sort p, chose middle)

(1)


1

10

[ ] ( )u uL

k kk

m E u p u

First moment (centroid) or statistical average gray level (2)

1

0[ ] ( )u k

Ln

n kk

nm E u p u

Moments of order n Order 2: Contrast (3)

1

0

ˆ [ | | ] | | ( )uL

n nn

km E u p u

k Absolute Moments of order n (4)

1

10

[[ ] ] ( ) (u uLn n

n kk

)kE E u m p

uCentral moments of order n

0 = m0 =1 and 1 = 0. (5)

12 2

2 10( ) ( )

L

k kk

u m p u

Variance = Central moment of order 2 (6)

1

10

[ˆ [ ] ] ( )u uLn n

n k kk

E E u m p u

Absolute central moments (7)

1

0

( )[ ] , 0uL

knn kn

k k

p um E uu

Inverse moment of order n (8)

12

0( ) [ ] ( )u

L

kk

E p p u

E Uniformity or Energy (9)

1

2[ log ] ( ) log ( )L

kk

2 kH E p p u p u

Entropy (in bits) (10)

1

21 log ( )

1

L

kk

H p u

Rényi Entropy of order (11)


Note that in the latter, the domain does not appear; equations refer to histograms of a set of samples, a signal, an image or any nD domain with discrete attribute uk. Most common histogram features have a name:

Summary of most common histogram features

1

24

1

med

max

22

2

3

ˆ

muu

m

Mean or statistical average gray level Median Mode Dispersion Variance Contrast or Mean Square Value or Average Energy Skewness Kurtosis

There may be histograms of the value distribution of any feature, other than gray levels, such as the local variance, coherence, shapes, etc. Vector features give rise to multi-dimensional histograms (see the tutorial on Histograms and Section 11 on Co-occurrence Matrices).


10. Texture Measures Based on Human Perceived Features We now return to the six basic features of textures, based on psychophysical studies of the HVS, and mathematically define the main five texture classifiers from Tamura that were mentioned in Section1. 1. Roughness It corresponds to one of the simplest and most employed measure of local variation. It was also introduced as a statistical moment, with a different notation. The equivalent to surface-physics RMS-roughness models image intensities as heights in the surface-intensity representation S= { ( I, hi,j =I(I) ) }. We first obtain a local measure of average height (1st moment of the image intensity distribution) and then obtain the 2nd moment (standard deviation):

1i, j i,j i-k, j-lW

k,l Wh h h

card W

(18)

the cardinality of window W is the number of pixels. Note all correspondences with equations 1 and 2 of Section 8, and how interpretation and notation change. Typical sizes of W should be at least as twice the size of details to be considered as texture.


212W,RMS i, j i-k, j-l i, j W

k,l Wh h

card W

(19) Local measures can be made “robust” by discarding from the window those pixels/values that do not pertain to the texture (noise, artifacts, edges), and by weighting heights (intensities) by other image properties (see below). Note that RMS (i,j) is an image where black indicates un-textured areas and white very busy, noisy or rough areas. An elemental “texture segmentation” can be employed in textures where roughness is the predominant feature. Several windows may be tested and obtain a pyramidal or scale-space set of features. The Ratio of Extrema Density is a normalized local measure of roughness in 1D (profiles), introduced by Haralick et al. In RED, shape properties (width of the peaks) are combined with intensity: the local maxima in a very small window W are obtained for a larger ROI:

max( ) min( )

arg max arg min

1REDi-k, j-l i-k, j-lk,l Wk,l W

i-k, j-l i-k, j-lk,l W k,l W

ROI,Wk,l ROI

h h

h hcard ROI

(20)

Since it only measures one-dimensional profiles of 2D images, it is an example of stereological parameters (see Tutorial 2), as well as an example of the surface-approach, where profiles of relieves are studied. Our group [Corkidi etal, 1998] introduced a slight variation in RED, by weighting/penalizing the width terms (numerators), and sampling


extrema by 1D profiles, under a stereological-approach justification. The new textural parameter is called MRWH, the mean ratio of width (

, ) and height (( | |, | |, or depth):

i i

|| ||

i i

ξ ζ

2 (ξ ζ )

( )1MRWH iROI,W

i rows ROI

w

card ROI

(21)

with

min( , )max( , )1 ( ) 2

0

i i

i ii

ξ ζξ ζ i i

if and ξ ζw

otherwise

where and are parameters to be adjusted, according to the kind of peak to be accepted; we used =0.6, independent of scale, and depends on the average gap between peaks of interest.


ξ

||ξ ||ζ

ξ

ζ

Gra

yle

vel

Distance (pixels)

ζξ

||ξ ||ζ

ξ

ζ

Gra

yle

vel

Distance (pixels)

ζ

Figure 4. Graphic plot showing the relationships of a profile features

used to define the mean ratio of width and height. We may extract several peak-related features. Figure 5 shows that three peaks with the same width to heigh ratio, may have different values of a peak width called the Full Width at Half Maximum (FWHM). A mean and a variance of such characteristic widths will complement the RMS roughness (equation (19)), the RED and MRWH parameters for roughness in a local window W. It is easy to verify that the FWHM of a Gaussian function of sigma value , is

2 2ln 2 .


22

1

1iW

i i-kWk W

FWHM i-k i Wk W

FWHM FWHMcard W

FWHM FWHMcard W

(22) (22)

max

A B C

1/2max

FWHMA FWHMB

Figure 5. Three peaks with the same width to height ratios, but different values of FWHM. The axis depends on profile orientation across the image domain, and represent any line = ax+by+c, with x, y the image domain coordinates. If is the line angle with respect to the x axis, then the peak parameters derived from FWHMs and from width-to-height ratios depend on .

Figure 5. Three peaks with the same width to height ratios, but different values of FWHM. The axis depends on profile orientation across the image domain, and represent any line = ax+by+c, with x, y the image domain coordinates. If is the line angle with respect to the x axis, then the peak parameters derived from FWHMs and from width-to-height ratios depend on .

FWHMC


Figure 5 shows that the FWHM width is not necessarily correlated with the peak width-to-height ratio, and both parameters may be combined to characterize certain profiles. We may still have two peaks with the same values of FWHM and the width-to-height ratio, but with a different absolute height h() over the baseline. We may also consider, for peak descriptors, the ratios Apeak/Atri, where Apeak is the area under each peak (See Figure 6) and Atri the area of its fitting triangle, defined by ξ ζ

.

Figure 6. The shaded area Apeak and the area Atri= ξ ζ

of the triangle with vertices min1, max1 and

min2 give rise to new peak descriptors, using for example the ratios Apeak/Atri.

max1

ξ

ζ

min1

min2


Another approach to peak-and-through representation includes profile components of constant height, forming trapezoids, as shown in Figure 7. Profile descriptors become more complex and the underlying structure constitutes a syntactical-based approach. Particular configurations and sequences of profile components may be though as “signatures” of specific profile features.

Figure 7. An intensity profile may be modeled by a mixture of triangle and trapezoidal components, constituting a syntactical-based approach. Multi-Resolution and Multi-Scale Analysis The choice of local minima and maxima in the above analysis depend on tolerance specifications, that is, the level of detail of interest. A way to generalize and to deal with several levels of detail is multi-resolution. The roughness and peak parameters (and in


general, any textural parameter) are obtained from the same image, at several reduced resolutions, either by sub-sampling pixels, or by averaging a local window and creating a new image whose pixels values are the average (or a weighted average, in general) of the pixel neighbors in that window. A common weighting function is the isotropic 2D Gaussian, and the sigma parameter defines a scale dimension. An image sequence of convolutions I (x,y) of the original I(x,y), with increasing values of constitutes a Gaussian scale-space:

2 2 2

2

( ) 212

, , x yI x y I x y e

/ (23)

Re-sampling of the above smoothed image, including other non-Gaussian kernels, at lower resolutions, gives new images where each pixel is representative of gray values in its neighborhood, characterized by or by the FWHM of the convolution kernel. A parameter such as MRWH would then depend on the local window W, profile orientation , (for anisotropic textures), the weighting criteria, and on scale of analysis In the following paragraphs, the choice of feature parameters corresponds to physiological perception experiments. 2. Coarseness, coarse versus fine Coarseness is a measure of scale in micro texture within the image. A simple procedure was presented by Rosenfeld that detects the largest size at which repetitive patterns are present.


Take averages at every point in the image at different scales, k, which are a power of 2.

This defines the arrays

1 1

11 11( ) ( )

k k

kk2k

ki x-2 i y-2

y+2x+2A x,y g i,j 2

(22)

Calculate the differences between neighboring averages which are non-overlapping. This is calculated in both horizontal and vertical directions.

1 1( ) ( , ) ( , )k k

h,k k kE x,y A x-2 y A x+2 y (23)

1 1( ) ( ) ( )k k

v,k k kE x,y A x, y-2 A x, y+2 (24)

Then pick the largest size which gives the maximum value,

1 1 2 2 k kk( ) 2 where max , , , ,..., , ,...n

h, v, h, v, h, v,nS x, y = , E E E E E E E (25)


Finally, we take an average as the coarseness measure, ( )crsF = S x, y . Note that S gives also the scale at which variation attains a maximum.

Coarseness, as well as other textural features can be quantified (mathematically characterised) by other parameter definitions. 3. Contrast, high versus low Two of the four factors, presented previously, have been proposed to influence the contrast within an image. 1. Dynamic range of grey levels, specified most commonly as the (local/regional/global) standard deviation

2. 2. Polarisation of the distribution of the black and white on the grey level histogram. The kurtosis, 4= 4/2, is used where 4 is the fourth moment about the mean and

2 is the variance. Skewness is also defined in terms of 3 . Combining these two factors, we have a measure for contrast, Fcon = /(4)

n. A value of n = 1/4, has been recommended when compared with physiological comparisons. 4. Directionality, directional versus non-directional


The values defined above for (G), can be quantized into a histogram, Hd. The position and number of peaks can be used to extract the type of directionality. To quantify this feature, the sharpness, or variance, of, np peaks, positioned at angles p can be calculated. By defining wp as the range of angles between two valleys, we have without normalization,

2

( )

( ) ( )p

p

n

p pi, j w

dirF = n H

d

(26)

Other measure of local directionality, orientation and coherence is the local tensor of inertia

I(x,y) (a feature not corresponding to physiological perception models) defined on a window W as an image whose attributes is a tensor:

yy

xx xy

yx

, (27) where the 2nd normalized moments are defined as:

,

2

,

( , )

( ) ( , )

x y W

x y WC

xx x y

y y x y

I

I

(28a)


,

2

,

( , )

( ) ( , )

x y W

x y Wyy

C

x y

x x x y

I

I

(28b)

,

,

( , )

( )( ) ( , )

x y W

x y WC C

xy yx x y

x x y y x y

I

I

(28c)

where local centroids are obtained from: , ,

, ,

( , ) ( , ),

( , ) ( , )x y W x y W

x y W x y W

C C

x x y y x yx y

x y x y

I I

I I

Note: some authors do not normalize the moments but we cannot assume that,

( , ) 1x y W

x y

I

for any window W. A second normalization from dimensional analysis may be needed for a correct physical interpretation.

Since the inertia matrix is diagonal when computed with respect to the principal axes (in the eigenvalue system of coordinates), the centroid and the principal axes completely describe the local orientation of an arbitrary neighborhood W of point (x,y). If we obtain the eigenvalues a, b, ordered by length (a b), and the corresponding eigenvectors av , bv we may define the:


Local directionality (eccentricity): Y = a / b , where b >0 , (29) and the

Local anisotropy (local orientation ): x

y

proy

proy =argtan a

a

vv

(30)

In stead of calculating (x,y) on a window of an image I(x,y), several authors obtain it also from the gradient I(x,y). 5. Line-likeness, line-like versus blob-like An adaptive tensor of inertia may provide a line-likeness measure: the eigenvalues for a circular neighborhood or window are used to build a new elliptic oriented window, where a second tensor of inertia is calculated, where the eccentricity may increase, making more evident a line-likeness structure. Other methods allow for direct line-like or blob-like assessment. Having calculated G and O(G), for all locations, a definition of line-likeness considers how probably the direction, at a specific point, is similar to one at a certain distance away. A matrix, Pd(i, j), is defined as the relative frequency with which two points on an edge, separated by a distance d, have direction codes i and j. A measure of line-likeness follows from:


( ) cos(( )2 / )

( )

n ndi j

n ndi j

lin

P i,j i j nF =

P i,j

(31)

6. Regularity: regular versus irregular Four different features have been described so far, and regularity can be thought of as how much these features change over an image. The standard deviation of these four features (excluding roughness, which was not considered in physiological models of texture perception) can be used, so

= 1 ( + + + ) reg crs con dir linF (32)

In general, outside the context of physiological perception studies, similar measures have to be normalized in some way and the normalization depends also on excluding or including other variation terms:

feature 1

1 = 1N

reg nnN

F

(33) There may be other regularity measures based on the coefficient of variation of a given parameter , global or local, as

CV = / (34)


Entropy

A commonly used indicator to describe the information content of any set of data is Shannon's formula for entropy. It is worth pointing out that Shannon's entropy was originally defined for all information streams and it also serves as a mesure of disorder, since pure random noise should not contain any information, by being completely disordered. We will consider it here in terms of image data.

1

1 20

1H logN

n nn p

p

(35)

This gives order-1 entropy1 where pn is the probability of pixel value n, in the range 0... N1, of occurring. This formula results in a single number that gives the minimum average code length if every pixel is encoded independently of the other pixels. Entropy is in effect the information content of the image.

Order-m entropy can also be calculated by extending the formula. By defining Pm( x1, x2,…, xm) as the probability of seeing the sequence of m pixels x1, x2,…, xm we have

1

20

1H logm 1 2 m1 2 m

N

m 1 2 mx ,x ,...,x

m P x ,x ,...,xP x ,x ,...,x

(36)

And defining <> as the expected value, the full entropy of data is given by


21 1H lim log

m 1 2 m

m m P x ,x ,...,xm

(37)

This limit always exists if the data stream is stationary and ergodic. These numbers can be calculated on a local window of values and used as a measure. _______________________________________________________________________________ 1There is some disagreement in terminology and although it is called first order entropy it is often also written as order-0 entropy.


11. Textural Descriptors from Gray Level Co-occurrence Matrices (GLCM) A very important and widely used set of textural descriptors, in the Statistical Approach, are obtained from information regarding the relative position of pixels with respect to each other. This information is not carried by individual histograms and is considered as second order statistics, since two joint random variables are considered: the intensities of a reference pixel and a second one, in a different position (an offset x, y). Such information is present in bi-dimensional histograms of the simultaneous occurrence (hence, co-occurrence, or concurrence) of two pixel intensity bins i, j. Note that indices i, j do not refer here to pixel coordinates. The size of the bi-dimensional histogram bins (classes) may be very large; but a 0-255 range of gray levels is typically quantized to only four gray-level classes (say, 0-63, 64-127, 128-191, 192-255). Such 2D histograms are thus reduced to LL arrays of probabilities pi,j (one array per offset (x, y)), hence, the term of concurrence or co-occurrence matrices, introduced first by Robert Haralick. Concurrence of pixels is also known as spatial dependence. The co-occurrence of two pixels (the joint event “first pixel with intensity i and second pixel with intensity j ”) requires the specification of a simple spatial relationship (the aforementioned offset) or placement rule (hence, its relationship with the structural approach). These spatial relationships define, each one, a particular co-occurrence matrix


and mainly include: adjacency or short separation, at a given orientation (or in general a displacement vector (x, y)). As for gray levels, the possible ranges of such separations and angle orientations are reduced to few discrete values (1 to 3 pixels separations and angles of 0o, 45o, 90o, 135o, 180o, 225o, 270o and 315o, but often reduced for symmetry to the first 4 directions).

Figure 5. Simple pair relationships can be coded by signed offsets (x, y) from the central

pixel at (x,y), for example the upper left corner has offsets (4,4).

The (un-normalized) co-occurrence matrix entries ni j, for each offset combination (x, y), are obtained from accumulating a count of pixels (see also eq. 40) satisfying:

1 1

0 0

1 ( , ) AND ( , )0

N Mi j

i jx y

if x y u x x y y unotherwise

I I (38)


where image intensities ui, uj are still in the range [0, 255], but the entries i, j correspond to bins (classes) according to quantization (usually into four bins, thus, i, j [0, 3] and for example ui [i64, 63+i64]. Note that the “AND” condition in equation (38) expresses the joint occurrence of intensities in bins i, j.

Figure . A co-occurrence 44 matrix is a histogram of 44 bins.

Once normalized, the set of matrix entries {pij} becomes the co-occurrence or bi-dimensional histogram of the event ( I(x, y)=ui AND I(x+x, y+y)=uj ), and approximates the joint probabilistic density function or joint distribution of the pair of attributes (ui , uj ).

12 - 10 - 8 - 6 - 4 - 2 - 0

8 6 2 124 3 10 43 2 2 1

2 4 5 4172

1i ji j

pNormalization

ni j

j

i

i j

ii j

i j np

n

j


Remember: For each pair of examined pixel attributes (thus, for each offset (x, y)), a 44 co-occurrence matrix is a bi-dimensional histogram. It has 44 classes (L=4 quantized intensities) for each pixel, one being the reference or independent site (position and angle 0), and the second, one placed at the dependent site (position D and angle , also quantized into very few orientations, usually four). There is ONE matrix GLCM per pixel configuration and each entry pij is the frequency (empirical probability, as the count nij normalized to the total) of pairs having the first pixel the intensity (or attribute) bin i while the second pixel has the intensity (or attribute) bin j. Bins or classes partition all gray levels in a LL array (44 if 4 bins are chosen). An image with 256 gray levels in [0, 255] is quantized then into four classes described as intervals (interval number i, or j) such as [0, 63], [64, 127], [128, 191], and [192, 255]. Given one configuration (a pixel and its diagonal neighbor, for example) the first pixel has a gray level of 68 and the second a gray level of 180, it corresponds to bin (i, j) = (2, 3), and entry nij = n23 of the matrix for that configuration, and the count of that entry is incremented. Using a label to identify configurations as offsets “H” (horizontal): (x, y) = (1,0), “V” (vertical): (0,1), “D” (diagonal): (1,1) and “D” (anti-diagonal): (-1,1), that is:

i j i i i

j j j

H V D Du u u u u

u u u

Most authors present the matrices with a gray level normalization of I(x,y) in equation (38), replacing intensities ui and uj by their respective indices i, j (row and column) which are directly the entry indices in the GLCMs (which, by the way, seems clearer):


H V D D i j i i i j j j

The difference is of course a look-up table from indices i, j to (quantized) intensities ui, uj. We combine both notations in the following example, for a 54 pattern (or image).

101 83 2 16 114

96 119 10 5 0

4 1 212 239 88

11 8 196 192 12

Figure 6. (Left) A gray level pattern (or image), before quantization into three level

bins “0”,”1” and “2” for intervals [0,79] (or, equivalently [0, 80)), [80,159] and [160,255], respectively. The GLCMs are four 33 matrices where the entries are the

counts n00, n01, n02, n10, n11, n12, n20, n21, n22, that, once normalized to all possibilities for that configuration, become probabilities p00, p01, p02, p10, p11. p12, p20, p21, p22 with values

in [0,1]. The normalized image contents is shown at right in Figure 7.


10

1 1 0 01 0 0 5 1 2 4 1 2 3 1 2 2 1 2

0 2 2 1 2 2 0 4 2 0 2 1 1 1 10 0 2 2 0 1 1 2 0 0 2 1 0 1 1 0 1

Image ( ) : ( )

j j j j

H V D D

0 1 2 0 1 2 0 1 2 0 1 2

0 0 0 01 1 1 12 2 2 2

quantized s M p M M M

i

I

2

11 0

0

GLCM

Figure 7. Four different co-occurrence matrices for the gray-level image (levels: 0,1,2). In bold red (image gray-level bins, at left), all configurations “D” for entry n10 = 2

(in bold green) in GLCM MD (at right).

To have probabilities p in [0, 1], the normalization constants are, for matrix MH (summing all entries): 1/(5+1+2 +2+2+0 +1+1+2)=1/16, for matrix MV: 1/15, for matrix MD: 1/12 and for matrix MD: 1/11. For example, the energy of matrix MD is (see equation (4) in the table below):

2 2

2

0 0

(9 1 4) (4 1 1 ) (1 0 1) 144 0.15278./D i ji j

E p


GLCMs of other point features. In general, GLCMs matrices can be obtained not only from gray-level intensity values of pixels or voxels, but also for any attribute (scalar, vector, etc.), local feature or code assigned to one point (and corresponding perhaps to local information), or from unstructured data. An example is the run-lengths codes obtained from the length of pixels or voxels along a row, which are connected if their gray-level lies within an interval. A new run-length is registered if the chain is “broken” (a pixel lies outside the tolerance interval). Another interesting example is the spatial dependence of 3D-relief features in the Topographic Primal Sketch, where coordinates in the support domain have local attributes for ridge and ravine lines, saddle points, flats, hills, hillsides, peaks and pits. These relief features can be thought also as parametric images obtained from the original, consisting in the output of a number of differential-operators. We summarize facts, dimensions and ranges that seem confusing, especially when they coincide:

Image dimensions are NM, that is, the image domain or support is the rectangle interval [0, N1][0, M1] and the intensity attribute ranges over [0, umax] (for example [0, 255]). A small sample ROI of any shape and size can be analyzed, instead of the whole image.

Co-occurrence histograms = co-occurrence matrices (GLCMs) are LL (always square), where L is the reduced number of gray levels into which the co-domain [0, umax] is re-quantized, usually to 4 gray levels (histogram bins), thus GLCMs are often 44. In the example above (Figure 7), we used three levels. Note that intensities themselves do not enter any calculation. Note also that, to visualize “bin intensities” of re-quantized


image, a display LUT is needed, to map four levels 0, 1, 2, 3 to display gray-levels 0, 84, 168, 256 (see Figure 8).

Pixel-pair combinations: It is common to use 4 combinations per offset pair, with offsets (x, y) with values in {0,1, 2,..., Offmax= D}, and satisfying |xy|1, in order to have horizontal, vertical, diagonal (45), and anti-diagonal (135) configurations. So, for a fixed separation at four orientations, there are 4 GLCMs, one for each offset (x, y). For square pixels, orthogonal distances are equal to |x|, but diagonal distances become equal to 2|x|. The gap or size offset Offmax is determined by the size of the largest textural features, that is, when a texture disappears if a Gaussian blur of sigma Offmax is applied (thus, below a spatial Nyquist-criterion). In the example above, Offmax=1.

Multi-resolution analysis: If one prefers to use always the latter choice of Offmax=1, a multi-resolution or scalespace approach should be used, by reducing resolution (decimation or sub-sampling), or by obtaining a set of convolved images I*Gsigma, with sigma in a range of scale values. This set may be in turn sub-sampled into a set of smaller images where texture is analyzed, for each sigma and/or reducing factor scale. A power-of-two sequence is commonly used, where scale {1/2, 1/4, ..., 1/2N }.

Other datasets: Since histograms do not depend on information domain, GLCMs may be obtained or correspond, not only from/to a single 2D image (different pixel positions), but from/to pixels of two images, or two subjects, two image modalities, two adjacent slices, two video frames, two series of data (1D-signals), two 3D-contours, two


attributed graphs, two volumes, two boundary sets, two meshes, two hiper-volumes, etc. The attribute (u) may also be a scalar, a vector (v.g., color images), a tensor, a field, etc.

The GLCM entry (i, j) [0, L1][0, L1] has the count nij of pairs of pixels in which the central (reference) pixel has intensity in bin i (note the “normalization of intensities”: with 4 bins (classes), there are 4 intensities numbered 0,1,2,3) AND the second pixel, at the selected offset/direction (for that matrix) has intensity in bin j.

Figure 8. (Left) A gray level quantization from 256 to 4 levels, corresponding to value

intervals (bins 1 to 4): [0, 63], [64, 127], [128, 191], [192, 255]. The gray shades chosen


to represent each bin are values 0, 85, 170 and 255. Compare the latter with (Middle) a quantization from 256 to 16 gray levels in intervals [16i, (16i1)], with i = 0,.., 16. A GLCM using this quantization (middle band) would have 1616 bins (entries). The

rightmost bands show a quantization with 32 gray level intervals.

We also stress that the original gray levels do not enter into any calculation of the GLCMs, but define bi-dimensional bins or classes (i, j) (i.e., entry indices of the GLCMs). However, the bin indices themselves may appear in co-occurrence features (or statistic descriptors) obtained from the GLCMs (see table below).


Some co-occurrence matrix features (any bidimensional histogram { pi j }) ( , ) Prob[ , ]u ui j i j a i b jp p u u u u i, j {0,…, L1}

,

max i ji jp Maximum probability (a)

,

arg max i ji j

p 2D Vector mode (or set of vector mode

,j) s) (b)= most frequent bin (i

,

median ( , ),i j i ji ju u p Probability median vector

1 1

0 0

,L L

i i j j i jj i

, i ip p p p p

Marginal probabilities of i and j

m and diagonal of the joint histogra

1 1

0 02 2

, ,L L

i j i j i j First moments (marginal means) of i and j i

i j

j p ip

j

ns (c)

Magnitude of marginal mea


1 12 2

0 0( )

L L

i i i ji j

i p

1 12 2

0 0( )

L L

j j i ji j

j p

2 2i j

s

j)

es

(d)

Second (central) moment(marginal variances of i and

Magnitude of marginal varianc

1 12

0 0( )( )

L L

i j i j i ji j

i j p

Cross-moment or covariance of i and j (e)

i ji j

i j

Correlation (1st order) of i and j

(f)

1 1

0 0

L L k hi j

i ji j p

Element difference moment of order k,h Order 2,1: Contrast (g)

1 1

0 0

( ),i j

k

hL L

i j

pi j

i j

Inverse element difference or inverse moment of order k, (h)h


21 1

0 0i j

L L

i jp

Uniformity or Energy (Element difference moment of order 0,

(i) 2)

2

1 1

0 0

logi j i j

L L

i j

p p

Joint Entropy 1 1

20 0

( , ) ( , ) log ( , )L L

i ji j

a b i ju u p u u p u u

(j) H

1 1

0 0i j

L L

i ji j p

Dissimilarity (equivalent to the Sum of Absolute Difference

(k)s)

1 1

0 0 1i j

L L

i j

p

i j

Homogeneity (l)

1 1

0 0( ) i j

L Lk

i ji j

i j p

Cluster Tendency (order k) (m)

2 11

0

( ) ( )i i i i inL

ni i

p p p

p

Probability of a run length n for gray level i (assuming the image is Markov)


1 1 1 1

2 2 20 0 0 0

1 1

20 0

( , ) ( ) ( ) ( , )

log log log

log

L L L L

i i j j i j i ji j i j

L Li j

i ji j i j

a a ab b bMI u u H u H u H u u

p p p p p p

pp

p p

Mutual (p)information

1 1

21 log

1

L L

iji j

H p

Joint Rényi Entropy of order

(q)

Table 2. Some co-occurrence matrix features or statistic descriptors, for any bidimensional histogram { pi j }. The Haralick’s GLCM features are shown in light gray

boxes.

The marginal means and marginal variances (that, is for each dimension at a time) of a local region centered at each pixel, are visualized in a similar fashion to gradient components I( , )x y x , I( , )x y y , either as a pair or images, or combined as color

channels R, G, or as single magnitude image 1/22 2i j or 1/22 2

i j and 1/22 2 2i j i j .


Note that such images are not the same as the local variance image (occurrence, not co-occurrence). The contrast definition (element difference moment of order 2,1) of equation (g) is found sometimes formulated as:

1 1 12

0 0 0,

L L L

i jk i j

k p with i j k

Statistic Description

Contrast Measures the largest local variations in the gray-level co-occurrence matrix.

Correlation Measures the joint probability occurrence of the specified pixel pairs, a low correlation means a low dependence of pixel attributes.

Energy Provides the sum of squared elements in the GLCM. Also known as uniformity or the angular second moment.

Joint Entropy Measures the randomness, and also the amount of information contributed by the pair of pixels and is related with the presence of order (low entropy) or disorder (high entropy) introduced by the pixel pair configuration.


Dissimilarity For a large number of pixels, it is equal to the sum of absolute differences of all the observations (see the 2nd moments), and is a measure of dissimilarity.

Homogeneity Measures the uniformity expressed as the closeness of the distribution of elements in the GLCM to the GLCM diagonal. If only diagonal values are different of zero, then there is no texture (an homogeneous region), if no diagonal dominance exist at all, the texture is rather noise (or very irregular).

Mutual information

When pixel pairs share some property, this feature measures the strength of that common property. This measure is better suited for comparing two textures (or configurations, or patterns), and rather as a measure of similarity.

Table 3. Interpretation of important statistic descriptors (most are self-explained).

Equation (38) is impractical for actual computation of a GLCM; an indirection method is commonly used by scanning the image pixel-by-pixel (x, y) and incrementing the count at the matrix entry indicated by image attributes I(x, y) and I(x+ x, y+y), after quantization to L levels (normalization of frequencies to get probabilities must be performed later):

1( , ) ( , ) ( , ) ( , )

0

( , )

( , )with and

: 1

0, , 0,..., 1

t tx y x x y y x y x x y y

i j

x y

i j

n n

n t NM

I I I I

(40)


Note that the 2D data is used as a 2D index of the histogram array. A common practice is to increment only x, from 1 to Offsetmax, perform the histogram and feature extraction, and then rotate the image a desired angular increment , from 0 to 180. Texture descriptors are then plotted as functions of . Image scaling is also used, in stead of changing the separation D between pixels (multi-resolution) and texture descriptors are then plotted in 3D as functions of (, scale). Such geometric transformations imply the use of interpolation techniques, in order to deal with discrete values of pixel positions. GLCMs are then obtained only for the configuration [0, 8], for example, at scales ½, ¼, and 1/8, and for orientations from 0 to 180, each 10. This produces 318 matrices and many 3D graphics for a single ROI. Two methods are used to speed-up the entry-filling of GLCMs: Sampling the textural region of interest, in stead of scanning all pixels. Look-up-tables

of offsets is effective when many offset/directions are to be explored. Using sliding windows (also known as roller buffers) in which a local GLCM (one per

offset/direction) is updated in an incremental way. It is more effective for simultaneously considering several, large offsets. Figure 9 shows a discrete sliding window for the calculation of a local histogram around each pixel.

In satellite image analysis, co-occurrence matrices are typically large (several bins, for example 6464 or 1616, being 44 a frequent minimum), there are at least four GLCM per offset (that is, four orientations at 45°), and they tend to be sparse. Besides the co-occurrence


matrix features already mentioned (the Haralick features hiligthed in Table 2), some sparse-matrix techniques may be further applied on large GLCMs, to reduce computation. The GLCM statistics may be calculated on the whole image domain, but in practice, they are calculated on small sampled or carefully chosen ROIs (Regions Of Interest). These may correspond to representative or meaningful regions of the image, after segmentation, partition and grouping unto labelled, non-overlapping regions, not necessarily connected (for example, the ROI may comprise all round components, or all particles of certain size or shape). In general, a parametric image of a single texture GLCM descriptor, say the Joint Entropy, can be obtained, by considering the local isotropic neighborhood NR (a disc or ball or radius R) of each pixel and assigning the parameter value to that pixel as intensity. Texture-based segmentation of an image can then be done by characterizing each “different” texture with a feature vector of GLCM parameters. Figure 10 shows an example of such parametric images. Histograms, intensity measurements, morphometry on isodensity regions and even GLCMs-based texture analysis itself can be done on the resulting set of images A-D.

Finally, to determine the best Offmax values (or, equivalently the scale for reducing resolution and use standard Offmax=2), a useful tool is the empirical Variogram, a function describing the degree of spatial dependence of a spatial random field or stochastic process; its still, range and nugget features provide such degree of correlation information, given a pixel lag. For example, values Offmax range, for a variogram of a textured region, won’t give significant results for any GLCM descriptor of that region.


Figure 9. Sliding buffers (or rolling windows) Nr of radius r=4, for the calculation of a local, pixel-by-pixel histogram (occurrence). All pixels of the first buffer Nr are read at

pixel point p0=(x0, y); for the next pixel p’0 = (x0+1, y), only the new boundary pixels at the right of Nr are read and added to the histogram and the old boundary pixels at the left of Nr are subtracted from the histogram. A row-update is also done for each (x, y+k). For

co-occurrence histograms, two concurrent sliding buffers are used; one for (x, y) and another for the offset pixel (x+x, y+y), where x, y {0,D,D}.


A B

C D

Figure 10. (Left) The Café de la Terrace, a painting of Vincent van Gogh. (Right) Parametric images for the Red channel, quantized to 16 graylevels, at offset of D=3 pixels,


orientation at 0°, and local disc ROIs NR of radius R=5 pixels, for (A) Contrast, (B) Dissimilarity, (C) Homogeneity and (D) Entropy.


Figure 11. (Left) A color parametric image is shown, where channels R, G, B correspond to Dissimilarity, Homogeneity and Contrast, respectively, for the painting of Figure 10. (Right) A three-dimensional RGB histogram of the color parametric image is shown, as a scatter plot, exhibiting correlations among three GLMC parameters at a time.

12. Binary Texture Analysis, Codewords and Local Binary Patterns Gray-level or color textures may be studied separating intensity bins as binary images, each one representing a particular bin (which in turns quantizes the full dynamical range into a few number of classes, typically from 4 to 20, which become as “channels”). Thus, binary texture analysis is a special case where many approaches exist, most of them based in information codification of local bit-by-bit patterns. In this approach, for example a 33 neighborhood in 8-connectivity is coded into a 8-bit word and a texture dictionary is then built, and statistics obtained from possible configurations. Moreover, GLCMs matrices can be obtained from the image of 8-bit codes extracted from a texture. The following figure illustrates (right) how a binary codeword (with decimal value of 180) is obtained for a textural pattern (left), reading bits clockwise:


0 0 1 codeword for pixel1 00 1 1

starting bit 0

bit 7 bit 0

:10110100 180

xx

The central bit (in red at left) is ignored, making the codeword robust to pixel noise. To make an anisotropic codeword, gray-level intensities (before separation into plane-bits or channels) are interpolated as illustrated by the following pixel configurations, which also extends the radius R of the neighborhood to larger values:

Square neighborhood Circular neighborhood

(g3, g1, g7, g5 are interpolated)


A sequence of different radii R-neighborhoods is equivalent to sampled multirresolution. Gaussian filters may also be applied to obtain a pyramid of decreasing-resolution images. In a gray-level analysis similar to co-occurrence (only two pixels), the joint distribution of all eight pixels is obtained as well as its 8th-order statistics. Texture descriptors in this approach are rotation-invariant. In the Local Binary Pattern (LBP) approach, in stead of gray levels, the differences with respect to the central value are kept, turning the new textural descriptors contrast-invariant. If only the sign is taken into account, it can be coded as a bit 0 for negative and 1 as positive, and again a binary code, and an 8-bit codeword is obtained. A number of LBP features can then be obtained. Codewords, on the other hand, lead to vector-quantization and vector-support machine approaches as well as Self-Organizing Maps (SOM)-based classification.

13. The Variogram Given a separation h, and an orientation , the variogram (h,) is the distribution of the average quadratic difference (variance) in a property (image intensity) at pixels separated by h (lag) at orientation . In cartesian coordinates, we define vector (hx,,hy) = (h cos, h sin ) and define mathematically:


2

2

1( , ) [ ( , ) ( , )]21 [ ( , ) ( , ) ( , ) ]21( ) [ ( ) ( ) ]2

x y x y

x y

h h Var x y x h y h

E x y x y h h

h x x h

I I

I I

I I (41)

A variogram is independent of location (x,y) and allows to detect features that change with orientation and separation h. Given a fixed orientation , a variogram is also the plot of

(h,) against lag h (similar to an histogram definition). In practice it is estimated by variance in intensity at N samples of the pairs of pixels:

2* 1( ) ( ( ) ( ))2 ( )

i j

i jN

hx x

h x xh

I I (42)


where is the number of vector-separations (or number of samples of pixel pairs separated at h at orientation ) to be considered (as the number of classes in a histogram). Samples are usually taken as multiples h,2h, 3h,...,Nh.

( )N h

1x hI

Figure 10. Variogram for two displacements lags h1, h2

2x hI2h

1h

xI



Variogram map: An image integrating variograms at several angular bins (in polar coordinates) or made by “pixels” (hx, hy):


Figure 13. Isocontour plot (from a more detailed version of last color plot)


Figure 14. Variogram features Sill and Range in terms of correlated pixels.

14. Principal Component Analysis (PCA) of Discrete Sample Texels.

A difficult feature of any texture is the quantification of the answer to this question:

“to what extend a texture has structural-approach properties”.


A related problem is, given a (structural) texture, to identify (or to obtain the “best fitting” of) texture primitives (texels), which may be extended to small building regions that repeat under some positioning rules (to be found, too). A qualitative answer to the first question is the complexity of the texel or texels and the complexity (length of description) of the positioning rules… if they are known! If texels exist, they eventually repeat, even if only with statistical variation. If repetition includes repeated spatial separations, then the Fourier Analysis and other unitary transform-based analysis are suitable tools to establish positioning rules and maybe to identify texels. If repetition is rather random, and positioning rules include texel rotation, scaling and gradients in spacing, it is more difficult to “identify” such rules, as well as the texels. A few observations may lead to a new approach in a structural-based analysis of a texture. As first defined, a texture image generated by texel plus position rules, complies with shift-invariant properties. If a correspondence can be established, then the ACF-based analysis, can be used to characterize texels. However, under complex positioning rules, several degrees of freedom increase complexity of correspondence finding. Such degrees of freedom un positioning include: scaling, rotations, quasi-periodic translations, gradients and non-linear deformations, such as shearing and perspective distortion, to mention the simplest. Under simple, known variations (rotation and translation (even if random), for example) a texel can be isolated by identifying a group of pixels exhibiting certain invariant properties, which somehow repeat in a given image. In a first approach the texel may be defined by its


gray level, shape or homogeneity of some local property, such as size, orientation, or second-order histogram, as defined by the co-occurrence matrices. Texels may be re-defined in a discrete image as any sub-image region(s) that either forms a texture by tiling the plane somehow, or spatially correlates with other regions, regardless of what lies between. Such regions will be also called “sample texels”, and may be KK rectangles, where K is smaller than image dimensions N,M, say, by at least an order of magnitude. The simplest case, when the regions are single pixels, has already been formulated as “pixel co-occurrence analysis”, and all extracted parameters from the co-occurrences histograms (matrices). Co-occurrence of larger regions (sample texels) becomes more complicated, with exponential combinatory. A way to extract correlation information between small regions (say, sub-images A I of KK = 44 pixels), is by abstracting the sample texel as a sample vector (in fact a random-vector variable) of gray-level intensities in a K2 dimension space. The order of vector components (coding) is irrelevant, as long as the same order rules are used. We may use the following lexicographic order:


A = coded as the 16D vector:

(a11, a12, a13, a14, a21, a22, a23, a24, a31, a32, a33, a34, a41, a42, a43, a44),

a11 a12 a13 a14

a21 a22 a23 a24

a31 a32 a33 a34

a41 a42 a43 a44

and re-indexed as column vector

X = [x1 x2 … x16] T

Any other ordering choice is equally useful, but should be the same for all samples. If a vector is returned to image space, it has to use the inverse ordering rule. Suppose now that we obtain L sample texels An with n=1,2,…,L, and code them into the random-variable vectors X n with n=1,2,…,L, then we define the LL covariance matrix:

( )( ) n n X X X X T for all vectors n = 1,2,…,L. (45)

Where < X > is the expected value (or vector average) of vectors { X n }, with n,m = 1,2,…,L. Note that < X > is a vector, not a scalar (it is not an “average gray level”). Note than in one dimension (X n

T = (x1) n), becomes the scalar variance 2 = 1/L (<x> x1 )2. In the nD case,

the (n,m) element of is the usual covariance between random variables xn and xm. (there are


n2 such pairs). In literature on statistics there are two conflicting notations for : var(X) and cov(X ) (or auto-covariance), and the two random vector extension: but also there is the “cross-covariance”, cov(X, Y) (for which cov(X, X) coincides with var(X). We now perform a Principal Component Analysis (PCA), where the diagonalized will show in its diagonal the de-correlated principal modes of variation of the texel samples. So, a texel pattern will correspond to the strongest few principal eigenvectors. If no dominant eigenvectors exists (the eigenvalue “spectrum” is spread over the vector space), then no evident texel pattern exists. The later analysis is very powerful and we have only sketched the starting steps. The structure properties of covariance matrix allow to quantify, by sampling many KK regions of the image, the degree of structural-ness (existence of texel primitives represented as KK arrays of pixels) of a textured image. It allows to count and identify such texels, and several properties of the eigenvector spectrum would modulate their importance and repetitiveness. PCA may be as well applied to vectors of textural features, to know and quantify the dominant features, by defining a feature space, where similar textures are defined by short Euclidean distance separations (or other metrics) and different textures are defined by large distance separations in that space. Neural networks may also work on feature spaces, to extract vector descriptors for classification tasks, where the texture class has a complex shape in that feature space.


Other methods and algorithms (see Tutorial 2):

Porous-material properties - percolation, mean length of branches Mathematical Morphology filtering (granulometries). Stereological approach (also considering stochastic geometry) Spectrograms and spectral power features Markovian Random Field descriptions and methods Length-runs: spatial information + co-ocurrence GLMC of length-code features Mosaic models (as Voronoi Zones Of Influence). Law bank of filters, Gabor and Hermite-transform descriptions Local Binary Patterns II Dynamic systems (attractors) and Fractal analysis Surface-intensity, 3D-approach or Topographic Prinal Sketch - facets Inter-texture and multi-texture analysis Color-Science image processing and analysis of textures 3D textures.

Documents

Texture Characterization and Analysis Tutorial I - version 3.0