57
In computer vision , segmentation refers to the process of partitioning a digital image into multiple segments (sets of pixels , also known as superpixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. [1] Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics. The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contours extracted from the image (see edge detection ). Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color , intensity , or texture . Adjacent regions are significantly different with respect to the same characteristic(s). [1] When applied to a stack of images, typical in Medical imaging , the resulting contours after image segmentation can be used to create 3D reconstructions with the help of interpolation algorithms like Marching cubes . Contents [hide ] 1 Applications 2 Thresholding 3 Clustering methods 4 Compression-based methods 5 Histogram-based methods 6 Edge detection 7 Region growing methods 8 Split-and-merge methods 9 Partial differential equation-based methods o 9.1 Level set methods 10 Graph partitioning methods 11 Watershed transformation 12 Model based segmentation

Color Space

Embed Size (px)

Citation preview

Page 1: Color Space

In computer vision, segmentation refers to the process of partitioning a digital image into multiple segments (sets of pixels, also known as superpixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze.[1] Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics.

The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contours extracted from the image (see edge detection). Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristic(s).[1] When applied to a stack of images, typical in Medical imaging, the resulting contours after image segmentation can be used to create 3D reconstructions with the help of interpolation algorithms like Marching cubes.

Contents

[hide]

1 Applications 2 Thresholding 3 Clustering methods 4 Compression-based methods 5 Histogram-based methods 6 Edge detection 7 Region growing methods 8 Split-and-merge methods 9 Partial differential equation-based methods

o 9.1 Level set methods 10 Graph partitioning methods 11 Watershed transformation 12 Model based segmentation 13 Multi-scale segmentation

o 13.1 One-dimensional hierarchical signal segmentation o 13.2 Image segmentation and primal sketch

14 Semi-automatic segmentation 15 Neural networks segmentation 16 Proprietary software 17 Open source software 18 Benchmarking 19 See also 20 External links 21 References

Page 2: Color Space

[edit] Applications

Some of the practical applications of image segmentation are:

Medical imaging [2] o Locate tumors and other pathologieso Measure tissue volumeso Computer-guided surgeryo Diagnosiso Treatment planningo Study of anatomical structure

Locate objects in satellite images (roads, forests, etc.) Face recognition Fingerprint recognition Traffic control systems Brake light detection Machine vision Agricultural imaging – crop disease detection

Several general-purpose algorithms and techniques have been developed for image segmentation. Since there is no general solution to the image segmentation problem, these techniques often have to be combined with domain knowledge in order to effectively solve an image segmentation problem for a problem domain.

[edit] Thresholding

The simplest method of image segmentation is called the thresholding method. This method is based on a clip-level (or a threshold value) to turn a gray-scale image into a binary image.

The key of this method is to select the threshold value (or values when multiple-levels are selected). Several popular methods are used in industry including the maximum entropy method, Otsu's method (maximum variance), and et al. k-means clustering can also be used.

[edit] Clustering methods

The K-means algorithm is an iterative technique that is used to partition an image into K clusters. The basic algorithm is:

1. Pick K cluster centers, either randomly or based on some heuristic2. Assign each pixel in the image to the cluster that minimizes the distance between the

pixel and the cluster center3. Re-compute the cluster centers by averaging all of the pixels in the cluster4. Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters)

Page 3: Color Space

In this case, distance is the squared or absolute difference between a pixel and a cluster center. The difference is typically based on pixel color, intensity, texture, and location, or a weighted combination of these factors. K can be selected manually, randomly, or by a heuristic.

This algorithm is guaranteed to converge, but it may not return the optimal solution. The quality of the solution depends on the initial set of clusters and the value of K.

In statistics and machine learning, the k-means algorithm is a clustering algorithm to partition n objects into k clusters, where k < n. It is similar to the expectation-maximization algorithm for mixtures of Gaussians in that they both attempt to find the centers of natural clusters in the data. The model requires that the object attributes correspond to elements of a vector space. The objective it tries to achieve is to minimize total intra-cluster variance, or, the squared error function. The k-means clustering was invented in 1956. The most common form of the algorithm uses an iterative refinement heuristic known as Lloyd's algorithm. Lloyd's algorithm starts by partitioning the input points into k initial sets, either at random or using some heuristic data. It then calculates the mean point, or centroid, of each set. It constructs a new partition by associating each point with the closest centroid. Then the centroids are recalculated for the new clusters, and algorithm repeated by alternate application of these two steps until convergence, which is obtained when the points no longer switch clusters (or alternatively centroids are no longer changed). Lloyd's algorithm and k-means are often used synonymously, but in reality Lloyd's algorithm is a heuristic for solving the k-means problem, as with certain combinations of starting points and centroids, Lloyd's algorithm can in fact converge to the wrong answer. Other variations exist, but Lloyd's algorithm has remained popular, because it converges extremely quickly in practice. In terms of performance the algorithm is not guaranteed to return a global optimum. The quality of the final solution depends largely on the initial set of clusters, and may, in practice, be much poorer than the global optimum. Since the algorithm is extremely fast, a common method is to run the algorithm several times and return the best clustering found. A drawback of the k-means algorithm is that the number of clusters k is an input parameter. An inappropriate choice of k may yield poor results. The algorithm also assumes that the variance is an appropriate measure of cluster scatter.

[edit] Compression-based methods

Compression based methods postulate that the optimal segmentation is the one that minimizes, over all possible segmentations, the coding length of the data.[3][4] The connection between these two concepts is that segmentation tries to find patterns in an image and any regularity in the image can be used to compress it. The method describes each segment by its texture and boundary shape. Each of these components is modeled by a probability distribution function and its coding length is computed as follows:

1. The boundary encoding leverages the fact that regions in natural images tend to have a smooth contour. This prior is used by huffman coding to encode the difference chain code of the contours in an image. Thus, the smoother a boundary is, the shorter coding length it attains.

2. Texture is encoded by lossy compression in a way similar to minimum description length (MDL) principle, but here the length of the data given the model is approximated by the

Page 4: Color Space

number of samples times the entropy of the model. The texture in each region is modeled by a multivariate normal distribution whose entropy has closed form expression. An interesting property of this model is that the estimated entropy bounds the true entropy of the data from above. This is because among all distributions with a given mean and covariance, normal distribution has the largest entropy. Thus, the true coding length cannot be more than what the algorithm tries to minimize.

For any given segmentation of an image, this scheme yields the number of bits required to encode that image based on the given segmentation. Thus, among all possible segmentations of an image, the goal is to find the segmentation which produces the shortest coding length. This can be achieved by a simple agglomerative clustering method. The distortion in the lossy compression determines the coarseness of the segmentation and its optimal value may differ for each image. This parameter can be estimated heuristically from the contrast of textures in an image. For example, when the textures in an image are similar, such as in camouflage images, stronger sensitivity and thus lower quantization is required.

[edit] Histogram-based methods

Histogram-based methods are very efficient when compared to other image segmentation methods because they typically require only one pass through the pixels. In this technique, a histogram is computed from all of the pixels in the image, and the peaks and valleys in the histogram are used to locate the clusters in the image.[1] Color or intensity can be used as the measure.

A refinement of this technique is to recursively apply the histogram-seeking method to clusters in the image in order to divide them into smaller clusters. This is repeated with smaller and smaller clusters until no more clusters are formed.[1][5]

One disadvantage of the histogram-seeking method is that it may be difficult to identify significant peaks and valleys in the image. In this technique of image classification distance metric and integrated region matching are familiar.

Histogram-based approaches can also be quickly adapted to occur over multiple frames, while maintaining their single pass efficiency. The histogram can be done in multiple fashions when multiple frames are considered. The same approach that is taken with one frame can be applied to multiple, and after the results are merged, peaks and valleys that were previously difficult to identify are more likely to be distinguishable. The histogram can also be applied on a per pixel basis where the information result are used to determine the most frequent color for the pixel location. This approach segments based on active objects and a static environment, resulting in a different type of segmentation useful in Video tracking.

[edit] Edge detection

Edge detection is a well-developed field on its own within image processing. Region boundaries and edges are closely related, since there is often a sharp adjustment in intensity at the region

Page 5: Color Space

boundaries. Edge detection techniques have therefore been used as the base of another segmentation technique.

The edges identified by edge detection are often disconnected. To segment an object from an image however, one needs closed region boundaries.

[edit] Region growing methods

The first region growing method was the seeded region growing method. This method takes a set of seeds as input along with the image. The seeds mark each of the objects to be segmented. The regions are iteratively grown by comparing all unallocated neighbouring pixels to the regions. The difference between a pixel's intensity value and the region's mean, δ, is used as a measure of similarity. The pixel with the smallest difference measured this way is allocated to the respective region. This process continues until all pixels are allocated to a region.

Seeded region growing requires seeds as additional input. The segmentation results are dependent on the choice of seeds. Noise in the image can cause the seeds to be poorly placed. Unseeded region growing is a modified algorithm that doesn't require explicit seeds. It starts off with a single region A1 – the pixel chosen here does not significantly influence final segmentation. At each iteration it considers the neighbouring pixels in the same way as seeded region growing. It differs from seeded region growing in that if the minimum δ is less than a predefined threshold T then it is added to the respective region Aj. If not, then the pixel is considered significantly different from all current regions Ai and a new region An + 1 is created with this pixel.

One variant of this technique, proposed by Haralick and Shapiro (1985),[1] is based on pixel intensities. The mean and scatter of the region and the intensity of the candidate pixel is used to compute a test statistic. If the test statistic is sufficiently small, the pixel is added to the region, and the region’s mean and scatter are recomputed. Otherwise, the pixel is rejected, and is used to form a new region.

A special region growing method is called λ-connected segmentation (see also lambda-connectedness). It is based on pixel intensities and neighborhood linking paths. A degree of connectivity (connectedness) will be calculated based on a path that is formed by pixels. For a certain value of λ, two pixels are called λ-connected if there is a path linking those two pixels and the connectedness of this path is at least λ. λ-connectedness is an equivalence relation.[6]

[edit] Split-and-merge methods

Split-and-merge segmentation is based on a quadtree partition of an image. It is sometimes called quadtree segmentation.

This method starts at the root of the tree that represents the whole image. If it is found non-uniform (not homogeneous), then it is split into four son-squares (the splitting process), and so on so forth. Conversely, if four son-squares are homogeneous, they can be merged as several

Page 6: Color Space

connected components (the merging process). The node in the tree is a segmented node. This process continues recursively until no further splits or merges are possible.[7][8] When a special data structure is involved in the implementation of the algorithm of the method, its time complexity can reach O(nlogn), an optimal algorithm of the method.[9]

[edit] Partial differential equation-based methods

Using a partial differential equation (PDE)-based method and solving the PDE equation by a numerical scheme, one can segment the image.

[edit] Level set methods

Curve propagation is a popular technique in image analysis for object extraction, object tracking, stereo reconstruction, etc. The central idea behind such an approach is to evolve a curve towards the lowest potential of a cost function, where its definition reflects the task to be addressed and imposes certain smoothness constraints. Lagrangian techniques are based on parameterizing the contour according to some sampling strategy and then evolve each element according to image and internal terms. While such a technique can be very efficient, it suffers from various limitations like deciding on the sampling strategy, estimating the internal geometric properties of the curve, changing its topology, addressing problems in higher dimensions, etc. In each case, a partial differential equation (PDE) called the level set equation is solved by finite differences.

The level set method was initially proposed to track moving interfaces by Osher and Sethian in 1988 and has spread across various imaging domains in the late nineties. It can be used to efficiently address the problem of curve/surface/etc. propagation in an implicit manner. The central idea is to represent the evolving contour using a signed function, where its zero level corresponds to the actual contour. Then, according to the motion equation of the contour, one can easily derive a similar flow for the implicit surface that when applied to the zero-level will reflect the propagation of the contour. The level set method encodes numerous advantages: it is implicit, parameter free, provides a direct way to estimate the geometric properties of the evolving structure, can change the topology and is intrinsic. Furthermore, they can be used to define an optimization framework as proposed by Zhao, Merriman and Osher in 1996. Therefore, one can conclude that it is a very convenient framework to address numerous applications of computer vision and medical image analysis.[10] Furthermore, research into various level set data structures has led to very efficient implementations of this method.

[edit] Graph partitioning methods

Graph partitioning methods can effectively be used for image segmentation. In these methods, the image is modeled as a weighted, undirected graph. Usually a pixel or a group of pixels are associated with nodes and edge weights define the (dis)similarity between the neighborhood pixels. The graph (image) is then partitioned according to a criterion designed to model "good" clusters. Each partition of the nodes (pixels) output from these algorithms are considered an object segment in the image. Some popular algorithms of this category are normalized cuts,[11]

Page 7: Color Space

random walker,[12] minimum cut,[13] isoperimetric partitioning [14] and minimum spanning tree-based segmentation.[15]

[edit] Watershed transformation

The watershed transformation considers the gradient magnitude of an image as a topographic surface. Pixels having the highest gradient magnitude intensities (GMIs) correspond to watershed lines, which represent the region boundaries. Water placed on any pixel enclosed by a common watershed line flows downhill to a common local intensity minimum (LIM). Pixels draining to a common minimum form a catch basin, which represents a segment.

[edit] Model based segmentation

The central assumption of such an approach is that structures of interest/organs have a repetitive form of geometry. Therefore, one can seek for a probabilistic model towards explaining the variation of the shape of the organ and then when segmenting an image impose constraints using this model as prior. Such a task involves (i) registration of the training examples to a common pose, (ii) probabilistic representation of the variation of the registered samples, and (iii) statistical inference between the model and the image. State of the art methods in the literature for knowledge-based segmentation involve active shape and appearance models, active contours and deformable templates and level-set based methods.

[edit] Multi-scale segmentation

Image segmentations are computed at multiple scales in scale-space and sometimes propagated from coarse to fine scales; see scale-space segmentation.

Segmentation criteria can be arbitrarily complex and may take into account global as well as local criteria. A common requirement is that each region must be connected in some sense.

[edit] One-dimensional hierarchical signal segmentation

Witkin's seminal work[16][17] in scale space included the notion that a one-dimensional signal could be unambiguously segmented into regions, with one scale parameter controlling the scale of segmentation.

A key observation is that the zero-crossings of the second derivatives (minima and maxima of the first derivative or slope) of multi-scale-smoothed versions of a signal form a nesting tree, which defines hierarchical relations between segments at different scales. Specifically, slope extrema at coarse scales can be traced back to corresponding features at fine scales. When a slope maximum and slope minimum annihilate each other at a larger scale, the three segments that they separated merge into one segment, thus defining the hierarchy of segments.

[edit] Image segmentation and primal sketch

Page 8: Color Space

There have been numerous research works in this area, out of which a few have now reached a state where they can be applied either with interactive manual intervention (usually with application to medical imaging) or fully automatically. The following is a brief overview of some of the main research ideas that current approaches are based upon.

The nesting structure that Witkin described is, however, specific for one-dimensional signals and does not trivially transfer to higher-dimensional images. Nevertheless, this general idea has inspired several other authors to investigate coarse-to-fine schemes for image segmentation. Koenderink[18] proposed to study how iso-intensity contours evolve over scales and this approach was investigated in more detail by Lifshitz and Pizer.[19] Unfortunately, however, the intensity of image features changes over scales, which implies that it is hard to trace coarse-scale image features to finer scales using iso-intensity information.

Lindeberg[20][21] studied the problem of linking local extrema and saddle points over scales, and proposed an image representation called the scale-space primal sketch which makes explicit the relations between structures at different scales, and also makes explicit which image features are stable over large ranges of scale including locally appropriate scales for those. Bergholm proposed to detect edges at coarse scales in scale-space and then trace them back to finer scales with manual choice of both the coarse detection scale and the fine localization scale.

Gauch and Pizer[22] studied the complementary problem of ridges and valleys at multiple scales and developed a tool for interactive image segmentation based on multi-scale watersheds. The use of multi-scale watershed with application to the gradient map has also been investigated by Olsen and Nielsen[23] and been carried over to clinical use by Dam[24] Vincken et al.[25] proposed a hyperstack for defining probabilistic relations between image structures at different scales. The use of stable image structures over scales has been furthered by Ahuja[26][27] and his co-workers into a fully automated system.

More recently, these ideas for multi-scale image segmentation by linking image structures over scales have been picked up by Florack and Kuijper.[28] Bijaoui and Rué[29] associate structures detected in scale-space above a minimum noise threshold into an object tree which spans multiple scales and corresponds to a kind of feature in the original signal. Extracted features are accurately reconstructed using an iterative conjugate gradient matrix method.

[edit] Semi-automatic segmentation

In this kind of segmentation, the user outlines the region of interest with the mouse clicks and algorithms are applied so that the path that best fits the edge of the image is shown.

Techniques like Siox, Livewire, Intelligent Scissors or IT-SNAPS are used in this kind of segmentation.

[edit] Neural networks segmentation

Page 9: Color Space

Neural Network segmentation relies on processing small areas of an image using an artificial neural network [30] or a set of neural networks. After such processing the decision-making mechanism marks the areas of an image accordingly to the category recognized by the neural network. A type of network designed especially for this is the Kohonen map.

Pulse-Coupled Neural Networks (PCNNs) are neural models proposed by modeling a cat’s visual cortex and developed for high-performance biomimetic image processing. In 1989, Eckhorn introduced a neural model to emulate the mechanism of cat’s visual cortex. The Eckhorn model provided a simple and effective tool for studying small mammal’s visual cortex, and was soon recognized as having significant application potential in image processing. In 1994, the Eckhorn model was adapted to be an image processing algorithm by Johnson, who termed this algorithm Pulse-Coupled Neural Network. Over the past decade, PCNNs have been utilized for a variety of image processing applications, including: image segmentation, feature generation, face extraction, motion detection, region growing, noise reduction, and so on. A PCNN is a two-dimensional neural network. Each neuron in the network corresponds to one pixel in an input image, receiving its corresponding pixel’s color information (e.g. intensity) as an external stimulus. Each neuron also connects with its neighboring neurons, receiving local stimuli from them. The external and local stimuli are combined in an internal activation system, which accumulates the stimuli until it exceeds a dynamic threshold, resulting in a pulse output. Through iterative computation, PCNN neurons produce temporal series of pulse outputs. The temporal series of pulse outputs contain information of input images and can be utilized for various image processing applications, such as image segmentation and feature generation. Compared with conventional image processing means, PCNNs have several significant merits, including robustness against noise, independence of geometric variations in input patterns, capability of bridging minor intensity variations in input patterns, etc.

[edit] Proprietary software

Several proprietary software packages are available for performing image segmentation

Pac-n-Zoom Color has a proprietary software that color segments over 16 million colors at photographic quality. Manual available at http://www.accelerated-io.com/usr_man.htm [1]

TurtleSeg is a free interactive 3D image segmentation tool. It supports many common medical image file formats and allows the user to export their segmentation as a binary mask or a 3D surface.

Page 10: Color Space

Image Processing with MATLAB

1. Image representation 2. Reading and writing image files 3. Basic operations 4. Linear filters 5. Nonlinear filters 6. Special topics

This tutorial discusses how to use MATLAB for image processing. Some familiarity with MATLAB is assumed (you should know how to use matrices and write an M-file).

It is helpful to have the MATLAB Image Processing Toolbox, but fortunately, no toolboxes are needed for most operations. Commands requiring the Image Toolbox are indicated with [Image Toolbox].

Image representation

There are five types of images in MATLAB.

1. Grayscale. A grayscale image M pixels tall and N pixels wide is represented as a matrix of double datatype of size M×N. Element values (e.g., MyImage(m,n)) denote the pixel grayscale intensities in [0,1] with 0=black and 1=white.

2. Truecolor RGB. A truecolor red-green-blue (RGB) image is represented as a three-dimensional M×N×3 double matrix. Each pixel has red, green, blue components along the third dimension with values in [0,1], for example, the color components of pixel (m,n) are MyImage(m,n,1) = red, MyImage(m,n,2) = green, MyImage(m,n,3) = blue.

3. Indexed. Indexed (paletted) images are represented with an index matrix of size M×N and a colormap matrix of size K×3. The colormap holds all colors used in the image and the index matrix represents the pixels by referring to colors in the colormap. For example, if the 22nd color is magenta MyColormap(22,:) = [1,0,1], then MyImage(m,n) = 22 is a magenta-colored pixel.

4. Binary. A binary image is represented by an M×N logical matrix where pixel values are 1 (true) or 0 (false).

5. uint8. This type uses less memory and some operations compute faster than with double types. For simplicity, this tutorial does not discuss uint8 further.

Grayscale is usually the preferred format for image processing. In cases requiring color, an RGB color image can be decomposed and handled as three separate grayscale images. Indexed images must be converted to grayscale or RGB for most operations.

Below are some common manipulations and conversions. A few commands require the Image Toolbox and are indicated with [Image Toolbox].

Page 11: Color Space

% Display a grayscale or binary imageimage(MyGray*255);axis imagecolormap(gray(256));

% Display an RGB image (error if any element outside of [0,1])image(MyRGB);axis image% Display an RGB image (clips elements to [0,1])image(min(max(MyRGB,0),1));axis image

% Display an indexed imageimage(MyIndexed);axis imagecolormap(MyColormap);

% Separate the channels of an RGB imageMyRed = MyRGB(:,:,1);MyGreen = MyRGB(:,:,2);MyBlue = MyRGB(:,:,3);% Put the channels back togetherMyRGB = cat(3,MyRed,MyGreen,MyBlue);

% Convert grayscale to RGBMyRGB = cat(3,MyGray,MyGray,MyGray);

% Convert RGB to grayscale using simple averageMyGray = mean(MyRGB,3);% Convert RGB to grayscale using NTSC weighting [Image Toolbox]MyGray = rgb2gray(MyRGB);% Convert RGB to grayscale using NTSC weightingMyGray = 0.299*MyRGB(:,:,1) + 0.587*MyRGB(:,:,2) + 0.114*MyRGB(:,:,3);

% Convert indexed image to RGB [Image Toolbox]MyRGB = ind2rgb(MyIndexed,MyColormap);% Convert indexed image to RGBMyRGB = reshape(cat(3,MyColormap(MyIndexed,1),MyColormap(MyIndexed,2),... MyColormap(MyIndexed,3)),size(MyIndexed,1),size(MyIndexed,2),3);

% Convert an RGB image to indexed using K colors [Image Toolbox][MyIndexed,MyColormap] = rgb2ind(MyRGB,K);

% Convert binary to grayscaleMyGray = double(MyBinary);

% Convert grayscale to binaryMyBinary = (MyGray > 0.5);

Page 12: Color Space

Reading and writing image files

MATLAB can read and write images with the imread and imwrite commands. Although a fair number of file formats are supported, some are not. Use imformats to see what your installation supports:

>> imformats

EXT ISA INFO READ WRITE ALPHA DESCRIPTION ------------------------------------------------------------------------------------------------bmp isbmp imbmpinfo readbmp writebmp 0 Windows Bitmap (BMP)gif isgif imgifinfo readgif writegif 0 Graphics Interchange Format (GIF)jpg jpeg isjpg imjpginfo readjpg writejpg 0 Joint Photographic Experts Group (JPEG)pbm ispbm impnminfo readpnm writepnm 0 Portable Bitmap (PBM)pcx ispcx impcxinfo readpcx writepcx 0 Windows Paintbrush (PCX)pgm ispgm impnminfo readpnm writepnm 0 Portable Graymap (PGM)png ispng impnginfo readpng writepng 1 Portable Network Graphics (PNG)pnm ispnm impnminfo readpnm writepnm 0 Portable Any Map (PNM)ppm isppm impnminfo readpnm writepnm 0 Portable Pixmap (PPM)...

When reading images, an unfortunate problem is that imread returns the image data in uint8 datatype, which must be converted to double and rescaled before use. So instead of calling imread directly, I use the following M-file function to read and convert images:

function Img = getimage(Filename)%GETIMAGE Read an image given a filename% V = GETIMAGE(FILENAME) where FILENAME is an image file. The image is% returned either as an MxN double matrix for a grayscale image or as an% MxNx3 double matrix for a color image, with elements in [0,1].

% Pascal Getreuer 2008-2009

% Read the file[Img,Map,Alpha] = imread(Filename);Img = double(Img);

if ~isempty(Map) % Convert indexed image to RGB Img = Img + 1; Img = reshape(cat(3,Map(Img,1),Map(Img,2),Map(Img,3)),size(Img,1),size(Img,2),3);else Img = Img/255; % Rescale to [0,1]end

Page 13: Color Space

Right-click and save getimage.m to use this M-function. If image baboon.png is in the current directory (or somewhere in the MATLAB search path), you can read it with MyImage = getimage('baboon.png'). You can also use partial paths, for example if the image is in <current directory>/images/ with getimage('images/baboon.png').

To write a grayscale or RGB image, use

imwrite(MyImage,'myimage.png');

Take care that MyImage is a double matrix with elements in [0,1]—if improperly scaled, the saved file will probably be blank.

When writing image files, I highly recommend using the PNG file format. This format is a reliable choice since it is lossless, supports truecolor RGB, and compresses pretty well. Use other formats with caution.

Basic operations

Below are some basic operations on a grayscale image u. Commands requiring the Image Toolbox are indicated with [Image Toolbox].

% StatisticsuMax = max(u(:)); % Compute the maximum valueuMin = min(u(:)); % MinimumuPower = sum(u(:).^2); % PoweruAvg = mean(u(:)); % AverageuVar = var(u(:)); % VarianceuMed = median(u(:)); % Medianhist(u(:),linspace(0,1,256)); % Plot histogram

% Basic manipulationsuClip = min(max(u,0),1); % Clip elements to [0,1]uPad = u([1,1:end,end],[1,1:end,end]); % Pad image with one-pixel marginuPad = padarray(u,[k,k],'replicate'); % Pad image with k-pixel margin [Image Toolbox]uCrop = u(RowStart:RowEnd,ColStart:ColEnd); % Crop imageuFlip = flipud(u); % Flip in the up/down directionuFlip = fliplr(u); % Flip left/rightuResize = imresize(u,ScaleFactor); % Interpolate image [Image Toolbox]uRot = rot90(u,k); % Rotate by k*90 degrees with integer k uRot = imrotate(u,Angle); % Rotate by Angle degrees [Image Toolbox]uc = (u - min(u(:))/(max(u(:)) - min(u(:))); % Stretch contrast to [0,1] uq = round(u*(K-1))/(K-1); % Quantize to K graylevels {0,1/K,2/K,...,1}

% Simulating noiseuNoisy = u + randn(size(u))*sigma; % Add white Gaussian noise of standard deviation sigma

Page 14: Color Space

uNoisy = u; uNoisy(rand(size(u)) < p) = round(rand(size(u))); % Salt and pepper noise

% Debuggingany(~isfinite(u(:))) % Check if any elements are infinite or NaNnnz(u > 0.5) % Count how many elements satisfy some condition

(Note: For any array, the syntax u(:) means “unroll u into a column vector.” For example, if u = [1,5;0,2], then u(:) is [1;0;5;2].)

For example, image signal power is used in computing signal-to-noise ratio (SNR) and peak signal-to-noise ratio (PSNR). Given clean image uclean and noise-contaminated image u,

% Compute SNR snr = -10*log10( sum((uclean(:) - u(:)).^2) / sum(uclean(:).^2) );

% Compute PSNR (where the maximum possible value of uclean is 1)psnr = -10*log10( mean((uclean(:) - u(:)).^2) );

Be careful with norm: the behavior is norm(v) on vector v computes sqrt(sum(v.^2)), but norm(A) on matrix A computes the induced L2 matrix norm,

norm(A) = sqrt(max(eig(A'*A))) gaah!

So norm(A) is certainly not sqrt(sum(A(:).^2)). It is nevertheless an easy mistake to use norm(A) where it should have been norm(A(:)).

Linear filters

Linear filtering is the cornerstone technique of signal processing. To briefly introduce, a linear filter is an operation where at every pixel xm,n of an image, a linear function is evaluated on the pixel and its neighbors to compute a new pixel value ym,n.

A linear filter in two dimensions has the general form

Page 15: Color Space

ym,n = ∑j∑k hj,k xm−j,n−k

where x is the input, y is the output, and h is the filter impulse response. Different choices of h lead to filters that smooth, sharpen, and detect edges, to name a few applications. The right-hand side of the above equation is denoted concisely as h∗x and is called the “convolution of h and x.”

Spatial-domain filtering

Two-dimensional linear filtering is implemented in MATLAB with conv2. Unfortunately, conv2 can only handle filtering near the image boundaries by zero-padding, which means that filtering results are usually inappropriate for pixels close to the boundary. To work around this, we can pad the input image and use the 'valid' option when calling conv2. The following M-function does this.

function x = conv2padded(varargin)%CONV2PADDED Two-dimensional convolution with padding.% Y = CONV2PADDED(X,H) applies 2D filter H to X with constant extension% padding. %% Y = CONV2PADDED(H1,H2,X) first applies 1D filter H1 along the rows and % then applies 1D filter H2 along the columns.%% If X is a 3D array, filtering is done separately on each channel.

% Pascal Getreuer 2009

if nargin == 2 % Function was called as "conv2padded(x,h)" x = varargin{1}; h = varargin{2}; top = ceil(size(h,1)/2)-1; bottom = floor(size(h,1)/2); left = ceil(size(h,2)/2)-1; right = floor(size(h,2)/2);elseif nargin == 3 % Function was called as "conv2padded(h1,h2,x)" h1 = varargin{1}; h2 = varargin{2}; x = varargin{3}; top = ceil(length(h1)/2)-1; bottom = floor(length(h1)/2); left = ceil(length(h2)/2)-1; right = floor(length(h2)/2);else error('Wrong number of arguments.');end

% Pad the input imagexPadded = x([ones(1,top),1:size(x,1),size(x,1)+zeros(1,bottom)],... [ones(1,left),1:size(x,2),size(x,2)+zeros(1,right)],:);

% Since conv2 cannot handle 3D inputs, we do filtering channel by channelfor p = 1:size(x,3) if nargin == 2 x(:,:,p) = conv2(xPadded(:,:,p),h,'valid'); % Call conv2

Page 16: Color Space

else x(:,:,p) = conv2(h1,h2,xPadded(:,:,p),'valid'); % Call conv2 endend

Right-click and save conv2padded.m to use this M-function. Here are some examples:

% A light smoothing filterh = [0,1,0; 1,4,1; 0,1,0];h = h/sum(h(:)); % Normalize the filteruSmooth = conv2padded(u,h);

% A sharpening filterh = [0,-1,0; -1,8,-1; 0,-1,0];h = h/sum(h(:)); % Normalize the filteruSharp = conv2padded(u,h);

% Sobel edge detectionhx = [1,0,-1; 2,0,-2; 1,0,-1];hy = rot90(hx,-1);u_x = conv2padded(u,hx);u_y = conv2padded(u,hy);EdgeStrength = sqrt(u_x.^2 + u_y.^2);

% Moving averageWindowSize = 5;h1 = ones(WindowSize,1)/WindowSize;uSmooth = conv2padded(h1,h1,u);

% Gaussian filteringsigma = 3.5;FilterRadius = ceil(4*sigma); % Truncate the Gaussian at 4*sigmah1 = exp(-(-FilterRadius:FilterRadius).^2/(2*sigma^2));h1 = h1/sum(h1); % Normalize the filteruSmooth = conv2padded(h1,h1,u);

A 2D filter h is said to be separable if it can be expressed as the outer product of two 1D filters h1 and h2, that is, h = h1(:)*h2(:)'. It is faster to pass h1 and h2 than h, as is done above for the moving average window and the Gaussian filter. In fact, the Sobel filters hx and hy are also separable—what are h1 and h2?

Page 17: Color Space

Fourier-domain filtering

Spatial-domain filtering with conv2 is easily a computaionally expensive operation. For a K×K filter on an M×N image, conv2 costs O(MNK2) additions and multiplications, or O(N4) supposing M∼N∼K.

For large filters, filtering in the Fourier domain is faster since the computational cost is reduced to O(N2 log N). Using the convolution-multiplication property of the Fourier transform, the convolution is equivalently computed by

% Compute y = h*x with periodic boundary extension[k1,k2] = size(h);hpad = zeros(size(x));hpad([end+1-floor(k1/2):end,1:ceil(k1/2)], ... [end+1-floor(k2/2):end,1:ceil(k2/2)]) = h;y = real(ifft2(fft2(hpad).*fft2(x)));

The result is equivalent to conv2padded(x,h) except near the boundary, where the above computation uses periodic boundary extension.

Fourier-based filtering can also be done with symmetric boundary extension by reflecting the input in each direction:

% Compute y = h*x with symmetric boundary extensionxSym = [x,fliplr(x)]; % Symmetrize horizontallyxSym = [xSym;flipud(xSym)]; % Symmetrize vertically[k1,k2] = size(h);hpad = zeros(size(xSym));hpad([end+1-floor(k1/2):end,1:ceil(k1/2)], ... [end+1-floor(k2/2):end,1:ceil(k2/2)]) = h;y = real(ifft2(fft2(hpad).*fft2(xSym)));y = y(1:size(y,1)/2,1:size(y,2)/2);

(Note: An even more efficient method is FFT overlap-add filtering. The Signal Processing Toolbox implements FFT overlap-add in one-dimension in fftfilt.)

Nonlinear filters

A nonlinear filter is an operation where each filtered pixel ym,n is a nonlinear function of xm,n and its neighbors. Here we briefly discuss a few types of nonlinear filters.

Order statistic filters

If you have the Image Toolbox, order statistic filters can be performed with ordfilt2 and medfilt2. An order statistic filter sorts the pixel values over a neighborhood and selects the kth largest value. The min, max, and median filters are special cases.

Page 18: Color Space

Morphological filters

If you have the Image Toolbox, bwmorph implements various morphological operations on binary images, like erosion, dilation, open, close, and skeleton. There are also commands available for morphology on grayscale images: imerode, imdilate and imtophat, among others.

Build your own filter

Occasionally we want to use a new filter that MATLAB does not have. The code below is a template for implementing filters.

[M,N] = size(x);y = zeros(size(x));

r = 1; % Adjust for desired window size

for n = 1+r:N-r for m = 1+r:M-r % Extract a window of size (2r+1)x(2r+1) around (m,n) w = x(m+(-r:r),n+(-r:r));

% ... write the filter here ...

y(m,n) = result; endend

(Note: A frequent misguided claim is that loops in MATLAB are slow and should be avoided. This was once true, back in MATLAB 5 and eariler, but loops in modern versions are reasonably fast.)

For example, the alpha-trimmed mean filter ignores the d/2 lowest and d/2 highest values in the window, and averages the remaining (2r+1)2−d values. The filter is a balance between a median filter and a mean filter. The alpha-trimmed mean filter can be implemented in the template as

% The alpha-trimmed mean filterw = sort(w(:));y(m,n) = mean(w(1+d/2:end-d/2)); % Compute the result y(m,n)

As another example, the bilateral filter is

ym,n = ∑j,k hj,k,m,n xm−j,n−k

∑j,k hj,k,m,n

Page 19: Color Space

where

hj,k,m,n = e−(j2+k2)/(2σs2) e−(x

m−j,n−k − x

m,n)2/(2σ

d2).

The bilateral filter can be implemented as

% The bilateral filter[k,j] = meshgrid(-r:r,-r:r);h = exp( -(j.^2 + k.^2)/(2*sigma_s^2) ) .* ... exp( -(w - w(r+1,r+1)).^2/(2*sigma_d^2) );y(m,n) = h(:)'*w(:) / sum(h(:));

If you don't have the Image Toolbox, the template can be used to write substitutes for missing filters, though they will not be as fast as the Image Toolbox implementations.

Filter Code

medfilt2 y(m,n) = median(w(:));

ordfilt2 w = sort(w(:));y(m,n) = w(k); % Select the kth largest element

imdilate % Define a structure element as a (2r+1)x(2r+1) arraySE = [0,1,0;1,1,1;0,1,0];y(m,n) = max(w(SE));

Special topics

Up to this point, we have covered those operations that are generally useful in imaging processing. You must look beyond this tutorial for the details on the topic of your interest. A large amount of MATLAB code and material is freely available online for filter design, wavelet and multiresolution techniques, PDE-based imaging, morphology, and wherever else researchers have made their code publically available.

Here I will briefly advertise my code (no toolboxes are necessary to use any of the following, code available under BSD license):

Converting between RGB, YUV, HSV, CIE Lab, ... Convert images between different color representations

Wavelet CDF 9/7 Implementation Implementation of the Cohen-Daubechies-Feauveau 9/7 wavelet. Supports color and arbitrary-size transforms.

Page 20: Color Space

Wavelet Transforms without the Wavelet Toolbox Self-contained function for Daubechies, symlets, splines, S+P wavelets, JPEG2000 wavelets, and more.

tvreg: Variational Imaging Methods Functions for total variation based denoising, deconvolution, and inpainting, and Chan-Vese segmentation.

You may also find these useful:

Writing Fast MATLAB CodeProfiling, JIT, vectorization, and more.

Writing MATLAB C/MEX CodeMEX: Combine the power of MATLAB and C.

Lab color spaceFrom Wikipedia, the free encyclopedia

Jump to: navigation, search

This article lacks historical information on the subject. Please help to add historical material to help counter systemic bias towards recent information.

This article may be too technical for most readers to understand. Please improve this article to make it understandable to non-experts, without removing the technical details. (November 2010)

Page 21: Color Space

The CIE 1976 (L*, a*, b*) color space (CIELAB), showing only colors that fit within the sRGB gamut (and can therefore be displayed on a typical computer display). Each axis of each square ranges from -128 to 128.

A Lab color space is a color-opponent space with dimension L for lightness and a and b for the color-opponent dimensions, based on nonlinearly compressed CIE XYZ color space coordinates.

Page 22: Color Space

The coordinates of the Hunter 1948 L, a, b color space are L, a, and b.[1][2] However, Lab is now more often used as an informal abbreviation for the CIE 1976 (L*, a*, b*) color space (also called CIELAB, whose coordinates are actually L*, a*, and b*). Thus the initials Lab by themselves are somewhat ambiguous. The color spaces are related in purpose, but differ in implementation.

Both spaces are derived from the "master" space CIE 1931 XYZ color space, which can predict which spectral power distributions will be perceived as the same color (see metamerism), but which is not particularly perceptually uniform.[3] Strongly influenced by the Munsell color system, the intention of both "Lab" color spaces is to create a space which can be computed via simple formulas from the XYZ space, but is more perceptually uniform than XYZ.[4] Perceptually uniform means that a change of the same amount in a color value should produce a change of about the same visual importance. When storing colors in limited precision values, this can improve the reproduction of tones. Both Lab spaces are relative to the white point of the XYZ data they were converted from. Lab values do not define absolute colors unless the white point is also specified. Often, in practice, the white point is assumed to follow a standard and is not explicitly stated (e.g., for "absolute colorimetric" rendering intent ICC L*a*b* values are relative to CIE standard illuminant D50, while they are relative to the unprinted substrate for other rendering intents).[5]

The lightness correlate in CIELAB is calculated using the cube root of the relative luminance.

Contents

[hide]

1 Advantages of Lab 2 Which "Lab"? 3 CIE 1976 (L*, a*, b*) color space (CIELAB)

o 3.1 Measuring differences o 3.2 RGB and CMYK conversions o 3.3 Range of L*a*b* coordinates o 3.4 CIE XYZ to CIE L*a*b* (CIELAB) and CIELAB to CIE XYZ conversions

3.4.1 The forward transformation 3.4.2 The reverse transformation

4 Hunter Lab Color Space o 4.1 Approximate formulas for K a and Kb

o 4.2 The Hunter Lab Color Space as an Adams chromatic valence space 5 References 6 External links

[edit] Advantages of Lab

Page 23: Color Space

Unlike the RGB and CMYK color models, Lab color is designed to approximate human vision. It aspires to perceptual uniformity, and its L component closely matches human perception of lightness. It can thus be used to make accurate color balance corrections by modifying output curves in the a and b components, or to adjust the lightness contrast using the L component. In RGB or CMYK spaces, which model the output of physical devices rather than human visual perception, these transformations can only be done with the help of appropriate blend modes in the editing application.

Because Lab space is much larger than the gamut of computer displays, printers, or even human vision, a bitmap image represented as Lab requires more data per pixel to obtain the same precision as an RGB or CMYK bitmap. In the 1990s, when computer hardware and software was mostly limited to storing and manipulating 8 bit/channel bitmaps, converting an RGB image to Lab and back was a lossy operation. With 16 bit/channel support now common, this is no longer such a problem.

Additionally, many of the "colors" within Lab space fall outside the gamut of human vision, and are therefore purely imaginary; these "colors" cannot be reproduced in the physical world. Though color management software, such as that built in to image editing applications, will pick the closest in-gamut approximation, changing lightness, colorfulness, and sometimes hue in the process, author Dan Margulis claims that this access to imaginary colors is useful, going between several steps in the manipulation of a picture.[6]

[edit] Which "Lab"?

Some specific uses of the abbreviation in software, literature etc.

In Adobe Photoshop, image editing using "Lab mode" is CIELAB D50.[6][7]

In ICC profiles, the "Lab color space" used as a profile connection space is CIELAB D50.[5]

In TIFF files, the CIELAB color space may be used.[8]

In PDF documents, the "Lab color space" is CIELAB.[9][10]

[edit] CIE 1976 (L*, a*, b*) color space (CIELAB)

CIE L*a*b* (CIELAB) is the most complete[citation needed] color space specified by the International Commission on Illumination (French Commission internationale de l'éclairage, hence its CIE initialism). It describes all the colors visible to the human eye and was created to serve as a device independent model to be used as a reference.

The three coordinates of CIELAB represent the lightness of the color (L* = 0 yields black and L* = 100 indicates diffuse white; specular white may be higher), its position between red/magenta and green (a*, negative values indicate green while positive values indicate magenta) and its position between yellow and blue (b*, negative values indicate blue and positive values indicate yellow). The asterisk (*) after L, a and b are part of the full name, since they represent L*, a* and b*, to distinguish them from Hunter's L, a, and b, described below.

Page 24: Color Space

Since the L*a*b* model is a three-dimensional model, it can only be represented properly in a three-dimensional space.[11] Two-dimensional depictions are chromaticity diagrams: sections of the color solid with a fixed lightness. It is crucial to realize that the visual representations of the full gamut of colors in this model are never accurate; they are there just to help in understanding the concept.

Because the red/green and yellow/blue opponent channels are computed as differences of lightness transformations of (putative) cone responses, CIELAB is a chromatic value color space.

A related color space, the CIE 1976 (L*, u*, v*) color space (a.k.a. CIELUV), preserves the same L* as L*a*b* but has a different representation of the chromaticity components. CIELUV can also be expressed in cylindrical form (CIELCH), with the chromaticity components replaced by correlates of chroma and hue.

Since CIELAB and CIELUV, the CIE has been incorporating an increasing number of color appearance phenomena into their models, to better model color vision. These color appearance models, of which CIELAB, although not designed as [12] can be seen as a simple example,[13] culminated with CIECAM02.

[edit] Measuring differences

Main article: Color difference

The nonlinear relations for L*, a*, and b* are intended to mimic the nonlinear response of the eye. Furthermore, uniform changes of components in the L*a*b* color space aim to correspond to uniform changes in perceived color, so the relative perceptual differences between any two colors in L*a*b* can be approximated by treating each color as a point in a three dimensional space (with three components: L*, a*, b*) and taking the Euclidean distance between them.[14]

[edit] RGB and CMYK conversions

There are no simple formulas for conversion between RGB or CMYK values and L*a*b*, because the RGB and CMYK color models are device dependent. The RGB or CMYK values first need to be transformed to a specific absolute color space, such as sRGB or Adobe RGB. This adjustment will be device dependent, but the resulting data from the transform will be device independent, allowing data to be transformed to the CIE 1931 color space and then transformed into L*a*b*.

[edit] Range of L*a*b* coordinates

As mentioned previously, the L* coordinate ranges from 0 to 100. The possible range of a* and b* coordinates depends on the color space that one is converting from.

[edit] CIE XYZ to CIE L*a*b* (CIELAB) and CIELAB to CIE XYZ conversions

Page 25: Color Space

[edit] The forward transformation

where

Here Xn, Yn and Zn are the CIE XYZ tristimulus values of the reference white point (the subscript n suggests "normalized").

The division of the f(t) function into two domains was done to prevent an infinite slope at t = 0. f(t) was assumed to be linear below some t = t0, and was assumed to match the t1/3 part of the function at t0 in both value and slope. In other words:

The slope was chosen to be b = 16/116 = 4/29. The above two equations can be solved for a and t0:

where δ = 6/29.[15] Note that the slope at the join is b = 4/29 = 2δ/3.

[edit] The reverse transformation

The reverse transformation is most easily expressed using the inverse of the function f above:

where

Page 26: Color Space

[edit] Hunter Lab Color Space

L is a correlate of lightness, and is computed from the Y tristimulus value using Priest's approximation to Munsell value:

where Yn is the Y tristimulus value of a specified white object. For surface-color applications, the specified white object is usually (though not always) a hypothetical material with unit reflectance and which follows Lambert's law. The resulting L will be scaled between 0 (black) and 100 (white); roughly ten times the Munsell value. Note that a medium lightness of 50 is produced by

a luminance of 25, since

a and b are termed opponent color axes. a represents, roughly, Redness (positive) versus Greenness (negative). It is computed as:

where Ka is a coefficient which depends upon the illuminant (for D65, Ka is 172.30; see approximate formula below) and Xn is the X tristimulus value of the specified white object.

The other opponent color axis, b, is positive for yellow colors and negative for blue colors. It is computed as:

where Kb is a coefficient which depends upon the illuminant (for D65, Kb is 67.20; see approximate formula below) and Zn is the Z tristimulus value of the specified white object.[16]

Both a and b will be zero for objects which have the same chromaticity coordinates as the specified white objects (i.e., achromatic, grey, objects).

[edit] Approximate formulas for Ka and Kb

Page 27: Color Space

In the previous version of the Hunter Lab color space, Ka was 175 and Kb was 70. Apparently, Hunter Associates Lab discovered that better agreement could be obtained with other color difference metrics, such as CIELAB (see above) by allowing these coefficients to depend upon the illuminants. Approximate formulæ are:

which result in the original values for Illuminant C, the original illuminant with which the Lab color space was used.

[edit] The Hunter Lab Color Space as an Adams chromatic valence space

Adams chromatic valence color spaces are based on two elements: a (relatively) uniform lightness scale, and a (relatively) uniform chromaticity scale.[17] If we take as the uniform lightness scale Priest's approximation to the Munsell Value scale, which would be written in modern notation:

and, as the uniform chromaticity coordinates:

where ke is a tuning coefficient, we obtain the two chromatic axes:

and

Page 28: Color Space

Understanding Color Spaces and Color Space Conversion

The Image Processing Toolbox software represents colors as RGB values, either directly (in an RGB image) or indirectly (in an indexed image, where the colormap is stored in RGB format). However, there are other models besides RGB for representing colors numerically. The various models are referred to as color spaces because most of them can be mapped into a 2-D, 3-D, or 4-D coordinate system; thus, a color specification is made up of coordinates in a 2-D, 3-D, or 4-D space.

The various color spaces exist because they present color information in ways that make certain calculations more convenient or because they provide a way to identify colors that is more intuitive. For example, the RGB color space defines a color as the percentages of red, green, and blue hues mixed together. Other color models describe colors by their hue (green), saturation (dark green), and luminance, or intensity.

The toolbox supports these color spaces by providing a means for converting color data from one color space to another through a mathematical transformation.

  Back to Top

Converting Between Device-Independent Color Spaces

The standard terms used to describe colors, such as hue, brightness, and intensity, are subjective and make comparisons difficult.

In 1931, the International Commission on Illumination, known by the acronym CIE, for Commission Internationale de l'Éclairage, studied human color perception and developed a standard, called the CIE XYZ. This standard defined a three-dimensional space where three values, called tristimulus values, define a color. This standard is still widely used today.

In the decades since that initial specification, the CIE has developed several additional color space specifications that attempt to provide alternative color representations that are better suited to some purposes than XYZ. For example, in 1976, in an effort to get a perceptually uniform color space that could be correlated with the visual appearance of colors, the CIE created the L*a*b* color space.

The toolbox supports conversions between members of the CIE family of device-independent color spaces. In addition, the toolbox also supports conversions between these CIE color spaces and the sRGB color space. This color space was defined by an industry group to describe the characteristics of a typical PC monitor.

This section

Lists the supported device-independent color spaces Provides an example of how to perform a conversion Provides guidelines about data type support of the various conversions

Page 29: Color Space

Supported Conversions

This table lists all the device-independent color spaces that the toolbox supports.

Color Space

Description Supported Conversions

XYZ The original, 1931 CIE color space specification. xyY, uvl, u′v′L, and L*a*b*

xyY CIE specification that provides normalized chromaticity values. The capital Y value represents luminance and is the same as in XYZ.

XYZ

uvL CIE specification that attempts to make the chromaticity plane more visually uniform. L is luminance and is the same as Y in XYZ.

XYZ

u′v′L CIE specification in which u and v are rescaled to improve uniformity.

XYZ

L*a*b* CIE specification that attempts to make the luminance scale more perceptually uniform. L* is a nonlinear scaling of L, normalized to a reference white point.

XYZ

L*ch CIE specification where c is chroma and h is hue. These values are a polar coordinate conversion of a* and b* in L*a*b*.

L*a*b*

sRGB Standard adopted by major manufacturers that characterizes the average PC monitor.

XYZ and L*a*b*

Example: Performing a Color Space Conversion

To illustrate a conversion between two device-independent color spaces, this example reads an RGB color image into the MATLAB workspace and converts the color data to the XYZ color space:

1. Import color space data. This example reads an RGB color image into the MATLAB workspace.

I_rgb = imread('peppers.png');

2. Create a color transformation structure. A color transformation structure defines the conversion between two color spaces. You use the makecform function to create the structure, specifying a transformation type string as an argument.

This example creates a color transformation structure that defines a conversion from RGB color data to XYZ color data.

Page 30: Color Space

C = makecform('srgb2xyz');

3. Perform the conversion. You use the applycform function to perform the conversion, specifying as arguments the color data you want to convert and the color transformation structure that defines the conversion. The applycform function returns the converted data.

4. I_xyz = applycform(I_rgb,C);

View a list of the workspace variables for the original and converted images.

whos

Name Size Bytes Class Attributes

C 1x1 7744 struct I_rgb 384x512x3 589824 uint8 I_xyz 384x512x3 1179648 uint16

Color Space Data Encodings

When you convert between two device-independent color spaces, the data type used to encode the color data can sometimes change, depending on what encodings the color spaces support. In the preceding example, the original image is uint8 data. The XYZ conversion is uint16 data. The XYZ color space does not define a uint8 encoding. The following table lists the data types that can be used to represent values in all the device-independent color spaces.

Color Space Encodings

XYZ uint16 or double

xyY double

uvL double

u'v'L double

L*a*b* uint8, uint16, or double

L*ch double

sRGB double

As the table indicates, certain color spaces have data type limitations. For example, the XYZ color space does not define a uint8 encoding. If you convert 8-bit CIE LAB data into the XYZ color space, the data is returned in uint16 format. If you want the returned XYZ data to be in the same format as the input LAB data, you can use one of the following toolbox color space format conversion functions.

lab2double lab2uint8 lab2uint16 xyz2double

Page 31: Color Space

xyz2uint16

  Back to Top

Performing Profile-Based Color Space Conversions

The Image Processing Toolbox software can perform color space conversions based on device profiles. This section includes the following topics:

Understanding Device Profiles Reading ICC Profiles Writing Profile Information to a File Example: Performing a Profile-Based Conversion Specifying the Rendering Intent

Understanding Device Profiles

If two colors have the same CIE colorimetry, they will match if viewed under the same conditions. However, because color images are typically produced for a wide variety of viewing environments, it is necessary to go beyond simple application of the CIE system.

For this reason, the International Color Consortium (ICC) has defined a Color Management System (CMS) that provides a means for communicating color information among input, output, and display devices. The CMS uses device profiles that contain color information specific to a particular device. Vendors that support CMS provide profiles that characterize the color reproduction of their devices, and methods, called Color Management Modules (CMM), that interpret the contents of each profile and perform the necessary image processing.

Device profiles contain the information that color management systems need to translate color data between devices. Any conversion between color spaces is a mathematical transformation from some domain space to a range space. With profile-based conversions, the domain space is often called the source space and the range space is called the destination space. In the ICC color management model, profiles are used to represent the source and destination spaces.

For more information about color management systems, go to the International Color Consortium Web site, www.color.org.

Reading ICC Profiles

To read an ICC profile into the MATLAB workspace, use the iccread function. In this example, the function reads in the profile for the color space that describes color monitors.

P = iccread('sRGB.icm');

Page 32: Color Space

You can use the iccfind function to find ICC color profiles on your system, or to find a particular ICC color profile whose description contains a certain text string. To get the name of the directory that is the default system repository for ICC profiles, use iccroot.

iccread returns the contents of the profile in the structure P. All profiles contain a header, a tag table, and a series of tagged elements. The header contains general information about the profile, such as the device class, the device color space, and the file size. The tagged elements, or tags, are the data constructs that contain the information used by the CMM. For more information about the contents of this structure, see the iccread function reference page.

Using iccread, you can read both Version 2 (ICC.1:2001-04) or Version 4 (ICC.1:2001-12) ICC profile formats. For detailed information about these specifications and their differences, visit the ICC web site, www.color.org.

Writing Profile Information to a File

To export ICC profile information from the MATLAB workspace to a file, use the iccwrite function. This example reads a profile into the MATLAB workspace and then writes the profile information out to a new file.

P = iccread('sRGB.icm');P_new = iccwrite(P,'my_profile.icm');

iccwrite returns the profile it writes to the file in P_new because it can be different than the input profile P. For example, iccwrite updates the Filename field in P to match the name of the file specified as the second argument.

iccwrite checks the validity of the input profile structure. If any required fields are missing, iccwrite returns an error message. For more information about the writing ICC profile data to a file, see the iccwrite function reference page. To determine if a structure is a valid ICC profile, use the isicc function.

Using iccwrite, you can export profile information in both Version 2 (ICC.1:2001-04) or Version 4 (ICC.1:2001-12) ICC profile formats. The value of the Version field in the file profile header determines the format version. For detailed information about these specifications and their differences, visit the ICC web site, www.color.org.

Example: Performing a Profile-Based Conversion

To illustrate a profile-based color space conversion, this section presents an example that converts color data from the RGB space of a monitor to the CMYK space of a printer. This conversion requires two profiles: a monitor profile and a printer profile. The source color space in this example is monitor RGB and the destination color space is printer CMYK:

1. Import RGB color space data. This example imports an RGB color image into the MATLAB workspace.

Page 33: Color Space

I_rgb = imread('peppers.png');

2. Read ICC profiles. Read the source and destination profiles into the MATLAB workspace. This example uses the sRGB profile as the source profile. The sRGB profile is an industry-standard color space that describes a color monitor.

inprof = iccread('sRGB.icm');

For the destination profile, the example uses a profile that describes a particular color printer. The printer vendor supplies this profile. (The following profile and several other useful profiles can be obtained as downloads from www.adobe.com.)

outprof = iccread('USSheetfedCoated.icc');

3. Create a color transformation structure. You must create a color transformation structure to define the conversion between the color spaces in the profiles. You use the makecform function to create the structure, specifying a transformation type string as an argument.

Note   The color space conversion might involve an intermediate conversion into a device-independent color space, called the Profile Connection Space (PCS), but this is transparent to the user.

4. This example creates a color transformation structure that defines a conversion from RGB color data to CMYK color data.

5. C = makecform('icc',inprof,outprof);

6. Perform the conversion. You use the applycform function to perform the conversion, specifying as arguments the color data you want to convert and the color transformation structure that defines the conversion. The function returns the converted data.

I_cmyk = applycform(I_rgb,C);

7. Write the converted data to a file. To export the CMYK data, use the imwrite function, specifying the format as TIFF. If the format is TIFF and the data is an m-by-n-by-4 array, imwrite writes CMYK data to the file.

imwrite(I_cmyk,'pep_cmyk.tif','tif')

To verify that the CMYK data was written to the file, use imfinfo to get information about the file and look at the PhotometricInterpretation field.

info = imfinfo('pep_cmyk.tif');info.PhotometricInterpretationans = 'CMYK'

Page 34: Color Space

Specifying the Rendering Intent

For most devices, the range of reproducible colors is much smaller than the range of colors represented by the PCS. It is for this reason that four rendering intents (or gamut mapping techniques) are defined in the profile format. Each one has distinct aesthetic and color-accuracy tradeoffs.

When you create a profile-based color transformation structure, you can specify the rendering intent for the source as well as the destination profiles. For more information, see the makecform reference information.

  Back to Top

Converting Between Device-Dependent Color Spaces

The toolbox includes functions that you can use to convert RGB data to several common device-dependent color spaces, and vice versa:

YIQ YCbCr Hue, saturation, value (HSV)

YIQ Color Space

The National Television Systems Committee (NTSC) defines a color space known as YIQ. This color space is used in televisions in the United States. One of the main advantages of this format is that grayscale information is separated from color data, so the same signal can be used for both color and black and white sets.

In the NTSC color space, image data consists of three components: luminance (Y), hue (I), and saturation (Q). The first component, luminance, represents grayscale information, while the last two components make up chrominance (color information).

The function rgb2ntsc converts colormaps or RGB images to the NTSC color space. ntsc2rgb performs the reverse operation.

For example, these commands convert an RGB image to NTSC format.

RGB = imread('peppers.png');YIQ = rgb2ntsc(RGB);

Because luminance is one of the components of the NTSC format, the RGB to NTSC conversion is also useful for isolating the gray level information in an image. In fact, the toolbox functions rgb2gray and ind2gray use the rgb2ntsc function to extract the grayscale information from a color image.

Page 35: Color Space

For example, these commands are equivalent to calling rgb2gray.

YIQ = rgb2ntsc(RGB);I = YIQ(:,:,1);

Note   In the YIQ color space, I is one of the two color components, not the grayscale component.

YCbCr Color Space

The YCbCr color space is widely used for digital video. In this format, luminance information is stored as a single component (Y), and chrominance information is stored as two color-difference components (Cb and Cr). Cb represents the difference between the blue component and a reference value. Cr represents the difference between the red component and a reference value. (YUV, another color space widely used for digital video, is very similar to YCbCr but not identical.)

YCbCr data can be double precision, but the color space is particularly well suited to uint8 data. For uint8 images, the data range for Y is [16, 235], and the range for Cb and Cr is [16, 240]. YCbCr leaves room at the top and bottom of the full uint8 range so that additional (nonimage) information can be included in a video stream.

The function rgb2ycbcr converts colormaps or RGB images to the YCbCr color space. ycbcr2rgb performs the reverse operation.

For example, these commands convert an RGB image to YCbCr format.

RGB = imread('peppers.png');YCBCR = rgb2ycbcr(RGB);

HSV Color Space

The HSV color space (Hue, Saturation, Value) is often used by people who are selecting colors (e.g., of paints or inks) from a color wheel or palette, because it corresponds better to how people experience color than the RGB color space does. The functions rgb2hsv and hsv2rgb convert images between the RGB and HSV color spaces.

Reducing Colors Using Color Approximation

On systems with 24-bit color displays, truecolor images can display up to 16,777,216 (i.e., 224) colors. On systems with lower screen bit depths, truecolor images are still displayed reasonably well, because MATLAB automatically uses color approximation and dithering if needed. Color

Page 36: Color Space

approximation is the process by which the software chooses replacement colors in the event that direct matches cannot be found.

Indexed images, however, might cause problems if they have a large number of colors. In general, you should limit indexed images to 256 colors for the following reasons:

On systems with 8-bit display, indexed images with more than 256 colors will need to be dithered or mapped and, therefore, might not display well.

On some platforms, colormaps cannot exceed 256 entries. If an indexed image has more than 256 colors, MATLAB cannot store the image data in a

uint8 array, but generally uses an array of class double instead, making the storage size of the image much larger (each pixel uses 64 bits).

Most image file formats limit indexed images to 256 colors. If you write an indexed image with more than 256 colors (using imwrite) to a format that does not support more than 256 colors, you will receive an error.

To reduce the number of colors in an image, use the rgb2ind function. This function converts a truecolor image to an indexed image, reducing the number of colors in the process. rgb2ind provides the following methods for approximating the colors in the original image:

Quantization (described in Quantization)o Uniform quantizationo Minimum variance quantization

Colormap mapping (described in Colormap Mapping)

The quality of the resulting image depends on the approximation method you use, the range of colors in the input image, and whether or not you use dithering. Note that different methods work better for different images. See Dithering for a description of dithering and how to enable or disable it.

Quantization

Reducing the number of colors in an image involves quantization. The function rgb2ind uses quantization as part of its color reduction algorithm. rgb2ind supports two quantization methods: uniform quantization and minimum variance quantization.

An important term in discussions of image quantization is RGB color cube, which is used frequently throughout this section. The RGB color cube is a three-dimensional array of all of the colors that are defined for a particular data type. Since RGB images in MATLAB can be of type uint8, uint16, or double, three possible color cube definitions exist. For example, if an RGB image is of class uint8, 256 values are defined for each color plane (red, blue, and green), and, in total, there will be 224 (or 16,777,216) colors defined by the color cube. This color cube is the same for all uint8 RGB images, regardless of which colors they actually use.

The uint8, uint16, and double color cubes all have the same range of colors. In other words, the brightest red in a uint8 RGB image appears the same as the brightest red in a double RGB

Page 37: Color Space

image. The difference is that the double RGB color cube has many more shades of red (and many more shades of all colors). The following figure shows an RGB color cube for a uint8 image.

RGB Color Cube for uint8 Images

Quantization involves dividing the RGB color cube into a number of smaller boxes, and then mapping all colors that fall within each box to the color value at the center of that box.

Uniform quantization and minimum variance quantization differ in the approach used to divide up the RGB color cube. With uniform quantization, the color cube is cut up into equal-sized boxes (smaller cubes). With minimum variance quantization, the color cube is cut up into boxes (not necessarily cubes) of different sizes; the sizes of the boxes depend on how the colors are distributed in the image.

Uniform Quantization.  To perform uniform quantization, call rgb2ind and specify a tolerance. The tolerance determines the size of the cube-shaped boxes into which the RGB color cube is divided. The allowable range for a tolerance setting is [0,1]. For example, if you specify a tolerance of 0.1, the edges of the boxes are one-tenth the length of the RGB color cube and the maximum total number of boxes is

n = (floor(1/tol)+1)^3

The commands below perform uniform quantization with a tolerance of 0.1.

RGB = imread('peppers.png');[x,map] = rgb2ind(RGB, 0.1);

Page 38: Color Space

The following figure illustrates uniform quantization of a uint8 image. For clarity, the figure shows a two-dimensional slice (or color plane) from the color cube where red=0 and green and blue range from 0 to 255. The actual pixel values are denoted by the centers of the x's.

Uniform Quantization on a Slice of the RGB Color Cube

After the color cube has been divided, all empty boxes are thrown out. Therefore, only one of the boxes is used to produce a color for the colormap. As shown earlier, the maximum length of a colormap created by uniform quantization can be predicted, but the colormap can be smaller than the prediction because rgb2ind removes any colors that do not appear in the input image.

Minimum Variance Quantization.  To perform minimum variance quantization, call rgb2ind and specify the maximum number of colors in the output image's colormap. The number you specify determines the number of boxes into which the RGB color cube is divided. These commands use minimum variance quantization to create an indexed image with 185 colors.

RGB = imread('peppers.png');[X,map] = rgb2ind(RGB,185);

Minimum variance quantization works by associating pixels into groups based on the variance between their pixel values. For example, a set of blue pixels might be grouped together because they have a small variance from the center pixel of the group.

In minimum variance quantization, the boxes that divide the color cube vary in size, and do not necessarily fill the color cube. If some areas of the color cube do not have pixels, there are no boxes in these areas.

While you set the number of boxes, n, to be used by rgb2ind, the placement is determined by the algorithm as it analyzes the color data in your image. Once the image is divided into n optimally located boxes, the pixels within each box are mapped to the pixel value at the center of the box, as in uniform quantization.

Page 39: Color Space

The resulting colormap usually has the number of entries you specify. This is because the color cube is divided so that each region contains at least one color that appears in the input image. If the input image uses fewer colors than the number you specify, the output colormap will have fewer than n colors, and the output image will contain all of the colors of the input image.

The following figure shows the same two-dimensional slice of the color cube as shown in the preceding figure (demonstrating uniform quantization). Eleven boxes have been created using minimum variance quantization.

Minimum Variance Quantization on a Slice of the RGB Color Cube

For a given number of colors, minimum variance quantization produces better results than uniform quantization, because it takes into account the actual data. Minimum variance quantization allocates more of the colormap entries to colors that appear frequently in the input image. It allocates fewer entries to colors that appear infrequently. As a result, the accuracy of the colors is higher than with uniform quantization. For example, if the input image has many shades of green and few shades of red, there will be more greens than reds in the output colormap. Note that the computation for minimum variance quantization takes longer than that for uniform quantization.

Colormap Mapping

If you specify an actual colormap to use, rgb2ind uses colormap mapping (instead of quantization) to find the colors in the specified colormap that best match the colors in the RGB image. This method is useful if you need to create images that use a fixed colormap. For example, if you want to display multiple indexed images on an 8-bit display, you can avoid color problems by mapping them all to the same colormap. Colormap mapping produces a good approximation if the specified colormap has similar colors to those in the RGB image. If the colormap does not have similar colors to those in the RGB image, this method produces poor results.

Page 40: Color Space

This example illustrates mapping two images to the same colormap. The colormap used for the two images is created on the fly using the MATLAB function colorcube, which creates an RGB colormap containing the number of colors that you specify. (colorcube always creates the same colormap for a given number of colors.) Because the colormap includes colors all throughout the RGB color cube, the output images can reasonably approximate the input images.

RGB1 = imread('autumn.tif');RGB2 = imread('peppers.png');X1 = rgb2ind(RGB1,colorcube(128));X2 = rgb2ind(RGB2,colorcube(128));

Note   The function subimage is also helpful for displaying multiple indexed images. For more information, see Displaying Multiple Images in the Same Figure or the reference page for subimage.

  Back to Top

Reducing Colors Using imapprox

Use imapprox when you need to reduce the number of colors in an indexed image. imapprox is based on rgb2ind and uses the same approximation methods. Essentially, imapprox first calls ind2rgb to convert the image to RGB format, and then calls rgb2ind to return a new indexed image with fewer colors.

For example, these commands create a version of the trees image with 64 colors, rather than the original 128.

load trees[Y,newmap] = imapprox(X,map,64);imshow(Y, newmap);

The quality of the resulting image depends on the approximation method you use, the range of colors in the input image, and whether or not you use dithering. Note that different methods work better for different images. See Dithering for a description of dithering and how to enable or disable it.

  Back to Top

Dithering

When you use rgb2ind or imapprox to reduce the number of colors in an image, the resulting image might look inferior to the original, because some of the colors are lost. rgb2ind and imapprox both perform dithering to increase the apparent number of colors in the output image. Dithering changes the colors of pixels in a neighborhood so that the average color in each neighborhood approximates the original RGB color.

Page 41: Color Space

For an example of how dithering works, consider an image that contains a number of dark orange pixels for which there is no exact match in the colormap. To create the appearance of this shade of orange, dithering selects a combination of colors from the colormap, that, taken together as a six-pixel group, approximate the desired shade of orange. From a distance, the pixels appear to be the correct shade, but if you look up close at the image, you can see a blend of other shades. To illustrate dithering, the following example loads a 24-bit truecolor image, and then uses rgb2ind to create an indexed image with just eight colors. The first example does not use dithering, the second does use dithering.

1. Read image and display it.2. rgb=imread('onion.png');

imshow(rgb);

3. Create an indexed image with eight colors and without dithering.4. [X_no_dither,map]= rgb2ind(rgb,8,'nodither');

figure, imshow(X_no_dither,map);

5. Create an indexed image using eight colors with dithering. Notice that the dithered image has a larger number of apparent colors but is somewhat fuzzy-looking. The image produced without dithering has fewer apparent colors, but an improved spatial resolution when compared to the dithered image. One risk in doing color reduction without dithering is that the new image can contain false contours.

6. [X_dither,map]=rgb2ind(rgb,8,'dither');figure, imshow(X_dither,map);

Page 42: Color Space

  Back to Top

The number of bits per screen pixel determines the display's screen bit depth. The screen bit depth determines the screen color resolution, which is how many distinct colors the display can produce.

Most computer displays use 8, 16, or 24 bits per screen pixel. Depending on your system, you might be able to choose the screen bit depth you want to use. In general, 24-bit display mode produces the best results. If you need to use a lower screen bit depth, 16-bit is generally preferable to 8-bit. However, keep in mind that a 16-bit display has certain limitations, such as

An image might have finer gradations of color than a 16-bit display can represent. If a color is unavailable, MATLAB uses the closest approximation.

There are only 32 shades of gray available. If you are working primarily with grayscale images, you might get better display results using 8-bit display mode, which provides up to 256 shades of gray.

To determine the bit depth of your system's screen, enter this command at the MATLAB prompt.

get(0,'ScreenDepth')ans =

32

Discuss the advantages of true color image over a pseudo-color image?

5 years ago

A true colour image will look the same as the object of which the image was taken.