Automating the Analysis of Spatial Grids - CIMMScimms.ou.edu/~lakshman/autospatialgrids/asgbook_excerpts.pdf · Valliappa Lakshmanan Automating the Analysis of Spatial Grids A Practitioner’s

Valliappa Lakshmanan

Automating the Analysis ofSpatial Grids

A Practitioner’s Guide to Data MiningGeospatial Images

January 5, 2012

Springer

Contents

1 Automated Analysis of Spatial Grids: Motivation and Challenges . . . . 11.1 Geographic Information Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 GIS Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Need for Automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Spatial Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.5 Challenges in Automated Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.6 Spatial Data Mining Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.6.1 Automatic Land-type classification . . . . . . . . . . . . . . . . . . . . . 111.6.2 Disaster Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.6.3 Discovering new climate indices . . . . . . . . . . . . . . . . . . . . . . . 121.6.4 Change in forest cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.6.5 Malaria eradication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.6.6 Monitoring ecosystem condition . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Geospatial grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.1 Georeference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.1.2 Map projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.1.3 Going from one projection to another . . . . . . . . . . . . . . . . . . . 26

2.2 Linearity of data values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.2.1 Perceptual color-maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.2.2 Verifying linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.3 Instrument geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412.4 Gridding point observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.4.1 Objective Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462.4.2 Cressman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.4.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492.4.4 Successive iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.4.5 Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.5 Rasterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532.5.1 Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

vii

viii Contents

2.5.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542.5.3 Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562.5.4 Polygons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572.5.5 Geocoding polygons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

2.6 Example Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3 Data Structures for Spatial Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.1 Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.2 Pixels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663.3 Level set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683.4 Topographical surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.5 Markov chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723.6 Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733.7 Parametric approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.8 Relational structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.9 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4 Global and Local Image Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894.1 Types of statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894.2 Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.2.1 Pixel to cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924.2.2 Cluster to cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.3 Distance transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 954.3.1 Ordered propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 964.3.2 Saiko and Toriwaki algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 984.3.3 Geodesic distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.4 Probability Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1054.4.1 Shannon entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1074.4.2 Kolmogorov-Smirnov test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1074.4.3 Threshold selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.5 Local measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1114.5.1 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117


5 Neighborhood and Window Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 1295.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1295.2 Window operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

5.2.1 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1315.2.2 Matched filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1355.2.3 Directional smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1375.2.4 Filter bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1395.2.5 Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1425.2.6 Edge detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

5.3 Median filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1495.3.1 Speckle filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Contents ix

5.4 Morphological operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1535.5 Skeletonization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

5.5.1 Thinning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1605.6 Frequency Domain Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1645.7 Example Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

6 Identifying Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1736.1 Object identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1736.2 Region growing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1766.3 Region properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

6.3.1 Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1796.3.2 Geocoding objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1836.3.3 Orientation and Aspect ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . 1846.3.4 Fitting lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

6.4 Hysteresis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1896.5 Active contours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1926.6 Watershed Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2006.7 Enhanced watershed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2076.8 Contiguity-enhanced Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2156.9 Choosing an object-identification technique . . . . . . . . . . . . . . . . . . . . . 2206.10 Example Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

7 Change and Motion Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2237.1 Estimating change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2237.2 Optical Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

7.2.1 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2267.2.2 Cross-correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2327.2.3 Pyramidal Cross-Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2387.2.4 Phase Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

7.3 Object-tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2457.3.1 Hungarian method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2507.3.2 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2567.3.3 Hybrid techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2617.3.4 Interpolating a motion field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

7.4 Choosing a change or motion estimation technique . . . . . . . . . . . . . . 2667.4.1 Temporal attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267


8 Data Mining Attributes from Spatial Grids . . . . . . . . . . . . . . . . . . . . . . . 2718.1 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2718.2 A Fuzzy Logic Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

8.2.1 Data set preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2728.2.2 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2758.2.3 Spatial Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2788.2.4 Why Fuzzy Logic? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

x Contents

8.3 Supervised learning models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2828.3.1 Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2838.3.2 Identifying primary cities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2878.3.3 Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2908.3.4 Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2928.3.5 Diagnosing a data mining model . . . . . . . . . . . . . . . . . . . . . . . 2948.3.6 Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2978.3.7 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

8.4 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3038.5 Example Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

The Accompanying Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317

1.5 Challenges in Automated Analysis 9

road identificationlandsathuman vision

ensure that the types of filtering and processing being carried are applicable to thenature of the data values.

The second challenge with automated analysis is that simple-minded image pro-cessing operations are rarely enough. One needs to incorporate domain knowledge.Consider for example, imagery from Landsat 5 shown in Figure 1.5. The scenedepicted is of Morro Bay, California. The brightest areas of the landsat image corre-spond to Highway 1, a scenic highway along the Pacific coast of the United States.Examining the image, it is clear that high values in the image correspond to roads.Can just thresholding the image be used to isolate the roads?

Simple low-level processing such as thresholding alone rarely works on real-world images – in this case, some soil areas are identified, and the roads are dis-jointed. Higher-level processing is required to prune off spurious detections and tocombine road segments into longer stretches based on domain knowledge, as shownin Figure 1.6.

a b c

Fig. 1.5 Low-level processing without the incorporation of domain knowledge is rarely sufficientin real-world images. (a) Landsat imagery of Morro Bay, California (b) Map of the general areadepicted by Landsat (c) Thresholded satellite imagery Images a,b from [Sho10].

This is an issue that comes up over and over again. What’s obvious to us visuallyis not so obvious to an automated algorithm. The human vision system is extremelypowerful, and we bring a lot of contextual knowledge into carrying out our analysis.One challenge in automated analysis is to make that instinctive knowledge explicit,and to encode our knowledge into the techniques we employ.

The combination of these three things – that (1) processing needs to be carriedout on data, not on pictures, (2) domain knowledge needs to inform the choice oflow-level operations and post-processing of the results and (3) seemingly obviousrelationships have to be carefully encoded as rules – means that it is highly unlikelythat any off-the-shelf, general-purpose toolkit will be able to extract just the rightinformation from your data. To carry out automated analysis of spatial grids, youwill inevitably have to code up your processing technique, understand the operationsthat are being performed and be able to adapt them to the needs of your domain andthe quirks of your data set. The goal of this book is to give you the capability to do

10 1 Automated Analysis of Spatial Grids: Motivation and Challenges

data mining algorithm

a b c

Fig. 1.6 The use of domain knowledge to identify roads in satellite images. (a) Landsat imageryof Whitewater, Michigan (b) Thresholded satellite imagery (c) Roads identified by high-level pro-cess that involves sharpening road pixels, determining their direction, finding road segments andcompleting them using knowledge-based rules. Images from [TJEH89], used with permission.

just that. Implementations of the techniques discussed in this book are provided sothat you have a convenient starting point – you will have to adapt the techniques toyour needs.

1.6 Spatial Data Mining Algorithms

A typical data mining algorithm to automatically analyze spatial grids starts fromsome raw data. These raw data have to be “gridded”, i.e. placed on a uniform (reg-ularly spaced) spatial grid. This is achieved through processes such as projection,objective analysis and rasterization (See Figure 1.7). The spatial grid is then turnedinto a representation – a matrix, a level set or a parametric approximation – de-pending on the types of operations that need to be performed on the data. A varietyof local and global features such as entropy, texture, etc. can be computed on thespatial grid. The spatial grid often needs to be linearized or its histogram may needto be equalized. In some instances, the resolution of data values needs to be re-duced through vector quantization. Objects can be identified from the spatial gridand properties of these objects computed. It is also possible to extract informationsuch as motion and change from a time sequence of spatial grids. Together all thesefeatures are used as input to a data mining model such as a decision tree, supportvector machine or neural network so that a machine can learn to recognize certainsituations automatically whenever a new spatial grid is received.

58 2 Geospatial grids

intercept Rasterizing a polygon boils down to a simple question: how do you determinewhether a grid point is inside or outside a given polygon?

Fig. 2.23 Testing to see if a pixel is inside a polygon.

The idea is to start outside the polygon and march along a row. Every time thepolygon’s edge is crossed, update a counter. At a pixel, if this counter is even, thenthe point is outside the polygon. If it is odd, then it is inside the polygon (See Fig-ure 2.23).

1 p u b l i c boolean c o n t a i n s ( double x , double y ) {2 i n t n u m x c r o s s i n g = 0 ;3 i n t n u m y c r o s s i n g = 0 ;4 f o r ( i n t i = 0 ; i < edges . s i z e ( ) ; ++ i ) {5 Double x i n t e r c e p t = edges . g e t ( i ) . g e t X I n t e r c e p t ( y ) ;6 Double y i n t e r c e p t = edges . g e t ( i ) . g e t Y I n t e r c e p t ( x ) ;7 i f ( y i n t e r c e p t != n u l l ) {8 i f ( y i n t e r c e p t >= y ) {9 ++ n u m y c r o s s i n g ;

10 }11 }12 i f ( x i n t e r c e p t != n u l l ) {13 i f ( x i n t e r c e p t >= x ) {14 ++ n u m x c r o s s i n g ;15 }16 }17 }18 / / odd number o f c r o s s i n g s means i n s i d e19 re turn ( ( n u m x c r o s s i n g % 2 == 1) && ( n u m y c r o s s i n g % 2 == 1) ) ;20 }

with the intercepts determined as follows:

1 p u b l i c Double g e t Y I n t e r c e p t ( double x ) {2 i f ( ! i sBe tween ( l a t 0 , x , l a t 1 ) ) {3 re turn n u l l ;4 }5 double y ;6 i f ( l a t 0 != l a t 1 ) {

2.5 Rasterization 59

wrap aroundgeocodingbounding box

7 y = lon0 + ( x − l a t 0 ) ∗ ( l on1 − l on0 ) / ( l a t 1 − l a t 0 ) ;8 } e l s e {9 y = ( lon1 + lon0 ) / 2 ;

10 }11 re turn y ;12 }

The technique described above is straightforward, but comes with a caveat. Itworks only if you can start at a point that is definitely outside the polygon. This maybe a problem if your domain contains partial polygons. The code above avoids thisproblem by using latitude and longitude as coordinates (instead of using a pixel’sx,y coordinates within the spatial grid). Thus, by starting at a latitude or longitudethat is outside the realm of possibility, we avoid having to truncate polygons to thegrid and of dealing with incomplete polygons. However, this convenience comeswith a trade-off of having to deal with longitude wrapping around (i.e., -180 is thesame as 180). Since the wrap around happens at the international date line, in themiddle of the Pacific, most real-world applications are unaffected and do not needto explicitly deal with this. In the unlikely event that yours is affected, replace thelat and lon in the above code with x and y and work within the coordinate system ofyour spatial grid.

2.5.5 Geocoding polygons

Given a set of polygons, then, it is possible to find out if a pixel is inside any ofthese polygons. Since countries, states, postal codes, etc. are usually available aspolygons, this technique forms the basic approach for geocoding gridded spatialdata i.e. determining the address of a particular pixel given its latitude and longitude.

If we are going to be constantly checking whether a pixel is inside a particularpolygon or not, it is useful to optimize away the line intersection code for caseswhere it is obvious that the point can not lie within the polygon. One way is tocompute the bounding box for a polygon and check the bounding box before movingon to the more precise code:

1 p u b l i c c l a s s Polygon {2 / / e t c .3 p r i v a t e BoundingBox boundingBox ;4 p u b l i c Polygon ( P o i n t [ ] v e r t i c e s ) {5 / / e t c .6 boundingBox = new BoundingBox ( v e r t i c e s ) ;7 }8 p u b l i c boolean c o n t a i n s ( double x , double y ) {9 / / as an o p t i m i z a t i o n , check t h e bounding box f i r s t

10 i f ( ! boundingBox . c o n t a i n s ( x , y ) ) {11 re turn f a l s e ;12 }13 / / normal check here . . .14 }

68 3 Data Structures for Spatial Grids

level set

a b

Fig. 3.1 (a) Top 10 pixels in terms of population density (b) Cluster of pixels around New YorkCity

3.3 Level set

Although it is logical to represent a spatial grid as an array of pixels that are orderedby location (i.e. row-by-row), a level set organization can prove useful. We will see aparticularly novel use of the level set later in the section on the watershed transform.But here, let’s just look at it as a computationally more efficient way to address theproblem of finding the top N pixels.

A level set organizes the pixels in a grid by pixel value:

1 p u b l i c c l a s s L e v e l S e t {2 p r i v a t e Map> d a t a = new TreeMap>() ;3 p u b l i c vo id add ( P i x e l p ) {4 L i s t l e v e l = d a t a . g e t ( p . g e t V a l u e ( ) ) ;5 i f ( l e v e l == n u l l ) {6 l e v e l = new A r r a y L i s t () ;7 d a t a . p u t ( p . g e t V a l u e ( ) , l e v e l ) ;8 }9 l e v e l . add ( p ) ;

10 }11 p u b l i c Map . Ent ry>[] g e t L e v e l s ( ) {12 re turn d a t a . e n t r y S e t ( ) . t o A r r a y ( new Map . E n t r y [ 0 ] ) ;13 }14 }

The “data” i.e. the internal storage of the LevelSet above is an associative array andis represented by the Map interface in Java. The TreeMap allows for the array tobe internally sorted. Here, the key of the map is the pixel value and the value is thelist of pixels in the grid that share that value. The “levels” that are returned by theLevelSet consist of an array of these key-value pairs.

3.4 Topographical surface 69

topographicalthreshold

To create the level set, one can march through a spatial grid and add pixels oneat a time. Normally, missing data are not considered part of any of the levels:

1 LatLonGrid i n p u t = . . . ;2 L e v e l S e t l e v e l s e t = new L e v e l S e t ( ) ;3 f o r ( i n t i =0 ; i < i n p u t . getNumLat ( ) ; ++ i ) {4 f o r ( i n t j =0 ; j < i n p u t . getNumLon ( ) ; ++ j ) {5 i f ( i n p u t . g e t V a l u e ( i , j ) != i n p u t . g e t M i s s i n g ( ) ) {6 l e v e l s e t . add ( new P i x e l ( i , j , i n p u t . g e t V a l u e ( i , j ) ) ) ;7 }8 }9 }

It should be noted that creating the LevelSet involves traversing the spatial gridexactly once.

From the level set representation of a spatial grid, finding the top 10 pixels isparticularly efficient, since it does not involve any more traversals through the spatialgrid. One needs only to step down the level set one level at a time (starting at thehighest level) until the size of the result reaches 10:

1 / / f i n d t h e t o p n p i x e l s2 Map . Ent ry>[] l e v e l s = l e v e l s e t . g e t L e v e l s ( ) ;3 L i s t r e s u l t = new A r r a y L i s t () ;4 i n t c u r r = l e v e l s . l e n g t h ;5 whi le ( r e s u l t . s i z e ( ) < n t h && c u r r > 0){6 c u r r = c u r r − 1 ; / / n e x t7 r e s u l t . ad dA l l ( l e v e l s [ c u r r ] . g e t V a l u e ( ) ) ; / / a l l p i x e l s a t t h i s

l e v e l8 }

Compare this to the implementation in the previous section that selection-sorted thePixel array, requiring 10 traversals through the array of pixels (which is the size ofthe spatial grid) to find the top 10 pixels.

3.4 Topographical surface

Another conceptual model of a spatial grid, related to the idea of a level set, is asa topographical surface. In other words, one can treat the spatial grid as a surfacewhere the value of a pixel determines the height at at pixel. Such a topographi-cal representation can help make sense of thresholding operations whether with asimple, global threshold (See Figure 3.2) or more complex operations such as thewatershed transform.

Treating a spatial grid as a topographical surface also gives rise to the conceptof “morphological” operations i.e. operations that operate in the neighborhood of apixel but that consider the shape of the surface. For example, one can think of com-puting distances between points where a path will pass only through pixels abovea certain threshold. This could be quite different from the “Euclidean” distance be-tween two points (See Figure 3.3);

4.2 Distances 93

distance!scalingdistance!MahalanobisE[x] =

∑k Ixkyk xk

∑k Ixkyk

(4.7)

where Ixkyk is the value of the spatial grid at (xk,yk).

Fig. 4.3 The distance from a pixel to a cluster can be computed by finding the minimum of thedistances to each of the points in the cluster or by finding the average of those distances.

Fig. 4.4 The Mahalanobis distance normalizes the centroid-based distance by the width of theellipse in the direction of the point.

The problem with simply finding the distance to the centroid is that it is notreally scaled in any way to the size of the cluster. Looking at Figure 4.4, it seemsintuitive that the pixel in the second panel is closer to the cluster than the pixel inthe first panel. However, the distance to the centroid from both pixels is the same.The Mahalananobis distance accounts for this by fitting the points in the cluster to anellipse and then normalizing the distance by the width of the ellipse in that direction.Because the ellipse is wider in the direction of the pixel in panel 2, the Mahalanobisdistance will be less. Mathematically, the Mahalanobois distance is defined as:

dcluster =√

(X −E[X ])T S−1(X −E[X ]) (4.8)

where X is the vector of coordinates, i.e. (x,y) and the expectation of some quantityz, E[z], is the mean value of z computed over the cluster of points. As before, this

94 4 Global and Local Image Statistics

distance!cluster to clustermetricdistance!metric

could be a weighted average, with the weights corresponding to the pixel value.Finally, the matrix S is the covariance matrix and is given by:

S =(

E[(x−E[x])(x−E[x])] E[(x−E[x])(y−E[y])]E[(x−E[x])(y−E[y])] E[(y−E[y])(y−E[y])]

)(4.9)

4.2.2 Cluster to cluster

Every one of the distance measures listed in the previous section is a metric.A function m(A,B) is a metric if it is symmetric (m(A,B) = m(B,A)), positive(m(A,B)≥ 0; m(A,B) = 0 if and only if A = B) and satisfies the triangle inequality(m(A,B)+m(B,C)≥ m(A,C)).

Why is it important that a measure be a metric? The triangle inequality property isessential to carry out a fair measurement. Think about this scenario: if we measuredthat the distance between A and B is 100 units, and the distance between A and C is10 units, we would say that C is better especially if A is some measure of “truth”.However, if the verification measurement does not satisfy the triangle inequalityproperty, we may find that the distance between B and C is, say, 1 unit or even less.Considering the expected variance in computed distances, we may not be convincedthat C is really better since it is almost the same as B (the distance between themis almost zero). The symmetric property guarantees that every set has equal rightto be fairly measured: the distance from set A to set B is always the same as thedistance from set B to set A. For these reasons, it is important that distance measuresbe metrics.

Are not all intuitive measures metrics? No. Consider the problem of finding thedistance from one cluster of points to another cluster of points. What is a goodmeasure? Intuitively, you might want to pick the closest pair of points in the twosets, i.e., find the distance from every point in the first cluster to the second cluster(using the pixel to cluster distances discussed in the previous section) and take theminimum of these:

dAX = min(daX ) (4.10)

where daX is the distance between the point a (in the cluster A) and the cluster X.

Fig. 4.5 It can be difficult to devise a metric for the distance between two clusters.

4.3 Distance transform 95

Hausdorffdistance!Hausdorffdistance!partial Hausdorffdistance!transformdistance!fast computation

However, taking the minimum of the distances (See Figure 4.5) violates the sec-ond property of a metric. When two clusters overlap, there will be points for whichthe distance daX will be zero and thus, the distance between the two clusters will bezero. But that doesn’t make the clusters identical.

What if, instead of taking the minimum of the distances, we take the maximum?The problem is this will not be symmetric. For every point in A, the distance to Xwill be based on the closest point in X. We will then take the maximum of those (Seethe second panel of Figure 4.5). However, if we go from every point in B to computethe distance to the points in A, we end up with a different value. So, the maximumis not a metric either. The average distance between the points in a cluster to anotheris also similarly problematic. We won’t go through the full litany of possibilities.

What does work is the Hausdorff metric, where we compute the distance from Ato B and compute the distance from B to A and then take the maximum. [Ruc96]:

mH(A,B) = max{d(A,B), d(B,A)}. (4.11)

Of course, this is symmetric now because we explicitly compute both distances.The Hausdorff metric is, however, highly susceptible to noise. Even a single point

can affect the metric since we are taking minima and maxima. One possible way toaddress this, called the Partial Hausdorff Distance (PHD), is to use, say the 75thpercentile, rather than the maximum. However, this is not a metric [Ruc96].

So, when we need to find the distance between two objects (“clusters”), it is theHausdorff metric that we will use. However, we will keep in mind that it shouldbe used only for objects that have been subjected to some sort of quality check, toavoid computing distances based on noisy pixels.

4.3 Distance transform

All the metrics that we discussed – for the distance between two points, the distancebetween a point and a cluster and the distance between two clusters – are built oncomputing the distance between two points. On large images, this computation canbe quite time-consuming. Therefore, we will explore ways to speed up distancecomputations.

A variety of fast methods to compute Euclidean distances have been proposed(See [FCTB08] for a survey). We will explore two of these methods: independentscanning [ST94] and ordered propagation [CM99] since both are quite simple toimplement and among the best performing. While we discuss these methods in thecontext of Euclidean distances, you can adapt these readily to other distance mea-sures such as the Manhattan distance. However, as discussed earlier, the reason touse the Manhattan distance is as an approximation to the Euclidean, with the un-derstanding that we will fall back on the Euclidean for “close enough” points. Onceyou have a fast way to compute Euclidean distances, there may not be a need foryou to compute the Manhattan distance.

96 4 Global and Local Image Statistics

Voronoi regionordered propagation

4.3.1 Ordered propagation

A closely related concept to distances is the idea of a Voronoi region. Assume thatyou have a set of points. Then, the Voronoi region of a point is all the pixels that arecloser to that point than to any other point of the set (See Figure 4.6). Note that, dueto pixelation, the Voronoi region is not exact – if the boundary between points splitsa pixel, we have to choose which region to assign the pixel to.

Fig. 4.6 Left: The Voronoi region of a point is the group of pixels that are closer to that point thatto any other point. Right: Ordered propagation starts out from a point, updating distances, stoppingthe propagation when distances are no longer improved.

Often, the purpose of finding distances is to determine “areas of influence”,which is of course what the Voronoi regions are. Ordered propagation provides afast and efficient method of determining Voronoi regions, i.e., it is a fast solution tothe problem of finding distances between a pixel and a set of points that do not forma cluster.

The idea behind ordered propagation is to start from the list of points and assigndistances at those pixels to zero. Then, we consider the neighbors of those points, up-dating the distance at those neighbors only if the distance at those neighbors wouldbe “improved”. If there are no neighbors to be improved, the process stops for thispoint and the next point is considered. The process for the light-colored point in Fig-ure 4.6 is illustrative. Note that the immediate neighbors of this point are changedand their neighbors checked. However, once we reach the boundary, we stop modi-fying distances.

For example, to find the Voronoi region from the set of points above a certainthreshold:

1 p u b l i c LatLonGrid g e t D i s t a n c e T r a n s f o r m ( LatLonGrid da t a , i n tt h r e s h ) {

2 i n t nrows = d a t a . getNumLat ( ) ;3 i n t n c o l s = d a t a . getNumLon ( ) ;

5.4 Morphological operations 155

erode+dilatedenoisingfilter!erode+dilate

If we take an intermediate value such as the 75th percentile, we’ll get a combina-tion of dilation and smoothing. Similarly, if we use the 25th percentile, we’ll get acombination of erosion and smoothing.

Population density Eroding using 3x3 window

Eroding using 7x7 window Eroding using 11x11 window

Fig. 5.15 Eroding the population density grid.

Dilation makes high-valued pixels expand by the half-size while erode makes ahigh-valued area that is half-size pixels across erode to a point. Thus, if we erodea grid and then dilate it, we will get back most of the original high-valued pixelsbut we will have gotten rid of any high-valued, isolated pixels i.e. we would havegotten rid of noise. This combination, of erosion followed by dilation, is oftenused for denoising a grid and making it more amenable for further processing (SeeFigure 5.16. The sizes of the dilation and erosion windows and the number of timesthe filter is applied can be used to control the degree of noise removal.

1 p u b l i c D i l a t e E r o d e F i l t e r ( i n t h a l f S i z e , i n t numTimes ) {2 t h i s . d i l a t e r = new D i l a t i o n F i l t e r ( h a l f S i z e ) ;3 t h i s . e r o d e r = new E r o s i o n F i l t e r ( h a l f S i z e ) ;4 t h i s . numTimes = numTimes ;5 }

156 5 Neighborhood and Window Operations

dilate+erodegap fillingfilter!dilate+eroderank filteredge filter

Population density 3x3 window, once

3x3 window, 3 times 5x5 window, 3 times

Fig. 5.16 Denoising the population density grid using a erode+dilate filter.

6 p u b l i c LatLonGrid f i l t e r ( La tLonGrid i n p u t ) {7 LatLonGrid o u t p u t = LatLonGrid . copyOf ( i n p u t ) ;8 f o r ( i n t i =0 ; i < numTimes ; ++ i ) {9 o u t p u t = e r o d e r . e r o d e ( o u t p u t ) ;

10 o u t p u t = d i l a t e r . d i l a t e ( o u t p u t ) ;11 }12 re turn o u t p u t ;13 }

Reversing the order of operations, by doing a dilate first, followed by an erodewill result in holes being filled but the overall area not being increased (See Fig-ure 5.17).

Other operations involving the ranked operations involve first sorting the valuesand then computing a variety of order statistics on the sorted pixel values. Theseyield a class of non-linear filters for edge detection and noise mitigation [PV86].

5.5 Skeletonization 157

skeletonskeletonization!MATMedial Axis Transform

(MAT)

Population density 3x3 window, once

3x3 window, 3 times 5x5 window, 3 times

Fig. 5.17 Filling holes in the population density grid using a dilate+erode filter.

5.5 Skeletonization

Often, we would like to reduce areas to lines that represent the general shape of theobject (See Figure 5.18) i.e. to reduce areas to “skeletons”. One way to think aboutthe skeleton is to imagine the pixels that form the object as forming a surface that isdeeper the farther away from the boundary you go. The skeleton is then the trenchin the middle of the object. It should be clear from the examples in the figure thatskeletons are not unique – two different shapes can have the same skeleton.

The Medial Axis Transform (MAT) explicitly encodes this analogy. It determinesthe closest boundary points for each pixel in the object, placing a pixel on the skele-ton if it is equally close to two boundary points. It consists of three steps:

1. Finding the pixels that are in the object. As an example, we can consider allpoints greater than some threshold to be part of an “object”.

1 p u b l i c s t a t i c LatLonGrid f i n d S k e l e t o n ( LatLonGrid i n p u t , i n tt h r e s h , F i l e o u t ) throws E x c e p t i o n {

2 / / t h r e s h o l d

6.4 Hysteresis 189

hysteresis

(a) Population density (b) Preprocessed to fill in holes

(c) Hilditch Skeleton (d) Three best lines

Fig. 6.7 Applying the Hough Transform to the skeleton of the population density image.

6.4 Hysteresis

Noise makes using a single, global threshold problematic. Thus, there may be pixelsthat barely miss meeting the threshold criterion right next to pixels that do meetit, leading to holes in the data. One way to mitigate this problem is to relax thethreshold but this causes many false detections. One solution to this problem is toemploy hysterisis (See Figure 6.8).

The term hysteresis comes from the lag observed between the application of anelectromagnetic field and its subsequent effect on a material. In the image processingcontext, the lagging effect is provided by the use of two thresholds – one to start thethresholding and the other (a lagging one) to stop it. The first threshold is used tostart the region growing process and the second threshold used to add pixels to analready started region. In effect, we are defining a valid object as a set of pixels withvalues above threshold2 that have at least one pixel with a value above threshold1.Allowing just one pixel to dictate whether a region is noise or not might seem overlylax, but we can easily make this much stricter by smoothing the grid before carrying

190 6 Identifying Objects

Fig. 6.8 (a) Using a high, global threshold breaks up regions. (b) Lowering the threshold causesmany false detections. (c) Hysteresis, by employing two thresholds, solves the problem.

out object identification. Smoothing, such as by using a median filter, has the effectof denoising the image. So, we can mitigate noise and then carry out the hysteresisoperation. The impact of such preprocessing can be explicitly defined. For example,if we apply a MxN median filter to the spatial grid before carrying out hysterisis,then for a pixel in the median filtered image to be above some threshold t, at leastMxN/2 of its neighbors must have been above t. Another way to state this would bein terms of the area occupied by these MxN/2 pixels – we could state for examplethat atleast 15km2 of the pixels neighborhood has to exceed the threshold.

The two thresholds have to be chosen so that the first threshold is high enoughthat only valid objects will meet that criterion whereas the second threshold hasto be low enough that regions do not get broken up. These thresholds are usuallyobtained by trial-and-error on a large dataset and rarely perfect. Broken regions andfalse detections will typically remain in the final analysis and have to be accountedfor. Hysteresis simply reduces the incidence of these beyond what we would havegotten if we had used only one threshold.

6.4 Hysteresis 191

Incorporation of the two thresholds into region growing is quite straightforward.The thresholds are user-specified. We should be careful to ensure that the first thresh-old is larger than the second threshold. Here, we swap the two numbers if the userinadvertently mixed them up:

1 p r i v a t e i n t t1 , t 2 ;2 p u b l i c H y s t e r e s i s S e g m e n t e r ( i n t t h r e s h 1 , i n t t h r e s h 2 ) {3 t h i s . t 1 = t h r e s h 1 ;4 t h i s . t 2 = t h r e s h 2 ;5 i f ( t 1 < t 2 ) {6 / / swap7 i n t t = t 1 ;8 t 1 = t 2 ;9 t 2 = t ;

10 }11 }

We march through the image, starting the region growing process at any heretounmarked pixel that exceeds the first threshold:

1 p u b l i c L a b e l R e s u l t l a b e l ( La tLonGrid d a t a ) {2 f i n a l i n t UNSET = 0 ;3 i n t nrows = d a t a . getNumLat ( ) ;4 i n t n c o l s = d a t a . getNumLon ( ) ;5 LatLonGrid l a b e l = new

LatLonGrid ( nrows , n c o l s , 0 , d a t a . getNwCorner ( ) ,d a t a . g e t L a t R e s ( ) , d a t a . ge tLonRes ( ) ) ;

6 / / l a b e l . f i l l (UNSET) ; j a v a d e f a u l t i s t o zero−o u t a r r a y s7 i n t r egno = 0 ;8 f o r ( i n t i =0 ; i < nrows ; ++ i ) f o r ( i n t j =0 ; j < n c o l s ; ++ j ) {9 i f ( d a t a . g e t V a l u e ( i , j ) > t 1 && l a b e l . g e t V a l u e ( i , j ) ==

UNSET ) {10 ++ regno ;11 RegionGrowing . growRegion ( i , j , da t a , t2 , l a b e l , r egno ) ;12 }13 }14 re turn new L a b e l R e s u l t ( l a b e l , r egno ) ;15 }

Note in line 11 of the above listing that region growing is carried out using thesecond threshold. The result of hysteresis on the population density dataset is shownin Figure 6.9. Compare with the results of using a single threshold in Figure 6.3.Although we do not show the impact of applying a size threshold to these cities, itshould be obvious that the results ought to be better because the cities obtained usinghysteresis and two thresholds t1 and t2 should be larger than one obtained using justt1 and fewer in number that one obtained using just t2.

192 6 Identifying Objects

snakeactive contour

(a) Population density (b) Threshold1 at 10,000

(c) Threshold1 at 20,000 (d) Threshold1 at 30,000

Fig. 6.9 Clusters obtained by region-growing using hysteresis. The second threshold in all casesis 5000 less than the first threshold. The colorscale associated with the clusters is abitrary.

6.5 Active contours

Hysteresis used two thresholds, the first a threshold high enough that noise is un-likely and a second threshold low enough that all the points in an object are included.Hysteresis is not ideal, however. For one thing, objects turn out to be pixelated (aterm that is now synonymous with something being jagged and poor quality). Foranother, it is difficult to devise thresholds that work perfectly. There is inevitablysome overlap between noisy values and values that correspond to valid objects, andwhere is such an overlap, object identification is problematic.

Another technique to account for noise is to use active contours, also called“snakes”. The idea is to draw a contour around a group of pixels that are consideredto be part of an object and to then move the boundary (hence the use of “active” and“snake”) subject to criteria such as that the contour should not cross a high-gradientand that the contour should be smooth.

An active contour consists of a set of control points through which a smoothcurve can be drawn (See Figure 6.10) The control points are moved iteratively andretained if the movement does not violate smoothness or gradient checks. Startingwith a contour is quite heuristic – the initial development of active contours werefor interactively drawn lines – but the movement of a drawn contour to enclose thedesired pixels can be quite objective.

Thus, a snake is represented as a list of control points, or snake nodes. The nodesthemselves are represented by their coordinates and by a curvature. Since the curva-ture makes sense only when the node is part of a curve, it is initialized to an arbitraryvalue and will be computed once the node is part of a snake.

6.5 Active contours 193

Fig. 6.10 Moving a snake involves moving its control points. The movement (a) might be accept-able but movement (b) might not be because it the contour now cuts across a high-gradient region.

1 p u b l i c c l a s s SnakeNode {2 i n t x ;3 i n t y ;4 double cu rv ;5 SnakeNode ( i n t x , i n t y ) {6 t h i s . x = x ;7 t h i s . y = y ;8 t h i s . cu rv = 0 ;9 }

10 }

The snake itself is a list of ordered nodes. Because we will need the mean distancebetween nodes to compute the snake’s energy functional, we precompute this:

1 p u b l i c c l a s s Snake {2 p r i v a t e SnakeNode [ ] p t s ;3 p r i v a t e double meanDis tBe tweenPts ;45 p u b l i c Snake ( SnakeNode [ ] p t s ) {6 t h i s . p t s = p t s ;7 meanDis tBe tweenPt s = 0 ;8 i f ( t h i s . p t s . l e n g t h == 0 ) re turn ;9

10 / / compute mean d i s t11 f o r ( i n t i =0 ; i < p t s . l e n g t h ; ++ i ) {12 SnakeNode c u r r = p t s [ i ] ;

7.2 Optical Flow 241

aperturemotion!phasephase correlation

4 C o n v o l u t i o n F i l t e r s m F i l t e r = newC o n v o l u t i o n F i l t e r ( C o n v o l u t i o n F i l t e r . b o xc a r ( smsize ,sm s i ze ) ) ;

5 i n = s m F i l t e r . smooth ( i n ) ;6 LatLonGrid o u t = new LatLonGrid ( i n . getNumLat ( ) / f a c t o r ,

i n . getNumLon ( ) / f a c t o r , i n . g e t M i s s i n g ( ) , i n . getNwCorner ( ) ,i n . g e t L a t R e s ( ) ∗ f a c t o r , i n . ge tLonRes ( ) ∗ f a c t o r ) ;

7 f o r ( i n t i =0 ; i < o u t . getNumLat ( ) ; ++ i ) {8 f o r ( i n t j =0 ; j < o u t . getNumLon ( ) ; ++ j ) {9 i n t x = i ∗ f a c t o r ;

10 i n t y = j ∗ f a c t o r ;11 o u t . s e t V a l u e ( i , j , i n . g e t V a l u e ( x , y ) ) ;12 }13 }14 re turn o u t ;15 }

The size of the window increases with the resolution, from one pixel at the coars-est resolution to (2R+1) pixels at the most detailed resolution. The resulting motionvector is updated taking into account the resolution of the grids on which the lowerscale one was computed.1

These steps are illustrated in Figures 7.10 and 7.11. The result of cross-correlationat the most coarse resolution (top row of both figures) is a motion vector consistingof -1, 0 or 1 at that resolution. This corresponds to a motion vector of −umax, 0 orumax at the most detailed resolution. At each resolution after that, the motion vec-tors are successively refined and the first image of the sequence aligned using thismotion vector (third column of Figure 7.11) until we reach the final result shown inFigure 7.12.

There are a couple of issues with pyramidal cross-correlation. One is that errors atcoarser scales are difficult to recover from, since we will be searching in the wrongneighborhood at more detailed scales. Secondly, cross-correlation suffers from theaperture problem. This is the idea that when we use small windows, we will notbe able to detect motion in a direction that is parallel to an edge that is larger thanthe window (See Figure 7.13). One solution is to use larger windows, but cross-correlation is computationally very expensive. We therefore seek a faster way toperform cross-correlation.

7.2.4 Phase Correlation

If our interest is not in determining a pixel-by-pixel motion vector but instead indetermining the overall degree to which the two spatial grids are shifted, then thecross-correlation approach is computationally extremely expensive. In essence, weare determining the motion vector at the center pixel in the grid, using a windowthat is the size of the entire grid. We are moving this window around, computing

1 For details of utility methods not presented here, please see PyramidalCrossCorrelation.java inthe package edu.ou.imganalysis.motion.

242 7 Change and Motion Estimation

Fig. 7.11 u, v and the first grid aligned using this motion vector at each of the resolutions in thepyramid.

(a) At time t0 (b) At time t1

(c) East-West motion (d) North-South motion

Fig. 7.12 Motion field computed using cross-correlation.

7.2 Optical Flow 243

frequency domainDigital Fourier TransformFast Fourier Transform

Fig. 7.13 When the window is small compared to the size of the object, motion that is parallel toan edge can not be detected using cross-correlation.

the cross-correlation (which involves traversing the entire grid since the window isthe size of the grid) and trying to determine the displacement at which the cross-correlation is maximum.

Recalling from Section 5.6 that convolution in the spatial domain is equivalentto multiplication of the Digital Fourier Transform (DFT) and that convolution andcross-correlation are different only in that convolution involves a flip of the windowcoefficients, can we optimize the cross-correlation method? Such an optimization isuseful whenever the window is large, not just when we need to determine a globalmotion vector.

Given two spatial grids I0 and I1, the cross-power spectrum of the two grids canbe obtained by multiplying the DFT of the first grid with the complex conjugate ofthe DFT of the second grid [DCM87]:

r = F−1 F (I1)F ∗(I0)|F (I1)F ∗(I0)|

(7.5)

where F (x) is the DFT of x, F−1 the inverse DFT and F ∗ the complex conjugateof the DFT. The location of the maximum in the cross-power spectrum gives theoptimal displacement.

The first step is to compute the DFT of the first frame of the sequence. We dothis using the Fast Fourier Transform (FFT):

1 p u b l i c P a i r compute ( LatLonGrid da ta0 ,LatLonGrid d a t a 1 ) {

2 i n t motNS = 0 , motEW = 0 ;3 Complex [ ] [ ] i n 1 = FFT2D . f f t ( FFT2D . z e r o p a d ( d a t a 0 ) ) ;

Then, we pick the center part of the second frame of the sequence. This is in casethere are strong structures towards the edge of the grid – we do not have to match it inthe current frame. Such windowing could be done with a slowly decaying weighting

244 7 Change and Motion Estimation

function (a “Hamming window”) or could be simply done by zeroing the boundaryareas:

1 / / zero−o u t an area o f t h i c k n e s s MAXU/MAXV around t h eboundary t o a v o i d boundary i s s u e s

2 LatLonGrid c e n t e r b = LatLonGrid . copyOf ( d a t a 1 ) ;3 i n t minx = MAXU;4 i n t miny = MAXV;5 i n t maxx = c e n t e r b . getNumLat ( ) − minx ;6 i n t maxy = c e n t e r b . getNumLon ( ) − miny ;7 f o r ( i n t i =0 ; i < d a t a 1 . getNumLat ( ) ; ++ i ) {8 f o r ( i n t j =0 ; j < d a t a 1 . getNumLon ( ) ; ++ j ) {9 i f ( i < minx | | j < miny | | i > maxx | | j > maxy ) {

10 c e n t e r b . s e t V a l u e ( i , j , 0 ) ;11 }12 }13 }14 Complex [ ] [ ] i n 2 = FFT2D . f f t ( FFT2D . z e r o p a d ( c e n t e r b ) ) ;

We can then compute the cross-power spectrum using Equation 7.5:

1 / / f i n d phase s h i f t a t t h i s p o i n t2 f o r ( i n t i =0 ; i < i n 1 . l e n g t h ; ++ i ) f o r ( i n t j =0 ; j <

i n 1 [ 0 ] . l e n g t h ; ++ j ) {3 i n 1 [ i ] [ j ] = i n 1 [ i ] [ j ] . m u l t i p l y ( i n 2 [ i ] [ j ] . c o n j u g a t e ( ) ) ;4 i n 1 [ i ] [ j ] = i n 1 [ i ] [ j ] . m u l t i p l y ( 1 . 0 / i n 1 [ i ] [ j ] . norm ( ) ) ;5 }6 / / t a k e i f f t7 Complex [ ] [ ] r e s u l t = FFT2D . i f f t ( i n 1 ) ;

Finally, we find the location at which the cross-power spectrum is maximum. Thisgives the displacement in pixels:

1 double b e s t V a l u e = I n t e g e r . MIN VALUE ;2 i n t s t a r t x = 0 ; / / r e s u l t . l e n g t h / 2 − MAXU;3 i n t s t a r t y = 0 ; / / r e s u l t [ 0 ] . l e n g t h / 2 − MAXV;4 i n t endx = r e s u l t . l e n g t h ; / / / 2 + MAXU;5 i n t endy = r e s u l t [ 0 ] . l e n g t h ; / / / 2 + MAXV;6 f o r ( i n t i = s t a r t x ; i < endx ; ++ i ) f o r ( i n t j = s t a r t y ; j <

endy ; ++ j ) {7 i f ( r e s u l t [ i ] [ j ] . normsq ( ) > b e s t V a l u e ) {8 b e s t V a l u e = r e s u l t [ i ] [ j ] . r e a l ;9 motNS = − i ;

10 motEW = − j ;11 }12 }

The DFT of a finite sequence assumes that the sequence is periodic i.e. that it keepsrepeating. Therefore, we can not distinguish between motion that is 3 pixels andmotion that is (N-3) pixels. On the grounds that it is much more likely that theimage is shifted by 3 than that it is completely wrapped around, we correct this:

1 / / we don ’ t want a 345−d eg re e phase s h i f t ; we want i t t o be15−d e g r e e s

2 i f ( Math . abs ( motNS ) > r e s u l t . l e n g t h / 2 ) {3 i f ( motNS < 0) motNS += r e s u l t . l e n g t h ;

300 8 Data Mining Attributes from Spatial Grids

Fig. 8.14 Decision tree used to classify cities in China.

Fig. 8.15 Left: neural network with two hidden nodes to categorize cities into rich vs. poor; Right:neural network with three hidden nodes to predict the value of GDI

For example, to create a classification neural network that is capable of predictingwhether a city is rich or not, we can train the network on a variable that is 1 for citiesthat belong to developed economies and 0 for poorer cities as follows: 9

1 l i b r a r y ( n e u r a l n e t ) ;

9 Unlike the R packages we have used so far, the neuralnet package is not a base packagein R. The base neural network package in R is called nnet but it does not have some of thecapabalities that we use in our listings. Install the neuralnet package from within R using:install.packages("neuralnet"); and follow the prompts.

8.3 Supervised learning models 301

neuralnetwork!probability

probabilityhidden nodesneural network!regressionclass indicatorneural network!multiple

outputs

2 d a t a $ r i c h = round ( d a t a $ g d i ) > 2 . 5 ) ;3 nn1 <− n e u r a l n e t ( r i c h ˜ pop+ l i g h t , d a t a = da t a , h i dd en =2 ,

t h r e s h o l d = 0 . 1 , e r r . f c t =” ce ” , a c t . f c t =” l o g i s t i c ” ,l i n e a r . o u t p u t =FALSE) ;

4 p l o t ( nn1 ) ;

The resulting neural network is shown in the first panel of Figure 8.15. The inputspop and light are transformed using the neural network so that the output is theprobability that the city in question is rich. In order to attain a true probability, theerror function needs to be cross-entropy (“ce” in the above listing) and the outputneeds to also be a logistic function (“linear.output” in the above listing is false). The“hidden” variable specifies the number of hidden nodes (the number of nodes inthe middle layer in Figure 8.15) and the “threshold” specifies the error threshold atwhich we wish to stop training.

If we wish to predict the actual value of gdi, we can not use the logistic functionbecause its value is bounded between 0 and 1. Also, for non-classification problems,it is better to minimize the sum of squared errors (“sse”) rather than cross-entropy.Therefore, we can obtain weights for a neural network capable of producing a de-velopment index using:

1 nn2 <− n e u r a l n e t ( g d i ˜ pop+ l i g h t , d a t a = da t a , h i dd en =3 ,e r r . f c t =” s s e ” , t h r e s h o l d = 0 . 5 , a c t . f c t =” l o g i s t i c ” ,l i n e a r . o u t p u t =TRUE) ;

2 p l o t ( nn2 ) ;

This “regression” (as opposed to “classification”) neural network, with 3 hiddennodes, is shown in the second panel of Figure 8.15. The only difference is that thelast node is not a logistic function but is simply a weighted sum of its inputs.

The classification neural network that we have created provides a probability, butas a consequence it is capable of classifying only into one of two classes (assum-ing that you threshold the resulting probability at, say, 0.5). If you have multiplecategories,then you will essentially need to train multiple neural networks, one foreach category. You can do this by creating a “class indicator matrix” that is 1 in theappropriate column of each row and then creating a neural network with that manyoutputs. This capability is not available in the neuralnet package, but we can usethe nnet package to do this:

1 l i b r a r y ( n n e t ) ;2 nn2 <− n n e t ( g d i ˜ pop + l i g h t , d a t a = da ta , s i z e =2 , e n t r o p y =FALSE ,

l i n o u t =TRUE ) ;3 summary ( nn2 )4 d a t a $ c l a s s e s = c l a s s . i n d ( round ( d a t a $ g d i ) ) ;5 nn3 <− n n e t ( c l a s s e s ˜ pop + l i g h t , d a t a = da ta , s i z e =2 ,

e n t r o p y =TRUE ) ;6 summary ( nn3 )

The resulting neural network summary provides the weights for each of the nodes(b is the bias node):

1 a 2−2−5 ne twork wi th 21 w e i g h t s o p t i o n s were − e n t r o p y f i t t i n g2 b−>h1 i1−>h1 i2−>h1

302 8 Data Mining Attributes from Spatial Grids

neural network!runningsupport vector machinerandom forest

3 16 .62 1 . 7 5 −2.374 b−>h2 i1−>h2 i2−>h25 3 . 4 0 0 . 0 1 −0.076 b−>o1 h1−>o1 h2−>o17 −6.21 1 . 9 8 5 . 8 08 b−>o2 h1−>o2 h2−>o29 −1.66 0 . 4 3 0 . 9 5

10 b−>o3 h1−>o3 h2−>o311 0 . 0 1 −1.17 −1.0912 b−>o4 h1−>o4 h2−>o413 −1.88 −0.79 −2.0814 b−>o5 h1−>o5 h2−>o515 1 . 1 0 −0.89 −23.42

Once we have obtained the neural network weights, we can assign the develop-ment index to any population and lighting input. For example, using the rich vs.poor neural network shown in the first panel of Figure 8.15, we can compute theprobability that a city is rich as follows:

1 p r i v a t e s t a t i c double l o g i s t i c ( double . . . v a l ) {2 double sum = 0 ;3 f o r ( i n t i =0 ; i < v a l . l e n g t h ; ++ i ) {4 sum += v a l [ i ] ;5 }6 re turn 1 . 0 / (1 + Math . exp(−sum ) ) ;7 }8 p r i v a t e s t a t i c double probOfRichNN ( double pop , double l i g h t ) {9 / / numbers from NN diagram

10 double h1 = l o g i s t i c ( 3 .06728∗ pop , 3 . 2 6 5 8 4 , 1 .77153∗ l i g h t ) ;11 double h2 = l o g i s t i c ( 0 .00625∗ pop , 2 . 8 2 9 1 7 , −0.03631 ∗

l i g h t ) ;12 double r i c h = l o g i s t i c ( 2 . 4 2 9 1∗ h1 , 3 . 1 2 8 1 7 , −11.29847∗h2 ) ;13 re turn r i c h ;14 }

and use this function to assign the probability (scaled to 0-100) to a list of patterns:

1 p u b l i c s t a t i c i n t [ ] app lyNeura lNe twork ( double [ ] [ ] p a t t e r n ) {2 i n t [ ] r e s u l t = new i n t [ p a t t e r n . l e n g t h ] ;3 f o r ( i n t i =0 ; i < p a t t e r n . l e n g t h ; ++ i ) {4 r e s u l t [ i ] = ( i n t ) Math . round (100 ∗

probOfRichNN ( p a t t e r n [ i ] [ 0 ] , p a t t e r n [ i ] [ 1 ] ) ) ;5 }6 re turn r e s u l t ;7 }

This is shown in Figure 8.16. Again, the results are quite similar to that of the linearfit or the tree model. However, you should realize that here what we have is thelikelihood that the city in question is highly developed (on a scale of 0 to 100), notthe development category (one of 0,1,2,3 or 4) of the city.

These are not the only data mining models available. In particular, support vec-tor machines (SVMs) and random forests are very popular. However, these modelssuffer from being much more complex to implement. The training is quite simple –there are R packages, for example, that will be able to create trained SVMs given a

8.4 Clustering 303

clusteringunsupervised learning

Fig. 8.16 Applying a neural network to all cities, shown over China.

set of patterns. The complexity arises when it comes to taking the trained model andimplementing it in your own software so as to process unseen cases. It is usuallynot as simple as computing a weighted sum (linear model), if-then rules (decisiontrees) or plugging weights into a logistic function (neural networks). In the case ofSVMs, you will have to store a library of support vectors (essentially selected pat-terns) and use these to derive the output for any new input. In the case of a randomforest, you will have to store a library of decision trees, pass the new input throughthese trees and derive the output based on the output of all these trees. Also, SVMsand random forests scale poorly to large data sets: training gets much slower as thesize of your training data set increases. The increased complexity and accuracy areusually not worth the trouble: the accuracy of the resulting data mining models onunseen patterns is often no better than a neural network or a simple decision tree.Consequently, the simple data mining models presented here usually suffice.

8.4 Clustering

What if we had no country-by-country dataset to train against? Is it possible to takethe city-by-city and see if there is a natural division of the city data into, say, 3categories? This is called unsupervised learning (the known categories for each ofthe training patterns forms the supervision). If we carry out unsupervised learning,we will always get the three categories that we wanted but the categories may or maynot correspond to economic activity (that is the danger with carrying out undirecteddata mining). However, there are situations where such a clustering of the data can

Index

active contour, 192alignment, 231aperture, 241array, 65array, 2D vs. 1D, 18aspect ratio, 184assignment problem, 247, 250attributes

geometric, 179physical, 179

automationchallenges, 7need for, 3, 4

background field, 225barnes, 51bin size, 106boundary, 61, 132bounding box, 59boxcar, 133Brewer, Cindy, 33

Catmull-Rom, 56centroid tracking, 246change detection, 224CIESIN, vclass indicator, 301classification, 297climatalogy, 225clustering, 215, 303coasting, 248colormap

grayscale, 32perceptual, 32, 33rainbow, 32

comparing images, 107convolution, 133

frequency domain, 164cooccurence, 113Cook’s distance, 294coordinate system, 19correlation, 292country, 61country boundary, 61cressman, 47cross validation, 296cross-correlation, 232cumulative distribution function, 107cylindrical equidistant, 19, 23

data mining algorithm, 10, 271data preparation, 273data rate, 4dataset selection, 107decision tree, 297denoising, 155DFT, 164differencing, 223Digital Fourier Transform, 164, 243dilate+erode, 156dilation filter, 153directional filter, 137distance

between pixels, 22cluster to cluster, 94Euclidean, 91fast computation, 95, 100geodesic, 101great circle, 23Hamming, 91Hausdorff, 95Mahalanobis, 93Manhattan, 91metric, 94

317

318 Index

noise tolerant, 92partial Hausdorff, 95pixel to cluster, 92scaling, 93taxicab, 91transform, 95types, 91

Downs,Robert, v

econometric dataproblems, 272

edgedirection, 148thickness, 145

edge detection, 143LoG, 145Sobel, 143

edge filter, 156ellipse filter, 137ellipse fit, 184enhanced watershed, 207entropy, 5, 116

Shannon, 107erode+dilate, 155erosion filter, 154expectation, 53expected value, 53

Fast Fourier Transform, 164, 243Federal Highway Administration, 4FFT, 164filter

anisotropic, 141dilate+erode, 156ellipse, 137erode+dilate, 155half-size, 150isotropic, 141oriented, 137separable, 142

filter bank, 84, 139frequency domain, 133, 243fuzzy, 275

aggregation, 277defuzzification, 277membership function, 276operation, 275rules, 276variable, 275

fuzzy algorithmsteps, 272

gap filling, 156Gaussian, 134

geocoding, 59, 183Geographic Information Systems, see GISgeoreference, 21GIS, 1, 2GLCM, 113GPS, 22gradient, 225gray level, 7greedy optimization, 247gridding, 45growth, 225

Hausdorff, 95hidden nodes, 301Hilditch, 160histogram, 105histogram equalization, 118Horn-Schunck, 226Hough transform, 186human vision, 9Hungarian method, 250hysteresis, 189

in-place, 131independent scanning, 98instrument geometry, 41integers, why, 18intercept, 58interpolation

bilinear, 29nearest neighbor, 29spline, 31

Java, v, 315

K-means, 304Kalman filter, 256Kendall, 293kmeans, 215Kolmogorov-Smirnov test, 107kriging, 52KS test, 107

Lambert, 26landsat, 9Laplacian of Gaussian, 145latitude

geodetic, 22LatLonGrid, 17latres, 19level set, 68, 201line, 54

fit, 186rasterization, 54

Index 319

line quality, 188linear assumption, 31linear model, 292linear regression, 292linearity, 145

verifying, 35local measures, 111, 112logistic function, 299longitude wrap, 112lonres, 19lookup table, 62

Markov, 72MAT

problems with, 159matched filter, 135

limitation, 136maximum, 65Medial Axis Transform (MAT), 157median filter, 149

speed, 150metric, 94missing data, 20, 144model diagnosis, 294morphological operation, 153motion

cross-correlation, 232edge effects, 225hybrid method, 261interpolation, 264object tracking, 245optical flow, 226partial derivative, 226phase, 241

multi-resolution, 85multi-scale, 85multiple channels, 225multiresolution, 238Munkres algorithm, 250

nearest neighbor, 29neighborhood, 129neural network, 299

classification, 299multiple outputs, 301probability, 301regression, 301running, 302

noise, 152normalized, 304northwest corner, 19NSSL, v

object size, 179

object tracking, 245objective analysis, 46optical flow, 226ordered propagation, 96orientation, 184oriented filter, 137Otsu, 110outlier detection, 294overfitting, 294

padding, 112parametric approximation, 79Paraview, 33Pearson, 293Peoria, 174performance optimization, 49phase correlation, 241pictures vs. data, 7pixel, 66

center location, 20pixel-wise processing, 278point, 54

rasterization, 53points

gridding, 45polygon, 57, 60population density, 31population density data set, vpostal code, 183pre-processing, 129

need for, 159primary city, 287principal components, 225probability, 301projection

changing, 26choosing a, 25converting, 26inverse formula, 27map, 23remapping grids, 27

Projection pursuit, 75pyramidal cross-correlation, 238

QQ plot, 294quantization, 117

R package, 292, 315installation, 300

radar geometry, 42Radial Basis Function, 73random forest, 302random process, 105rank filter, 153, 156

320 Index

raster, 6rasterization, 53

line, 54point, 53

ratio, 224recursion

problems with, 97, 176refinement, 282region growing, 176remap, 290remapping grids, 27representation, 17reprojection, 26residuals, 294road identification, 9rotation, 141, 184

Saito, 98satellite coordinate system, 41Shannon, 107shapefile, 3skeleton, 157, 188skeletonization

Hilditch, 160MAT, 157

smoothing, 131, 224time, 131

snake, 192soft threshold, 275, 281sorting, 66spatial grid, 5spatial operation, 2speckle filter, 152spline, 56split-and-merge, 249stack overflow, 97statistics

types, 89

supervised learning, 282support vector machine, 302

template matching, 135texture, 113, 116thinning

Hilditch, 160threshold, 69

global vs. local, 204selection, 109

thresholding, 173topographical, 69tracking, 245tree, 84

classification, 297pruning, 298regression, 297

uniformity, 116unsupervised learning, 282, 303UTM, 25

vector quantization, 120, 121Voronoi region, 96

watershed, 200enhanced, 207

wavelet, 84, 240wgs84, 21window, 129window coefficients, 132window size, 132wrap around, 20, 59

Yuan,May, v

zero pad, 168zipcode, 183

Documents

Automating the Analysis of Spatial Grids - CIMMScimms.ou.edu/~lakshman/autospatialgrids/asgbook_excerpts.pdf · Valliappa Lakshmanan Automating the Analysis of Spatial Grids A Practitioner’s