Appendix IX.Spatial Statistics: Exploratory Spatial Data Analysis

8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

1/16


2/16

2

TABLE OF CONTENTS

I. Introduction......3II. Methods....3

III. Results....3A. Figure 1. EDA of CSRobSeries dataset.4B. Figure 2. EDA of Residential Robberies dataset..5C. Figure 3. EDA of Robberies dataset..6D. Figure 4. EDA of Thefts from Automobiles dataset..7E. Figure 5. EDA of AllCrime dataset....8F. Figure 6. EDA of Maples dataset......9G. Figure 7. EDA of Hickories dataset......9H. Figure 8. EDA of AllTrees dataset.....10

IV. Conclusions..11A. Convenience Store Robbery Series (CSRobSeries)..11B. Residential Robberies .11C. Robberies .12D. Thefts from Automobiles .12E. AllCrime ...13F. Maples ........15G. Hickories .....15H. AllTrees .......16


3/16

3

I. Introduction

Not only does ArcGIS provide functions with which to create and manipulate graphic

representations of spatial data, but the software also allows for computations and display of spatial

statistics. While viewing a map of data points alone, it is sometimes possible to detect patterns among

them. However, products of such statistical functions are tools with which to analyze properties of

spatial data. Exploratory data analysis (EDA) results in calculations for the measures of both central

tendency: mean center and central feature, and dispersion: standard distance and directional

distribution. These statistics are useful for evaluating average points of occurrence as well as

distribution of points.

II. Methods

ArcGIS was used to upload data layers, create maps of study areas, and perform statistical

analyses. Roads were identified using the ID tool. The tools in the Measuring Geographic Distributions

compartment of the spatial statistics toolbox provided shapes for mean center, central feature, standard

distance, and directional distribution. No features were weighted. This process was used to analyze five

sample datasets provided by ESRI: CSRobSeries, Residential Burglaries, Robberies, Theft from Autos, and

AllTrees. EDA was also performed for all crime data combined and all trees combined.

III. Results

See following figures.


4/16

4

Figure 1. EDA of CSRobSeries dataset


5/16

5

Figure 2. EDA of Residential Burglaries dataset


6/16

6

Figure 3. EDA of Robberies dataset


7/16

7

Figure 4. EDA of Theft from Autos dataset


8/16

8

Figure 5. EDA for all crime datasets combined


9/16

9

Figure 6. EDA of Maple dataset

Figure 7. EDA of Hickory dataset


10/16

10

Figure 8. EDA of all trees combined


11/16

11

IV. Conclusions

Convenience Store Robbery Series data

Before conducting any spatial analysis, viewing the data points overlaid on a street map allows

some patterns to become evident. First, all of the robberies occurred along major streets within easy

access to highways. Convenience stores are typically located in high-traffic areas to facilitate

accessibility, so this pattern can be expected. A string of points lies on a Cornhusker Highway, a major

roadway through a commercial district in Lincoln, NE. While these are the only points that are obviously

clustered, most of the others are located on main roads or close to major highways as well. Proximity to

highways may have an impact on efficient getaways, causing such a trend.

The mean center, or the point characterized by average x and y coordinates, is located about a

block away from any data points, including the central feature. The mean seems slightly skewed by an

extreme point in the south along Route 2. In further examination, this point could be removed so that is

effect could be seen. The standard distance is relatively large which indicates that the occurrences of

convenience store robberies were fairly widespread. The directional distribution, or standard

deviational ellipse, indicates that there is little variation along the NE-SW axis and variation greater than

one standard deviation along the NW-SE axis. The points in the NW and SE corners caused this stretch

of the ellipse.

Residential Burglaries

By qualitatively evaluating a street map with the occurrences of residential burglaries, some

general patterns can be detected by eye. First, it seems that occurrences are denser in the center of

town, the area with the streets arranged as square blocks, than are occurrences further outward. It can

also be seen that the points not in the center of town are mostly within densely-populated residential


12/16

12

developments, as recognized by their distinct road pattern. Few burglaries occur on small roads and

isolated neighborhoods.

Spatial statistical analysis shows that the mean center and central feature are very close and

located in a downtown area. The standard distance is large, showing that the points are not tightly

localized, but spread across the area. The directional deviation shows that the data is relatively more

variable along the NW-SE axis than along its opposite. This is due to the clusters of burglaries in the

developments off of US-180 and -80 in the NW corner and the many points in the large developments in

the SE corner.

Robberies

The map of data points representing robberies shows that they are slightly denser in the center

than the along the edges. It can also be seen that most of the robberies occurred along or within easy

access main roads.

The mean center and central feature are located in the downtown area of the city, just above

the west end of Capitol Highway. The standard distance is intermediate and reflects the localization of

occurrences in the center. The SE and NW points slightly stretch the directional deviation in that

direction, thinning it in the opposite direction. This slight difference between the standard distance and

the directional deviation indicates that the data are relatively evenly distributed in all directions.

Theft from Autos

While evaluating the data points strictly using the map, the number of points can become

overwhelming. However, if examined closely, clusters can be discerned in the center of town as well in

surrounding neighborhoods. Areas void of any data points protrude among the heavily-dotted map.

While referencing an aerial photograph provided by Google Maps, it can be seen that these areas


13/16

13

correspond to large plots or yards where no there are no automobiles to burglarize, one of which looks

like a horse racetrack.

The mean center and central feature are more or less in the center of town. The wide standard

distance shows that the points are scattered throughout the area and are not very localized. Since the

directional deviation is only slightly stretched along the NW-SE axis, the data is relatively evenly

distributed in all directions.

All crime data

When evaluated together, EDA for these four datasets can be used to describe overall crime in

this area of Lincoln, NE. Like the statistics for many of the individual datsets, the mean center and

central feature are located in what seems to be the downtown area of Lincoln. The standard distance is

relatively large and shows how widespread crime is throughout the town. The directional distance

indicates the denser crime areas in the NW and SE corners of the map.

Spatial statistics for the convenience store robbery series dataset clearly differ from those

describing all crime. The mean center of convenience store robberies is located north and slightly west

of that of AllCrime. This may be due to the string of robberies which occurred along Cornhusker

Highway which significantly impacted the mean of a small dataset. Once again, reanalysis with these

data removed could be done to demonstrate how their occurrence contributes to confirm or disprove

this notion. When comparing distributions, the standard distance of AllCrime is only very slightly

smaller, so it conveys the general distribution of convenience store robberies. However, it does not

express the greater variability along the NW-SE axis and less along its opposite.

When comparing the AllCrime dataset to the one of residential burglaries, it can be seen that

the mean centers are very close. The standard distance is very similar and the directional deviation is


14/16

14

only slightly more variable along NW-SE axis than AllCrime. Therefore, analysis of the AllCrime dataset is

relatively adequate for capturing the distribution and variation in residential burglaries.

The mean center of robberies is located west and slightly south of that of the AllCrime dataset.

The standard distance of the robbery data is much smaller than that of AllCrime, indicating much more

localization of occurrences than crime overall. The distribution of robberies is much more evenly

distributed in all directions when compared to AllCrime. Due to differences between robberies and all

crimes combined, AllCrime statistics are not useful for describing robberies.

Thefts from automobiles and AllCrime have very similar mean centers and standard distances,

although that of AllCrime is very slightly smaller. A small difference lies in the directional variability:

distribution of thefts from automobiles is slightly greater along the NE-SW axis than is that of AllCrime.

Since most of these differences are slight, spatial statistics of AllCrime data seem suitable to describe

this set of thefts from automobiles.

Overall, AllCrime sufficiently described the central tendencies and dispersion of only half of

crimes presented. It is through EDA of individual datasets that clearer understanding of the spatial

characteristics and patterns of such data is gained. Because of the abundance of data points, qualitative

evaluation of the combined data for patterns is much more difficult than it is for each crime type

specifically. While the AllCrime data usually gave a good general idea of where the mean center is

located, it did not always communicate the dispersion and its directional variability accurately.

Therefore, in order to interpret patterns and use maps and statistics generated most efficiently for

applications such as prediction of future attempts and hotspots. It is necessary to separate crimes by

type. However, these combined data can be used for applications in which overall crime is of concern.


15/16

15

Maple

When evaluating the map of locations of maple trees alone, some clustering can be noticed in

the south and the mid-northern areas of the study area. While there are high densities of maple trees

along these areas, there are few trees between them. The maples are not as densely distributed in the

north end of the study area.

After carrying out EDA, it can be seen that the mean center and central feature are both located

along the sparsely occupied area in the center of the study area. These statistics may have been slightly

skewed by the scattered maples in the north end. The standard distance is relatively moderate,

indicating that the points are somewhat localized. The directional deviation shows that the data is

slightly more variable moving right to left. This variability is emphasized a bit to the east because of the

occurrence of more maples in the top right corner.

Hickory

Some patterns can be seen when looking at a map of hickory locations alone. There seems to be

higher density of hickory trees along the left and right edges in the top of the study area. While there

are trees located elsewhere, they are less densely-spaced.

Exploratory data analysis shows that the mean center and central feature are horizontally

centered but slightly more concentrated along the top of the study area. This reflects the clustering of

points seen on the map. The standard distance is relatively large which shows the widespread

distribution of hickory trees across the plot. The directional distribution shows greater variation left to

right and less top to bottom because of the clustering at the top.


16/16

16

All Trees

While comparing the spatial distributions of maple and hickory trees together, it can be seen

that the hickories are more prevalent in the top end of the plot while maples seem to occur more along

the bottom. This is reflected in the location of mean centers of each, with that of the hickories is above

and slightly left of that of the maples. The distribution of maples is more variable vertically while the

opposite is true for the hickory trees, which can be seen more evidently in the shape of each types

directional deviation. It can also be detected that while the hickories are spread across the whole study

area, the distribution of the maples is more centrally located.

Evaluation of this data together as all trees combined yields a mean center that is roughly in the

middle of the study area, above and slightly left of Maples and below and slightly right of Hickorys.

Maple trees are much less widely distributed than all trees combined, determined by the standard

distance of each. While the directional distribution shows that combined data is more slightly more

variable along diagonally from the top left to bottom right corners, the opposite is true for the Maple

data. Comparison of the standard distances of all trees and hickory trees shows that both datasets are

similarly variable. However, the directional deviation shows slightly more variability along

aforementioned diagonal in hickories than in all trees combined. Since the study plot is relatively evenly

occupied by trees, EDA of combined data does not take into consideration the differences of centrality

and distribution of each species. It is for this reason that separate analysis needs to be done for maples

and hickories in order to detect species-specific patterns. Such pattern detection can be applied in such

cases as recognizing habitat preferences and resource distribution.

Documents

Appendix IX.Spatial Statistics: Exploratory Spatial Data Analysis