Appendix IX.Spatial Statistics: Exploratory Spatial Data Analysis

Embed Size (px)

Citation preview

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    1/16

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    2/16

    2

    TABLE OF CONTENTS

    I. Introduction......3II. Methods....3

    III. Results....3A. Figure 1. EDA of CSRobSeries dataset.4B. Figure 2. EDA of Residential Robberies dataset..5C. Figure 3. EDA of Robberies dataset..6D. Figure 4. EDA of Thefts from Automobiles dataset..7E. Figure 5. EDA of AllCrime dataset....8F. Figure 6. EDA of Maples dataset......9G. Figure 7. EDA of Hickories dataset......9H. Figure 8. EDA of AllTrees dataset.....10

    IV. Conclusions..11A. Convenience Store Robbery Series (CSRobSeries)..11B. Residential Robberies .11C. Robberies .12D. Thefts from Automobiles .12E. AllCrime ...13F. Maples ........15G. Hickories .....15H. AllTrees .......16

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    3/16

    3

    I. Introduction

    Not only does ArcGIS provide functions with which to create and manipulate graphic

    representations of spatial data, but the software also allows for computations and display of spatial

    statistics. While viewing a map of data points alone, it is sometimes possible to detect patterns among

    them. However, products of such statistical functions are tools with which to analyze properties of

    spatial data. Exploratory data analysis (EDA) results in calculations for the measures of both central

    tendency: mean center and central feature, and dispersion: standard distance and directional

    distribution. These statistics are useful for evaluating average points of occurrence as well as

    distribution of points.

    II. Methods

    ArcGIS was used to upload data layers, create maps of study areas, and perform statistical

    analyses. Roads were identified using the ID tool. The tools in the Measuring Geographic Distributions

    compartment of the spatial statistics toolbox provided shapes for mean center, central feature, standard

    distance, and directional distribution. No features were weighted. This process was used to analyze five

    sample datasets provided by ESRI: CSRobSeries, Residential Burglaries, Robberies, Theft from Autos, and

    AllTrees. EDA was also performed for all crime data combined and all trees combined.

    III. Results

    See following figures.

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    4/16

    4

    Figure 1. EDA of CSRobSeries dataset

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    5/16

    5

    Figure 2. EDA of Residential Burglaries dataset

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    6/16

    6

    Figure 3. EDA of Robberies dataset

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    7/16

    7

    Figure 4. EDA of Theft from Autos dataset

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    8/16

    8

    Figure 5. EDA for all crime datasets combined

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    9/16

    9

    Figure 6. EDA of Maple dataset

    Figure 7. EDA of Hickory dataset

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    10/16

    10

    Figure 8. EDA of all trees combined

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    11/16

    11

    IV. Conclusions

    Convenience Store Robbery Series data

    Before conducting any spatial analysis, viewing the data points overlaid on a street map allows

    some patterns to become evident. First, all of the robberies occurred along major streets within easy

    access to highways. Convenience stores are typically located in high-traffic areas to facilitate

    accessibility, so this pattern can be expected. A string of points lies on a Cornhusker Highway, a major

    roadway through a commercial district in Lincoln, NE. While these are the only points that are obviously

    clustered, most of the others are located on main roads or close to major highways as well. Proximity to

    highways may have an impact on efficient getaways, causing such a trend.

    The mean center, or the point characterized by average x and y coordinates, is located about a

    block away from any data points, including the central feature. The mean seems slightly skewed by an

    extreme point in the south along Route 2. In further examination, this point could be removed so that is

    effect could be seen. The standard distance is relatively large which indicates that the occurrences of

    convenience store robberies were fairly widespread. The directional distribution, or standard

    deviational ellipse, indicates that there is little variation along the NE-SW axis and variation greater than

    one standard deviation along the NW-SE axis. The points in the NW and SE corners caused this stretch

    of the ellipse.

    Residential Burglaries

    By qualitatively evaluating a street map with the occurrences of residential burglaries, some

    general patterns can be detected by eye. First, it seems that occurrences are denser in the center of

    town, the area with the streets arranged as square blocks, than are occurrences further outward. It can

    also be seen that the points not in the center of town are mostly within densely-populated residential

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    12/16

    12

    developments, as recognized by their distinct road pattern. Few burglaries occur on small roads and

    isolated neighborhoods.

    Spatial statistical analysis shows that the mean center and central feature are very close and

    located in a downtown area. The standard distance is large, showing that the points are not tightly

    localized, but spread across the area. The directional deviation shows that the data is relatively more

    variable along the NW-SE axis than along its opposite. This is due to the clusters of burglaries in the

    developments off of US-180 and -80 in the NW corner and the many points in the large developments in

    the SE corner.

    Robberies

    The map of data points representing robberies shows that they are slightly denser in the center

    than the along the edges. It can also be seen that most of the robberies occurred along or within easy

    access main roads.

    The mean center and central feature are located in the downtown area of the city, just above

    the west end of Capitol Highway. The standard distance is intermediate and reflects the localization of

    occurrences in the center. The SE and NW points slightly stretch the directional deviation in that

    direction, thinning it in the opposite direction. This slight difference between the standard distance and

    the directional deviation indicates that the data are relatively evenly distributed in all directions.

    Theft from Autos

    While evaluating the data points strictly using the map, the number of points can become

    overwhelming. However, if examined closely, clusters can be discerned in the center of town as well in

    surrounding neighborhoods. Areas void of any data points protrude among the heavily-dotted map.

    While referencing an aerial photograph provided by Google Maps, it can be seen that these areas

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    13/16

    13

    correspond to large plots or yards where no there are no automobiles to burglarize, one of which looks

    like a horse racetrack.

    The mean center and central feature are more or less in the center of town. The wide standard

    distance shows that the points are scattered throughout the area and are not very localized. Since the

    directional deviation is only slightly stretched along the NW-SE axis, the data is relatively evenly

    distributed in all directions.

    All crime data

    When evaluated together, EDA for these four datasets can be used to describe overall crime in

    this area of Lincoln, NE. Like the statistics for many of the individual datsets, the mean center and

    central feature are located in what seems to be the downtown area of Lincoln. The standard distance is

    relatively large and shows how widespread crime is throughout the town. The directional distance

    indicates the denser crime areas in the NW and SE corners of the map.

    Spatial statistics for the convenience store robbery series dataset clearly differ from those

    describing all crime. The mean center of convenience store robberies is located north and slightly west

    of that of AllCrime. This may be due to the string of robberies which occurred along Cornhusker

    Highway which significantly impacted the mean of a small dataset. Once again, reanalysis with these

    data removed could be done to demonstrate how their occurrence contributes to confirm or disprove

    this notion. When comparing distributions, the standard distance of AllCrime is only very slightly

    smaller, so it conveys the general distribution of convenience store robberies. However, it does not

    express the greater variability along the NW-SE axis and less along its opposite.

    When comparing the AllCrime dataset to the one of residential burglaries, it can be seen that

    the mean centers are very close. The standard distance is very similar and the directional deviation is

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    14/16

    14

    only slightly more variable along NW-SE axis than AllCrime. Therefore, analysis of the AllCrime dataset is

    relatively adequate for capturing the distribution and variation in residential burglaries.

    The mean center of robberies is located west and slightly south of that of the AllCrime dataset.

    The standard distance of the robbery data is much smaller than that of AllCrime, indicating much more

    localization of occurrences than crime overall. The distribution of robberies is much more evenly

    distributed in all directions when compared to AllCrime. Due to differences between robberies and all

    crimes combined, AllCrime statistics are not useful for describing robberies.

    Thefts from automobiles and AllCrime have very similar mean centers and standard distances,

    although that of AllCrime is very slightly smaller. A small difference lies in the directional variability:

    distribution of thefts from automobiles is slightly greater along the NE-SW axis than is that of AllCrime.

    Since most of these differences are slight, spatial statistics of AllCrime data seem suitable to describe

    this set of thefts from automobiles.

    Overall, AllCrime sufficiently described the central tendencies and dispersion of only half of

    crimes presented. It is through EDA of individual datasets that clearer understanding of the spatial

    characteristics and patterns of such data is gained. Because of the abundance of data points, qualitative

    evaluation of the combined data for patterns is much more difficult than it is for each crime type

    specifically. While the AllCrime data usually gave a good general idea of where the mean center is

    located, it did not always communicate the dispersion and its directional variability accurately.

    Therefore, in order to interpret patterns and use maps and statistics generated most efficiently for

    applications such as prediction of future attempts and hotspots. It is necessary to separate crimes by

    type. However, these combined data can be used for applications in which overall crime is of concern.

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    15/16

    15

    Maple

    When evaluating the map of locations of maple trees alone, some clustering can be noticed in

    the south and the mid-northern areas of the study area. While there are high densities of maple trees

    along these areas, there are few trees between them. The maples are not as densely distributed in the

    north end of the study area.

    After carrying out EDA, it can be seen that the mean center and central feature are both located

    along the sparsely occupied area in the center of the study area. These statistics may have been slightly

    skewed by the scattered maples in the north end. The standard distance is relatively moderate,

    indicating that the points are somewhat localized. The directional deviation shows that the data is

    slightly more variable moving right to left. This variability is emphasized a bit to the east because of the

    occurrence of more maples in the top right corner.

    Hickory

    Some patterns can be seen when looking at a map of hickory locations alone. There seems to be

    higher density of hickory trees along the left and right edges in the top of the study area. While there

    are trees located elsewhere, they are less densely-spaced.

    Exploratory data analysis shows that the mean center and central feature are horizontally

    centered but slightly more concentrated along the top of the study area. This reflects the clustering of

    points seen on the map. The standard distance is relatively large which shows the widespread

    distribution of hickory trees across the plot. The directional distribution shows greater variation left to

    right and less top to bottom because of the clustering at the top.

  • 8/7/2019 Appendix IX. Spatial Statistics: Exploratory Spatial Data Analysis

    16/16

    16

    All Trees

    While comparing the spatial distributions of maple and hickory trees together, it can be seen

    that the hickories are more prevalent in the top end of the plot while maples seem to occur more along

    the bottom. This is reflected in the location of mean centers of each, with that of the hickories is above

    and slightly left of that of the maples. The distribution of maples is more variable vertically while the

    opposite is true for the hickory trees, which can be seen more evidently in the shape of each types

    directional deviation. It can also be detected that while the hickories are spread across the whole study

    area, the distribution of the maples is more centrally located.

    Evaluation of this data together as all trees combined yields a mean center that is roughly in the

    middle of the study area, above and slightly left of Maples and below and slightly right of Hickorys.

    Maple trees are much less widely distributed than all trees combined, determined by the standard

    distance of each. While the directional distribution shows that combined data is more slightly more

    variable along diagonally from the top left to bottom right corners, the opposite is true for the Maple

    data. Comparison of the standard distances of all trees and hickory trees shows that both datasets are

    similarly variable. However, the directional deviation shows slightly more variability along

    aforementioned diagonal in hickories than in all trees combined. Since the study plot is relatively evenly

    occupied by trees, EDA of combined data does not take into consideration the differences of centrality

    and distribution of each species. It is for this reason that separate analysis needs to be done for maples

    and hickories in order to detect species-specific patterns. Such pattern detection can be applied in such

    cases as recognizing habitat preferences and resource distribution.