Exploring Spatial Patterns Iap2013

Embed Size (px)

Citation preview

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    1/47

    EXPLORINGSPATIALPATTERNSIN

    YOURDATA

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    2/47

    OBJECTIVES

    Learn how to examine your data using the

    Geostatistical Analysis tools in ArcMap.

    Learn how to use descriptive statistics in ArcMap

    and Geoda to analyze data.

    Be able to identify Geostatistical Analysis tools that

    can be used for further analysis.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    3/47

    WHYEXPLOREYOURDATA?

    It allows you to better select an appropriate tool to

    analyze your data.

    If you skip exploring your data, you may miss keyinformation about it that may lead to incorrect

    conclusions and decisions.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    4/47

    GEODAVS. ARCMAP

    Geodafree, open-source, simple, software

    specifically for statistical analysis

    ArcMapproprietary, GIS software that canperform statistical analysis along with hundreds of

    other analyses

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    5/47

    GEODAVS. ARCMAP

    With ArcMap you

    can view several

    data layers at once.

    In Geoda, you view

    only one data layer.

    Some tools are

    found in both

    programs, while

    some are found inonly one.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    6/47

    EXPLORETHELOCATIONOFYOURDATA

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    7/47

    EXPLORETHELOCATIONOFYOURDATA

    Explore:

    size of the study area

    mean

    median

    direction data are oriented

    You will see where data are clustered relative to the

    rest of the data.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    8/47

    MEANCENTER

    The geographic center for a set of features.

    Constructed from the average x and y values for

    the input feature centroids (middle points, if input

    features are polygons).

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    9/47

    MEDIANCENTER

    Median Center is robust to outliers.

    Uses an algorithm to find the point that minimizes

    travel from it to all other features in the dataset.

    At each step (t) in the algorithm, a candidateMedian Center is found (Xt, Yt) and refined until it

    represents the location that minimizes Euclidian

    Distance dto all features (i) in the dataset.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    10/47

    DIRECTIONDISTRIBUTION(STANDARD

    DEVIATIONALELLIPSE)

    Standard deviational ellipses summarize the spatialcharacteristics of geographic features: central tendency,dispersion, and directional trends.

    The ellipse allows you to see if the distribution of

    features is elongated and hence has a particularorientation.

    When the underlying spatial pattern of features isconcentrated in the center with fewer features towardthe periphery (a spatial normal distribution), a one standard deviation ellipse polygon will cover

    approximately 68 percent of the features

    two standard deviations will contain approximately 95 percentof the features

    three standard deviations will cover approximately 99 percentof the features

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    11/47

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    12/47

    EXPLORETHEVALUESOFYOURDATA

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    13/47

    NORMALDISTRIBUTION

    Some analysis tools assume a normal distribution:

    Mean and median are similar

    Data are symmetrical

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    14/47

    DATAFREQUENCYUSINGHISTOGRAMS

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    15/47

    DATADISTRIBUTIONUSINGAQQ PLOT

    A normally distributed datasetMany characteristics of a normal datasetNot normal

    A normal QQ plot shows the relationship of your data to a normal distribution line.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    16/47

    BOXPLOT

    Displays the median and interquartile range (IQ)(25%-75%)

    Hinge = multiple of interquartile range

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    17/47

    MAPS

    For examining data values and frequencies:

    Quantile Map

    Natural breaks

    Equal intervals

    For finding outliers:

    Percentile Map

    Box Map

    Standard Deviation Map

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    18/47

    QUANTILEMAP

    Displays the distribution of values in categories with

    an equal number of observations in each category.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    19/47

    EQUALINTERVALMAP

    Sets the value ranges in each category equal in size.

    The entire range of data values is divided equally into

    however many categories have been chosen.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    20/47

    NATURALBREAKSMAP

    Seeks to reduce the variance within classes and

    maximize the variance between classes

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    21/47

    OTHEREXPLORATORYMETHODS

    Scatter Plot (2 variables)

    Parallel coordinate plot (A pattern of lines is drawn

    that connects the coordinates of each observation

    across the variables on parallel x-axes.)

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    22/47

    DETECTOUTLIERS

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    23/47

    OUTLIERS

    Outliers can reveal mistakes, unusual occurrences,

    and shift points in data patterns (a valley in a

    mountain range).

    You should use more than one method to find

    outliers because some techniques will only highlight

    data values near the two ends of your range.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    24/47

    PERCENTILEMAP

    Groups ranked data into 6 categories

    Lowest and highest 1% are potential outliers

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    25/47

    BOXMAP

    Groups data into

    4 categories, plus

    2 outlier

    categories at both

    ends

    Data are outliers

    if they are 1.5 or

    3 times the IQ.

    Detects outlierswith more

    certainty than a

    percentile map

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    26/47

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    27/47

    SEMIVARIOGRAMCLOUD

    When points closer together have greaterdifferences in their values, this may indicate anoutlier in the data.

    The selected points may be outliers.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    28/47

    VORONOIMAP

    Cluster Voronoi maps show spatial outliers in yourdata; simple Voronoi maps can pinpoint data valuesthat are many class breaks removed fromsurrounding polygons.

    The gray

    polygons may

    be outliers.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    29/47

    HISTOGRAM

    Values in the last bars to the left or right, if far

    removed from the adjacent values, may indicateoutliers.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    30/47

    NORMALQQ PLOT

    Values at the tails of a normal QQ plot can also beoutliers. This can happen when the tail values do

    not fall along the reference line.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    31/47

    BOXPLOT

    Points outside the hinges (represented by the

    black, horizontal lines), maybe outliers.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    32/47

    EXPLORESPATIALRELATIONSHIPSIN

    YOURDATA

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    33/47

    SPATIALAUTOCORRELATION

    Everything is related, but objects closertogether are more related than objectsfarther apart.

    Explore using a semivariogram graph orcloud

    Can also be explored using Morans I andGetis-Ord G statistics

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    34/47

    Height (sill) = variation between

    data values.

    Range= distance between

    points at which thesemivariogram flattens out.

    As the range increase, height

    should increase, since points

    further away from each other are

    not as related, so there should

    be more variation.

    If a semivariogram is a

    horizontal line, there is no

    spatial autocorrelation.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    35/47

    VARIATIONINYOURDATA

    Many spatial statistics analysis techniques assumeyour data are stationary, meaning the relationshipbetween two points and their values depends onthe distance between them, not their exact location.

    Explore variation using a Voronoi map.

    A Voronoi map is created by defining Thiessenpolygons around each point in your dataset.

    Any location inside a polygon represents the areacloser to that data point than to any other data

    point.

    This allows you to explore the variation of eachsample point based on its relationship tosurrounding sample points.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    36/47

    A SIMPLEVORONOIMAP

    A simple Voronoi map shows the data value at each

    location. The map is symbolized using a geometrical

    interval classification. This will show the variation in data

    values across your entire dataset.

    Green = little localvariation

    Orange and Red =

    greater local variation

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    37/47

    TYPESOFVORONOIMAPS

    Simple:The value assigned to a polygon is the valuerecorded at the sample point within that polygon.

    Mean:The value assigned to a polygon is the mean value thatis calculated from the polygon and its neighbors.

    Mode:All polygons are categorized using five class intervals.The value assigned to a polygon is the mode (most frequently

    occurring class) of the polygon and its neighbors.

    Cluster:All polygons are categorized using five classintervals. If the class interval of a polygon is different fromeach of its neighbors, the polygon is colored gray and put intoa sixth class to distinguish it from its neighbors.

    Entropy:All polygons are categorized using five classesbased on a natural grouping of data values (smart quantiles).The value assigned to a polygon is the entropy that iscalculated from the polygon and its neighbors.

    Entropy = - (pi* Logpi),

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    38/47

    EXPLORETRENDSINYOURDATA

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    39/47

    TRENDANALYSIS

    You can use the trend analysis tool in Arcmap to

    visually compare the trend lines with any patterns in

    your data.

    When exploring trends, your data locations are

    mapped along the x- and y-axes. The values of

    each data location are mapped as height (z-axis).

    Trends are analyzed based on direction and on the

    order of the line that fits the trend. The trend line is

    a mathematical function, or polynomial, that

    describes the variation in the data.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    40/47

    These polynomials show

    a clear curve, indicating

    a second-order trendin the data.

    You can determine whether

    the order of the polynomial

    fits your data based on the

    shape created by the line.

    A second-order polynomial

    will appear as an upward

    or a downward curve

    (known as a parabola).

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    41/47

    SELECTINGANANALYSISTECHNIQUE

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    42/47

    Each of the following techniques are types of

    interpolation. Interpolation creates surfaces based

    on spatially continuous data.

    Each surface uses the values and locations of your

    points to create (or interpolate) the values for the

    remaining points in the surface.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    43/47

    GEOSTATISTICALINTERPOLATION

    Creates surfaces using the relationships betweenyour data locations and their values.

    Predicts values based on your existing data.

    Assumptions:

    Data is not clustered.(Simple kriging technique has a declustering option.)

    Data is normally distributed.(Transformation options are available.)

    Data is stationary (no local variation). Data is autocorrelated.

    Data has no local trends.(You can remove trends from data as part of the interpolationprocess. )

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    44/47

    GLOBALDETERMINISTICINTERPOLATION

    Creates surfaces using the existing values at each

    location.

    Uses your entire dataset to create your surface.

    Assumptions: Outliers have been removed from the data.

    Global trends exist in the data.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    45/47

    LOCALDETERMINISTICINTERPOLATION

    Uses several subsets, or neighborhoods, within an

    entire dataset to create the different components of

    the surface.

    Assumption:

    Data is normally distributed.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    46/47

    INVERSEDISTANCEWEIGHTED

    INTERPOLATION(IDW)

    A type of local deterministic interpolation.

    Assumptions:

    Data is not clustered.

    Data is autocorrelated.

  • 8/12/2019 Exploring Spatial Patterns Iap2013

    47/47

    OTHERSPATIALSTATISTICALTESTS

    Tests for spatial autocorrelation

    Getis-Ord General G and Global Morans I (to determine

    overall clustering and dispersion of values)

    Hot Spot Analysis (Getis-Ord Gi*) andAnselinsLocal

    Morans I (to determine specific clusters of high and lowvalues)

    Regression

    Used to evaluate relationships between two or more

    feature attributes. Are location, crime rates, racial make-

    up, and income related to housing values in a census

    tract?