Upload
osokop
View
1.439
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Fundamentals of crime mapping chapter 8
Citation preview
Fundamentals of Crime Mapping
A Brief Review of Statistics
Understand the difference between qualitative and quantitative data.
Define and explain levels of measurement including nominal, ordinal, interval, and ratio.
Understand the difference between discrete and continuous variables.
Understand descriptive statistics, including typical measures of central tendency and dispersion.
Understand inferential statistics, including typical tests of significance and measures of association.
Understand what a regression model is and how it works. Understand the limitations of statistics and how their
improper application can yield misleading results. Define and explain classification in crime mapping and be
able to identify strengths and weaknesses of each method.
Objectives
Qualitative◦ Yields narrative-oriented information
Park, Blue, Yes, Tall, Short, etc Quantitative
◦ Produces number-oriented information Key Factors or “Variables”
Types of Research/Data
Ratio◦ Highest level◦ Can be reclassified to any of the other
levels◦ - ∞ to + ∞
Interval◦ Precise value of a measure is known
and thus can also be ranked◦ 1,2,3,4,5,6,7,8,9,10
Ordinal◦ Rank order nominal data and order
can be important◦ Officer, Sergeant, Lt, Commander,
Major, Chief
Nominal◦ Male, Female
Types of Data
Nominal◦ Dichotomous
African American Caucasian Hispanic Native American Asian Other
Types of Data
Caucasian Non-Caucasian
Must be mutually exclusive and exhaustive
Nominal DataOrder not important
Ordinal◦ Categorical or numerical
data that can be ranked, but the precise value is not known
Likert scale example
Types of DataTraits, concepts,
and ideas in criminal justicecan be difficult
to operationalize,
or measure.
What is your annual household income?1. Less than $20,0002. Between $20,000 and $40,0003. Between $40,001 and $60,0004. Between $60,001 and $80,0005. More than $80,000
I feel safe walking in my neighborhood alone at night1 -Strongly agree2 – Agree3 – Neutral4 – Disagree5 - Strongly disagree6 - Don’t know
Validity◦ A variable accurately
reflects the trait or concept it is measuring
Reliability◦ The measure is
representative consistently across people, places, and time
Types of Data
Interval◦ What is your annual
household income? __________________ Ranking possible and
precise value known 112 burglaries occurred in
beat 32
Types of Data
Ratio◦ Treated the same as
interval data 112.23 burglaries occurred
on average in beat 32 Can we have .23 of a
burglary?
Types of Data
$16095.32$17262.67$24262.78$26095.32$27262.67$32262.78$33095.32$35262.67$36262.78$36095.32$40262.67$41262.78$52095.32$55262.67$68262.78
Types of Data
Ratio$16095.00$17262.00$24262.00$26095.00$27262.00$32262.00$33095.00$35262.00$36262.00$36095.00$40262.00$41262.00$52095.00$55262.00$68262.00
Interval$0 - $25,000$25,001 - $35,000$35,001 - $45,000$45,001 - $55,000$55,001 - $65,000Over $65,000
OrdinalBelow $35,000Over $35,000
Nominal
Frequency Distributions
Discrete◦ Variables that cannot
be subdivided The number of persons
living in a household is a discrete variable. For example, there cannot be 2.3 persons living in a household. There can be 2, or there can be 3, but not 2.3.
Types of Data
Continuous Can be subdivided—
theoretically they can be subdivided an infinite number of times.
Time for example Days, Hrs, Mins, Secs,
Nanosecs, etc.
Rates◦ Violent crimes per
100,000 population Violent Crimes /
(Population/100000) = Rate
Types of Data
Ratios Violent Crimes “per”
Property crime Violent crimes = 10 Property crimes = 300 PC/VC (300/10)=30 For every one violent crime,
there are 30 property crimes
Percent Change◦ For comparing time
periods ((New-Old)/Old) *100 2009 property crimes =2567 2008 property crimes = 2655 Percent change=
(2567-2655)/2655 or -0.033 * 100 = -3.3%
Types of Data
Measures of Central Tendency◦ Mean or Average
Average of a distribution of values
◦ Mode Most often found value in a
distribution
◦ Median The middle value in a
distribution
Descriptive Statistics
25555665728282849097
Exam Scores
Count = 10Average = 70.8
Mode = 82Median = 77
Median = 82-72= 10/2= 72+5
Bi-Modal
Descriptive Statistics
25555565728282849097
Exam Scores
Count = 10Average = 70.7Mode = 55 & 82
Median = 77
Median = 82-72= 10/2= 72+5
Mean◦ Should not be used
when distribution is greatly “skewed” As with most crime data
◦ Use Median where it makes sense instead
Descriptive Statistics
Positive or Right Skewed
Almost normal
Negative or Left Skewed
Measures of Variance or Dispersion◦ Range
The distance between the lowest and highest score
◦ Interquartile range The distance between the
25th and 75th percentile
◦ Variance The average squared
distance of each score in a distribution from the mean of the distribution
◦ Standard deviation The average distance of
each score from the mean
Descriptive Statistics
25555565728282849097
Exam Scores
Range = 72Interquartile Range = 26
Variance = 456.9Standard deviation = 21.4
1st Quartile = 57.5
3rd Quartile = 83.5
26
Mean CenterMapping
Measures of Variance or Dispersion◦ Range
The distance between the lowest and highest score
◦ Interquartile range The distance between the
25th and 75th percentile
◦ Variance The average squared
distance of each score in a distribution from the mean of the distribution
◦ Standard deviation The average distance of
each score from the mean
Descriptive Statistics
Standard DeviationMapping
Sample Analyzed and “infer” information to the population◦ Probability theory
The number of times any given outcome will occur if the event is repeated many times.
Inferential Statistics
Bell-Shaped or Normal Curve
Inferential Statistics
Mode & Median same as Mean
Histogram◦ Normal◦ Skewed
Inferential Statistics
Average 26.20Median 30Mode 40
Average 20Median 20
Mode 20
Average 13.6Median 10
Mode 1
What variables are available? What is the overall n? What is the unit of analysis? What do I want to know about the variable(s)? What is the level of measurement of the
variable(s)? Are the variables discrete or continuous? How many groups will be compared in the
analysis? Am I interested in just describing the data or
finding inferences within it?
Questions to Ask Yourself…
Independent variable◦ The variable that analysts are trying to explain
(in crime mapping, the dependent variable is often some crime measure).
Dependent variable◦ Variables that produce a change in our dependent
variable
Variables for our Stats..
Casual relationship◦ Intervening variable◦ Antecedent variable◦ Contingent variable◦ Multicollinearity
When X, Y, and Z have overlapping measures of the same concept
◦ Spurious relationships When X and Y have no direct relationship but are both
affected by Z
Variables for our Stats..X
Z Y
Multicollinearity
Chi-square T-tests Z-tests ANOVA
◦ Essentially, they work by determining whether or not variable distributions or differences between groups or areas would be expected based on random chance
Test of significance
Lambda Gamma Kendall’s tau statistics Spearman’s rho Pearson’s correlation coefficient
◦ To determine the strength and direction of a relationship between two variables
◦ Values between -1 and +1◦ Inverse/negative or positive relationships possible
Measures of Association
Variable 1 Variable 2
Positive
Variable 1 Variable 2
Inverse
Spatial Autocorrelation◦ Moran’s I
A value between 0 and 1 indicates positive spatial autocorrelation (or clustering).
A value between 1 and 0 indicates negative spatial autocorrelation (random distribution).
◦ Geary’s C Values under 1 signify positive spatial autocorrelation Values over 1 designate negative spatial autocorrelation
Spatial Measures
Linear relationship◦ (OLS) Ordinary least-squares
Y =a + b1 X1 + b2 X2 + b3 X3 …
◦ Units of analysis Must be the same
Regression models
ArcGIS Data Classification
Capabilities
Polygons Nominal (categories), Ordinal, Interval and
Ratio (Quantities) can be used with different methods
Fills and outlinesNominal data
example
Ratio Data Example
Symbology
Category data symbology comes next
It displays data by unique values of a field, or multiple fields
Nominal, ordinal, ratio or interval data
Symbology
Next, comes the quantities
symbology method
It uses a number field in the table
to display data by classified values
Ratio and interval data
Quantities Classifications
Six different ways to classify data, with an added manual method for infinite freedom
Classification Methods
Equal Interval Defined Interval Quantile Natural Breaks Geometrical Interval Standard Deviation
Types of Data Categorical (Qualitative)
◦ Grouping based on some quality◦ Labels or categories◦ E.g.; Sex = Male or Female◦ Nominal or Ordinal
Nominal the order is not important E.g.: Sex = male or female
Ordinal the order is important E.g.; Rank = Officer, Sergeant, Lieutenant, etc
◦ Can be binary or non-binary Binary = only two values (male or female) Non-Binary = More than two (red, blonde, brunette,
etc)
Types of Data Measurement (Quantitative)
◦ Grouping based on some quantity or value◦ Always numbers◦ Discrete or continuous
Discrete = only certain values are possible and data could have gaps (1, 2, 3, or 4)
Continuous = Any value along some interval (any value between 1 and 4 (ie: 3.24211)
◦ Interval or ratio In interval data the interval between values is
important (ie; temperature of 30 compared to 110 means something)
Ratio data is the best, and the “0” value can be informative (ie; a grid can have 0 crimes, or any value up to infinity)
Great Website to Explain Research and Data Types
http://www.socialresearchmethods.net/kb/index.php
Classification Methods
Equal Interval (ratio, Interval)◦ The range between the classifications is
the same
Take thehigh value-low value and for each of the 5 classes, the
value is 199.61
Number of classes desired
determines interval
Classification Methods Defined Interval (ratio, interval)
◦ Similar to the equal interval, but here, we define what the interval will be and thus establish the classes
In this case the interval was set to 150, and so the number of
classes is determined by
the interval
Classification Methods Quantile (ratio, interval)
◦ A percentage of the values in the class falling with the range. Each class contains an equal number of features.
Each of the 10 classes has the same number of features within each class, or makes up 10%
of the total records
Classification Methods Natural Breaks (ratio, interval)
◦ Breaks the data where there are natural holes between values
Use test exam score example
Classification Methods Geometrical Interval (ratio, interval)
◦ This is a classification scheme where the class breaks are based on class intervals that have a geometrical series. This ensures that each class range has approximately the same number of values with each class and that the change between intervals is fairly consistent.
The interval is determined by a
geometric equation (large
and small changes
depending on breaks in data)
Classification Methods Standard Deviation (ratio, interval)
◦ Classes are determined by mean and standard deviation of values. Can display by 1, ½, ¼ standard deviations as needed
Getting to know your data, and the factors that influence crime can help analysts create more useful maps and analysis products and do problem solving
Handling data properly will keep your from making incorrect assumptions and coming to unrealistic conclusions
Remember the wheel of science
Conclusions