Welcome to the Quantitative Analysis (Statistics/EXCEL) Module John Gates Oxford Centre for Water...

Preview:

Citation preview

Welcome to theQuantitative Analysis (Statistics/EXCEL)

Module

John GatesOxford Centre for Water ResearchSchool of Geography and the Environment

What is statistics?

“…the collection and analysis of numerical data in large quantities.” – Oxford English Dictionary

“The mathematics of the collection, organization, and interpretation of numerical data, especially the analysis of population characteristics by inference from sampling.” – American Heritage Dictionary

“Statistics: the mathematical theory of ignorance.” – Morris Kline

“It has long recognized by public men of all kinds ... that statistics come under the head of lying, and that no lie is so false or inconclusive as that which is based on statistics.” - H. Belloc

“There are three kinds of lies - lies, damned lies and statistics.” –Benjamin Disraeli

“Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” – H.G. Wells

Why statistics?

• make quantified statements about a phenomenon we are interested in

• frequently this phenomenon is too large to go out and measure exhaustively…

• …so we collect samples as proxies of the greater population of individuals or items that make up the phenomenon we are interested in

Aims of the course

• Introduction to basic statistics• Demonstrate geographical context• Learn to use analysis tools in EXCEL• Make you an intelligent user of data• Make you an intelligent user of statistics

We will bypass much of the underlying maths, rather will emphasize the understanding of underlying principles

How the course works

1. Cover the statistical principles in lecture• course lecture notes

2. Go through lecture notes in own time before practical• use textbooks to supplement lecture notes

3. Attend practical• work through practical handouts• ask demonstrators for help

4. Take online assessments• theory – any time after lecture• practical – any time after finishing prac

Course Structure

• Lectures on Mondays– in OUCE Lecture Theatre

• Practicals on Tuesday afternoons (except next week)– in Medical Sciences Teaching Centre’s

computing laboratory

2.00-4.00pm 4.00-6.00pmHarris ManchesterBrasenoseChrist ChurchHertfordJesusSt. Edmunds HallSt. Hilda’sWorcester

KebleMansfieldMertonRegent’s ParkSt. Anne’sSt. CatzSt. Peter’sWadhamSt. John’s

Practicals

Course Information

ALL INFORMATION IS ON THE WEBhttp://techniques.geog.ox.ac.uk

– Lecture notes and glossary– Practical notes– Excel files– Internet resources– Recommended textbooks– Tests

Week 1 - Central Tendency

1. Types of statistics2. Types of data3. Samples4. Frequency distribution5. Measures of central tendency

a) modeb) medianc) arithmetic mean

6. Precision and accuracy

1a. Descriptive Statistics

• Definition: Quantitative methods of organizing, summarizing, and presenting data numerical data in an informative way.

• Describe the overall characteristics of a sample (and hence the population?)

• Transform raw data into more easily understood forms

• Central tendency – “average” character of the data.

1b. Inferential (analytical) Statistics

• Definition: The branch of statistics used to make inferences or judgments about a larger population based on the data collected from a smaller sample drawn from the population

2. Types of Data

• Interval• Ordinal• Nominal

2. Types of Data

• Interval• Ordinal• Nominal

-- Can tell exactly how far any measurement is from any other

-- Examples: height, age, size

2. Types of Data

• Interval• Ordinal• Nominal

-- A set of observation ordered according to some criterion, i.e. ranking

-- Cannot tell how far one measurement is from the next

-- Examples: horses’ positions in race, the ten highest mountains in the world

-- Note that interval data can be converted into ordinal form

2. Types of Data

• Interval• Ordinal• Nominal

-- Also referred to as categorical data

-- Data are grouped into categories

-- Examples: land use type, ethnicity, rock type

-- Note that interval data can be converted into nominal form

3. Samples

• Definition: A subset of the target populationRandom:

– the individuals in the samples are randomly selected– each member of the population has a known, but

possibly non-equal, chance of being included in the sample

Independent:– a sample should have no effect and are not affected

by other samples selected from the same population, or different populations

4. Frequency Distribution

• The spread of data along its range– either mathematical description– or (and) visual description…

• …a frequency histogram– define categories or intervals or classes– count the number of measurements that fall

into each class– plot classes along x-axis– plot counts (frequencies) on y-axis

4. Frequency Distribution

020

4060

80100120

140160

180200

Variable X

Fre

qu

en

cy

25 30 35 40 45 50 55 60 65 70 75 80 850

20

4060

80100120

140160

180200

Variable X

Fre

qu

en

cy

25 30 35 40 45 50 55 60 65 70 75 80 85

Grade (in percent)

Grades for 1st Stats Practical (1991-2002)

5a. Mode

• Definition: The most commonly occurring value• for nominal data we refer to the modal class• not appropriate for ordinal or (usually) interval data

020

4060

80100120

140160

180200

Variable X

Fre

qu

en

cy

25 30 35 40 45 50 55 60 65 70 75 80 850

20

4060

80100120

140160

180200

Variable X

Fre

qu

en

cy

25 30 35 40 45 50 55 60 65 70 75 80 85

Modal Class

5b. Median

• Definition: The central value in an ordered set of data

Raw data

4

2

5

1

7

10

6

Sorted data

1

2

4

5

6

7

10

Median

5b. Median

• even number of values

Raw data

4

2

5

1

7

10

Sorted data

1

2

4

5

6

7

¬ Median(4 + 5) / 2 = 4.5

5c. Arithmetic Mean

n

x

x

n

ii

1

5c. Arithmetic Mean

n

xx

57

357

61071524

x

data: 4, 2, 5, 1, 7, 10, 6

The “average”

• average = central tendency• the mean, mode and median are all

measures of “average”• average mean

6. Precision and accuracy

• Precision:– The degree of refinement with which an

operation is performed or a measurement stated

• Accuracy:– Freedom from mistake or error

6. Precision and accuracy

Week 1 - Central Tendency

1. Types of statistics2. Types of data3. Samples4. Frequency distribution5. Measures of central tendency

a) modeb) medianc) arithmetic mean

6. Precision and accuracy

Excel skills in Practical 1

• Entering and sorting data• Calculating mean, median and mode• Creating frequency histograms• Introduction to formulas functions

Recommended