11
What is this class about? • The Statistical Analysis of Data This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

What is this class about? The Statistical Analysis of Data This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

Embed Size (px)

Citation preview

Page 1: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

What is this class about?

• The Statistical Analysis of Data This includes 2 key terms which

need some explanation

1. “Statistics”

2. “Data”

Page 2: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

StatisticsA. What are “statistics?”

“The use of numbers to quantitatively describe or index the states of some phenomena.”

a) They are Quantitative (numerical)

b) They are Aggregate references (to groups of data)

c) They are Objective information (calculated)

They are intellectual constructions or computations (that may be useful)

a) They exist because we compute them

b) There are alternative ways to compute them

c) But they should refer to real patterns/events

d) They are valid only as long they “work”

e) Avoid reifications and “weatherman’s fallacy”

Page 3: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

StatisticsB. What are they good for?

a) Describe things quantitatively Descriptive statistics: take the observed

data as the whole population of interest• Summarize a given set of data points• Index or categorize the data points in the set

b) Make “educated guesses” from limited info Inferential statistics: take the observed data

as a limited sample from a larger population of interest (of which we have limited info)

• Draw conclusions or inferences• Make decisions

Page 4: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

Statistics – Limitations?Useful only for answering quantitative questions

(i.e., about amounts, degrees, or extents) Only apply to things that are countable or

measurable in objectified terms.

Are statistics inherently misleading? The famous problem of “lying with statistics”

Do we really need statistics? What are the alternatives?

We can’t escape them; they’re everywhere

The issue is to know WHEN and HOW to use them meaningfully

Page 5: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

“Types of Analysis”A. Qualitative vs. Quantitative

Statistical analysis is necessarily quantitative i.e., We are using numbers to describe the

numerical properties or patterns of things We require data coded into countable &

measurable variables.

B. Descriptive vs. Inferential 2 basic analytic tasks in statistics:

(1) Summarize things (a set of data points) in numerical terms

(2) Make inferences and decisions from limited observations

Page 6: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

What are “Data” (?)A. Data = information collected and

recorded (a plural noun?) Data may be Quantitative or Qualitative

Data Set contains many data points

B. The magic word for Quantitative Data = “Variables”

Variable = any attribute or property of some thing that can take on different values/states

• Must have more than one possible state• Don’t have to be numerical values

Page 7: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

“Data” (continued)

C. Different Types of Variables?1. By their analytical function

a) Dependent variables

b) Independent variables

c) Extraneous variables

Why do functional types of variables matter in statistics?

Page 8: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

3. “Data” (continued)

C. Different Types of Variables (cont.) 2. By Level of Measurement

a) Nominal level – numbers as labels

b) Ordinal level – numbers as relative position

c) Interval level – numbers as comparative size

d) Ratio level – numbers as absolute size The level-of-measurement represents the uses,

inferences, or meanings we make of the data. Why do Levels of variables matter in statistics? Treating ordinal data as interval data – Why

not?

Page 9: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

3. “Data” (continued)

C. Different Types of Variables (cont.) 3. Other important distinctions?

a) Numerical vs. Nonnumeric

b) Discrete vs. Continuous

4. A very special type of variable = “Binary”a) Dichotomy only 2 possible values or outcomes

(0 & 1 as only values)

b) Examples? Yes-No; Present-Absent; Drug Use-Abstinence; Guilty-Not Guilty; Alive-Dead; Pass-Fail; Pregnant-Not Pregnant

c) Binary variables = Both Numeric and Nominal (?!)

Page 10: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

Why are Binary variables important?1) The world consists of LOTS of dichotomous

events – (a) outcomes; (b) decisions

2) Binary numbers are very well-defined and handy (both mathematical & practical terms)

a) They are the basis for modern digital computers

b) They represent logical events

3) Many complex events = combinations of binary events

4) Binary = very useful but can present special statistical issues

Page 11: What is this class about? The Statistical Analysis of Data  This includes 2 key terms which need some explanation 1. “Statistics” 2. “Data”

Some Introductory Math Issues:1) Reading equations and formulas

• Useful to have a basic working knowledge• Memorization = unnecessary

2) Doing complex tedious arithmetic• By hand (calculator) – some will be required• By computer – most of our statistical calculations

3) “Rounding numbers”? (How many places?)• Interim calculations – carry more decimal places

for computation precision• Final results – round to most meaningful units for

ease of interpretation (usually one place more than original numbers)