30
AP STATS CHAPTER 1 The Role of statistics and the data analysis process

AP STATS CHAPTER 1 The Role of statistics and the data analysis process

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

AP STATSCHAPTER 1

The Role of statistics and the data analysis process

Page 2: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

The Data Analysis Process

• Acknowledging Variability

• Collecting Data Sensibly

Step 1

• Describing Variability in the Data

• Descriptive Statistics

Step 2

• Drawing Conclusions in a Way That Recognizes Variability in the Data

• Probability Supports the Conclusion

Step 3

Page 3: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Statistics (1)

The scientific discipline that provides methods to help make sense of data.

Suspicion: Extreme skeptics, usu. speaking out of ignorance, characterize this discipline as a subcategory of lying.

Used properly, statistical methods offer a set of POWERFUL tools for gaining insight into the world around us.

Used in business, medicine, agriculture, social sciences, natural sciences, and applied sciences.

Page 4: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Statistics (2)

“…teaches us how to make intelligent judgments and informed decisions in the presence of uncertainty and variation.”

Page 5: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

1.1 Three Reasons to Study Statistics

Be Informed

•To understand news reports making data-based claims.•Extract information from tables and graphs.•Follow numerical arguments.•Understand the basics for valid research designs.

Understand Issues and Sound Decision

Making Based on Data

•Is existing info adequate, or do we need more?•How to collect information in a reasonable and thoughtful manner.•Summarize data in a useful and informative way.•Analyze available data.•Make conclusions and decisions, and assess risk for an incorrect decision.

Evaluate Decisions That Affect Your Life

•Other people use statistical methods to make decisions that affect you life.•Drug screening by companies, medical researchers, university financial aid, insurance companies, etc.•Are the decisions made by these groups done in a reasonable way.

Page 6: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

1.2 The Nature and Role of Variability

Statistics focuses on collecting, analyzing, and drawing conclusions from data. If all measurements were identical for every individual, this

task would be easy. But populations without variability are virtually non-existent. In fact, variability is universal.

We need to understand variability to be able to collect, analyze, and draw conclusions from data in a sensible way.

The branch called descriptive statistics helps to increase our understanding of the nature of variability in a population.

Page 7: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Figure 1.1 Histogram of heights (in inches) of female athletes: (a) basketball players; (b) gymnasts. Sample size is 100 for both groups (N =100).

Next

Page 8: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.1 If the Shoe Fits

Is the variation in the heights between the two groups similar?

What if a 5’11” woman was looking for her sister who is practicing with her team in the gym, where would you direct her? Why?

What if you found a pair of size 6 shoes left in the locker room? Where would you try to return them?

You informally used statistical reasoning that combined your knowledge of the relationship of height between siblings and height and shoe size with the information about height distributions in Fig. 1.1.

Page 9: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Figure 1.2 Frequency of contaminant concentration (in ppm) in well water. Based on the average of five measurements per day for 200 days

(N = 200).

Page 10: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.2 Monitoring Water Quality

Suppose a chemical spill occurred at a manufacturing plant 1 mile from the well.

One month aver the spill the average contamination is 15.5 ppm. Would this be convincing evidence that well was affected by the spill?

What if the average was 17.4 ppm? 22.0 ppm?

In both Examples 1.1 and 1.2, reaching a conclusion required an understanding of variability. Variability allows us to distinguish between usual and unusual values.

Page 11: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

1.3 Statistics and the Data Analysis Process

Conclusions based on data are seen regularly in popular media and professional and academic publications.

Decisions are data driven in business, industry, and government.

Descriptive statistics – methods for organizing and summarizing data. Next step in the data analysis process once a data set has be collected or

an appropriate source identified.Inferential statistics – involves generalizing from a sample

to the population and requires and understanding of the variation in the population (i.e., descriptive statistics). Also includes assessing the reliability of such generalizations. Because these conclusions are based on incomplete information there is a

need to quantify the chance of an incorrect conclusion.

Page 12: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Population – the entire collection of individuals or objects about which information is desired.

Sample – a subset of the population, selected for study in some prescribed manner.

Page 13: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

The Data Analysis Process (1)

Raw data without analysis is of little value, likewise even a sophisticated analysis cannot provide meaningful information from data that were not collected in a sensible way.

Data collection and analysis allow researchers to answer questions about the way systems work.

Steps to data analysis process: Understand the nature of the problem. Decide what to measure and how to measure it. Data collection Data summarization and preliminary analysis Formal data analysis Interpretation of results

Page 14: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.3 A Proposed New Treatment for Alzheimer’s Disease

In 2002, eleven patients had shunts implanted into brain.

Comparison group received the standard care for Alzheimer’s

Quarterly tests of memory function for both groups showed a steady decline in the control group, while the surgically treated (experimental) did not decline.

Study was too small to produce conclusive statistical evidence, but the preliminary results justified a larger study to include 256 patients at 25 medical centers across the country.

Page 15: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

The Data Analysis Process (2)

Evaluating a Research Study (in the popular and technical press): The six data analysis steps can be used as a guide.1. What were the researchers trying to learn? What

question motivated their research?2. Was relevant information collected? Were the right

things measured?3. Were the data collected in a sensible way?4. Were the data summarized in an appropriate way?5. Was an appropriate method of analysis used, given the

type of data and how the date were collected?6. Are the conclusions drawn by the researchers supported

by the data analysis?

Page 16: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.4 Spray Away the Flu

Newspaper article reported the results of a study in which vaccine was administered by nasal spray.

Results general look promising, but the newspaper articles typically do not report the details of the study. How were the data collected? How were the children selected for the study? How was it determined which children received the

vaccine and which received the placebo? How was subsequent diagnosis of flu made? Not mentioned by authors of text: how does this compare

to vaccine by injection?

Page 17: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

1.4 Types of Data and Some Simple Graphical Displays

Describing Data Variable – any characteristic whose value may change

from one individual or object to another. Data – results from making observations either on a

single variable or simultaneously on two or more variables.

Univariate data set – data set consisting of observations on a single attribute. Categorical (or qualitative) – individual responses are

categorical responses. Numerical (or quantitative) – observations are

numerical.

Page 18: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.5 Airline Safety Violations

FAA monitors airlinesUSA Today March 13 2000 reported on

violations that could lead to fines from FAA.Violation categories: Security (S),

Maintenance (M), Flight Operations (F), Hazardous Materials (H), or Other (O).

Subset of the data for 20 administrative actions: S S M H M O S M S S F S O M S M S M S M

What kind of data set is this?

Page 19: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Describing data (continued) Bivariate data set – when a data set consists of two

attributes recorded simultaneously for each individual.

Multivariate data set

Page 20: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.6 Revisiting Airline Safety Violations

Airline Number of Violations

Average Fine per Violation (in US dollars)

Alaska 258 5038.760

America West 257 3112.840

American 1745 2693.410

Continental 973 5755.396

Delta 1280 3828.125

Northwest 1097 2643.573

Southwest 535 3925.234

TWA 642 2803.738

United 1110 2612.613

US Airways 891 3479.237

What type of data set is this?

Page 21: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Two Types of Numerical Data Discrete – a numerical variable in which the possible

values of the variable correspond to isolated points on the number line.

Continuous – a numerical variable in which the possible values of the variable form an entire interval on the number line.

Page 22: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.7 Calls to Drug Abuse Hotline

The number of telephone calls per day to a drug abuse hotline is recorded for 12 days: 3 0 4 3 1 0 6 2 0 0 1 2

This data set represents isolated points on a number line, thus this is a discrete numerical data set.

Example 1.6 had both types of numerical data. Number of violations Average fine per violation

In general, data are continuous when observations involve making measurements

Page 23: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Frequency Distributions and Bar Charts for Categorical Data A tabular or graphical display can effectively

communicate information A common way to present categorical data is in the form

of a in a table called a frequency distribution. Frequency distribution for categorical data – a table

that displays the possible categories along with the associated frequencies and/or relative frequencies. Frequency – for a particular category, the number of times

the category appears in the data set. Relative frequency – for a particular category, the fraction

or proportion of the observations resulting in the category.

Page 24: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

setdatatheinnsobservatioofnumber

frequencyfrequencyrelative

If a table includes relative frequencies, it is sometimes referred to as a relative frequency distribution.

Page 25: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.8 Motorcycle Helmets Can You See Those Ears?

Helmet Use Category

Frequency Relative Frequency

No Helmet 731 0.430

Noncompliant Helmet

153 0.090

Compliant Helmet 816 0.480

TOTAL 1700 1.000

Table 1.1 Frequency distribution of helmet use.

Total number of observations

Should be equal to 1, but may be slightly off due to rounding.

731/1700

153/1700

Page 26: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Bar Charts – a graph of the frequency distribution of categorical data. When to use: Categorical data. How to construct:

Horizontal line, with category names below line at regularly spaced intervals

Vertical line, label the scale using in frequency or relative frequency.

Rectangular bar above every category should be same width, height determined by category’s frequency.

What to look for: Frequently and infrequently occurring categories.

Page 27: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.9 Revisiting Motorcycle Helmets

No Helmet Noncompliant Helmet Compliant Helmet0

100

200

300

400

500

600

700

800

900

Helmet Use Category

Fre

qu

en

cy

Figure 1.5 Bar chart of helmet use.

Page 28: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Dotplots for Numerical Data A dotplot is a simple way to display numerical data when

the data set is reasonably small. When to use: Small numerical data sets How to construct:

Draw a horizontal line and mark with an appropriate measurement scale.

Locate each value in the data set along the measurement scale and represent it by a dot. If there are two or more observations with the same value, stack the dots vertically.

What to look for: Dotplots convey information about: A representative or typical value in the data set. The extent of the spread of the data The nature of the distribution of values along the number line. The presence of unusual values in the data set.

Page 29: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Example 1.10 Graduation Rates for NCAA Division I Schools in CA and TX

From The Chronicle of Higher Education, Aug. 31, 2001

Reported graduation rates as percentages of full-time freshmen in fall 1993 who earned bachelor’s degrees by Aug. 1999.

California Texas

64 41 44 31 37 67 21 32 88 35

73 72 68 35 37 71 39 35 71 63

81 90 82 74 79 12 46 35 39 28

67 66 66 70 63 65 25 24 22

Page 30: AP STATS CHAPTER 1 The Role of statistics and the data analysis process

Figure 1.6 Minitab dotplot of graduation rates.

Figure 1.7 Minitab dotplot of graduation rates for California and Texas.