40
AP Statistics – Summer Packet - 2015 Mercer Island High School Congratulations on your decision to join the thousands of other students across the country that will be enrolled in AP Statistics in the upcoming school year. You have joined the growing ranks of students who recognize the need to take an introductory statistics course. A course such as this is typically required of many college majors including the social sciences, health sciences, and business. What you need to know about this class: This is probably unlike any course that you have taken. I would say that it is a combination of Math, English, and Science. Communication skills are essential, and there is much more reading and writing than what you are used to in a math class. AP Statistics is not an easy class. No Advance Placement class is easy. It is a very rewarding course and a very important one, in my opinion, but can be quite difficult at times. Since it is an AP course, it is considered to be college-level. The mathematics required for this course may not be as difficult as in other advanced math courses, but some of the concepts can be very confusing. You can expect to spend time studying outside of class, as well as in class. However, AP Statistics is special. It is a course that combines both mathematical and verbal skills. On the AP exam, you will be asked to write descriptive paragraphs and concluding sentences. You will have to explain the reasoning behind the method you use and your conclusions. In addition, there is a great deal of material that we are expected to cover by April’s end, so you need to be committed to giving it your absolute best effort day in and day out. Lastly, a TI-NSpire is an essential tool for this course, as those calculators have many statistical features we take advantage of. It would be a good idea to obtain one if you don’t have one already. Purpose of this packet: Not surprisingly, it can be difficult to cover all the required material for this course and still have time for a desired two week review period for the AP exam. I believe that completion of this packet might free up a few extra days to cover the required curriculum, which is significant. It might allow us to spend extra time on the more difficult topics. In addition, this packet will provide you with a good introduction to what Statistics (the field and this course) is about so that you can make the decision about whether or not you want to remain signed up for the class. The assignments in this packet will be due on the first day of school and will count as a major homework grade. You should give yourself at least a week to complete it. You may choose to respond to the questions by typing them in, or you can write them by hand. If you find something confusing, please check out online resources to see if you can get help via a video, Khan Academy, etc. I expect you to give this packet your best shot, but you will not be penalized if you get an answer “wrong.” We will go over the critical components of this packet in class. As with any assignment, copying answers from another individual or another source is considered academically dishonest and will result in a grade of a zero. Visit my website http://www.mercerislandschools.org/Page/4820 and in the AP Statistics Menu, you will find a link to the Summer Work. Here, you will enter answers to selected questions from each part of this review. Also you will find a link to a survey that I would like you to complete prior to the first day of class. Make sure to set aside some time to complete the survey!!! Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century by David Salsburg and Freakonomics by Steven Levitt and Stephen Dubner. HAVE A GREAT SUMMER AND I LOOK FORWARD TO SEEING YOU IN SEPTEMBER! -Mrs. Adsit If you need to email me: [email protected] During the summer, I may not be checking my email every day; be patient and I will get back to you.

Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics – Summer Packet - 2015 Mercer Island High School

Congratulations on your decision to join the thousands of other students across the country that will be enrolled in AP Statistics in the upcoming school year. You have joined the growing ranks of students who recognize the need to take an introductory statistics course. A course such as this is typically required of many college majors including the social sciences, health sciences, and business.

What you need to know about this class: This is probably unlike any course that you have taken. I would say that it is a combination of Math, English, and Science. Communication skills are essential, and there is much more reading and writing than what you are used to in a math class. AP Statistics is not an easy class. No Advance Placement class is easy. It is a very rewarding course and a very important one, in my opinion, but can be quite difficult at times. Since it is an AP course, it is considered to be college-level. The mathematics required for this course may not be as difficult as in other advanced math courses, but some of the concepts can be very confusing. You can expect to spend time studying outside of class, as well as in class. However, AP Statistics is special. It is a course that combines both mathematical and verbal skills. On the AP exam, you will be asked to write descriptive paragraphs and concluding sentences. You will have to explain the reasoning behind the method you use and your conclusions. In addition, there is a great deal of material that we are expected to cover by April’s end, so you need to be committed to giving it your absolute best effort day in and day out. Lastly, a TI-NSpire is an essential tool for this course, as those calculators have many statistical features we take advantage of. It would be a good idea to obtain one if you don’t have one already.

Purpose of this packet: Not surprisingly, it can be difficult to cover all the required material for this course and still have time for a desired two week review period for the AP exam. I believe that completion of this packet might free up a few extra days to cover the required curriculum, which is significant. It might allow us to spend extra time on the more difficult topics. In addition, this packet will provide you with a good introduction to what Statistics (the field and this course) is about so that you can make the decision about whether or not you want to remain signed up for the class.

The assignments in this packet will be due on the first day of school and will count as a major homework grade. You should give yourself at least a week to complete it. You may choose to respond to the questions by typing them in, or you can write them by hand. If you find something confusing, please check out online resources to see if you can get help via a video, Khan Academy, etc. I expect you to give this packet your best shot, but you will not be penalized if you get an answer “wrong.” We will go over the critical components of this packet in class. As with any assignment, copying answers from another individual or another source is considered academically dishonest and will result in a grade of a zero.

Visit my website http://www.mercerislandschools.org/Page/4820 and in the AP Statistics Menu, you will find a link to the Summer Work. Here, you will enter answers to selected questions from each part of this review. Also you will find a link to a survey that I would like you to complete prior to the first day of class. Make sure to set aside some time to complete the survey!!! Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century by David Salsburg and Freakonomics by Steven Levitt and Stephen Dubner.

HAVE A GREAT SUMMER AND I LOOK FORWARD TO SEEING YOU IN SEPTEMBER! -Mrs. Adsit

If you need to email me: [email protected] During the summer, I may not be checking my email every day; be patient and I will get back to you.

Page 2: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 1: Introduction to Statistics Enter your answers online at:

https://docs.google.com/forms/d/1KPvpVXobKxj7Q37mb1SSX4nNfFGcEgaTEwhaIHakHCM/viewform

Sta·tis·tics Etymology: German Statistik: study of political facts and figures, from New Latin statisticus: of politics, from Latin status: state. Date: 1770

1 : a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data [note: this is for Statistics with a uppercase S]

2 : a collection of quantitative data [note: this is for statistics with a lowercase s] Source: http://www.merriam-webster.com/dictionary/statistics

Answer the following in complete, well written sentences, to the best of your ability:

1) Before you saw this definition, how would you have defined Statistics? Has your definition changed after

reading this?

2) How one collects the data is extremely important. Explain how you would conduct a survey to determine the

percentage of Mercer Island High School students who are satisfied with the quality of education that they are

receiving. Due to resource constraints, however, you will only be able to ask 100 students.

3) You have worked with data before in your science classes, if nowhere else. Provide one example from your

life in which you have worked with data. How did you collect it? How did you analyze it? How did you

present your findings? What conclusions did you come to?

Page 3: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

4) Tell me what you have heard from other people about this class.

5) This class is an elective. So, why did you sign up for it?

6) Find a newspaper or magazine article involving statistics. Bring a hard copy (not electronic) of your article

to class on the first day, attached to this packet. Write a summary of your article here:

Page 4: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 2: Censuses Throughout History Enter your answers online at: https://docs.google.com/forms/d/1RUyc3GYmhD6MP2RD7JsKbG_QHWQlNMYd1zuvkfaegCw/viewform

Read the following to learn more about early uses of statistics… Censuses throughout history Early population counts generally were not concerned with determining the total size of the population or including detailed information about people. Their main goal was to discover who was available for military duty and who held taxable property. These counts usually did not give an accurate number or picture of the population. They often left out large segments of society, such as women and children, men attempting to avoid military service or taxation, and native inhabitants of an area. The earliest known population counts were made thousands of years ago by the ancient Babylonians, Chinese, and Egyptians. Around 2500 B.C., the Babylonians recorded on clay tablets information about the taxpaying part of the population. These tablets included such data as the number of farm animals, farm products, and households for districts within the kingdom. Tax returns from around 2300 B.C. for parts of ancient China indicate some kind of population count. About 1300 B.C., Egypt was divided into administrative districts. The government registered and counted heads of households and members of the households within these districts. The fourth book of Bible, the Book of Numbers, describes the census, or numbering, of the tribes in ancient Israel to determine the number of men of fighting age (Numbers 1: 1-46; Numbers 26: 1-51). In 594 B.C., the Greek lawmaker Solon introduced a form of enumeration and registration to reform tax laws in Greece. The Romans employed census takers known as censors to determine the number of people who were eligible for taxation and military duty. The Roman censor was responsible for officially registering all citizens in a particular area, evaluating their property, collecting revenue, and guarding public morals. Perhaps the best-known Roman census is described in the New Testament story of the birth of Jesus Christ (Luke 2:1-7). This census took place about 5 B.C., when Joseph and Mary traveled to Bethlehem to record their names in a census ordered by the Roman emperor Augustus. The practice of taking censuses declined in Europe after the fall of the West Roman Empire in A.D. 476. One of the few attempts to count people during the Middle Ages occurred in England in 1086. That year, commissioners sent by William the Conqueror traveled the kingdom and recorded, for tax purposes, the names of all English landowners and the value of their lands and houses, tenants, and servants. The resulting document, known as the Domesday Book, provides historians with a censuslike description of England at that time. Through the years, with the rise in trade, the growth of towns, and the development of nations, rulers and government officials increasingly recognized the importance of counting people and goods. In 1665, King Louis XIV of France ordered a census in New France, in what is now Quebec, Canada. This census recorded the name of each person, along with such information as age, marital status, occupation, and relationship to the head of the household. The main purpose of this census was to collect information about the colony's progress, rather than to assess how much military service or tax revenue the colonists might provide. Because of this purpose, census historians generally consider the New France enumeration to be the model for modern censuses. Likewise, in 1703, there was a house-to-house census in Iceland for reasons other than taxation and military service. This census inquired into the effects of economic conditions and natural disasters. The government then used the information to develop programs for economic and social improvement. A number of European countries undertook censuses of individual cities and provinces in the early 1700’s. However, none of these enumerations counted the total population of a nation until 1749. That year, the Swedish government conducted the first national census.

Page 5: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

The first modern census —one that was complete, direct, and scheduled to be repeated at regular intervals—was the United States census of 1790. In the 1800’s, a number of other countries began taking regular censuses. In 1853, an International Statistical Congress was held in Brussels, Belgium. This conference represented the first attempt to adopt international recommendations and requirements to help in comparing population census data among various countries. After World War II ended in 1945, censuses became especially important as an aid in planning for the economic reconstruction of countries that had been heavily damaged in the war. In 1946, the United Nations established a separate Population and Statistical Commission, which recognized the need for census statistics. Since then, the United Nations has published a number of principles and recommendations for population and housing censuses to assist countries in the planning of censuses. Following these recommended standards allows for international comparison of collected data. In addition, the United Nations Fund for Population Activities provides many countries with financial and expert assistance for the planning of censuses. Today, most censuses are proclaimed by a government decree or law and planned and executed by a statistical agency, a permanent or semipermanent census bureau, or both. These census acts or laws require every person to answer the questions to the best of his or her knowledge. Refusal to cooperate can result in a fine or even imprisonment. Draaijer, Gera. "Census." World Book Advanced. World Book, 2011. Web. 5 June 2011.

7) After reading this, provide a conjecture for why the word “Statistics” is rooted in the Latin for “state”.

Page 6: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 3: For Today’s Graduate, Just One Word:

Statistics Enter your answers online at:

https://docs.google.com/forms/d/12J7dgSXNKrP6u15AzKzBWRGBk-W6_0XMqGuOyog7hl8/viewform

Read the following: The New York Times - August 6, 2009

For Today’s Graduate, Just One Word: Statistics By STEVE LOHR

MOUNTAIN VIEW, Calif. — At Harvard, Carrie Grimes majored in anthropology and archaeology and

ventured to places like Honduras, where she studied Mayan settlement patterns by mapping where artifacts

were found. But she was drawn to what she calls “all the computer and math stuff” that was part of the job.

“People think of field archaeology as Indiana Jones, but much of what you really do is data analysis,” she said.

Now Ms. Grimes does a different kind of digging. She works at Google, where she uses statistical analysis of

mounds of data to come up with ways to improve its search engine.

Ms. Grimes is an Internet-age statistician, one of many who are changing the image of the profession as a place

for dronish number nerds. They are finding themselves increasingly in demand — and even cool.

“I keep saying that the sexy job in the next 10 years will be statisticians,” said Hal Varian, chief economist at

Google. “And I’m not kidding.”

The rising stature of statisticians, who can earn $125,000 at top companies in their first year after getting a

doctorate, is a byproduct of the recent explosion of digital data. In field after field, computing and the Web are

creating new realms of data to explore — sensor signals, surveillance tapes, social network chatter, public

records and more. And the digital data surge only promises to accelerate, rising fivefold by 2012, according to a

projection by IDC, a research firm.

Yet data is merely the raw material of knowledge. “We’re rapidly entering a world where everything can be

monitored and measured,” said Erik Brynjolfsson, an economist and director of the Massachusetts Institute of

Technology’s Center for Digital Business. “But the big problem is going to be the ability of humans to use,

analyze and make sense of the data.”

The new breed of statisticians tackle that problem. They use powerful computers and sophisticated

mathematical models to hunt for meaningful patterns and insights in vast troves of data. The applications are

as diverse as improving Internet search and online advertising, culling gene sequencing information for cancer

research and analyzing sensor and location data to optimize the handling of food shipments.

Even the recently ended Netflix contest, which offered $1 million to anyone who could significantly improve the

company’s movie recommendation system, was a battle waged with the weapons of modern statistics.

Page 7: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Though at the fore, statisticians are only a small part of an army of experts using modern statistical techniques

for data analysis. Computing and numerical skills, experts say, matter far more than degrees. So the new data

sleuths come from backgrounds like economics, computer science and mathematics.

They are certainly welcomed in the White House these days. “Robust, unbiased data are the first step toward

addressing our long-term economic needs and key policy priorities,” Peter R. Orszag, director of the Office of

Management and Budget, declared in a speech in May. Later that day, Mr. Orszag confessed in a blog entry that

his talk on the importance of statistics was a subject “near to my (admittedly wonkish) heart.”

I.B.M., seeing an opportunity in data-hunting services, created a Business Analytics and Optimization Services

group in April. The unit will tap the expertise of the more than 200 mathematicians, statisticians and other

data analysts in its research labs — but that number is not enough. I.B.M. plans to retrain or hire 4,000 more

analysts across the company.

In another sign of the growing interest in the field, an estimated 6,400 people are attending the statistics

profession’s annual conference in Washington this week, up from around 5,400 in recent years, according to

the American Statistical Association. The attendees, men and women, young and graying, looked much like any

other crowd of tourists in the nation’s capital. But their rapt exchanges were filled with talk of randomization,

parameters, regressions and data clusters. The data surge is elevating a profession that traditionally tackled

less visible and less lucrative work, like figuring out life expectancy rates for insurance companies.

Ms. Grimes, 32, got her doctorate in statistics from Stanford in 2003 and joined Google later that year. She is

now one of many statisticians in a group of 250 data analysts. She uses statistical modeling to help improve the

company’s search technology.

For example, Ms. Grimes worked on an algorithm to fine-tune Google’s crawler software, which roams the Web

to constantly update its search index. The model increased the chances that the crawler would scan frequently

updated Web pages and make fewer trips to more static ones.

The goal, Ms. Grimes explained, is to make tiny gains in the efficiency of computer and network use. “Even an

improvement of a percent or two can be huge, when you do things over the millions and billions of times we do

things at Google,” she said.

It is the size of the data sets on the Web that opens new worlds of discovery. Traditionally, social sciences

tracked people’s behavior by interviewing or surveying them. “But the Web provides this amazing resource for

observing how millions of people interact,” said Jon Kleinberg, a computer scientist and social networking

researcher at Cornell.

For example, in research just published, Mr. Kleinberg and two colleagues followed the flow of ideas across

cyberspace. They tracked 1.6 million news sites and blogs during the 2008 presidential campaign, using

algorithms that scanned for phrases associated with news topics like “lipstick on a pig.”

The Cornell researchers found that, generally, the traditional media leads and the blogs follow, typically by 2.5

hours. But a handful of blogs were quickest to quotes that later gained wide attention.

The rich lode of Web data, experts warn, has its perils. Its sheer volume can easily overwhelm statistical

models. Statisticians also caution that strong correlations of data do not necessarily prove a cause-and-effect

link.

Page 8: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

For example, in the late 1940s, before there was a polio vaccine, public health experts in America noted that

polio cases increased in step with the consumption of ice cream and soft drinks, according to David Alan Grier,

a historian and statistician at George Washington University. Eliminating such treats was even recommended

as part of an anti-polio diet. It turned out that polio outbreaks were most common in the hot months of

summer, when people naturally ate more ice cream, showing only an association, Mr. Grier said.

If the data explosion magnifies longstanding issues in statistics, it also opens up new frontiers.

“The key is to let computers do what they are good at, which is trawling these massive data sets for something

that is mathematically odd,” said Daniel Gruhl, an I.B.M. researcher whose recent work includes mining

medical data to improve treatment. “And that makes it easier for humans to do what they are good at — explain

those anomalies.”

What is Statistics? by Jordan Neus (from http://www.fiu.edu/~neusj/whatisstatistics.html)

Statistics is becoming increasingly more important in modern society with passing time. We are constantly being

bombarded with charts, graphs, and statistics of various types in an attempt to provide us with succinct information to

make decisions. Sometimes this information is presented in a manner so as to sway us toward a particular view. As

consumers and decision makers we must be aware of this. Which drug should we take? Which car should we buy?

Where will the economy go? Who is infected with a particular deadly disease? These are all examples of questions

which are usually relegated to the statistician for analysis and dissemination. This lecture will attempt to introduce the

beginning to student some of the reasoning behind the necessity of statistical inference.

In order to realistically understand the subject of Statistics it is important to appreciate the rationale behind why

and how Statistics is used by the world, at large. That is, why do we need Statistics anyway? This, perhaps, is a bit

philosophical, yet I can not over emphasize the need for thinking along these lines. Without proper perspective, Statistics

becomes a mere mathematical exercise, diverging from the true nature of the subject.

In order to begin our analysis as to why Statistics is a necessary type of reasoning we must begin by addressing

the nature of science and experimentation. A characteristic method used by scientists is to study a relatively small

collection of objects, say 2500 people, and a characteristic, say longevity, and through experimentation or observation,

draw a conclusion appropriate for the entire class of objects (i.e. people, in general). For example, suppose a study

published results suggesting people who own pets live longer. Would this mean that all people who own pets are likely to

live long lives? Does owning a pet cause longevity? Suppose the people in the study, by chance, were on the whole, very

healthy people, and therefore lived long lives: Would this invalidate the researcher’s assertion that people who own pets

live longer? The obvious problem with this type of reasoning is that these issues can never be proved absolutely. This

type of scientific reasoning is called inductive reasoning and is inherently flawed. One can never study a sample and

expect conclusions to hold true for the entire population with absolute certainty. This is exactly why Statistics is needed.

In contrast to the lack of certainty associated with inductive reasoning, the type of logic used in Mathematics is

absolutely certain. The mathematician begins with general principles and logically concludes more specific relationships.

This type of reasoning from the general to the particular is called deductive reasoning. A rather simplistic (but

nevertheless correct) example is based on the principle that two numbers can be added in any order, thereby giving the

same sum. This is called the axiom of commutativity. An example of deductive reasoning would be to assert that since

this holds for any two numbers, surely this must hold for the numbers two and three, in particular. We are, therefore,

absolutely certain that 2 + 3 = 3 + 2, given the axiom of commutativity.

In its applied form, Statistics then becomes a bridge between the inductive uncertainty of science and the

deductive certainty of Mathematics. In his classic book, The Design of Experiments, Sir Ronald A. Fisher expresses this

idea beautifully:

Page 9: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

We may at once admit that any inference from the particular to the general must be attended with some degree of

uncertainty, but this is not the same as to admit that such inference cannot be absolutely rigorous, for the nature and

degree of the uncertainty may itself be capable of rigorous expression.

Statistics, therefore, is the mathematical method by which the uncertainty inherent in the scientific method is

rigorously quantified.

View the following TEDtalk:

Arthur Benjamin: Teach statistics before calculus! http://www.ted.com/talks/arthur_benjamin_s_formula_for_changing_math_education?language=en

8) React to the above pieces and TED Talk in at least three paragraphs:

Page 10: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 4: Data and Its Context + Reading

Comprehension Involving Statistics Enter your answers online at:

https://docs.google.com/forms/d/1Rw25yXlC3CaaRCuNq7AcY69jriTST4Lag65SzU1GcNE/viewform

Read the following…

“Teen Automobile Crash Rates Are Higher When School Starts Earlier”

ScienceDaily (June 10, 2010) — Earlier school start times are associated with increased teenage car crash rates, according

to a research abstract presented June 9, 2010, in San Antonio, Texas, at SLEEP 2010, the 24th annual meeting of the

Associated Professional Sleep Societies LLC.

Results indicate that in 2008 the teen crash rate was about 41 percent higher in Virginia Beach, Va., where high school

classes began at 7:20 a.m., than in adjacent Chesapeake, Va., where classes started more than an hour later at 8:40 a.m.

There were 65.4 automobile crashes for every 1,000 teen drivers in Virginia Beach, and 46.2 crashes for every 1,000 teen

drivers in Chesapeake.

"We were concerned that Virginia Beach teens might be sleep restricted due to their early rise times and that this could

eventuate in an increased crash rate," said lead author Robert Vorona, MD, associate professor of internal medicine at

Eastern Virginia Medical School in Norfolk, Va. "The study supported our hypothesis, but it is important to note that this

is an association study and does not prove cause and effect."

The study involved data provided by the Virginia Department of Motor Vehicles. In Virginia Beach there were 12,916

drivers between 16 and 18 years of age in 2008, and these teen drivers were involved in 850 crashes. In Chesapeake there

were 8,459 teen drivers and 394 automobile accidents. The researchers report that the two adjoining cities have similar

demographics, including racial composition and per-capita income.

1) Answer the following questions regarding the above excerpt:

a) Who is being studied?

b) What about those individuals is being recorded / analyzed (i.e. what are the variables?)? Do you think the

variables are categorical or quantitative in nature?

c) When was the data collected?

d) Where was the data collected (more accurately: what geographical area is associated with the data)?

Page 11: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

e) Why do you think this data was collected and analyzed?

f) How was the data collected and analyzed? In other words, what methods were used?

g) Why do you think the authors of the study mentioned that “it is important to note that this is an association

study and does not prove cause and effect?”

2) Answer the same questions in (a) – (f) above, except now do it for the article that you found regarding

statistics:

Page 12: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 5: Course Description Enter your answers online at:

https://docs.google.com/forms/d/1PULEBqzFeXbeEJwONaHkqjFYU6h-LtKksB0rZ-t9xcc/viewform

Read the following and answer the questions at the end:

Highlights from the AP Statistics Course Description (from http://apcentral.collegeboard.com/apc/public/repository/ap-statistics-course-description.pdf)

Introduction

The Advanced Placement Program offers a course description and exam in statistics to secondary school

students who wish to complete studies equivalent to a one semester, introductory, non-calculus-based, college

course in statistics.

Statistics and mathematics educators who serve as members of the AP Statistics Development

Committee have prepared the Course Description and exam to reflect the content of a typical introductory

college course in statistics. The exam is representative of such a course and therefore is considered appropriate

for the measurement of skills and knowledge in the field of introductory statistics.

In colleges and universities, the number of students who take a statistics course is almost as large as the

number of students who take a calculus course. A July 2002 article in the Chronicle of Higher Education

reports that the enrollment in statistics courses from 1990 to 2000 increased by 45 percent — one testament to

the growth of statistics in those institutions. An introductory statistics course, similar to the AP Statistics course,

is typically required for majors such as social sciences, health sciences and business. Every semester about

236,000 college and university students enroll in an introductory statistics course offered by a mathematics or

statistics department. In addition, a large number of students enroll in an introductory statistics course offered

by other departments. Science, engineering and mathematics majors usually take an upper-level calculus-based

course in statistics, for which the AP Statistics course is effective preparation.

The Course

The purpose of the AP course in statistics is to introduce students to the major concepts and tools for collecting,

analyzing and drawing conclusions from data. Students are exposed to four broad conceptual themes:

1. Exploring Data: Describing patterns and departures from patterns

2. Sampling and Experimentation: Planning and conducting a study

3. Anticipating Patterns: Exploring random phenomena using probability and simulation

4. Statistical Inference: Estimating population parameters and testing hypotheses

AP Statistics Course Content Overview

The topics for AP Statistics are divided into four major themes: exploratory analysis (20–30 percent of the

exam), planning and conducting a study (10–15 percent of the exam), probability (20–30 percent of the exam),

and statistical inference (30–40 percent of the exam).

I. Exploratory analysis of data makes use of graphical and numerical techniques to study patterns and

departures from patterns. In examining distributions of data, students should be able to detect important

characteristics, such as shape, location, variability and unusual values. From careful observations of patterns in

data, students can generate conjectures about relationships among variables. The notion of how one variable

may be associated with another permeates almost all of statistics, from simple comparisons of proportions

Page 13: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

through linear regression. The difference between association and causation must accompany this conceptual

development throughout.

II. Data must be collected according to a well-developed plan if valid information is to be obtained. If data

are to be collected to provide an answer to a question of interest, a careful plan must be developed. Both the

type of analysis that is appropriate and the nature of conclusions that can be drawn from that analysis depend in

a critical way on how the data was collected. Collecting data in a reasonable way, through either sampling or

experimentation, is an essential step in the data analysis process.

III. Probability is the tool used for anticipating what the distribution of data should look like under a given

model. Random phenomena are not haphazard: they display an order that emerges only in the long run and is

described by a distribution. The mathematical description of variation is central to statistics. The probability

required for statistical inference is not primarily axiomatic or combinatorial but is oriented toward using

probability distributions to describe data.

IV. Statistical inference guides the selection of appropriate models. Models and data interact in statistical

work: models are used to draw conclusions from data, while the data are allowed to criticize and even falsify the

model through inferential and diagnostic methods. Inference from data can be thought of as the process of

selecting a reasonable model, including a statement in probability language, of how confident one can be about

the selection.

Topic Outline

The percentages in parentheses for each content area indicate the coverage for that content area in the exam.

I. Exploring Data: Describing patterns and departures from patterns (20%–30%)

Exploratory analysis of data makes use of graphical and numerical techniques to study patterns and departures

from patterns. Emphasis should be placed on interpreting information from graphical and numerical displays

and summaries.

A. Constructing and interpreting graphical displays of distributions of univariate data (dotplot, stemplot,

histogram, cumulative frequency plot)

1. Center and spread

2. Clusters and gaps

3. Outliers and other unusual features

4. Shape

B. Summarizing distributions of univariate data

1. Measuring center: median, mean

2. Measuring spread: range, interquartile range, standard deviation

3. Measuring position: quartiles, percentiles, standardized scores (z-scores)

4. Using boxplots

5. The effect of changing units on summary measures

C. Comparing distributions of univariate data (dotplots, back-to-back stemplots, parallel boxplots)

1. Comparing center and spread: within group, between group variation

2. Comparing clusters and gaps

3. Comparing outliers and other unusual features

4. Comparing shapes

D. Exploring bivariate data

1. Analyzing patterns in scatterplots

2. Correlation and linearity

Page 14: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

3. Least-squares regression line

4. Residual plots, outliers, and influential points

5. Transformations to achieve linearity: logarithmic and power transformations

E. Exploring categorical data

1. Frequency tables and bar charts

2. Marginal and joint frequencies for two-way tables

3. Conditional relative frequencies and association

4. Comparing distributions using bar charts

II. Sampling and Experimentation: Planning and conducting a study (10%–15%)

Data must be collected according to a well-developed plan if valid information on a conjecture is to be

obtained. This plan includes clarifying the question and deciding upon a method of data collection and analysis.

A. Overview of methods of data collection

1. Census

2. Sample survey

3. Experiment

4. Observational study

B. Planning and conducting surveys

1. Characteristics of a well-designed and well-conducted survey

2. Populations, samples, and random selection

3. Sources of bias in sampling and surveys

4. Sampling methods, including simple random sampling, stratified random sampling, and cluster

sampling

C. Planning and conducting experiments

1. Characteristics of a well-designed and well-conducted experiment

2. Treatments, control groups, experimental units, random assignments, and replication

3. Sources of bias and confounding, including placebo effect and blinding

4. Completely randomized design

5. Randomized block design, including matched pairs design

D. Generalizability of results and types of conclusions that can be drawn from observational studies,

experiments, and surveys

III. Anticipating Patterns: Exploring random phenomena using probability and simulation (20%–30%)

Probability is the tool used for anticipating what the distribution of data should look like under a given model.

A. Probability

1. Interpreting probability, including long-run relative frequency interpretation

2. “Law of Large Numbers” concept

3. Addition rule, multiplication rule, conditional probability, and independence

4. Discrete random variables and their probability distributions, including binomial and geometric

5. Simulation of random behavior and probability distributions

6. Mean (expected value) and standard deviation of a random variable, and linear transformation of

a random variable

B. Combining independent random variables

1. Notion of independence versus dependence

2. Mean and standard deviation for sums and differences of independent random variables

C. The normal distribution

1. Properties of the normal distribution

2. Using tables of the normal distribution

Page 15: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

3. The normal distribution as a model for measurements

D. Sampling distributions

1. Sampling distribution of a sample proportion

2. Sampling distribution of a sample mean

3. Central Limit Theorem

4. Sampling distribution of a difference between two independent sample proportions

5. Sampling distribution of a difference between two independent sample means

6. Simulation of sampling distributions

7. t-distribution

8. Chi-square distribution

IV. Statistical Inference: Estimating population parameters and testing hypotheses (30%–40%)

Statistical inference guides the selection of appropriate models.

A. Estimation (point estimators and confidence intervals)

1. Estimating population parameters and margins of error

2. Properties of point estimators, including unbiasedness and variability

3. Logic of confidence intervals, meaning of confidence level and confidence intervals, and

properties of confidence intervals

4. Large sample confidence interval for a proportion

5. Large sample confidence interval for a difference between two proportions

6. Confidence interval for a mean

7. Confidence interval for a difference between two means (unpaired and paired)

8. Confidence interval for the slope of a least-squares regression line

B. Tests of significance

1. Logic of significance testing, null and alternative hypotheses; p-values; one- and two-sided tests;

concepts of Type I and Type II errors; concept of power

2. Large sample test for a proportion

3. Large sample test for a difference between two proportions

4. Test for a mean

5. Test for a difference between two means (unpaired and paired)

6. Chi-square test for goodness of fit, homogeneity of proportions, and independence (one- and

two-way tables)

7. Test for the slope of a least-squares regression line

The Use of Technology

The AP Statistics course adheres to the philosophy and methods of modern data analysis. Although the

distinction between graphing calculators and computers is becoming blurred as technology advances, at present

the fundamental tool of data analysis is the computer. The computer does more than eliminate the drudgery of

hand computation and graphing — it is an essential tool for structured inquiry.

Data analysis is a journey of discovery. It is an iterative process that involves a dialogue between the

data and a mathematical model. As more is learned about the data, the model is refined and new questions are

formed. The computer aids in this journey in some essential ways. First, it produces graphs that are specifically

designed for data analysis. These graphical displays make it easier to observe patterns in data, to identify

important subgroups of the data and to locate any unusual data points. Second, the computer allows the student

to fit complex mathematical models to the data and to assess how well the model fits the data by examining the

residuals. Finally, the computer is helpful in identifying an observation that has an undue influence on the

analysis and in isolating its effects.

Page 16: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

In addition to its use in data analysis, the computer facilitates the simulation approach to probability that

is emphasized in the AP Statistics course. Probabilities of random events, probability distributions of random

variables and sampling distributions of statistics can be studied conceptually, using simulation. This frees the

student and teacher from a narrow approach that depends on a few simple probabilistic models.

Because the computer is central to what statisticians do, it is considered essential for teaching the AP

Statistics course. However, it is not yet possible for students to have access to a computer during the AP

Statistics Exam. Without a computer and under the conditions of a timed exam, students cannot be asked to

perform the amount of computation that is needed for many statistical investigations. Consequently, standard

computer output will be provided as necessary and students will be expected to interpret it.

Currently, the graphing calculator is the only computational aid that is available to students for use as a

tool for data analysis on the AP Exam.

Formulas and Tables

Students enrolled in the AP Statistics course should concentrate their time and effort on developing a

thorough understanding of the fundamental concepts of statistics. They do not need to memorize formulas. [A]

list of formulas and tables will be furnished to students taking the AP Statistics Exam.

The Exam

The AP Statistics Exam is 3 hours long and seeks to determine how well a student has mastered the

concepts and techniques of the subject matter of the course. This paper-and-pencil exam consists of (1) a 90-

minute multiple-choice section testing proficiency in a wide variety of topics, and (2) a 90-minute free-response

section requiring the student to answer open-ended questions and to complete an investigative task involving

more extended reasoning. In the determination of the score for the exam, the two sections will be given equal

weight.

Each student will be expected to bring a graphing calculator with statistical capabilities to the exam. The

expected computational and graphic features for these calculators are described in an earlier section.

Minicomputers, pocket organizers, electronic writing pads and calculators with qwerty (i.e., typewriter)

keyboards will not be allowed. Calculator memories will not be cleared. However, calculator memories may be

used only for storing programs, not for storing notes. A student may bring up to two calculators to the exam.

Multiple-Choice Questions

On the AP exam, there will be 40 multiple choice questions with five answer choices each.

Multiple-choice scores are based on the number of questions answered correctly. Points are not

deducted for incorrect answers, and no points are awarded for unanswered questions. Because no points are

deducted for incorrect answers, students are encouraged to answer all multiple-choice questions. On difficult

questions, students should eliminate as many incorrect answer choices as they can, and then make an educated

guess among the remaining choices.

Free-Response Questions

In the free-response section of the AP Statistics Exam, students are asked to answer five questions and

complete an investigative task. Each question is designed to be answered in approximately 12 minutes. The

longer investigative task is designed to be answered in approximately 30 minutes.

Statistics is a discipline in which clear and complete communication is an essential skill. The free-

response questions on the AP Statistics Exam require students to use their analytical, organizational and

communication skills to formulate cogent answers and provide students with an opportunity to:

Page 17: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Relate two or more different content areas (i.e., exploratory data analysis, experimental design

and sampling, probability, and statistical inference) as they formulate a complete response or

solution to a statistics or probability problem.

Demonstrate their mastery of statistics in a response format that permits the students to

determine how they will organize and present each response.

The purpose of the investigative task is not only to evaluate the student’s understanding in several

content areas but also to assess his or her ability to integrate statistical ideas and apply them in a new context or

in a nonroutine way.

Scoring of Free-Response Questions

The evaluation of student responses on the free-response section of the AP Statistics Exam reflects the

dual importance of statistical knowledge and good communication. The free-response questions and the

investigative task are scored “holistically”; that is, each question’s response is evaluated as “a complete

package.” With holistic scoring, after reading through the details of a student’s response, the scorer makes a

judgment about the overall quality of the response. This is different from “analytic” scoring, where the

individual components to be evaluated in a student’s response are specified in advance, and each component is

given a value counting toward the overall score.

The AP Statistics scoring guideline (rubric) for each free-response question has five categories,

numerically scored on a 0 to 4 scale. Each of these categories represents a level of quality in the student

response. These levels of quality are defined on two dimensions: statistical knowledge and communication. The

specific rubrics for each question are tied to a general template, which represents the descriptions of the quality

levels as envisioned by the Development Committee. This general template is given in the following table, “A

Guide to Scoring Free-Response Statistics Questions.”

A GUIDE TO SCORING FREE-RESPONSE STATISTICS QUESTIONS: THE CATEGORY

DESCRIPTORS

Page 18: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

At Mercer Island High School, if you take an AP course we (and most universities) would like you

to take the AP exam to complete the full Advanced Placement experience.

The AP Statistics Exam will occur at noon on Thursday, May 12th.

The fee for each AP Exam is $91.

If you are a student with special needs, you should talk with your guidance counselor about

applying for accommodations with the College Board.

QUESTIONS

1) Name the four main content areas in AP Statistics and next to each, place the percentage of the exam that

corresponds to that area

a)

b)

c)

d)

2) The AP Statistics exam is ___ hours long, in total.

3) 50% of your score on the exam comes from the __________________ section and the remaining 50% comes

from the _____________________ section.

4) There are ____ multiple choice questions and you have ___ minutes to complete them.

5) There are ___ standard free response questions followed by a larger free response question, called the

_________________________.

6) True or False: Formulas are provided for you on the AP exam. TRUE FALSE

7) Free responses questions are scored out of __ points.

8) True or False: It is possible to receive full credit on a free response question if a minor mathematical error is

present in your response. TRUE FALSE

9) True or False: There is a ¼ point penalty for each incorrect multiple choice answer given. TRUE FALSE

10) True or False: AP Statistics is a calculus-based course. TRUE FALSE

11) True or False: Students are expected to know how to read and interpret computer output on the AP

Statistics exam. TRUE FALSE

12) The fee to take an AP exam is $ ___.

13) True or False: Calculator programs are allowed on the AP Statistics exam. TRUE FALSE

Page 19: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 6: Summer Survey Enter your answers online at:

https://docs.google.com/forms/d/1QE1JmhOW1UzzKizj2QSGe2VmJETBNInGcp5UuDLpAtU/viewform

Part 7: Summer Work Packet Complete these on your own paper.

Before you start, you might find the “Quick Reference” of Statistics Basics helpful.

Part 1: Displaying and Describing Categorical Data CATEGORICAL OR QUANTITATIVE Determine if the variables listed below are quantitative or categorical.

1. Time it takes to get to school

2. Number of people under 18 living in a

household

3. Hair color

4. Temperature of a cup of coffee

5. Teacher salaries

6. Gender

7. Smoking

8. Height

9. Amount of oil spilled

10. Age of Oscar winners

11. Type of Depression medication

12. Jellybean flavors

13. Country of origin

14. Type of meat

15. Number of shoes owned

ACCIDENTAL DEATHS In 1997 there were 92,353 deaths from accidents in the United States. Among these were 42,340 deaths from motor

vehicle accidents, 11,858 from falls, 10,163 from poisoning, 4051 from drowning, and 3601 from fires. The rest

were listed as “other” causes.

a. Find the percent of accidental deaths from each of these causes, rounded to the nearest percent.

b. What percent of accidental deaths were from “other” causes? Show how you determined your answer.

c. NEATLY create a well-labeled bar graph of the distribution of causes of accidental deaths. Be sure to include an

“other causes” bar.

d. NEATLY create a well-labeled pie chart of the distribution of causes of accidental deaths. Be sure to include an

“other causes” wedge. Be sure the pie “wedges” are proportionally sized to each category.

Page 20: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Pick a simple question with simple responses that you would like to ask (e.g. Do you prefer iPhone, Blackberry,

or Android?)

Ask 30 random people the question, and record their response as well as their gender (try to get a roughly

equivalent number of boys and girls):

Summarize your results in a table:

Summarize your findings in one or more graphs:

Does one’s gender appear to be independent of how one responds to this question? Explain, and use your

results to support your argument.

# Response to Question Gender

1 M F

2 M F

3 M F

4 M F

5 M F

6 M F

7 M F

8 M F

9 M F

10 M F

11 M F

12 M F

13 M F

14 M F

15 M F

16 M F

17 M F

18 M F

19 M F

20 M F

21 M F

22 M F

23 M F

24 M F

25 M F

26 M F

27 M F

28 M F

29 M F

30 M F

Page 21: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics Summer Assignment - 2014

1. Measures of Center:

Mean – the arithmetic average. If the sample contains the data

then the mean is ̅

Median – the number in the middle when the data is arranged in order. If there is an even

number of observations in the list, then the median is the average of the two middle

numbers.

Mode – the value that occurs the most frequently.

2. Measures of Spread:

Range – the difference between the largest and smallest values of a data distribution

Inter-quartile Range (IQR) – the difference between the third and first quartiles. The third

quartile, Q3, is often referred to as the upper quartile. Likewise, the first quartile, Q1, is

often referred to as the lower quartile. Therefore, the IQR = Q3 – Q1. Quartiles are those

percentiles that divide the data into fourths. The first quartile, Q1, is the 25th percentile.

The second quartile, Q2, is the median and also the 50th percentile. The third quartile, Q3, is

the 75th percentile. See the diagram below:

Caution: The diagram does NOT imply that the interval lengths are equal between the

quartiles!

Sample Standard Deviation – is denoted and is calculated as √

∑( ̅) where

is any entry in the distribution and is the number of entries. indicates the sum is taken

over all data values.

Sample Variance – this is the square of the standard deviation, denoted .

Least Q1 Q2 Q3 Greatest Median

25% 25% 25% 25%

Page 22: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics Summer Assignment - 2014

3. The Five-Number-Summary consists of the Least value (minimum), Q1, the Median, Q3, and the

Greatest value (maximum). We draw a box plot using these five numbers as shown.

Step 1. Draw a scale to include the minimum and maximum values. Make sure your scale is

consistently spaced.

Step 2. Above the scale draw a box from Q1 to Q3.

Step 3. Include a solid line through the box at the Median.

Step 4. Extend solid lines (whiskers) from the box to reach the minimum on the left and the

maximum on the right.

Note: This can also be done vertically.

4. Outliers – Some data sets include values so low or so high that they seem to stand apart from the

rest of the data. These data are called outliers. Outliers may be from data collection errors, data

entry errors, or are simply valid though unusual observations. Regardless of the reason, it is

important to identify outliers in the data set and examine them carefully to determine if they are in

error. The general rule of thumb for identifying outliers in a single data set is any value that falls

below the Lower Limit of Q1 – 1.5 (IQR) or above the Upper Limit of Q3 + 1.5 (IQR).

5. Histograms are graphs in which:

The bars have the same width (called class or bin width)

The bars always touch unless there is a zero frequency for that bin

The width of each bar represents a quantitative interval (not a category)

The height of each bar represents the frequency of those values occurring in the data set

Several types of histograms are shown below and their shapes are described.

706050403020100

35

30

25

20

15

10

5

0

Parts per million

Fre

qu

en

cy

Arsenic in Ground Water

Sample

0 2 4 6 8 10 12 14 16 18

Box-and-Whisker Plot Box Plot

Skewed Right

Page 23: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics Summer Assignment - 2014

20151050

7

6

5

4

3

2

1

0

Fre

qu

en

cy

Histogram that is Skewed Left

108642

2.00

1.75

1.50

1.25

1.00

0.75

0.50

0.25

0.00

Fre

qu

en

cy

Histogram that is Uniform

8642

5

4

3

2

1

0

Fre

qu

en

cy

Histogram that is Symmetrical

1412108642

5

4

3

2

1

0

Fre

qu

en

cy

Histogram that is Bimodal

Skewed Left

Uniform

Roughly Symmetrical;

Also Single-Peaked

Bimodal

Page 24: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics Summer Assignment - 2014

08 3 09 2 7 8 9 10 1 2 3 4 5 5 6 6 9 11 0 1 2 2 2 2 6 7 7 8 9 12 0 4 5 5 6 8 13 1 2 5 14 3

6. A stem and leaf plot is another graphical display for quantitative data.

To make a stem-and-leaf plot:

1) Divide the digits of each data value into two parts. The leftmost part is all the digits but one, and

is called the stem. The rightmost part is a single digit and is called the leaf.

2) Align the stems in a vertical column from least to greatest. Draw a vertical line to the right of all

the stems.

3) Place all the leaves with the same stem on the same row as the stem, and arrange the leaves in

increasing order horizontally.

4) Use a label to indicate the magnitude of the numbers in the display. For example, “3|1

represents 31 ounces” or “26|3 represents 2.63 grams.”

What does it take to win at sports? If you are talking about basketball, one sports writer gave the

answer. He listed the winning scores of the conference championship games over the last 35 years.

The scores for those games is shown on the left and the stem plot is shown on the right.

132 118 124 109 104 101 125 83 99

131 98 125 97 106 112 92 120 103

111 117 135 143 112 112 116 106 117

119 110 105 128 112 126 105 102

Page 25: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 2: Displaying and Describing Quantitative Data

Consider the following data set: {-2, 0, 4, 2, 2}

Find the mean (average) (show work)

Find the median (middle value) (show work):

Identify the mode:

If the number 20 was added to each number in the data set, what would the new mean be? (show work)

If the number 20 was added to each number in the data set, what would the new median be? (show work)

Which one changed more?

If you had 50 numbers arranged in numerical order, the median would be the average of the ___ and ___

numbers.

If you had 49 numbers arranged in numerical order, the median would be located at the ___ number.

Page 26: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Keep a record of the number of hours of sleep you get each night during a two week period. Round each time

to the nearest half-hour. (no quarter-hours please)

Two-week period: _________________________________________________________

Mon Tue Wed Thu Fri Sat Sun

Week 1

Week 2

Determine the measures of center (mean, median and mode). Be ready to enter these on the Online Stats

Survey

What do these values tell you about your “typical” sleep time?

Determine the measures of variability (range, standard deviation, interquartile range). Use your calculator; also

reference the

What do these values tell you about how your sleep times vary on average?

Use your data to create the following data displays:

Dotplot: Stemplot

Histogram: Explain your choice of bin width Boxplot: are there any outliers?

Page 27: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics Summer Assignment - 2014

Exercises – Round off final answers to two decimal places.

1) The Grand Canyon and the Colorado River are beautiful, rugged, and sometimes dangerous.

Thomas Myers is a physician at the park clinic in Grand Canyon Village. Dr. Myers has recorded (for

a 5-year period) the number of visitor injuries at different landing points for commercial boat trips

down the Colorado River in the lower Grand Canyon.

Lower Canyon: Number of injuries per Landing Point Between Bright Angel and Lava Falls:

8, 1, 1, 10, 6, 7, 2, 14, 3, 0, 1, 13, 2, 1

a) Find the: Mean _______________ Median _____________ Mode _______________

b) The lower quartile is 1 and the upper quartile is 7. Find the lower limit for outliers: _______

Find the upper limit for outliers: ___________ Are there any outliers? ______________

If yes, which numbers are they? ______________

c) Name and list the 5 numbers used in the five-number summary. __________________

____________________ ______________________ _______________________

______________________. Draw a boxplot of these data.

d) Find the: Range _______________ Inter-Quartile Range (IQR) ____________________

2) Big Blossom Greenhouse was commissioned to develop an extra-large rose for the Rose Bowl

Parade. A random sample of blossoms from Hybrid A bushes yielded these diameters (in inches) for

mature peak blossoms: 2 3 4 5 6 8 10 10

a) Find the range. _______________

b) Find the mean, ̅. ____________ Find the sample standard deviation, . _________________

Page 28: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics Summer Assignment - 2014

3) The frequency distribution shows the scores on an easy multiple choice algebra test.

Grade, Frequency,

100 16

90 4

85 2

80 1

75 2

a) Explain why the mean is NOT ̅

b) Find the mean, ̅ _________________________. Find the median. ________________

4) Ozone levels (in parts per million, ppm) were recorded at sites in New Jersey monthly between 1926

and 1971. Here are box plots of the data for each month (over 46 years) lined up in order (January =

1). A small circle indicates an outlier.

a) In what month was the highest ozone level recorded? _________________________

b) Which month has the largest IQR? _________________________

c) Which month has the smallest range? ________________________

d) Write a brief comparison of the ozone levels in January and June. _____________________

__________________________________________________________________________

______________________________________________________________________________

e) What annual pattern do you see in the ozone levels? ___________________________________

______________________________________________________________________________

______________________________________________________________________________

Page 29: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics Summer Assignment - 2014

5) Many products come with owner registration or warranty cards. Usually, the consumer is asked a

few questions about his or her family and household income. Random samples of warranty or

registration cards for the indicated product revealed the household income distribution shown.

Categorize the shape of each histogram as uniform, symmetric, bimodal, skewed left, or skewed

right. If you were in charge of advertising, how would you use income-distribution information of

present customers to target ads for the indicated product?

6) How fast do horses run? Kentucky Derby winners top 30 mph. The cumulative frequency graph

below shows the percentage of derby winners that have run slower than a given speed. Note that

about 95% of the winning horses have run slower than 37 mph.

a) Estimate the median speed. ___________

b) Estimate the quartiles. ______ , ________

c) Estimate the range. _______ IQR _______

d) Create a box plot of these speeds.

Page 30: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

AP Statistics Summer Assignment - 2014

7) American League baseball teams play their games with the designated hitter rule, meaning that

pitchers do not bat. The League believes that replacing the pitcher, traditionally a weak hitter, with

another player in the batting order produces more runs and generates more interest among fans.

Below are the average number of runs scored in American League and National League stadiums for

the first half of the 2001 season.

American 11.1 10.8 10.8 10.3 10.3 10.1 10.0 9.5 9.4 9.3 9.2 9.2 9.0 8.3

National 14.0 11.6 10.4 10.3 10.2 9.5 9.5 9.5 9.5 9.1 8.8 8.4 8.3 8.2 8.1 7.9

a) Complete the back-to-back stem-and-leaf plot of these data.

b) Write a few sentences comparing the average number of runs scored per game in the two

leagues. Include the shape, center, spread and unusual features.

c) Coors Field, in Denver, stands a mile above sea level, an altitude far greater than that of any

other major league ballpark. Some believe that the thinner air makes it harder for pitchers to

throw curve balls and easier for batters to hit the ball a long way. Do you see any evidence that

the 14 runs scored per game there is unusually high? ______________ Explain.

Page 31: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 3: Combinatorics and Probability

To complete this assignment, you need to view videos from YouTube. The videos are about 10 minutes and 4 minutes. After viewing, answer the problems below on a separate sheet of paper. I. Title: DuPage Statistics: Basic Rules of Probability: http://www.youtube.com/watch?v=3HCu_7O1oEY

II. Title: Multiplication Rule (Probability “and”): http://www.youtube.com/watch?v=Q_7PR9kRXWs&feature=related

Probability Terms:

Probability

Empirical Probability Random Phenomenon

Theoretical Probability Law of large numbers

Event Outcome Equally Likely Outcomes

Sample Space Fair Dice/Coins/Cards

With Replacement Without Replacement

Arrangement Group

Venn diagram

Disjoint Events Mutually Exclusive Events

Intersection of Events Union of Events Compliment of an event

Tree Diagram

Independent Events Dependent Events

Conditional Probability

Probability Rules

General Addition Rule Addition Rule for Disjoint Events

General Multiplication Rule Multiplication Rule for Independent Events

Show how you arrived at each answer. If you are having difficulty with these, check out the tutorials on this

site: http://www.intmath.com/counting-probability/counting-probability-intro.php (there are many other good

tutorial sites as well)

1) If there are 3 appetizers, 3 entrees, and 2 desserts available, how many different three course meals are

possible?

2) Suppose three coins are tossed, and each time, they turn up heads. What is the probability that the next

toss will be heads? Explain

3) How many ways are there to arrange the first five letters of the alphabet (no repetition of characters)?

Page 32: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

4) How many 4 digit PINs (personal identification numbers) are possible if repetition of digits is allowed?

5) There are three slots available per day for oral presentations in a hypothetical class. If there are 25

students in the class, how many ways can the presentations be arranged on the first day?

6) For two standard 6 sided dice,

a. Go online and find a picture of a tree diagram representing tossing two dice. Attach or paste to

this paper.

b. What is the probability of rolling two sixes?

c. Of not rolling two sixes?

d. Of rolling a sum of three?

7) Two cards are drawn from a standard 52 card deck.

a. What important question must be answered about the draw of 2 cards?

b. What is the probability that they’re both aces?

8) 7 people (4 boys and 3 girls) are available to play basketball. How many 5 person teams are possible if

each team must have 3 boys and 2 girls on it?

9) Let’s say a person makes 3 out of every 4 free-throws, on average. If they shoot four shots, what is the

probability that they will make exactly three?

Page 33: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

A good video to watch before the next problems: https://www.youtube.com/watch?v=bLNfsh8Ax38

10) Police report that 78% of drivers stopped on suspicion of drunk driving are given a breath test, 36% a blood test, and 22% both tests.

Draw a Venn diagram of this information in the box at the right top box. Label each area clearly with the variable and the probability.

Draw a tree diagram of this information in the box at the right bottom box. Label each branch clearly with the variable and the probability What is the probability that a randomly selected DUI suspect is given…

a. a test?

b. a blood test or a breath test, but not both?

c. neither test?

d. A blood test?

e. a blood test, if a breath test was already given at the scene?

f. Based on your answers to d and e, do you think that receiving a breath test and a blood test are independent? That is, does having a breath test appear to be related to the likelihood that you receive a blood test? Explain your reasoning.

Page 34: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

A good video to watch before the next problem: https://www.youtube.com/watch?v=OPGG5nhH198

1. An insurance company is conducting a study in Connecticut of “at risk” drivers—those drivers most likely to be in an accident. The insurance company decides to define each of the following groups as “at risk” drivers: 1. Drivers under 21 years of age

2. Drivers over 75 years of age

3. Drivers of any age with a traffic ticket in the last year.

The insurance company took a random sample of 1,000 Connecticut drivers to determine their age and whether they had received a ticket in the last year. These data are shown below:

BE SURE TO SHOW ALL WORK FOR EACH PROBLEM BELOW TO RECEIVE FULL CREDIT*

Example: The probability a person is under 21 =P(under 21) =

What is the probability that a randomly selected driver was over 75? What is the probability that a randomly selected driver is considered “at risk?” What is the probability that a given driver who is under 21 received a traffic ticket in the past year? What is the probability that a randomly selected driver received a traffic ticket in the past year?

Based on your answers to the last two questions, do you think that receiving a traffic ticket and age are independent? That is, does age appear to be related to the likeliness that you receive a ticket? Explain your reasoning.

Under 21 Over 75 Other Ages (21-75) Total

Traffic Ticket 24 11 218 253

No Traffic Tickets 29 84 634 747

Total 53 95 852 1000

Page 35: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

1. If the individual outcomes of a phenomenon are uncertain, but there is nonetheless a regular distribution of outcomes in

a large number of repetitions, we say the phenomenon is

A) random.

B) predictable.

C) deterministic.

D) probable.

2. The probability of any outcome of a random phenomenon is

A) the precise degree of randomness present in the phenomenon.

B) the proportion of a very long series of repetitions in which the outcome occurs.

C) either 0 or 1, depending on whether or not the phenomenon can actually occur.

D) any number, as long as it is a value between 0 and 1.

E) impossible to determine if the phenomena is truly random.

3. When two coins are tossed, the probability of getting two heads is 0.25. This means that

A) of every 100 tosses, exactly 25 will have two heads.

B) the odds against two heads are 4 to 1.

C) in the long run, the average number of heads is 0.25.

D) in the long run two heads will occur on 25% of all tosses.

4. If the knowledge that an event A has occurred implies that a second event B cannot occur, the events A and B are said

to be

A) independent.

B) disjoint.

C) mutually exhaustive.

D) the sample space.

E) complementary.

5. An assignment of probability must obey which of the following?

A) The probability of any event must be a number between 0 and 1, inclusive.

B) The sum of all the probabilities of all outcomes in the sample space must be exactly 1.

C) The probability of an event is the sum of the outcomes in the sample space which make up the event.

D) All of the above.

E) A and B only.

Use the following to answer questions 6-7:

Event A occurs with probability 0.2. Event B occurs with probability 0.8.

6. If A and B are disjoint (mutually exclusive), then

A) P(A and B) = 0.16.

B) P(A or B) = 1.0.

C) P(A and B) = 1.0.

D) P(A or B) = 0.16.

E) Both A and B are true.

7. If A and B are independent, then

A) P(A and B) = 0.16.

B) P(A or B) = 1.0.

C) P(A and B) = 1.0.

D) P(A or B) = 0.16.

E) both A and B are true.

Page 36: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Use the following to answer questions 8-9:

In a particular game, a fair die is tossed. If the number of spots showing is either four or five, you win $1. If the number of

spots showing is six, you win $4. And if the number of spots showing is one, two, or three, you win nothing. You are

going to play the game twice.

8. The probability that you win $4 both times is

A) 1/6.

B) 1/3.

C) 1/36.

D) 1/4.

E) 1/12.

9. The probability that you win at least $1 both times is

A) 1/2.

B) 4/36.

C) 1/36.

D) 1/4.

E) 3/4.

Use the following to answer questions 10-11:

Ignoring twins and other multiple births, assume that babies born at a hospital are independent events with the probability

that a baby is a boy and the probability that a baby is a girl both equal to 0.5.

10. The probability that the next five babies are girls is

A) 1.0.

B) 0.5.

C) 0.1.

D) 0.0625.

E) 0.03125.

11. The probability that at least one of the next three babies is a boy is

A) 0.125.

B) 0.333.

C) 0.667.

D) 0.750.

E) 0.875.

12. In a certain town, 50% of the households own a cellular phone, 40% own a pager, and 20% own both a cellular phone

and a pager. The proportion of households that own neither a cellular phone nor a pager is

A) 0%.

B) 10%.

C) 30%.

D) 70%.

E) 90%.

13. Suppose that A and B are two independent events with P(A) = 0.2 and P(B) = 0.4.

P(A ∪ B) is

A) 0.08.

B) 0.12.

C) 0.44.

D) 0.52.

E) 0.60.

14. Suppose that A and B are two independent events with P(A) = .2 and P(B) = .4. P(A ∩ B) is

A) 0.08.

B) 0.12.

C) 0.40.

D) 0.52.

E) 0.60.

Page 37: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

15. The probability that a randomly chosen woman aged 20 to 24 is married is .35; the probability that she is widowed or

divorced is .03. What is the probability that such a woman has never been married?

A) .72

B) .38

C) .62

D) .65

Use the following to answer question 16:

An event A will occur with probability 0.5. An event B will occur with probability 0.6. The probability that both A and B

will occur is 0.1.

16. The conditional probability of A given B

A) is 0.5.

B) is 0.3.

C) is 0.2.

D) is 1/6.

E) cannot be determined from the information given.

17. Event A occurs with probability 0.8. The conditional probability that event B occurs, given that A occurs, is 0.5. The

probability that both A and B occur

A) is 0.3.

B) is 0.4.

C) is 0.625.

D) is 0.8.

Page 38: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Part 4: Linear Regression

1. Use the graph shown below to answer

questions. The graph shows the predicted

ice cream sales ($) based on the

temperature (°Celsius) for a given day.

What is the (approximate) y-coordinate of

the point with the largest x-coordinate?

What is the (approximate) x-coordinate of

the point with the lowest y-coordinate?

Does the graph shown have a positive or negative association? Circle the appropriate word for each of

the bolded options:

There is a positive/negative association between temperature and predicted ice cream sales.

Generally, as temperature increases/decreases, predicted ice cream sales increase/decrease.

Write the coordinates of the circled point on the graph.

Interpret the meaning of the coordinate pair for this circled point (in context!)—what does it tell us

about temperature/ice cream sales? Be detailed!

2. Use your graphing calculator to find the linear regression equation (line of best fit), correlation, and

coefficient of determination for the data set below. Round coefficients to three decimal places. See the

next page for calculator help.

Linear Regression Equation (Line of Best Fit): _______________________________________________

Correlation (r) = ___________________ Coefficient of Determination ( ) = _____________

Interpret the slope of the Linear Regression Equation in context. Include the appropriate units

Interpret the y-intercept Linear Regression Equation in context. Include the appropriate units

Use the equation calculated above to predict the population of bacteria after 15 hours. Show all of your

work in the space below.

x (# of hours) 2 5 6 8 9 10

Y (Population of Bacteria, in

thousands) 4.19 8.46 9.35 15.06 21.98 29.72

Page 39: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

Suppose the actual population of bacteria after 15 hours was 45 thousand. How far off was your

prediction? In statistical terms, calculate the residual. (Residual = actual value – predicted value)

Part 5: Algebra Review

1) Evaluate z if x

z

and x = 20, µ = 10, and σ = 2. (If you don’t know already, µ is the Greek lowercase “m”

(we say “mu” (like myoo)) and σ is the Greek lowercase “s” (we say “sigma”).)

2) Solve x

z

for σ, then for μ

3) Solve 20.5

0.05 1.96n

for n.

4) If 60

1.64

and

951.96

, solve for µ and σ.

5) Find the equation of the line in slope intercept (y = a + bx) form that goes through the points (-2, 4) and

(5, 7).

Page 40: Mercer Island High School...Lastly, if you are really interested in Statistics and have some extra time, I recommend Outliers by Malcolm Gladwell, The Lady Tasting Tea: How Statistics

For each of the equations below (a thru e) solve for x. Show and explain each step. If rounding is necessary, round to three decimal places.

__________ a) 624

3.7

x

__________ b) x

714.0258.0

__________ c) ln x = 3.7861 __________ d) log 57.211 = .3x – 3.08 __________ e) log x = –27.5 + 0.0982w (I know this will be an expression but simplify it as much as possible.)

6. Write an equivalent equation in exponential form. x8log2 (Do not solve it.)

7. Write an equivalent equation in logarithmic form. 6443