14
LESSON 11.1 CONDENSED Discovering Advanced Algebra Condensed Lessons CHAPTER 11 161 ©2010 Key Curriculum Press In this lesson you will learn to identify and design an experiment, an observational study, and a survey learn to distinguish between causation and association Carrie has a theory that students are late to school because they eat breakfast. So she waits by the door of her first period class and asks each late student whether he or she ate breakfast before coming to school. She finds the percentage of students late to her first period class that ate breakfast to be 66.7%, and she concludes that two-thirds of the students who come late to school are late because they ate breakfast. This study is anecdotal, a study in which data are collected from a sample convenient to the investigator. It is not an accurate or scientific kind of study, but it is used frequently. On page 618 of your textbook there is a chart describing three better ways to collect sample data for study. Read this chart, and then study the diagram showing the stages of an experimental study. Investigation: Designing a Study Complete the investigation in your book yourself. Then compare your study designs to the sample answers below. Experiment For an experiment, you might ask for volunteers from several different math classes. In each class, assign half of the volunteers to Group 1 and half to Group 2. Ask all the volunteers to write down how long it takes them to complete their math assignment each night for two weeks. For the first week, ask Group 1 to listen to music while they do math homework. Ask Group 2 not to listen to music. During the second week, have the groups switch roles. Observational It is challenging to run an observational study that does not have a biased sample. For example, you might observe students working in the school library or homework center, and note whether they are listening to music and how long it takes them to complete their assignments. However, this may be a biased sample, because students who work at home may be more likely to listen to music. And in many schools, music players are not allowed, which would make an observational study at school impossible. Instead, you might distribute a questionnaire that doesn’t reveal the purpose of the study. You could put times in 10-minute intervals on the side, and a list of activities across the top, such as studying, listening to music, talking on the phone, chatting online, and so forth. Then ask students to check off any activities they perform during each time period. Survey Distribute surveys to students in several math classes. Ask them to record the length of time it takes them to complete their math assignments for one week. Ask them whether they listen to music while they do their math homework. Experimental Design (continued)

LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

  • Upload
    danganh

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

L E S S O N

11.1CONDENSED

Discovering Advanced Algebra Condensed Lessons CHAPTER 11 161©2010 Key Curriculum Press

In this lesson you will

● learn to identify and design an experiment, an observational study, and a survey

● learn to distinguish between causation and association

Carrie has a theory that students are late to school because they eat breakfast. So she waits by the door of her first period class and asks each late student whether he or she ate breakfast before coming to school. She finds the percentage of students late to her first period class that ate breakfast to be 66.7%, and she concludes that two-thirds of the students who come late to school are late because they ate breakfast.

This study is anecdotal, a study in which data are collected from a sample convenient to the investigator. It is not an accurate or scientific kind of study, but it is used frequently. On page 618 of your textbook there is a chart describing three better ways to collect sample data for study. Read this chart, and then study the diagram showing the stages of an experimental study.

Investigation: Designing a StudyComplete the investigation in your book yourself. Then compare your study designs to the sample answers below.

Experiment

For an experiment, you might ask for volunteers from several different math classes. In each class, assign half of the volunteers to Group 1 and half to Group 2. Ask all the volunteers to write down how long it takes them to complete their math assignment each night for two weeks. For the first week, ask Group 1 to listen to music while they do math homework. Ask Group 2 not to listen to music. During the second week, have the groups switch roles.

ObservationalIt is challenging to run an observational study that does not have a biased sample. For example, you might observe students working in the school library or homework center, and note whether they are listening to music and how long it takes them to complete their assignments. However, this may be a biased sample, because students who work at home may be more likely to listen to music. And in many schools, music players are not allowed, which would make an observational study at school impossible. Instead, you might distribute a questionnaire that doesn’t reveal the purpose of the study. You could put times in 10-minute intervals on the side, and a list of activities across the top, such as studying, listening to music, talking on the phone, chatting online, and so forth. Then ask students to check off any activities they perform during each time period.

SurveyDistribute surveys to students in several math classes. Ask them to record the length of time it takes them to complete their math assignments for one week. Ask them whether they listen to music while they do their math homework.

Experimental Design

(continued)

DAA2CL_010_11.indd 161DAA2CL_010_11.indd 161 1/13/09 11:01:14 AM1/13/09 11:01:14 AM

Page 2: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

162 CHAPTER 11 Discovering Advanced Algebra Condensed Lessons

©2010 Kendall Hunt Publishing

Read carefully the text following the investigation that explains the difference between causation and association.

Anyone conducting a study must be careful of bias. Choosing subjects randomly, correctly representing subgroups of the population, and independent review of how an experiment, observational study, or a survey will be conducted can reduce bias. You should know how data are collected and what bias or errors may exist before accepting a conclusion from a study.

Work through Example A and Example B in your book. Then read the following example.

EXAMPLE Angela has a theory that youngest siblings learn to read earlier than their older siblings. To test her theory, she does a survey of her classmates. In the survey, she ask the following questions:

1. How old were you when you learned to read?

2. How old were your siblings when they learned to read?

3. Are you the youngest, oldest, or middle child in your family?

Describe any problems with Angela’s methodology. What type of study would work well to validate or disprove Angela’s theory?

� Solution The main problem with Angela’s plan is that students may not accurately report how old they were when they learned to read, or how old their siblings were. The students would have to rely on their own (possibly faulty) memories or the memories of their parents and siblings, which also might be inaccurate.

A large-scale observational study would work well to test Angela’s theory. Many nursery school and elementary school teachers could be asked to report at what age their students learn to read. Then the researchers could ask the parents of the students involved in the study to report their child’s birth order.

Lesson 11.1 • Experimental Design (continued)

DAA2CL_010_11.indd 162DAA2CL_010_11.indd 162 1/13/09 11:01:15 AM1/13/09 11:01:15 AM

Page 3: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

CONDENSED

Discovering Advanced Algebra Condensed Lessons CHAPTER 11 163©2010 Kendall Hunt Publishing

In this lesson you will

● sketch the graph of the probability distribution for a continuous random variable

● find probabilities by finding or approximating areas under a probability distribution curve

● extend the definitions of mode, median, and mean to probability distributions

At election time, television stations, newspapers, and magazines often conduct polls. By surveying a small sample of voters, they hope to get information about how the entire population of voters feels about a candidate or an issue. In earlier chapters you learned some statistics—for example, the mean, median, and standard deviation—that can be used to describe a sample. The corresponding numbers describing the entire population are called parameters. The larger the sample, the closer its statistics will be to the parameters.

In the previous chapter, you worked with discrete random variables. The data had integer values, for example, 10 heads or 3 tails. Sometimes data can take on any real value within an interval. This is represented by a continuous random variable. For example, the height of a randomly selected person is a continuous random variable. A person might be 165 cm tall or 166 cm tall, but any measurement between these integer measurements is also possible (for example, 165.25 cm or 165.67897 cm).

Investigation: Pencil LengthsThe investigation in your book requires you to collect pencil-length data from all students in your class. The results below use these sample data. (Lengths are in centimeters.)

{16.9, 18.7, 11.3, 13.8, 15.2, 17.0, 16.5, 16.6, 11.8, 17.2, 15.5, 15.7, 17.0, 11.4, 16.5, 16.0, 13.4, 15.7, 15.5, 14.1, 12.3, 13.8, 15.5, 15.7, 10.7, 15.6, 12.1, 14.4, 16.5, 17.9, 8.2, 17.8, 17.6, 14.1, 16.7, 14.6, 12.3, 10.0, 13.2, 14.3}

Create a histogram of these data with bins representing

8 10

Pencil length (cm)

Nu

mb

er o

f p

enci

ls

2

012 14 16 18 20

4

6

8

101 cm increments. Your histogram should look like the one at right.

Divide the number of pencils in each bin by the total number of pencils. You should get these results:

8–9: 0.025 9–10: 0 10–11: 0.05 11–12: 0.075

12–13: 0.075 13–14: 0.1 14–15: 0.125 15–16: 0.2

16–17: 0.175 17–18: 0.15 18–19: 0.025

Make a second histogram using these new values as y-values. Your histogram should look like the one on the next page. This graph has the same shape as the one above, but the vertical scale is different.

L E S S O N

11.2 Probability Distributions

(continued)

DAA2CL_010_11.indd 163DAA2CL_010_11.indd 163 1/13/09 11:01:15 AM1/13/09 11:01:15 AM

Page 4: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

164 CHAPTER 11 Discovering Advanced Algebra Condensed Lessons

©2010 Kendall Hunt Publishing

The height of each bar represents the fraction of pencils

8 10

Pencil length (cm)

Frac

tion

of

pen

cils

0.05

012 14 16 18 20

0.10

0.15

0.20

0.25

with lengths in the given interval. Because the width of each bar is 1, the area of each bar also represents the fraction of pencils in the interval. Because all the pencils have been accounted for, the total area of all the bars must be 1. You can check this by adding the areas. Note that the area of each grid square is 0.025.

Imagine that you collect and measure more and more pencils and draw a histogram using the fraction of pencils as the bin height. Sketch what a histogram of infinitely many pencil lengths would look like. Make sure you can give reasons for the shape of your histogram.

Imagine you do a complete and precise survey of all the

8 10

Pencil length (cm)

Frac

tion

of

pen

cils

0.05

012 14 16 18 20

0.10

0.15

0.20

0.25

pencils in the world. Assume the distribution of lengths is about the same as that in the sample above. Also, assume that you use infinitely many very narrow bins. (Each bin width represents an infinitely small fraction of a centimeter.) To approximate this plot, sketch over the top of your histogram with a smooth curve. Make the area between the curve and the horizontal axis about the same as the area of the histogram. Try to make sure the extra area enclosed by the curve above the histogram is the same as the area cut off the corners of the bins, like the curve at right.

Use your curve to estimate the areas described in Step 7 in your book. These estimates are based on the curve at right.

a. About 0.025 b. About 3(0.025), or 0.075

c. About 32.5(0.025), or 0.8125 d. 0

The second histogram you made in the investigation, showing the fraction of pencils in each bin, is called a relative frequency histogram. The smooth curve you drew approximates the probability distribution for a continuous random variable for the infinite set of measurements.

The areas you found are the probabilities that a randomly chosen pencil length will satisfy the given condition. If x represents the continuous random variable giving the pencil lengths in centimeters, then you can write these areas as

P (x � 10) P (11 � x � 12) P (x � 12.5) P (x � 11)

In a continuous probability distribution, the probability of any single outcome, such as the probability that a pencil length is 11 cm, is the area of a line segment, which is 0. It is theoretically possible for a pencil to be 11 cm, but the probability of choosing one outcome out of infinitely many outcomes is 0. Read Example A in your book, which illustrates how areas represent probabilities for a continuous random variable.

After Example A, your book defines the mode, median, and mean of a probability distribution. These definitions are related to, but slightly different from, the definitions you learned earlier. Read them carefully, and make sure they make sense to you. Then, apply the new definitions by working through Example B.

Lesson 11.2 • Probability Distributions (continued)

DAA2CL_010_11.indd 164DAA2CL_010_11.indd 164 1/13/09 11:01:16 AM1/13/09 11:01:16 AM

Page 5: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

CONDENSED

Discovering Advanced Algebra Condensed Lessons CHAPTER 11 165©2010 Kendall Hunt Publishing

In this lesson you will

● learn that the graph of a binomial distribution is a bell-shaped curve, called a normal curve

● learn the equation for a normal distribution with mean � and standard deviation �

● use calculator functions to graph a normal curve, and find areas under a portion of the curve

● discover the 68-95-99.7 rule for determining the probability that a data value is within one, two, or three standard deviations from the mean

In Chapter 10, you studied the binomial distribution for discrete random variables. In general, if an experiment has two outcomes, success and failure, with probability of success p and probability of failure q, then the probability of x successes in n trials is P (x) � nCx p x q n�x. Note that q � 1 � p, so this is equivalent to P (x) � nCx p x(1 � p)n�x. In this lesson, you will discover some properties of this probability distribution.

As n gets larger and larger, the binomial distribution looks more and more continuous until, eventually, it looks like the bell-shaped curve at right. Distributions for large populations often have this shape. The bell-shaped curve is called a normal curve, and a bell-shaped distribution is called a normal distribution.

The formulas you have learned for the sample mean, __

x , and the sample standard deviation, s, are estimates for values in the population. When you find the mean and standard deviation for an entire population, they are called the population mean, �, and the population standard deviation, �. These symbols are the Greek letters mu (pronounced “mew”) and sigma.

Beginning on page 634, your book discusses the equation for the graph of the normal distribution. Read that text, and work through the example. In the example, you graph the general equation for the normal curve, write the equation for a standard normal distribution with mean � and standard deviation �, and then determine how well a normal curve fits a given binomial distribution. The general equation for a normal distribution is also given in the “The Normal Distribution” box on page 637.

The equation for the normal distribution can be tedious to enter into a calculator. Fortunately, most calculators provide the equation as a built-in function. You have to provide only the mean and standard deviation. (See Calculator Note 11B to learn how to graph a normal distribution on your calculator.)

In this chapter, the notation n (x, mean, standard deviation) is used to represent a normal distribution. For example, n (x, 2.6, 1.5) represents a normal distribution with mean 2.6 and standard deviation 1.5. The standard normal distribution function, that is, the function for the normal distribution with mean 0 and standard deviation 1, is simply denoted n (x). The notation N (lower, upper, mean, standard deviation) is used to represent the area under a portion of the normal curve. For example, N (2, 3, 2.6, 1.5) represents the area between x � 2

L E S S O N

11.3 Normal Distributions

(continued)

DAA2CL_010_11.indd 165DAA2CL_010_11.indd 165 1/13/09 11:01:16 AM1/13/09 11:01:16 AM

Page 6: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

166 CHAPTER 11 Discovering Advanced Algebra Condensed Lessons

©2010 Kendall Hunt Publishing

and x � 3 under the curve for the normal distribution with mean 2.6 and standard deviation 1.5. (See Calculator Note 11C to learn how to find these areas on your calculator.)

Investigation: The Normal Curve Work through the investigation in your book yourself. Then check your results against the results below.

Step 1 Your data will vary from those below. However, your results for Steps 1 through 3 for your own data should be very similar to those shown below.

Step 2 For the sample data graphed below, 337 ___ 500 � 67.4% of the data values fall

within one standard deviation of the mean.

Step 3 479 ___ 500 � 95.8% of the data fall within two standard deviations of the mean,

and 499 ___ 500 � 99.8% fall within three standard deviations of the mean.

Step 4 The “68-95-99.7” rule states that in a random sample about 68% of the data values fall within one standard deviation of the mean for the curve, about 95% fall within two standard deviations, and about 99.7% fall within three standard deviations.

Step 5 Create a set of data and a table for yourself in a manner similar to what you used for Steps 1, 2 and 3 of this investigation. Analyze the percentages of data that are one, two, and three standard deviations from the mean.

Step 6 The probabilities should very closely follow the 68-95-99.7 rule. Nearly all data should lie within three standard deviations of the mean.

At points that are one standard deviation from the mean, the normal curve changes from curving downward to curving upward. These points are called inflection points. You can estimate the standard deviation of a normal distribution by locating the inflection points of its graph.

�� �

Curving downward

Curvingupward

Curvingupward

�� �

Curving downward

Curvingupward

Curvingupward

Lesson 11.3 • Normal Distributions (continued)

DAA2CL_010_11.indd 166DAA2CL_010_11.indd 166 1/13/09 11:01:17 AM1/13/09 11:01:17 AM

Page 7: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

CONDENSED

Discovering Advanced Algebra Condensed Lessons CHAPTER 11 167©2010 Kendall Hunt Publishing

In this lesson you will

● apply the 68-95-99.7 rule

● learn how to transform x-values of a normal distribution to z-values

● calculate confidence intervals

Knowing how a value in a sample relates to the mean value does not tell you how typical the value is. For example, to say that the weight of a Scottish terrier is 4 lb above the mean does not tell you if this measurement is a rare event or a common event. If you knew how many standard deviations the dog’s weight was from the mean, you would have a much better idea of how unusual the weight is.

Investigation: Keeping ScoreRead the investigation in your book and come up with your own conclusion before reading the possible solution below.

One possible measure is distance of each student’s score from the mean. Andre’s score was 86 � 74 � 12 points above the mean, whereas Imani’s score was 84 � 75 � 9 points above the mean. However, calculating the distance from the mean in numbers of standard deviations will give a better measure of how each student did compared with other students who took the exam.

Andres’s score of 86 was 86 � 74 _____ 9 � 4 _ 3 , or about 1.33 standard deviations above

the mean. Imani’s score of 84 was 84 � 75 _____ 6 � 3 _ 2 , or 1.5 standard deviations above

the mean. Andres’s score may have been higher, but Imani performed better in relation to the rest of her class.

In a normal distribution, the z-value, or z-score, of x is the number of standard deviations that x is from the mean. As you learned in Lesson 11.3, the probability that a new measurement will have a z-value between �1 and 1 is 68%, the probability that it will have a z-value between �2 and 2 is 95%, and the probability that it will have a z-value between �3 and 3 is 99.7%.

You can think of the z-value of x as the image of x under a transformation that translates and dilates the normal distribution to the standard normal distribution n(x) with mean � � 0 and standard deviation � � 1. Transforming x-values to z-values is called standardizing the variable and can be calculated with the equation z �

x � � ____ � , where � and � are the mean and standard deviation of

the normal distribution of x. Work through Example A in your book, which illustrates standardizing the variable. Note that when using trial and error in part c you must test only intervals that are symmetric about the mean.

There is no way to know for certain how close the mean of the normally distributed population is to the mean of a sample. However, you can describe how confident you are that the population mean lies in a given interval centered at the sample mean. A p % confidence interval is an interval about the sample mean,

__ x , in which you can be p % confident the population mean, �, lies.

L E S S O N

11.4 z-Values and Confidence Intervals

(continued)

DAA2CL_010_11.indd 167DAA2CL_010_11.indd 167 1/13/09 11:01:17 AM1/13/09 11:01:17 AM

Page 8: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

168 CHAPTER 11 Discovering Advanced Algebra Condensed Lessons

©2010 Kendall Hunt Publishing

Specifically, if z is the number of standard deviations from the mean within which p% of normally distributed data lie, then the p% confidence interval for a sample of size n is

_

x � z� ____ �

__ n � � �

_ x � z� ____

� __

n

The value z� ___

� __

n is called the margin of error. In many real-world situations, you

will not know the population standard deviation. However, if the sample size is large enough, generally n � 30, you may use the sample standard deviation, s, in place of �.

The 68-95-99.7 rule tells you what z-values to use if you want to be 68%, 95%, or 99.7% confident. In Example B in your book, confidence intervals are calculated using several methods. Work through that example, and then try to solve the problem in the example below.

EXAMPLE The quality-control manager at a cereal company pulled a random sample of 30 boxes of Morning Crunch cereal from the production lines and weighed the contents of each box. The mean weight for the sample was 9.8 oz and the standard deviation was 0.42 oz.

a. Find the 68% and 90% confidence intervals.

b. The label on the Morning Crunch box states that the weight is 10 oz. Do you think the quality-control manager should report a problem? Explain.

� Solution a. Using the sample standard deviation in place of �, the 68% confidence interval is

� 9.8 � 1(0.42)

______ �

___ 30 , 9.8 �

1(0.42) ______

� ___

30 � , or about (9.72, 9.88)

The quality-control manager is 68% confident that the mean weight is between 9.72 and 9.88 oz.

As you found in Example B, part c, the z-value corresponding to a 90% confidence interval is 1.645, so the 90% confidence interval is

� 9.8 � 1.645(0.42)

__________ �

___ 30 , 9.8 �

1.645(0.42) __________

� ___

30 � , or about (9.67, 9.93)

The quality-control manager is 90% confident that the population mean weight is between 9.67 and 9.93 oz.

b. According to the answer to part a, the quality-control manager can be 90% certain that the population mean (that is, the mean weight for all the boxes produced) is in the interval (9.67, 9.93). Because this interval does not include 10 oz, the weight claimed on the box, the manager should report a problem.

Lesson 11.4 • z-Values and Confidence Intervals (continued)

DAA2CL_010_11.indd 168DAA2CL_010_11.indd 168 1/13/09 11:01:18 AM1/13/09 11:01:18 AM

Page 9: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

CONDENSED

Discovering Advanced Algebra Condensed Lessons CHAPTER 11 169©2010 Kendall Hunt Publishing

In this lesson you will

● determine whether two variables are correlated

● use the correlation coefficient to determine how strong a correlation is

Many real-world statistical problems involve predicting associations between two variables. For example, researchers may want to determine if there is an association between the number of milligrams of vitamin C a person consumes and the number of colds the person gets. The process of collecting data on two possibly related variables is called bivariate sampling.

An association between variables is called correlation. In this lesson you will focus on determining whether there is a linear association between variables and how strong the linear association might be. The most commonly used statistical measure of linear association is the correlation coefficient.

Investigation: Looking for ConnectionsStep 1 Look at the sample survey in Step 1 of the investigation in your book.

Step 2 List at least two pairs of variables you think will have a positive correlation (as one increases, the other tends to increase). One possibility might be the number of minutes of homework (question 1) and the number of academic classes (question 5).

Now, list at least two pairs of variables you think will have a negative correlation (as one increases, the other tends to decrease). One possibility is the number of academic classes (question 5) and the time spent talking, calling, e-mailing, or writing to friends (question 2).

Finally, list at least two pairs of variables you think will have a weak correlation. One possibility might be the time spent communicating with friends (question 2) and the time of going to bed (question 4).

Step 3 The table on the next page shows the results gathered in one class. Enter the data into five calculator lists or spreadsheet columns. Plot points for each pair of lists, and find the correlation coefficients. (See Calculator Note 11E.) Here are the correlation coefficients and graphs for the relationships mentioned in Step 2. However, you should make graphs and find correlation coefficients for all the possible pairings.

5 VS. 1 5 VS. 2 4 VS. 2

r � 0.771 r � �0.632 r � �0.059

L E S S O N

11.5 Bivariate Data and Correlation

(continued)

DAA2CL_010_11.indd 169DAA2CL_010_11.indd 169 1/13/09 11:01:18 AM1/13/09 11:01:18 AM

Page 10: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

170 CHAPTER 11 Discovering Advanced Algebra Condensed Lessons

©2010 Kendall Hunt Publishing

You should make the following observations:

● The graphs that increase have positive correlation coefficients, and the graphs that decrease have negative correlation coefficients.

● The stronger the correlation is, the closer the correlation coefficient is to �1. Weak correlations have correlation coefficients close to 0.

Student # Question 1 Question 2 Question 3 Question 4 Question 5

1 0 120 60 100 3

2 20 120 90 200 4

3 80 65 80 140 6

4 55 20 220 260 5

5 10 20 155 200 4

6 15 0 145 200 4

7 90 10 80 150 6

8 215 10 0 60 6

9 100 0 140 150 6

10 60 30 120 105 5

11 65 0 120 150 6

12 10 60 300 360 4

13 120 0 0 45 6

14 30 45 285 90 4

15 40 60 150 190 4

16 0 85 150 75 3

17 0 180 30 30 4

18 80 0 0 0 6

19 90 20 0 0 6

20 45 10 180 285 5

21 10 120 0 90 4

22 40 30 100 115 5

23 0 0 360 60 5

24 30 50 45 90 5

25 60 20 30 90 6

26 45 20 30 45 5

27 20 105 20 60 4

28 0 90 0 120 4

29 50 40 0 90 5

30 40 10 0 45 4

Lesson 11.5 • Bivariate Data and Correlation (continued)

(continued)

DAA2CL_010_11.indd 170DAA2CL_010_11.indd 170 1/13/09 11:01:19 AM1/13/09 11:01:19 AM

Page 11: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

Discovering Advanced Algebra Condensed Lessons CHAPTER 11 171©2010 Kendall Hunt Publishing

Step 4 Write a paragraph describing the correlations you discovered. Mention any pairs that are not correlated that you thought would be. Here are some things you might mention: The correlations between time spent on homework and number of academic classes and between time spent watching TV and bedtime are relatively strong and positive. The correlations between time spent talking to friends and number of academic classes and between time spent on homework and time spent talking to friends are relatively strong and negative. The other pairs of variables do not appear to be correlated. These data were collected from a small sample that was not very random, so they may not be good predictors of results for the entire school.

The text between the investigation and Example A in your book gives information about the correlation coefficient and how its formula was derived. This text also points out that in statistics, the x- and y-variables are often called the explanatory and response variables. Read this text, and then work through Example A, which demonstrates how to calculate the correlation coefficient using the formula.

It is very important not to confuse correlation with causation. The fact that two variables are strongly correlated does not mean a change in one variable causes a change in the other. Example B illustrates this point. Read that example carefully.

Lesson 11.5 • Bivariate Data and Correlation (continued)

DAA2CL_010_11.indd 171DAA2CL_010_11.indd 171 1/13/09 11:01:19 AM1/13/09 11:01:19 AM

Page 12: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

DAA2CL_010_11.indd 172DAA2CL_010_11.indd 172 1/13/09 11:01:19 AM1/13/09 11:01:19 AM

Page 13: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

CONDENSED

Discovering Advanced Algebra Condensed Lessons CHAPTER 11 173©2010 Kendall Hunt Publishing

In this lesson you will

● learn how to fit the least squares line to a set of data

● discover how the least squares line gets its name

● use the root mean square error to compare a least squares line to a median-median line

In Chapter 3, you learned how to fit a line to data and make predictions. In this lesson, you’ll learn about the most commonly used line of fit, called the least squares line. The equation of the least squares line is zy � r z x, where r is the correlation coefficient and zx and zy are the z-values for x and y, respectively. In practice, you want the equation to represent the relationship between x and y, not between their z-values. Using the definition of z-value, you can rewrite the equation as

y �

__ y _____ sy

� r � x � _

x _____ sx � , or y �

_ y � r �

sy __ sx � (x �

_ x )

To find out more about the least squares line, read the text before Example A in your book. Then, read Example A, which illustrates how to find the least squares line for a given set of data.

Investigation: Spin TimeIf you can, use data from your classmates. Otherwise, use the sample data below to answer Steps 2–5. Then check your answers with the results below.

Step 1 Sample data are shown in the table below.

Coin Spin Times (s)

Quarter 10.69, 12.26, 13.56, 9.71

Dime 8.38, 7.47, 8.87, 7.37

Nickel 15.48, 13.88, 12.19, 12.54

Cent 9.6, 11.39, 8.25, 6.78

Step 2 The thickness and weight both seem to be factors that might affect the length of time a coin will spin.

Step 3 Using the table in your text, compare the spin time to the variables of weight, diameter, thickness, area, and volume. Use your calculator or other technology to find the correlation coefficient and the equation of the least squares line. Remember that you can find the least squares line by using the equation y �

_ y � r �

sy __ sx � (x �

_ x ).

Comparing spin time to thickness using the sample data gives r � 0.840 and the least squares line y � �5.19 � 9.525x. The least squares line and the data comparing spin time with thickness are shown at right.

The Least Squares LineL E S S O N

11.6

(continued)

DAA2CL_010_11.indd 173DAA2CL_010_11.indd 173 1/13/09 11:01:19 AM1/13/09 11:01:19 AM

Page 14: LESSON 11.1 Experimental Design - Prek 12math.kendallhunt.com/documents/daa2/CL/DAA2CL010_11.pdf · the diagram showing the stages of an experimental study. Investigation: Designing

174 CHAPTER 11 Discovering Advanced Algebra Condensed Lessons

©2010 Kendall Hunt Publishing

Step 4 Solve for the spin time by substituting thickness for x in the equation for the least squares line and solving for the spin time.

y � �5.19 � 9.525(2.00)

y � 13.86 seconds

The dollar should spin for about 14 seconds.

Step 5 Answer the questions yourself before reading the sample answers below.

a. The average is more stable than a single value. One really good spin or really bad spin could change everything.

b. Any systematic ordering makes the data subject to other factors. For example, the spinner might improve his technique with each spin. Any factors that can’t be controlled should be randomized. Because you can’t do all 16 spins at once, you randomize the order.

c. An experiment should attempt to control every factor except the factor under consideration (the characteristics of the coin). Different people will have different techniques for spinning and timing—using only one spinner and timer gives more control over this factor.

d. To choose your model, you should look at the coefficient of correlation and any pattern in the residuals.

e. In the observational study you would not know if the longer spin time was due to the coin, the spinner, the timer, or some other condition, so no conclusion about causation could be reached. In a well-designed experiment, you might conclude that different coins have characteristics that “cause” different spin times.

You can measure the accuracy of a least squares line by calculating the root mean square error, just as you did for median-median lines in Chapter 3. Example B in your book fits both a least squares line and a median-median line to a set of data and determines which is the better fit by computing the root mean square error. Work through that example, and then read the rest of the lesson.

The least squares line is often called the “best-fit line” because it has the smallest sum of squares of errors between data points and predictions from the line. However, because it places equal emphasis on each point, the least squares line can be affected by outliers. In contrast, the median-median model is relatively unaffected by one or two outliers. When you fit a line to data, it is always a good idea to check the line visually. Sometimes the median-median line or another line is a better fit than the least squares line.

Lesson 11.6 • The Least Squares Line (continued)

DAA2CL_010_11.indd 174DAA2CL_010_11.indd 174 1/13/09 11:01:20 AM1/13/09 11:01:20 AM