5
Page 1 of 5 When you're examining the relationship between two variables, you're looking at bivariate data—a data set with two variables. A scatterplot is a two-dimensional graph that displays a set of bivariate data on a single graph. It's the same type of graph used in algebra to show the graphs of equations. In a scatterplot, the data are in the form of pairs (x,y), and values are plotted along the horizontal x-axis and vertical y-axis. Scatterplots are used to examine whether observable linear or non-linear patterns exist in the data. The variable on the x-axis is often called the explanatory, or independent, variable; the variable along the y-axis is often called the response, or dependent, variable. An explanatory variable is one that may explain or cause changes in a second variable. The response variable is one that might be caused or explained by another variable. However, not every relationship between two variables represents an instance of causation. While it's possible the relationship indicates the two variables are correlated with each other, a correlation only suggests the two variables are related. There's no presumption that changes in one variable cause changes in the second variable. Typically, scatterplots are used to examine the relationship between two quantitative variables. For example, you could collect age and weight information from a set of elementary and middle school students and graph the observations on a scatterplot to see if there is a linear relationship between age and weight. In addition, this scatterplot can be used to assess whether there is positive or negative relationship between our two variables. A positive relationship exists in the instance where values of both variables change in the same direction (for example, as age increases, weight also increases). A negative relationship exists in the instance where values of both variables change in opposite directions (for example, as age increases, weight decreases). Examples of scatterplots showing positive and negative relationships are shown below: Positive Relationship Notice how the pattern of points moves from the bottom left corner to the top right corner of the graph. 2.00000 13.00000 47.00000 18.00000 y x ______________________________ Copyright © 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse) TI-83 screens are used with the permission of the publisher. Copyright © 1996, Texas Instruments, Incorporated. AP Statistics Practice: Scatterplots and Bivariate Data

Statistics Study Guide Bivariate Data

  • Upload
    ldlewis

  • View
    213

  • Download
    1

Embed Size (px)

DESCRIPTION

Study Guide for Scatterplots and Bivariate Data

Citation preview

  • Page 1 of 5

    When you're examining the relationship between two variables, you're looking at bivariate dataa data set with two variables. A scatterplot is a two-dimensional graph that displays a set of bivariate data on a single graph. It's the same type of graph used in algebra to show the graphs of equations. In a scatterplot, the data are in the form of pairs (x,y), and values are plotted along the horizontal x-axis and vertical y-axis. Scatterplots are used to examine whether observable linear or non-linear patterns exist in the data. The variable on the x-axis is often called the explanatory, or independent, variable; the variable along the y-axis is often called the response, or dependent, variable. An explanatory variable is one that may explain or cause changes in a second variable. The response variable is one that might be caused or explained by another variable. However, not every relationship between two variables represents an instance of causation. While it's possible the relationship indicates the two variables are correlated with each other, a correlation only suggests the two variables are related. There's no presumption that changes in one variable cause changes in the second variable. Typically, scatterplots are used to examine the relationship between two quantitative variables. For example, you could collect age and weight information from a set of elementary and middle school students and graph the observations on a scatterplot to see if there is a linear relationship between age and weight. In addition, this scatterplot can be used to assess whether there is positive or negative relationship between our two variables. A positive relationship exists in the instance where values of both variables change in the same direction (for example, as age increases, weight also increases). A negative relationship exists in the instance where values of both variables change in opposite directions (for example, as age increases, weight decreases). Examples of scatterplots showing positive and negative relationships are shown below: Positive Relationship Notice how the pattern of points moves from the bottom left corner to the top right corner of the graph.

    2.00000 13.00000

    47.00000

    18.00000

    y

    x

    ______________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse) TI-83 screens are used with the permission of the publisher. Copyright 1996, Texas Instruments, Incorporated.

    AP Statistics Practice: Scatterplots and Bivariate Data

  • Page 2 of 5

    Negative Relationship Notice how the pattern of points moves from the top left corner to the bottom right corner.

    24.00000 46.00000

    21.00000

    2.00000

    y1

    x1

    In this Independent Study you'll create and interpret scatterplots. Although it's easy to create such plots by hand, the process can take a long time. Here are some instructions for creating a scatterplot using the TI-83/TI-84, in case you need a refresher. Lets assume you have the following data for a set for 10 individuals. The x-value represents the amount of change they have in their pocket (in cents), and the y-value represents their age (in years).

    Person # x \

    1 23 34 2 23 25 3 21 24 4 34 26 5 45 43 6 42 39 7 34 47 8 32 36 9 22 38

    10 28 32 The steps to create a scatterplot on the TI-83/TI84 are:

    1. Clear two lists: Press STAT 4 2nd [L1] , 2nd [L2] ENTER.

    2. Enter data from x into one of the lists (in this case, L1).

    3. Enter data from y into a second list (in this case, L2).

    4. Press 2nd [STAT PLOT].

    5. Select 1 and turn ON Plot1. ______________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse) TI-83 screens are used with the permission of the publisher. Copyright 1996, Texas Instruments, Incorporated.

    AP Statistics Practice: Scatterplots and Bivariate Data

  • Page 3 of 5

    6. Set up the plot as shown:

    7. Press ZOOM 9, and a scatterplot will appear.

    8. Press TRACE to see the x- and y-values of the points. By pressing the forward and back arrows, you'll be able to see the x- and y-values up and down the list you entered. (The up and down arrow keys won't work in this mode.) If you want to see the value for any location on the graph, press GRAPH and use any of the arrow keys. This screen shot shows that the highlighted point has an x value of 23 and a y value of 34 (for person number 1, who has 23 cents in her pocket and is 34 years old):

    ______________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse) TI-83 screens are used with the permission of the publisher. Copyright 1996, Texas Instruments, Incorporated.

    AP Statistics Practice: Scatterplots and Bivariate Data

  • Page 4 of 5

    Questions Part 1 1. A social skills training program was implemented for seven students with mild

    handicaps in a study to determine whether the program caused improvement in pre/post measures and behavior ratings. For one such test, these are the pre- and posttest scores for the seven students:

    Student Pretest PosttestEarl 101 113 Ned 89 89 Jasper 112 121 Charlie 105 99 Tom 90 104 Susie 91 94 Lori 89 99

    Data Source: Gregory K. Torrey, S.F. Vasa, J.W. Maag, and J.J. Kramer, Social Skills Interventions Across School Settings: Case Study Reviews of Students with Mild Disabilities, Psychology in the Schools 29 (July 1992):248.

    A. Draw a scatterplot relating posttest score to pretest score. (You can create the scatterplot either by hand or by using the TI-83/TI-84.)

    B. Describe the relationship between pre- and posttest scores using the graph in part A. Do you see any trend?

    2. Health-conscious Americans often consult the nutritional information on food

    packages in an attempt to avoid food with large amounts of fat, sodium, or cholesterol. The following information was taken from eight different brands of American cheese slices:

    Brand Fat (g)

    Saturated Fat (g)

    Cholesterol (mg)

    Sodium (mg) Calories

    Kraft Deluxe American 7 4.5 20 340 80 Kraft Velveeta Slices 5 3.5 15 300 70 Private Selection 8 5.0 25 520 100 Ralphs Singles 4 2.5 15 340 60 Kraft 2% Milk Singles 3 2.0 10 320 50 Kraft Singles American 5 3.5 15 290 70 Borden Singles 5 3.0 15 260 60 Lake to Lake American 5 3.5 15 330 70

    A. Draw a scatterplot for fat and saturated fat. Describe the relationship. B. Draw a scatterplot for fat and calories. Compare the pattern to that found in part

    B. C. Draw a scatterplot for fat versus sodium and another for cholesterol versus

    sodium. Compare the patterns. Are there any clusters or outliers? (You might want to enter each of the five columns into the TI-83/TI-84.)

    ______________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse) TI-83 screens are used with the permission of the publisher. Copyright 1996, Texas Instruments, Incorporated.

    AP Statistics Practice: Scatterplots and Bivariate Data

  • Page 5 of 5

    Questions Part 2 1. You have the following data from the 20 largest counties in the United States: State/County Median HH Income

    (1989) % Below Poverty

    Level (1989) Serious Crime Rate (per 100,000 residents) in

    1989 CA Los Angeles $39,035 11.6 7,614 IL Cook $39,296 11.1 8,475 TX Harris $36,404 12.5 8,807 CA San Diego $39,798 8.1 6,816 CA Orange $51,167 5.2 5,873 NY Kings $30,033 19.5 9,265 AZ Maricopa $36,078 8.8 8,179 MI Wayne $34,099 16.9 9,248 FL Dade $31,113 14.2 12,311 TX Dallas $36,982 10.4 11,322 WA King $44,555 5.0 8,040 PA Philadelphia $30,140 16.1 6,836 CA San Bernardino $36,977 10.3 6,491 CA Santa Clara $53,670 5.0 5,037 OH Cuyahoga $35,749 11.0 5,425 MA Middlesex $52,112 4.2 3,599 NY Suffolk $53,247 3.3 5,029 PA Allegheny $35,338 8.7 3,805 CA Alameda $45,037 8.1 8,220 NY Nassau $60,619 2.5 3,343 Source for data Bureau of Justice Statistics Crime and Justice, 1997 www.ojp.usdoj.gov/bjs You're interested in whether a relationship exists between the measures of income and affluence and the serious crime rate in US counties. Create two different scatterplots: Median Income vs. Serious Crime Rate and Percent Below Poverty Level vs. Serious Crime Rate. Discuss what the scatterplots tell you about the relationships present in these sets of variables. Describe the pattern of the relationship in each plot (linear vs. non-linear, positive vs. negative, presence of clusters or gaps). In either case, can you be certain by looking only at these data that a change in one variable causes a change in another? Acknowledgements Question 1: This is question 3.13 (a, b) from page 107 of Introduction to Probability and Statistics, Tenth Edition, by W. Mendenhall, R. Beaver, and B. Beaver. Copyright 1999 by Brooks Cole, division of Thompson Learning Incorporated. Further reproduction is prohibited without permission of the publisher. Question 2: This is question 3.15 (b, c, d) from page 112 of Introduction to Probability and Statistics, Tenth Edition, by W. Mendenhall, R. Beaver, and B. Beaver. Copyright 1999 by Brooks Cole, division of Thompson Learning Incorporated. Further reproduction is prohibited without permission of the publisher.

    ______________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse) TI-83 screens are used with the permission of the publisher. Copyright 1996, Texas Instruments, Incorporated.

    AP Statistics Practice: Scatterplots and Bivariate Data