Example Problems Unit I Notes Stats

Embed Size (px)

Citation preview

  • 8/8/2019 Example Problems Unit I Notes Stats

    1/6

    Unit I Mayfields Notes

    Chapter 1

    This chapter shows you how to use the textbook What does Mrs. Mayfield find important about the textbook

    o TI Tips: These little boxes show you how to do the calculations in your calculator.o Step-by Step: These are example problems that are similar to homework problemso What can go wrong: Read through these, but do not focus too much on it or you

    are likely to commit the same mistakes

    o What we have learned: Gives you a basic outline of what you have learned in thatchapter. This is a good thing to read over to make sure you are prepared for a

    chapter quiz.

    Chapter 2

    Summaries of data almost always should be verbal (a sentence), visual (a graph), andnumerical (a number.)

    No analysis is complete without telling what it means i.e. a connection back the real-world

    No analysis (nor AP answer) is complete without a connection back to the real world, the4 Cs to any answer clear, concise, complete and in context, answers are not numbers butsentences

    Context is critical It matters the who and what we are studyo Who Who are we studying (could be an object)o What What do we want to know about this whoo When when did this study occuro Where where did the study occuro Why why did the researchers conducted the studyo How how was the data collected

    There are Univariate and Bivariate data analysis. Univariate most students prefer theXBOX System. Bivariate- Females tend to prefer the Wii and males tend to prefer theXBOX.

    Important Terms Statistic a numerical summary of data, Example: 23% of the class is male Data the who and what you are studying. Data is one point were statistic is the

    summary, i.e. who can not be a statistic only a datum

    Categorical data are non-numerical data or numerical data whose mean is irrelevant,Examples: Male/ Female, zip code, phone number, political party

    Quantitative data - numerical data whose mean is relevant, quantity is measure,Examples: age, grade

    Variables a attribute or characteristic of an individual or object whose values variesfrom case to case, what you want to know

    Read the TI Tips on page 14 and make sure you do these things in your calculator For each description of data, identify the Ws, name the variables, specify for each

    variable whether its use indicates it should be treated as categorical or quantitative, and

    for any quantitative variable, identify the units in which it was measured (or note thatthey were not provided.)

    o Example 1) A Consumer report article on energy bars gave the brand name,flavor, price, number of calories, and grams of protein and fat

  • 8/8/2019 Example Problems Unit I Notes Stats

    2/6

    o Example 2) A report on the Boston Marathon listed each runners gender,country, age and time

    Chapter 3

    Frequency table records the totals and the category names, Example: Grades and countof students or type of car and count

    Relative frequency table displays the percentages rather than the counts, Example: thepercent of students with As, Bs and Cs in each class

    Graphs you should be already familiar with Bar charts and pie charts Area Principle the area occupied by a part of the graph should correspond to the

    magnitude of the value it represents (see graph and caption on page 22) Rule for graphing: the y axis is the dependent value the y (trade in value) depends on x

    (the odometer)

    Contingency table: individuals are distributed along each variable contingent on the valueof the other variableExample: Class vs. Survival on the Titanic

    First Class Second Class Third Class Crew Total

    Alive 203 118 178 212 711

    Dead 122 167 528 673 1490

    Total 325 285 706 885 2201

    Marginal distribution the distribution of either variable alone the counts or percentagesare the totals found in the margins (last row or column) Example: Just looking at theAlive people and their class breakdown, OR just looking at the First Class and seeing

    their alive and dead breakdown.

    Conditional distribution restricting the Who to consider only a smaller group ofindividuals, Example: Given the person is in first class, find the probability they

    survived?

    Independent one variable DOES NOT rely on the other example sex and eye color,shoe size and IQ

    Segmented bar chart each bar represents the whole and divides it proportionally intosegments corresponding to the percentage in each group (see page 29)

    Simpsons Paradox when averages are taken across certain groups they can appear tocontradict overall averages. Occurs rarely in real lifeDay Night Overall

    Moe 90 out of 100 (90%) 10 out of 20 (50%) 100 out of 120 (83%)

    Jill 19 out of 20 (95%) 75 out of 100(75%) 94 out of 120 (78%)

    Who has better day on time record?Who had better night on time record?

    Who had overall on time record?

  • 8/8/2019 Example Problems Unit I Notes Stats

    3/6

    Famous example: Admission rates at Berkley

    Example 1: Use the below information to find the following.o What class has the highest percentage of As?o What class has the least percent failing?o What grade (A-F) has the highest percentage?o What kind of conclusions can you make from this data? Are grade and class

    independent?

    A B C D F

    9th

    50 75 25 25 75

    10th

    20 60 50 10 6011th 35 15 25 50 50

    12th

    40 10 10 30 10 Example 2: Use the class data:

    o What percent of girls are prefer WII?o What percent of students that prefer the WII are girls?o What percent of the class are girls what prefer the WII?o What is the marginal frequency distribution for system preference?o What is the conditional relative frequency distribution of gender

    among the PS3 preference?

    Example 3: Classwork worksheet Example 4: Is the color distribution of M&Ms independent of the type of candy?

    Chapter 4

    Histogram - each bar represents the frequency or relative frequency of values that fall inan interval of values

    Stem and leaf plot shows quantitative data in a way that sketches the distribution.Example: Amount of money spent on school supplies

    Dot plot a dot for each case against a single axis. Example: Number of windows inyour house.

    With all distributions you should be able to describe the shape, center, spreado SHAPE what kind on mode does it have? What about symmetry? Any outliers?

    Mode the number that occurs most often, tallest bar on a frequencydistribution

    Unimodal, bimodal, multimodal, uniform Symmetric vs. left (negative) skewed vs. right (positively) skewed

  • 8/8/2019 Example Problems Unit I Notes Stats

    4/6

    Skewed to the right has a right tail example: Income mostpeople have low income and some people have high incomemaking a right or positive tail

    Outliers numbers that are far away from the data Are there clusters or gaps?

    oCENTER median middle) of the data set

    o SPREAD variation, standard deviation, range or Interquartile range, tells us howfar apart the numbers are

    Timeplots shows what happens over a period of timeo Which one depends on the other (time is always x)o Example a walk about town or a workout

    To compare two distributions use a histogram or back to back stem and leaf plotExample: grades (1

    stand 3

    rdhour)

    What can go wrong on page 59o Choose your graph wiselyo Do not look for shape center and spread on a bar chart use a stem and leaf plot or

    dot ploto Choose a bin width appropriate for datao Avoid inconsistent scaleso Label Clearly

    Read the TI Tips on page 54 and 55 and make sure you do these things in your calculator Example 1: In what ways are stem-and-leaf displays, dot plots, and histograms all

    similar? What information is apparent in some but not in others? What are the advantages

    and disadvantages of each type of graph?

    Example 2: Make a dot plot of the number of siblings (is it skew to the left or right,where is the tail pointing)

    Example 3: What should the stem part be each different magnitudes and rangeso Quiz scores out of 100o Back-to-back stem plots for males vs. females for number of states visitedo Student weightso Weights of cattle

    Example 4: Describe the shape, center, and spread for the weight of pennies.2.57, 2.56, 3.14, 3.03, 3.13, 2.47, 2.43, 3.11, 3.06, 2.48, 2.51, 2.50, 3.07, 3.08, 3.01, 2.45,2.51, 3.13, 2.51, 3.12, 3.10, 3.08, 2.46, 2.44, 2.47, 2.54, 3.09, 3.13, 2.56, 2.49

    Put into L1 and look at the graph

  • 8/8/2019 Example Problems Unit I Notes Stats

    5/6

    Chapter 5

    Median the middle value that divides the histogram into two equal areas, measure ofcenter

    Range= max-min, measure of spread 5 number summary max, Q3, median, Q1, Min Q3 and Q1 are the quarter marks, which divides the data at 25% and 75% Percentile the ith percentile is the number that falls and i% of the data Interquartile range =upper quartile-lower quartile, IQR, measure of spread box plot made up of the 5 number summary, good for comparing groups of data Mean - the sum divide by the total number of entries n, the point at which the histogram

    would balance

    Standard deviation how far each value falls from the meano The standard deviation is only appropriate of symmetric data

    Variance when we add up all the square deviations and find their averages

    1

    )( 22

    =

    n

    meanys

    Standard deviation the square root of the variance When there is an outlier in the data the mean, range, and standard deviation change a lot,

    but the median and IQR are more stable. Example:

    Outlier Ruleo Lower Fence = Q1-1.5(IQR)o Upper Fence = Q3+1.5(IQR)o Outlier Rule is just a rule of Thumb not a Law of the Universe, always look at

    your data and check sensibility. Make sure to check assumptions before moving to choosing mean vs. median

    Quantitative Data Condition

    Read the TI Tips on pages 80 and 86 and make sure you do these things in yourcalculator

    The Mean and Standard Deviation is only appropriate when the data is symmetric andthere are no outliers, WHY?

    For skewed data it is better to report the median and IQR rather than the mean andstandard deviation. WHY? (however, median and IQR can be used for symmetric data it

    is just not as powerful)

  • 8/8/2019 Example Problems Unit I Notes Stats

    6/6

    Example 1- Use the class data for number of text messages a day to make a box plot.

    Example 2 - Classwork worksheet Example 3 Look at the below data. What is each groups mean and standard deviation?

    What does this tell you?

    Group

    1 2 3 4 5 6

    10 8 0 0 0 410 10 10 8 2 6

    10 10 10 10 10 8

    10 10 10 12 18 14

    10 12 20 20 20 18

    Chapter 6