20
Statistical Reasoning “He told me I was average. I told him he was mean.”

Statistical Reasoning “He told me I was average. I told him he was mean.”

Embed Size (px)

Citation preview

  • Statistical ReasoningHe told me I was average. I told him he was mean.

  • Descriptive StatisticsUsed to organize and summarize data in a meaningful way.

    Frequency distributions Where are the majority of the scores?Used to organize raw scores, or data, so that information makes sense at a glance. They take scores and arrange them in order of magnitude and the number of times each score occurs.

  • Histograms & Frequency Polygons(showing you data a glance)Ways of showing your frequency distribution data.Histogram graphically represents a frequency distribution by making a bar chart using vertical bars that touch When you have a continuous scale (for example, scores on a test go from 0-100, continuously getting larger.) the bars touch, because you have to have a class for each score to fall into, and you cant have any gaps.Different than a simple Bar Graph which is used when you have non-continuous classes (example, which candidate do you support, Obama or McCain? Youd have a bar for each, with gaps in between, because you cant fall between two candidates, you have to pick one.)

  • HistogramUses a Bar Graph to show data

  • Frequency PolygonUses a line graph to show data2. Frequency Polygon graphically represents a frequency distribution by marking each score category along a graphs horizontal axis, and connecting them with straight lines (line graph)

  • Measures of Central TendencyA single number that gives us information about the center of a frequency distribution. Measures of central tendency 3 types4, 4, 3, 4, 5

    Mode=most common=4(Reports what there is more of Used in data with no connection. Cant average men & women.)2. Mean=arithmetic average=20/5=4(has most statistical value but is susceptible to the effects of extreme scores )3. Median=middle score=4(1/2 the scores are higher, half are lower. Used when there are extreme scores)

  • Central Tendency An extremely high or low price/score can skew the mean. Sometimes the median is better at showing you the central tendency.1968 TOPPS Baseball CardsNolan Ryan$1500Billy Williams$8Luis Aparicio$5Harmon Killebrew$5Orlando Cepeda$3.50Maury Wills$3.50Jim Bunning$3Tony Conigliaro$3Tony Oliva$3Lou Pinella$3Mickey Lolich$2.50Elston Howard$2.25Jim Bouton$2Rocky Colavito$2Boog Powell$2Luis Tiant$2Tim McCarver$1.75Tug McGraw$1.75Joe Torre$1.5Rusty Staub$1.25Curt Flood$1With Ryan:Median=$2.50Mean=$74.14Without Ryan:Median=$2.38Mean=$2.85

  • Does the mean accurately portray the central tendency of incomes?NO!What measure of central tendency would more accurately show income distribution?Median the majority of the incomes surround that number.

  • Measures of Variability Gives us a single number that presents us with information about how spread out scores are in a frequency distribution. (See example of why this is important). Range Difference b/w a high & low scoreTake the highest score and subtract the lowest score from it. (can be skewed by an extreme score)Standard Deviation How spread out is your data?The larger this number is, the more spread out scores are from the mean. The smaller this number is, the more consistent the scores are to the mean

  • Calculating Standard DeviationHow spread out (consistent) is your data?Calculate the mean.2. Take each score and subtract the mean from it.Square the new scores to make them positive.Mean (average) the new scoresTake the square root of the mean to get back to your original measurement.6. The smaller the number the more closely packed the data. The larger the number the more spread out it is.

  • Standard DeviationPuntDistance36384145Mean:160/4 = 40 ydsDeviationfrom Mean36 - 40 = -438 40 = -241 40 = +145 40 = +5 DeviationSquaredNumbers multiplied by itself & added together16 4 125Variance:46/4 = 11.5

    StandardDeviation:

  • Multiple ChoiceEssayCompositeMean=34.3SD=4.2Mean=10.2SD=2.0Mean=9.3SD=2.3Are these scores consistent? Is there a skew?

    Sheet2

    Sheet1

    13A+4013A+

    12A39441%A122352%12A1139%

    3811A-111011A-14

    11A-3711B+101510B+9

    10B+366B9541%9B1245%

    9B35431%B-868B-8

    345C+727C+2

    8B-335C615%6C+211%

    7C+324C-55C-3

    6C31319%D+424D+

    303D33%3D35%

    5C-292D-22D-

    4D+28F10%1F0%

    3D2748%0

    261

    2D-25

    1F242%

  • Z-ScoresA number expressed in Standard Deviation Units that shows an Individual scores deviation from the mean.Basically, it shows how you did compared to everyone else.+ Z-score means you are above the mean, Z-score means you are below the mean.Z-Score = your score minus the average score divided by standard deviation.Which class did you perform better in compared to your classmates?Z score in Biology: 168-160 = 8, 8 / 4 = +2 Z ScoreZ score in Psych: 44-38 = 6, 6/2 = +3 Z Score

    You performed better in Psych compared to your classmates.

    Test TotalYour ScoreAverage scoreS.D.Biology2001681604Psych.10044382

  • 9/14/2010*Photo courtesy of Judy Davidson, DNP, RN

  • Standard Normal Distribution Curve Characteristics of the normal curveBell shaped curve where the mean, median and mode are all the same and fall exactly in the middle + or - #+3-3Wechsler Intelligence Scores-10+1+2-2

  • Skewed Curves

  • Inferential StatisticsHelp us determine if our results are legit and can be generalized to the publicHelp to determine whether a studys outcome is more than just chance events. Used to predict things about a population based on a sample. 3 Principles of Inferential Statistics:Non-biased sample - Representative Samples are better than biased samples for generalizing dataLess-variability is better the average is better when it comes from scores of low variabilityMore cases are better than fewer averages based on many cases are more reliable.

  • Statistically SignificantPossibility that the differences in results between the experimental and control groups could have occurred by chance is no more than 5 percentMust be at least 95% certain the differences between the groups is due to the independent variable

  • Statistical Significancep value = likelihood a result is caused by chance. In other words, are they statistically significant? If the answer is yes, then they can be generalized to a larger population Researchers want this number to be as small as possible to show that any change in their experiment was caused by an independent variable and not some outside force.Results are considered statistically significant if the probability of obtaining it by chance alone is less than .05 or a P-Score of 5%. p .05 Researcher must be 95% certain their results are not caused by chance.Replication of the experiment will prove the p value to be true or not.Effect Size Measure of the strength of a relationship between variables (used with SS to report quality of results)

  • p ValueDescribes the percent of the population/area under the curve (in the tail) that is beyond our statisticThis means the percentage of chance that a confounding variable may be responsible for our results.Check out P Values made simple for more help.

    *I always remember that descriptive statistics describe a population.*Histograms are used when you have a continuous scale (for example, scores on a test go from 0-100, continuously getting larger.) Thats why the bars touch, because you have to have a class for each score to fall into, and you cant have any gaps.Bar graphs are used when you have non-continuous classes (example, which candidate do you support, Obama, Clinton, Edwards? Youd have a bar for each, with gaps in between, because you cant fall between two candidates, you have to pick one.)We usually call frequency polygons line graphs.*This is a bar graph. If you wanted it to be a histogram, you would make the bars equal width, touching. You would also give some numerical scale instead of letter grades, indicating what the cut-off is for A, B, C, and D.*This is ok*ok*Another good example that kids get easily is home values. We have lots of big $$$ homes in GH (out by the lake) that, when averaged in, skew the mean home price. Thats why you always hear realtors talk about the median home value.*ok*ok*ok*I dont understand this slideprobably your oral presentation clarifies it!*Ok we do examples exactly like this!*ok*We usually say skewed right (positive) or left (negative).*Usually you would use inferential statistics to try to predict things about a population based on a sample. For example, we surveyed 50 staff members in the district about their level of education and are trying to use that to predict the average level of education for all staff in the district.*ok