9
Descriptive Statistics Review – Chapter 14

Descriptive Statistics Review – Chapter 14. Data Data – collection of numerical information Frequency distribution – set of data with frequencies

Embed Size (px)

Citation preview

Page 1: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

Descriptive Statistics

Review – Chapter 14

Page 2: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

Data Data – collection of numerical information Frequency distribution – set of data with

frequencies listed Stem-and-Leaf Plot Bar graph – draw rectangles (bars) above each

class (listed on horizontal axis) with height equal to the frequency of the class (listed on the vertical axis). Bars = width, do not touch.

Histogram – used to graph a frequency distribution when the classes are continuous variable quantities*. Bars connect.

* If a data value falls on the boundary between two data classes, make it clear whether you are counting that value in the class to the right or to the left of the data value.

Page 3: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

Measures of Central Tendency

Mean, Median, Mode

Outliers – affect the mean value

5 Number Summary: Min, Q1, Median, Q3, Max Box and Whisker Graph

Page 4: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

Standard Deviation

A measure based on the distance of each data value from the mean.

SAMPLE vs POPULATION

FORMULAS

Page 5: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

Coefficient of Variation

Compares the standard deviation to the mean

Expressed as a percentage CV= [(standard dev) / mean] * 100% A larger CV means that there is more

variation, relatively speaking, in the data.

Page 6: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

Normal Distribution

The most common distribution in statistics.

Many real- life data sets follow the normal distribution pattern

The normal curve is bell-shaped. The highest point on the curve is the

mean of the distribution; mean=median=mode

The curve is symmetric with respect to its mean

Page 7: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

More – Normal Distribution

The area under the curve equals ONE. 50% of the area lies on each side of the

mean. The curve has two POINTS OF

INFLECTION – a POI is a point on the curve where the curve changes from being curved upward to being curved downward (or vice-versa).

POI are located one standard deviation on each side of the mean.

Page 8: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

Percentages – Roughly,

68% of the data values are within one standard deviation from the mean.

95% of the data values are within two standard deviations from the mean.

99.7% of the data values are within three standard deviations from the mean.

Note: We can use the 68-95-99.7 rule to estimate how many values are expected to fall within one, two or three standards deviations of the mean of a normal distribution. Generally, we assume the normal distributions are for an entire population rather than a sample.

Page 9: Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies

Standard Normal Distribution

CHART/TABLE on page 823 This gives the area under the curve between

the mean and a number called a z-score. A z-score represents the number of standard

deviations a data value is from the mean. Formula for converting Raw Scores to z-scores

– page 826 The table only gives positive z-scores, but by

symmetry we can find areas corresponding to negative z-scores.