Shavelson – Descriptive Statistics

Shavelson – Descriptive Statistics

Variability

Range

Variance

SD

Shavelson Chapter 5S5-1. Define, be able to create and recognize graphic representations of a normal

distribution (115-121).

Normal distribution: Provides a good model of relative frequency distribution found in behavioral research.

Shavelson Chapter 5S5-2. Know the four properties of the normal distribution (120-121).

Unimodal, thus the greater the distance a score lies from the mean, the less the frequency of at score.

SymmetricalMean, mode, and median all the sameAymptotic line never touches the

abscissaNote that the mean and variance can

differ, thus “a family of normal distributions”

Shavelson Chapter 5S5-3. You should know what is meant by the phrase “a family of normal distributions” (121,3). I will also cover in class the general issues of “distributions” which are

frequently used in statistical analyses.

From:http://www.gifted.uconn.edu/siegle/research/Normal/instructornotes.html

450 of 5

25% 25%25%25%1. These distributions have the same

mode, different median and SD

2. These distributions have different mode, same median, different SD

3. These distributions have different means, modes and variances

4. These distributions have the same mode, mean and median, but different SDs

Enter question text...

450 of 5

25% 25%25%25%1. These distributions have the same

mean, but different SDs

2. These distributions have a different means and medians, but the same modes and SDs

3. These distributions have different means, modes, and

4. Nothing is the same with these two!

Shavelson Chapter 5S5-4. Know the areas under the curve of a normal distribution (roughly, e.g. 34.13%,

13.59%, 2.14 % and .13% on either side of the mean)

From:http://www.gifted.uconn.edu/siegle/research/Normal/instructornotes.html

Shavelson Chapter 5

Shavelson Chapter 5S5-5a. What is a standard score (z-score) (123,3)? Be able to calculate the z-score,

given a raw score, mean, and standard deviation.

Z score = X-mean

S

X = raw score

Mean = mean of distribution

S = standard deviation

Notice that to calculate the Z score you need the mean and S of a distribution of scores.

Shavelson Chapter 5S5-5b. What two bits of information does the z-score provide us (125, 1-2)?

Z scores provides the following information:1. Size of Z scores indicates the number of standard

deviations raw score is from the mean2. Sign (+ or -) indicates if the raw score is above the

mean (+) or below the mean (-)

A z score of -1.8 means…

450 of 5

20% 20% 20%20%20%1. The mean of the distribution is 1.8

2. The distribution is skewed

3. The raw score lies 1.8 means above the mean

4. The raw score lies 1.8 standard deviations below the mean

5. The raw score 1.8 lies standard deviations above the mean

Mean = 10, X= 18, S = 4, what is the Z score?

450 of 5

20% 20% 20%20%20%1. -2

2. 4

3. -4

4. 2

5. Not listed

Shavelson Chapter 5S5-6. Know what a Standard(ized) distribution is.

Convert all raw scores of a distribution into Z scores, and put into a frequency distribution.

– Mean = 0– Std. Dev. And Variance = 1

-2 -1 0 +1 +2

Shavelson Chapter 5S5-8. Know how to calculate the proportion of scores that lie above or below a given

raw score

Convert raw score to a Z score

Do rough estimate on a standard normal distribution

Look up in table B (swap labels on 3&4 if it is a neg Z value).

Mean = 80

S = 5

X = 69

Let's do a few more!

Mean = 18, X = 5, S = 7.1. What percentile

was the person who scored the x in?

450 of 5

25% 25%25%25%1. 1.83

2. 96.64

3. 3.44

4. 46.64

Shavelson Chapter 8S8-1. Know the definition of a statistic, parameter, and

estimator

Statistic: describes characteristic of sample e.g. sample mean x-bar as opposed to population mean mu (μ)

Parameter: describes characteristic of population

Estimator: statistic that estimates a population parameter

The mean is an example of…..

450 of 5

25% 25%25%25%1. Parameter

2. Statistic

3. Estimator

4. All of the above

Shavelson Chapter 8S8-2. Know the role of statistics, as well as the difference between inferential and

descriptive statistics.

Role of Stats:• Guidelines for summarizing/describing data• Method for drawing inferences from sample to

population• Help set effective methodology

Descriptive Stats• Organize/summarize/depict/describe

collections of data

Inferential Stats• Draw inferences about population from sample

Shavelson Chapter 8S8-3. Know and be able to recognize and provide examples of the two types of questions

asked about a population (Case 1 and Case II research). (217)

Case I Research:Was a particular sample of observations drawn

from a particular (known) population?Example: all students in US took GRE on same

day, means of all scores…look at one state in particular…mean is higher…they are from a different population with a higher mean.

Take one sample from the population (get mean) and compare to the overall population mean.

Actually answer: what is the probability that a sample was drawn from a particular (known) population.

Shavelson Chapter 8S8-5. Know the general approach for conducting case I and case II hypothesis testing. That is, you

should be able to list and briefly describe the steps your author lists at the end of each section (case I, 220-221, 4 steps; case II 223-224, 5 steps). Be able to describe the various alternative hypotheses (step two of each)

Case II research:

Are the observations from two different samples drawn from the same population?

• (do observations on two groups of subjects differ from one another)

• Actually answer: given that a difference exists between two samples (e.g. the means) what is the probability that this difference is caused by chance alone? If not from chance alone, they must be from different populations e.g. our treatment changed them!

Shavelson Chapter 8S8-5Case 1 research steps:1. Set your hypotheses • Ho: µ = specific value• H1: µ some specific value – usually pop mean (two tailed)• H1: µ > some specific value (one tailed)• H1: µ < some specific value (one tailed)2. Randomly select participants for your study3. decide to reject the null or not based on the comparison of the

sample mean to the population mean – Reject null means that the difference between the population

mean and the sample mean is not likely to have occurred by chance (it was probably due to whatever you were studying!)

– Failure to reject the null means there is a fairly good chance that the difference between the sample mean and the population mean could have occurred simply by chance (not due to whatever you were studying)

Shavelson Chapter 8Case II1. Set your hypotheses • Ho: µe = µc• µe = Experimental group• µc = Control Group• H1: µe µc (two tailed)• H1: µe > µc (one tailed)• H1: µe < µc value (one tailed)2. Randomly Select then Randomly Assign participants to

experimental and control groups. 3. Perform the experiment – apply the IV and measure the DV4. Decide to reject the Null Hypothesis or not

– Reject the null means that the difference between the experimental and control group is not likely to have occurred by chance (thus was probably your IV!)

– Failure to reject the null means that it is likely that the difference between the control group and the experimental group was due to chance and not your IV.

Shavelson Chapter 8S8-6. Know the two types of statistical errors: Type 1 and type 2. Be

able to prove and recognize examples of each.

Types of errors in statistical inference

Type I: Reject the null when it is true (say there is a treatment effect when there is not)

Type II: Not reject null when it should have been (say there is no treatment effect when there was)

Shavelson Chapter 8S8-6. Know the two types of statistical errors: Type 1 and type 2. Be

able to prove and recognize examples of each.

The way it really is

Your Decision Vit A has no

effectVit a Had an

effect

Reject Null – Vit A had an effect

Type 1 error Correct Decision

Accept NullVit A had no effect

Correct decision

Type 2 error

Shavelson Chapter 9Probability

Event: any specified outcomeOutcome space: all possible outcomes• P(e)= the probability of some event• P(e)= # events/# outcomes in outcome

space• Ex. DiceOutcome space = {1,2,3,4,5,6} (= six items)E = {2} (=1 item) Probability of getting a 2 = 1/6=.17

Probability: what is the probability of getting a two by chance alone?

First Question

SecondQuestion

ThirdQuestion

Score on test

Correct 3

Correct Incorrect 2

Correct

Incorrect Correct 2

Incorrect 1

Correct 2

Correct Incorrect 1

Incorrect

Incorrect Correct 1

Incorrect 0

Shavelson Chapter 10

10-1. Two fundamental ideas of conducting case I research:

The null hypothesis is assumed to be true.• (that is, the difference between the sample and

population mean is assumed to be due to chance alone)

A sampling distribution is used to determine the probability of obtaining a particular sample mean.

• In this case the sampling distribution is composed of group means

Shavelson Chapter 1010-2. What is the central limit theorem?

The Central Limit Theorem is a statement about the characteristics of the sampling distribution of means of random samples from a given population. That is, it describes the characteristics of the distribution of values we would obtain if we were able to draw an infinite number of random samples of a given size from a given population and we calculated the mean of each sample.

The Central Limit Theorem consists of three statements: [1] The mean of the sampling distribution of means is equal to the mean

of the population from which the samples were drawn. [2] The variance of the sampling distribution of means is equal to the

variance of the population from which the samples were drawn divided by sqrt of the size of the samples.

[3] If the original population is distributed normally (i.e. it is bell shaped), the sampling distribution of means will also be normal. If the original population is not normally distributed, the sampling distribution of means will increasingly approximate a normal distribution as sample size increases. (i.e. when increasingly large samples are drawn)

Shavelson Chapter 1010-3. Know the characteristics of a sampling distribution of means.

Characteristics of Sampling distribution of means

1. normally distributed (even if pop. is skewed - if N = 30 or more)

2. sampling mean = population mean

3. standard dev (standard error of the mean) = Pop S.D.

N


10-4. Know what happens to the SEM as sample size increases.

SEM decreases as N increases

SEM = Pop S.D. N

σ x = σ N

Shavelson Chapter 1010-5. Know how one could create a sampling distribution of means

Sampling Distribution of means

A distribution composed of sample means

How to conduct

1. Pull a sample from population of N size

2. Find the mean of the sample

3. Repeat this many times (all samples of size N)

4. Create a frequency distribution of the means

(actual convert if to relative frequencies = proportions!)

Shavelson Chapter 1010-5. What is the functions of a sampling distribution of means?

Used as a probability distribution to determine the likelihood of obtaining a particular sample mean, given that the null hypothesis is true.

null hypothesis is true = same thing as “by chance alone”

Shavelson Chapter 10S10-6. As your author does, be able to calculate the probability of obtaining a particular sample mean,

given the appropriate data (e.g. the mean of the sampling distribution and the standard error). If I ask for this on the test I will either supply table B or will have the Zx fall on a whole value (e.g. 1 or 2, or 3). You should thus review the probabilities under the normal curve as you will be expected to be able to apply this information) (260-262)

μ = 100 (mean of the population and the sampling distribution)

σ x = 25X = (mean of the sample we used in our study)

What is the probability of obtaining a sample mean of 175 by chance alone (i.e. when the null is true: Ho: μ = x)

Zx = mean of the sample – pop mean = X - μ = 175-100

SEM σ x 25

Use table b if needed!

Shavelson Chapter 10S10-7. What meant by the terms "unlikely" and "likely"? You should

be able to answer this in terms of accepting or rejecting the null hypothesis, or in terms of what is meant by "significance level" (263-264)

Level of significance = what we consider to be “unlikely”

Generally set at 5% or 1 % chance of obtaining a sample mean by chance alone

Alpha = .05 or alpha = .01Thus: decisions to reject the null are based on

your alpha levelReject null if your sample mean is equal too, or

less than your alpha level.

You get all the scores of the folks in CA who took the GRE and find that their average score is 675 (for verbal). The overall (entire population) mean is 500 and the SEM is 100. Is the California mean statistically

significant (the diff from the pop mean). Alpha = .05

450 of 5

33% 33%33%1. Yes

2. No

3. Huh?

Shavelson Chapter 10S10-7

Decisions to reject the null are based on your alpha level

“Reject the null hypothesis if the probability of obtaining a sample mean is less than or equal to .05 (.01); otherwise, don’t reject the null hypothesis”

Shavelson Chapter 1010-8 Calculating Zx (critical)

(The Zx score at which we say it is “unlikely” to obtain this value by chance alone)

at the alpha = .05 level of significance Zx (critical) = 1.65 (from table B)

at the α = .01 level of significance (critical) = 2.33 (from table B )

Example:μ = 42σx = 8X = 30

Reject the Ho or not at the .05 level of significance?translate alpha level into z-score

Shavelson Chapter 1010-8 Calculating Zx (critical)

Two ways to reject the null: Find the probability of obtaining the Z score (obtained), or find the Z scored that lies at the alpha level (critical). Then

Either compare the probability of getting the Zobtained (e.g. .03) to the alpha level (e.g. .05). In this case you would say reject the null - we show statistical significance

Or, compare the Zobtained to the Zcritical in this case, 1.88 (obtained) and 1.65(critical). In this case since the Zobtained is greater than Zcritical we reject the null - we show statistical significance


10-9. Know the difference between directional and non directional tests, and when to use each!

1. A one tail may be supported by previous research or theory2. When in doubt, choose two tailed!

Tails are specified by alternative hypotheses.Ho: xbar=muH1: xbar ≠≠ μ (2 tailed: both)OrH1: xbar < μ (1 tailed: left)OrH1: xbar > μ (1 tailed: right)Easier to show statistical significance with 1-tailed test.

Directional vs. non-directional testsDirectional uses only one tail of the sampling distribution

Non-directional uses both tailsThus: If alpha = .05 and one tail all .05 (1.65) is in one tail (or -1.65)

If alpha = .05 and two tail .025 (1.96) is in one tail, and .025 (-1.96) is in other tail

Shavelson, Chapter 10

• If conducting case II research, how could you determine the probability of getting a particular difference between 2 means

• (which is what we are looking at for case II).

Sampling distribution of differences between means gives the probability of obtaining a particular difference between means. (Case II)

Theoretically you could….Make sampling distribution of differences between means, then find a z-score, compare to alpha level, accept or reject the

null hypothesis.Case IIxbar1-xbar2 = 2xbar1-xbar2 = 3xbarz-xbar2 = -1graph frequency of each differencemake freq distribution of differences between sample meansCan also calculate SD, determine likelihood of obtaining

difference between means by chance alone


• Characteristics of the sampling distribution of differences between means

1. normally distributed2. Mean=03. Standard Deviation (called the standard error

of the difference between means)Is equal to:

σx1-x2 = σx12 + σx2

2

• Note: variance = sigma squared


Calculate a Z score for diff between means

Z x1-x2 = Xe – Xc σx1-x2

Example:

Xe = 24

Xc = 30

σx1-x2 = 2.8

H1: = Xe ≠ Xc

Z crit?Z obs?


Shavelson Chapter 11S11-1. Know the definition and recognize/generate examples of the two types of

errors (Type I and Type II)(also see table 11-1)This is similar to what we did last unit. How does one adjust the probability of making a type I error? (313).

The way it really is

Your DecisionVit A has no

effectVit a Had an

effect

Reject Null – Vit A had an effect Type 1 error Correct Decision

Accept NullVit A had no effect

Correct decision Type 2 error

Shavelson Chapter 11S11-2. Know the definition of "power" and how it is

calculated. (314)Power = 1-Beta

The probability of correctly rejecting a false null hypothesis. OR: Power is the probability of you detecting a true treatment effect.

(What researchers are really interested in! Detecting a true difference if it exists.)

Power = .27 (27%)…very low. Want higher power, want higher number.

Shavelson Chapter 12S12-1. What is the purpose of a t test in general (334,3). Also how is a t test used for

case I research? (that is, what question does it answer?(334,3). As in previous chapters the function of the t test is to determine the probability of observing a particular sample mean, given that the null hypothesis is true. You should know this point. You should also know how the standard deviation is estimated for the population when using the t distribution (334)

T-test is used to…

A. Determine the probability that a sample was drawn from a hypothesized population (given a true Ho)

B. Used when the population standard deviation is not known

C. Calculated standard deviation (SEM) is:

How would one go about doing this?

Standard Dev. Of Sample = Sx = s

Sq. Root of sample size N

Shavelson Chapter 12S12-2. You should be able to describe the t distribution and what it is used for

(determining the probability of obtaining a particular sample mean)(335-336). Know the important differences between the t distribution and the normal distribution. (335,5,-335,7) (there are three points made).

A. T(observed):

X – μ sx

= the number of standard deviations that a particular t lies from the mean)

The t distribution is created from numerous same sized samples from the population –

just like a sampling distribution!

The t(observed) can be compared to the t distribution to determine the probability of

obtaining that particular sample mean (given the Ho is true)

Shavelson Chapter 12T-distribution vs. Normal Distribution:

1. T has a different distribution for every sample size (N)

2. More values lie in the tails of t; thus critical values for t are higher than Z

3. As sample size increases t becomes closer + closer to normal

distribution.


Documents

Shavelson – Descriptive Statistics