Upload
nszakir
View
262
Download
2
Embed Size (px)
DESCRIPTION
Statistica
Citation preview
Chapter 3: Producing Data
Introduction
3.1 Design of Experiments
3.2 Sampling Design
3.3 Toward Statistical Inference
3.4 Ethics
2
3.3 Toward Statistical Inference
3
Parameters and Statistics
Sampling Variability
Sampling Distribution
Bias and Variability
Sampling from Large Populations
4
Parameters and Statistics Using samples to talk about populations
A parameter is a number that describes some characteristic of the population. In statistical practice, the value of a parameter is not known because we cannot examine the entire population.
Name Symbol Example
Mean µ In a nationwide test, what is the average score? Proportion p What proportion of people choose chocolate as their favorite ice cream
flavor?
Name Symbol Example Sample Mean Sample mean of 100 test scores
Sample Proportion
Sample proportion of 100 people who choose chocolate as their favorite ice cream flavor?
x
We answer such questions by studying a sample…. A statistic is a number that describes some characteristic of a sample. The value of a statistic can be computed directly from the sample data.
p
5
Parameters and Statistics Examples:
Proportion of all students who attended the last home football game. Parameter, p Proportion of registered voters who voted in November.
Parameter, p Mean height of a sample of NBA basketball players.
Statistics, Mean SAT of entering freshmen
Parameter, µ Proportion of people who prefer Coke over Pepsi in a sample of mall shoppers
Statistics, Mean number of pepperoni slices on a 12̎ pizza from a sample of a certain brand of pepperoni pizzas.
Statistics, x
x
6
Statistical Estimation
The process of statistical inference involves using information from a sample to draw conclusions about a wider population.
Your estimate of the population is only as good as your sampling design.
Work hard to eliminate biases.
Your sample is only an estimate—and if you randomly sampled again you would probably get a somewhat different result.
Bigger sample is better.
7
Sampling Variability
Each time we take a random sample from a population, we are likely to get a different set of individuals and calculate a different statistic. This is called sampling variability.
We ask, “What would happen if we took many samples?” Take a large number of samples from the same population.
Calculate the sample mean/proportion for each sample.
Make a histogram of these values.
Examine the distribution displayed in the histogram for shape, center, and spread, as well as outliers or other deviations.
8
Sampling Variability (Cont…)
The sampling distribution of a statistic is the distribution of that statistic for samples of a given size n taken from the same population. The variability of a statistic is described by the spread of its sampling distribution. This spread depends on the sampling design and the sample size n, with larger sample sizes leading to lower variability.
9
The results of many SRSs have a regular pattern. Here, we draw 1000 SRSs of size 100 from the same population. The population proportion is p = 0.60. The histogram shows the distribution of the 1000 sample proportions.
The distribution of sample proportions for 1000 SRSs of size 2500 drawn from the same population as in first figure. The two histograms have the same scale. The statistic from the larger sample is less variable.
10
Both bias and variability describe what happens when we take many shots at the target.
Bias concerns the center of the sampling distribution. A statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated.
The variability of a statistic is described by the spread of its sampling distribution. This spread is determined by the sampling design and the sample size n. Statistics from larger probability samples have smaller spreads. 10
Bias and Variability
11
A good sampling scheme must have both small bias and small variability.
To reduce bias, use random sampling. To reduce variability of a statistic from an SRS, use a larger sample.
Managing Bias and Variability
POPULATION SIZE DOESN’T MATTER The variability of a statistic from a random sample does not depend on the size of the population, as long as the population is at least 100 times larger than the sample.
12
3.4 Ethics
Institutional Review Boards
Informed Consent
Confidentiality
Clinical Trials
Behavioral and Social Science Experiments
13
Institutional Review Boards
The organization that carries out the study must have an institutional review board that reviews all planned studies in advance in order to protect the subjects from possible harm.
The institutional review board:
reviews the plan of study
can require changes
reviews the consent form
monitors progress at least once a year
14
Informed Consent
All subjects must give their informed consent before data are collected.
Subjects must be informed in advance about the nature of a study and any risk of harm it might bring.
Subjects must then consent in writing.
Who can’t give informed consent?
prison inmates
very young children
people with mental disorders
15
Confidentiality
All individual data must be kept confidential. Only statistical summaries may be made public.
Confidentiality is not the same as anonymity. Anonymity means that subjects are anonymous—their names are not known even to the director of the study. Anonymity prevents follow-ups to improve non-response or inform subjects of results.
Any breach of confidentiality is a serious violation of data ethics.
The best practice is to separate the identity of the subjects from the rest of the data immediately!
16
Clinical Trials
Clinical trials study the effectiveness of medical treatments on actual
patients—these treatments can harm as well as heal.
Points for a discussion:
Randomized comparative experiments are the only way to
see the true effects of new treatments.
Most benefits of clinical trials go to future patients. We must
balance future benefits against present risks.
17
Behavioral and Social Science Experiments
Many behavioral experiments rely on hiding the true purpose of the
study.
Subjects would change their behavior if told in advance what
investigators were looking for.
The “Ethical Principles” of the American Psychological Association
require consent unless a study only observes behavior in a public
space.