Statistics Lecture 3 Notes

Embed Size (px)

Citation preview

  • 7/29/2019 Statistics Lecture 3 Notes

    1/5

    11

    1 -61

    Recap of Last Class

    MATH 1050Y-A (FW 2012-13)

    1. There is a lower infant mortality rate for at-home births assisted by midwives than forhospital births attended by physicians. Does

    this mean that concerned women would bewise to opt for home deliveries bymidwives?

    A. Yes

    B. No

    Why or why not? Discuss with yourneighbour

    1 -62

    Recap of Last Class

    MATH 1050Y-A(FW 2012-13)

    2. A student who sends out a campus-widemailing about forming a stamp-collecting clubreceives 24 responses, 8 of which arefavourable. "Based on my sample of sizen=24," he states, "I estimate that 1/3 of thestudent body favours forming such a club andhereby apply for funding." What is wrong withhis logic?

    Discuss with your neighbour

    1 -63MATH 1050Y-A (FW 2012-13)

    Chapter 1Introduction to Statistics

    1-1 Overview

    1-2 The Nature of Data

    1-3 Uses and Abuses of Statistics

    1-4 Design of Experiments

    1 -64MATH 1050Y-A(FW 2012-13)

    Collecting Data

    We typically obtain our data from two commonsources; observational studies and experiments.

    In an observational study, we observe andmeasure specific characteristics but we dontattempt to manipulate or modify the subjectsbeing studied.

    Example: Surveys.

    In an experiment, we apply some treatment andthen proceed to observe its effects on thesubjects.

    Example: Clinical drug trials.

    1 -65MATH 1050Y-A (FW 2012-13)

    Design of Experiments

    When designing an experiment, one shouldfollow the steps below.

    1. Identify your objective. What question are youtrying to answer? What is the population ofinterest?

    2. Collect sample data. The sample must belarge enough to detect the treatment effect,representative of the population underconsideration, and unaffected by extraneousfactors.

    3. Use a random procedure to avoid bias.

    4. Analyze the data and form conclusions.

    1 -66MATH 1050Y-A(FW 2012-13)

    Design of Experiments

    Example: Hudson Auto Repair

    The manager of Hudson Auto

    would like to have a betterunderstanding of the cost

    of parts used in the engine

    tune-ups performed in the

    shop. She examines 50

    customer invoices for tune-ups and records the costs of

    parts, rounded to the nearest dollar

  • 7/29/2019 Statistics Lecture 3 Notes

    2/5

    22

    1 -67

    Process of Statistical Inference

    1. Populationconsists of alltune-ups. Averagecost of parts is

    unknown.

    2. A sample of 50engine tune-ups

    is examined.

    3. The sample dataprovide a sample

    average parts costof $79 per tune-up.

    4. The sample averageis used to estimate the

    population average.

    1 -68MATH 1050Y-A(FW 2012-13)

    Interference from variable factors that are notrelevant to the issue being studied can becontrolled through good experimental design.

    Careful experiments often involve a treatmentgroup that is given a particular treatment and acontrol group that is not given the treatment (or isgiven a fake treatment).

    Example: The 1954 Salk vaccine trials involveda treatment group of children who were injectedwith the vaccine and a control group of childrenwho were injected with a placebo.

    Treatment and Control Groups

    1 -69MATH 1050Y-A (FW 2012-13)

    A placebo effect occurs when a subject that hasreceived a placebo incorrectly believes thathe/she is receiving a treatment and reports animprovement in symptoms.

    The placebo effect can be countered by usingblinding, a technique in which the subject doesntknow whether he/she is receiving a treatment ora placebo.

    To avoid biased interpretations of the results,experimenters sometimes use a double-blindexperiment wherein neither the subjects nor the

    experimenters are aware of which subjectsreceive the treatment or placebo.

    The Placebo Effect

    1 -70MATH 1050Y-A(FW 2012-13)

    When designing an experiment to test theeffectiveness of one or more treatments, subjectsshould be assigned to the different treatments insuch a way that those groups are very similar.

    One simple yet effective approach is known as acompletely randomized design. In this approach,subjects are assigned to the treatments atrandom.

    Another approach is a rigorously controlleddesign wherein subjects are chosen carefully sothat the different groups are carefully arranged to

    be similar.

    Design of Experiment

    1 -71MATH 1050Y-A (FW 2012-13)

    The results of experiments are sometimes ruinedby confounding.

    Confounding occurs when the effects from two ormore variables cannot be distinguished fromeach other.

    Example: In a study of the effects of educationon salary, a group of older subjects withoutuniversity degrees might be compared to a groupof younger subjects with university degrees. If adifference in salary is observed, it would bedifficult to tell if the difference is due to educationor age.

    Confounding

    1 -72MATH 1050Y-A(FW 2012-13)

    Replication is the repetition of an experiment onmore than one subject.

    Samples should be large enough so that theerratic behaviour that is characteristic of verysmall samples will not disguise the true effects ofdifferent treatments.

    Replication

  • 7/29/2019 Statistics Lecture 3 Notes

    3/5

    33

    1 -73MATH 1050Y-A (FW 2012-13)

    Randomization is used when subjects areassigned to different groups through a process ofrandom selection.

    The idea is to use chance as a way to create twogroups that are similar.

    Randomization plays a crucial role in how thesamples are chosen. We will now exploreseveral such sampling methods.

    REMEMBER: Data carelessly collected may beso completely useless that no amount ofstatistical torturing can salvage them.

    Randomization

    1 -74MATH 1050Y-A(FW 2012-13)

    In a random sample, members of the populationare selected in such a way that each individualhas an equal chanceof being selected.

    Random Sample

    1 -75MATH 1050Y-A (FW 2012-13)

    A simple random sample of nsubjects is selectedin such a way that every possible sample of sizenhas the same chance of being chosen.

    Example: When pollsters and telemarketers,generate a phone number at random so that allphone numbers are possible, they are usingrandom sampling. If they generate 100 suchphone numbers, they have obtained a simplerandom sample of size 100.

    Simple Random Sample

    1 -76MATH 1050Y-A(FW 2012-13)

    Computers can be used to generate randomnumbers for selecting random samples

    Example: The Excel function

    RANDBETWEEN(1,X)

    can be used to generate random numbers between1 and X

    Simple Random Sample

    1 -77MATH 1050Y-A (FW 2012-13)

    With stratified sampling, we subdivide thepopulations into at least two differentsubpopulations (or strata) that share the samecharacteristics (such as gender), then we draw a

    simple random sample from each stratum.

    Stratified Sampling

    1 -78MATH 1050Y-A(FW 2012-13)

    Stratified sampling can help us reduce variationin our results that might result from a simplerandom sample.

    Example: In surveying views on pay equity, wemight use gender as a basis for creating twostrata and use proportionate sampling. A simplerandom sample might be biased toward males orfemales.

    Stratified Sampling

  • 7/29/2019 Statistics Lecture 3 Notes

    4/5

    44

    1 -79MATH 1050Y-A (FW 2012-13)

    In systematic sampling, we select some startingpoint and then select every kth element in thepopulation.

    Example: When selecting a sample of objects forinspection on an assembly line, you might selectevery 100th object.

    Systematic Sampling

    1 -80MATH 1050Y-A(FW 2012-13)

    In cluster sampling, we first divide the populationarea into clusters, then randomly select a few ofthose sections, and then choose allthe membersfrom those selected sections.

    Cluster Sampling

    1 -81MATH 1050Y-A (FW 2012-13)

    In stratified sampling,we select someofthe objects fromeverystrata.

    Stratified Samplingversus Cluster Sampling

    In cluster sampling,we select allof theobjects from someofthe clusters.

    1 -82MATH 1050Y-A(FW 2012-13)

    In convenience sampling, we simply use resultsthat are readily available.

    Convenience sampling does not employrandomization and can be heavily biased.

    Convenience Sampling

    1 -83MATH 1050Y-A (FW 2012-13)

    Three very important considerations in the design of

    experiments are as follows

    Summary

    1. Use randomizationto assign subjects to different

    groups.

    2. Use replicationby repeating the experiment on

    enough subjects so that effects of treatment or

    other factors can be clearly seen.

    3. Control the effects of variablesby using such

    techniques as blinding and a completelyrandomized experimental design.

    1 -84MATH 1050Y-A(FW 2012-13)

    Summary

  • 7/29/2019 Statistics Lecture 3 Notes

    5/5

    55

    1 -85MATH 1050Y-A (FW 2012-13)

    Sampling error is the difference between asample result and the true population result;such an error results from chance samplefluctuations.

    Nonsampling error occurs when the sample dataincorrectly collected, recorded, or analyzed(such as by selecting a nonrandom and biasedsample, using a defective instrument, or copyingthe data incorrectly).

    Errors

    No matter how well you plan and execute thesample collection process, there is likely to besome error in the results.

    1 -86MATH 1050Y-A(FW 2012-13)

    1. Temperatures and amounts of precipitation andsnowfall are recorded in different locations acrossthe country. This corresponds to:

    A. An observational study

    B. An experiment

    Pause & Practice

    1 -87MATH 1050Y-A (FW 2012-13)

    2. When the population of professors is divided intogroups according to their ranks (instructor, assistantprofessor, senior, etc...) and then several areselected from each group to make up a sample, thesample is called a cluster sample.

    A. True

    B. False

    Pause & Practice

    1 -88MATH 1050Y-A(FW 2012-13)

    3. A medical researcher from Acadia Universityinterviews all leukemia patients in each of 20randomly selected hospitals. This is an example ofthe following type of sampling:

    A. Random

    B. Stratified

    C. Systematic

    D. Cluster

    E. Convenience

    Pause & Practice

    1 -89MATH 1050Y-A (FW 2012-13)

    4. In conducting research for the Ottawa eveningnews, a reporter for CBC interviews 15 people asthey leave Revenue Canada audits. This is an

    example of the following type of sampling:A. Random

    B. Stratified

    C. Systematic

    D. Cluster

    E. Convenience

    Pause & Practice

    1 -90

    Coming up

    MATH 1050Y-A(FW 2012-13)

    Our next step will be organizing andsummarizing data

    Assignment #1 is posted Due Sept. 18th in seminar

    For next class: Read sections 2-1 to 2-2