educational statistics new lecture1_1stsem2010-2011

Embed Size (px)

Citation preview

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    1/85

    EDUCATIONAL STATISTICS

    Dr. Joseph Mercado

    Special Lecturer

    PUP GS

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    2/85

    Research Process

    Identification of the Research ProblemFormulation of HypothesesIdentification of Necessary DataData CollectionAnalysisSummarizing Results

    Drawing Conclusions and ImplicationsExpanded, Revised, and New TheoryNew Knowledge

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    3/85

    Research Problems

    Classroom Leadership, Attitude Towards TeachingScience and other Correlates of Teaching Competenceof Science High School Teachers in the NCR

    Principals Attributes : Their Effects on TeachersEmpowerment and Learners Achievement in PublicSchools in the NCR

    Evaluation of the Quality of Research and ExtensionPrograms of Selected Higher Education Institutions inthe CALABARZON Region

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    4/85

    Ethical Responsibility in the Work Behavior of SchoolManagers in Four selected Universities in the LuzonArea

    An Assessment of the Expanded ROTC Program s andCivic Welfare Service and Its Implications to theNational Peace and Development Plan (NPDP)

    Total Quality Management Practices in SelectedHigher Education Institutions in Metro Manila

    Principal Empowerment in public Secondary Schools:Basis For The Development of a Primer

    Prediction Models of School Based Management,Teaching Behavior and Professionalism

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    5/85

    Factors Affecting the Results of the National AchievementTest (NAT) in Selected City Schools in the NCR

    Teachers Beliefs Regarding The Implementation of Constructivism in the Classroom

    Analysis of Leadership Behavior and Self-Efficacy of Principals of Catholic Secondary Schools

    Study Habits and Academic Achievements of IntermediatePupils in Guadalupe Nuevo Elementary School in Makati:An Assessment

    Teachers Behavior, Teachers Collective Efficacy, StudentsBehavior and Their Relationship to Students AcademicAchievement in Selected Public Secondary Schools inBulacan

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    6/85

    Personal Needs, Job Satisfaction and Work-Related Characteristics as Correlates of TeacherCommitment Among Mathematics FacultyMembers of State Universities and Colleges inthe NCR

    Teacher s Behavioral Pattern and TeachingStyles: Their Influence on Pupils AcademicAchievement in the District of Tanza

    Leadership Styles of Administrators in ThreeSelected State Universities in Manila and TheirEffect on Faculty Performance

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    7/85

    An Analysis of the Effectiveness of Field Study Coursesof the Revised Teacher Education Curriculum AmongSelected Higher Education Institutions (HEIs) in the

    National Capital Region

    An Evaluation of the International Training Program inthe Hospitality Education of Selected Higher EducationInstitutions (HEIs) in Metro Manila

    Personological Attributes Affecting the ManagementStyles of Educational Managers in Selected Schoolsin the Province of Laguna

    Relationship Between Some Selected Variables and

    Conflict Management Styles of the Administrators of the Polytechnic University of the Philippines System

    Quality Assessment of Student Development andServices Program in Local Colleges and Universities inMetro Manila

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    8/85

    Nature of Statistics

    Statistics may be defined as the science of artof collecting, presenting, analyzing andinterpreting data in a certain field, such aseducation, science, psychology, business,economics, engineering, medicine, or any otherarea.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    9/85

    Statistics can also be used in makingcorrect decisions during the time of uncertainty. One may apply the differentstatistical methods so as to arrive at thecorrect result with appropriate criticaljudgment.

    Meaning of Statistics

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    10/85

    Branches of Statistics

    Descriptive Statistics the branch thatsummarizes and organizes raw data into ameaningful information.

    To calculate the average score of your studentsor can summarize the percentages of the score.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    11/85

    Branches of Statistics

    Inferential Statistics statistical inference isthe process of obtaining information about a

    larger group from the study of a smaller group.The total group of people, things, or characteristics you are interested instudying, understanding, or predicting is called population .

    A sample is a group of representative items chosen from the populationand used to predict the behavior or characteristics of the total populationwith help of inferential statistics.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    12/85

    The Statistical Treatment

    The design of the studydeterminesdetermines what statisticaltechniques should be employed,not vice versa.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    13/85

    The Statistical Treatment

    The kind of statistical treatmentthat can be done in a study is

    mainly constrainedconstrained by the level(or scale) of data measurementemployed in the study.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    14/85

    The Statistical Treatment

    Scales of Data Measurement

    Nominal scale

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    15/85

    The Statistical Treatment

    Scales of Data Measurement

    1. Nominal scale2. Ordinal Scale

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    16/85

    The Statistical Treatment

    Scales of Data Measurement

    1. Nominal scale2. Ordinal Scale3. Interval and Ratio Scale

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    17/85

    The Statistical Treatment

    Scale of Data MeasurementNominal data (the least sophisticated):

    data assumes no natural ordering;

    largely allied to measuring qualitativecharacteristics such as eye colo r, hair colo r, g en d e r, nati on a l it y o r even l if es t ylegr oups , i.e ., si ng les , youn g marri e d,r e tir e d.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    18/85

    The Statistical TreatmentScale of Data Measurement

    Nominal data: Example

    Eye Color Number of Men %

    Blue 60 30Brown 80 40Green 30 15Grey 20 10Hazel 10 5

    TOTAL 200 100

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    19/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    20/85

    The Statistical TreatmentScale of Data Measurement

    Ordinal data: Example

    Rail travelers might be asked to give theirviews on the quality of the MRT service

    according to a scale of 1 5 where:1 = very poor2 = poor3 = adequate

    4 = good5 = very good

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    21/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    22/85

    The Statistical TreatmentScale of Data Measurement

    Interval and Ratio Data

    (the most sophisticated)data where values progress bothin order and according to a seriesof equal steps.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    23/85

    The Statistical TreatmentScale of Data Measurement

    Interval and Ratio DataExamples:

    number of babies born to different

    familiesnumber of people pre-purchasingtheater tickets at a particular venueover different periodsage, weight and height of a sample of human beings

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    24/85

    The Statistical TreatmentScale of Data Measurement

    Statistical procedure is selected on thebasis of its appropriateness for

    answering the question involved in thestudy.Nothing is gained by using a complicatedprocedure when a simple one will do just aswell. Statistics are to serve research, not todominate it.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    25/85

    The Statistical TreatmentScale of Data Measurement

    Interval and Ratio Data

    Examples:number of babies born to differentfamiliesnumber of people pre-purchasingtheater tickets at a particular venue

    over different periodsage, weight and height of a sample of human beings

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    26/85

    Sources of Data

    1. S econ dar y Data : Data which arealready available. An example:statistical abstract of USA.

    Advantage : less expensive.Disadvantage : may not satisfyyour needs.2. P rimar y Data : Data which must becollected.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    27/85

    P reliminary Steps in Statistical Study

    Define the problemDetermine the population/subject of

    the studyDevise the set of questionsDetermine the sampling design

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    28/85

    G uidelines in the Selection of a Research P roblem

    or Topic

    The research problem must be chosen by theresearcher himself so that he will not make

    excuses for all the obstacles he will encounter.The problem must be within the interest of the

    researcher so that he will give all the time and

    effort in the research work.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    29/85

    G uidelines in the Selection of a Research P roblemor Topic

    The problem must be within the specialization of the researcher. It will make the work easier for theresearcher because he is familiar in the area and it

    will help him improve his specialization, skill andcompetence in his own area.The research problem must be within thecompetence of the researcher. The researcher must

    know the procedures in making research and howto apply them. He must have a workableunderstanding of his study.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    30/85

    G uidelines in the Selection of a Research P roblemor Topic

    The researcher must have the ability and capacityto finance the research problem to make sure thatthe study will be completed on the target time.

    The research problem must be manageable. Thedata must be available or within the capacity of theresearcher to gather data. The data must beaccurate, objective and not biased. The data should

    help the researcher answer the question beinginvestigated.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    31/85

    G uidelines in the Selection of a Research P roblemor Topic

    The research problem must be completedwithin the period set by the researcher.

    The research problem must be significant,important and relevant to the present time aswell as to the future. This means that theresearch problem must have an impact to the

    situation and people it is intended for.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    32/85

    G uidelines in the Selection of a Research P roblemor Topic

    The results of the study must be practical andimplementable.

    The study must contribute to the humanknowledge. The facts and knowledge must be aproduct of research.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    33/85

    Different Ways to Draw a Sample

    (2 Types of Sampling)

    1 . Random Sampling or Probability Sampling

    In probability sampling, the sample is aproportion of the population and such sample isselected from the population by means of systematic way in which every element of thepopulation has a chance of being included in thesample.

    2 . Non-Random Sampling or Non-Probability Sampling

    In a non-probability sampling, the sample is nota proportion of the population and there isno system in selecting the sample. The selectiondepends on the situation.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    34/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    35/85

    b. Systematic Random Sampling - Select some startingpoint and then select every K th element in thepopulation

    Random Sampling or Probability Sampling

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    36/85

    c. Stratified Sampling - subdivide the population intosubgroups that share the same characteristic, thendraw a sample from each stratum

    Random Sampling or Probability Sampling

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    37/85

    d. Cluster Sampling - divide the population into sections(or clusters); randomly select some of those clusters;choose all members from selected clusters

    Random Sampling or Probability Sampling

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    38/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    39/85

    N on-Random Sampling or N on-ProbabilitySampling

    1 . Accidental SamplingIn this type of sampling, there is no system of selection butonly those whom the researcher or interviewer meets bychance are included in the sample. This type of samplinglacks representativeness where the sample may be biased. If the interviewer goes to a business section, most people whowill be interviewed are likely from the business and probablyrich people hence the respondents will be from well-to-dopeople. But if the interviewer stays in a slum area, then it ispossible that the respondents are poor people. In a research,

    every section of the population must be equally representedin the sample. This method is being used when there is noalternative.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    40/85

    2 . Quota SamplingIn this type of sampling, specified number of personsof certain types is included in the sample. Supposethe reactions of the people for a particular issue, suchas the effects of drug addiction in a certain locality,can be decided from a sample that constitutes 10doctors, 9 lawmakers, 15 parents and 20 drugaddicts.In quota sampling, many sectors of the populationare represented. However, the representation is

    doubtful because there is no proportionalrepresentation since there are no guidelines in theselection of the respondents. Anyone who is selectedto participate will do. Quota sampling may be usedonly when any of the more desirable types of

    sampling will not do.

    N on-Random Sampling or N on-Probability Sampling

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    41/85

    N on-Random Sampling or N on-Probability Sampling

    3 . Convenience SamplingConvenience sampling is a process of picking out peoplein the most convenient and fastest way to get reactionsimmediately. This method can be done by telephoneinterview to get the immediate reactions of a certaingroup of sample for a certain issue.

    This kind of method is biased and not representative.This is quite different from gathering data by interviewwhereby the interview can be done through the

    telephone. In the interview method, people who areinterviewed through the telephone are properlyselected to be included in the sample.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    42/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    43/85

    Sample Size Determination

    The extent of the population will depend on the nature of theproblem. The census survey will require all individual in the populationthat is considered while the sample survey will consider a fewrepresentative of the population.

    In determining the sample size, the formula which can be applied is asfollows: (Slovins Formula)

    n = sample size

    N = population sizee = desired margin for error

    (per cent allowance for non-precision because a sample is used)

    21 NeN

    n u

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    44/85

    Example 1A researcher wants to make use of a student

    population of 3 , 000 for his study in the mathematicalachievement test. If he allows a 5% margin of error,how many students must he take for his sample?

    Solution:The formula given can be used:

    21 N eN n u

    2)05.0(300013000

    un

    )0025.0(300013000

    un

    5.713000

    un

    5.83000

    un

    35394.352 or n u

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    45/85

    DESCRIPTIVE STATISTICS

    Summarizes or describes the important characteristicsof a known set of data.

    Measures of Central Tendency

    A measure of tendency for a collection of datavalues is number that is meant to convey the idea ofcentralness for the data set.

    Numerical values that are indicative of the centralpoint or the greatest frequency concerning a set ofdata. The most common measures of centraltendency are the mean, median, and mode.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    46/85

    Measures of Central Tendency and Scales ofmeasurement

    T he mode requires only nominal data - and you cancompute it for ordinal, interval, and ratio.

    T he Median requires ordinal data - and you can compute itfor interval and ratio. You cannot compute the Median for nominal data.

    T he Mean requires interval or ratio data - you cannotcompute it for either nominal or ordinal data.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    47/85

    Example

    1. What is the mean for the following samplevalues? 3, 8, 6, 14, 0, -4, 0, 12, -7, 0, -10.

    Solution:

    211

    )10(0)7(120)4(014683!!

    X

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    48/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    49/85

    2. Find the median for the ages of the following

    eight college students:23 19 32 25 26 22 24 20

    Solution: First order the values, The orderedarray is

    19 20 22 23 24 25 26 32

    median = (23 + 24)/2 = 23.5

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    50/85

    What is the mode for the following sample

    values?3 5 1 4 2 9 6 10The data set has no mode

    What is the mode for the following samplevalues?

    3 5 1 4 2 9 6 10 5

    3 4 3 9 3 6 1The value of the is mode is 3,Unimodal

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    51/85

    What is the mode for the following sample values?6 10 5 3 4 3 9 3 6 1 6

    The values of the mode are 3 and 6, bimodal

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    52/85

    Mean: Grouped DataT he calculation of mean from a frequencydistribution is almost the same as that from anungrouped data (raw data), only in a distribution, theindividual values are not known. When the number of items is too large, it is best to compute for themeasures of central tendency using a frequencydistribution.

    n

    fX n

    ii

    X

    !! 1

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    53/85

    Ex ample:

    Calculation of Mean from Frequency Distribution of Sample Test Scores of 40 Students in E ducational Statistics

    ifX

    Scores f Xi fXi

    70 74 2 72 144

    65 69 2 67 134

    60 64 3 62 186

    55 59 2 57 114

    50 54 8 52 41645 49 9 47 423

    40 44 2 42 84

    35 39 4 37 148

    30 34 5 32 160

    25 29 3 27 81

    n = 40 = 1,890

    25.4740

    18901 !!!

    !

    n

    f X n

    ii

    X

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    54/85

    Median: Grouped Data

    if

    F n

    LMdnm

    m

    m

    !12

    Formula:

    Where:Mdn = median

    lower class boundary of the median class= less than cumulative frequency of the class immediately preceding the

    median classi = the size of the intervaln = the total number of scores

    !mL

    1mF

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    55/85

    Scores f

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    56/85

    Mode: Grouped Data

    id d

    d LM

    m oo

    !

    21

    1

    Formula:

    Where:lower class boundary of the modal class

    = numerical difference between the frequency of the modal classand the frequency of the adjacent lower class

    = numerical difference between the frequency of the modal classand the frequency of the adjacent higher class

    i = size of the class interval

    !m oL

    1d

    2d

    Ex ample:

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    57/85

    Scores f

    70 74 2

    65 69 2

    60 64 3

    55 59 2

    50 54 8

    45 49 9

    40 44 235 39 4

    30 34 5

    25 29 3

    n = 40

    Ex ample:Calculation of Mode from Frequency Distribution of Sample Test Scores of 40Students in E ducational Statistics

    88.48517

    75.44

    21

    1 !

    !

    ! i

    d d

    d LM

    m oo

    P roperties of the different central tendency measures:

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    58/85

    P roperties of the different central tendency measures:

    1. T he mean is the standard measure of central tendency in statistics. Itis most frequently used.

    2. T he mean is not necessarily equal to any score in the data set

    3. T he mean is the most s table measure from s ample to s ample .

    4.T

    he mean is very influenced by Outlier s

    --T

    hat is, the mean will bestrongly influenced by the presence of e x treme scores.

    5. T he median is not s en s itive to outlier s .

    6. T he mean is based on all scores from the sample but the mode and

    the median are not.

    7. T he M ode is the lea s t s table measure from sample to sample.

    8. T he median is the best measure of central tendency if the distributionis skewed.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    59/85

    Measures of P ositionT he Quartiles

    Quartiles are the points which divide the total number of scores into four equal parts. E ach set of scores has three quartiles. T wenty-five percentfalls below the first quartile (Q 1), fifty percent is below the second quartile(Q 2), and seventy-five percent is below the third quartile (Q 3).

    T he steps in finding the quartiles of raw scores can be summarized asfollows:

    1. Arrange the scores from highest to lowest or lowest to highest.

    2. Determine Q k, where Q k is the k th quartile and k = 1, 2, 3

    2.1 If is an integer, scores

    2.2 If is not an integer, Q k = ith score where I is the closestinteger greater than

    4nk

    2

    14

    thth

    k

    nk k n

    Q

    !

    4nk

    4nk

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    60/85

    Ex ample

    Calculation of Quartiles from Sample Raw Scores of E ight Students inE ducational Statistics and Nine Students in Applied Statistics

    Edu catio nal Statistics Applie d Statistics

    17 15

    17 19

    26 20

    28 24

    30 28

    30 30

    31 32

    37 32

    40

    5.212

    26172

    3224

    1811 !!!p!! scor esQQ

    th

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    61/85

    5.30

    23130

    276

    64

    3833 !!!p!! scor esQQ

    th

    For Applied Statistics:

    20325.2

    4

    1911

    !!p!! scor eQQ r d

    32775.6

    439

    33 !!p!! scor eQQth

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    62/85

    For Grouped Data:

    Formula: First Quartile

    if

    F n

    LQQ

    m

    Q

    !1

    1

    1

    1 4

    1Q

    1QL 1Q

    1mF 4n

    1Qf 1Q

    Where:

    = the First Quartile

    = lower class boundary where lies

    = less than cumulative frequency approaching or equal to but not e x ceeding

    = the frequency where

    i = the size of the interval

    n = the total number of scores

    lies

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    63/85

    For Grouped Data:Formula: T hird Quartile

    if

    F n

    LQQ

    mQ

    !3

    3

    13

    4

    3

    3Q

    13QL

    3Q

    1mF

    43n

    3Qf 3Q

    Where:

    = the T hird Quartile

    = lower class boundary where lies

    = less than cumulative frequency approaching or equal to but not e x ceeding

    = the frequency where

    I = the size of the interval

    n = the total number of scores

    lies

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    64/85

    Scores f

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    65/85

    3754

    8105.344

    1

    1

    1

    1!

    !!

    ! i

    f

    F n

    LQQ

    m

    Q

    88.53582330

    5.494

    3

    3

    3

    1

    3 !

    !

    ! if

    F n

    LQ Q

    m

    Q

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    66/85

    Measures of Variability

    A measure of variability for a collection of datavalues is a number that is meant to convey theidea of spread for the data set. The mostcommonly used measures of variability forsample data are the range, the interquartilerange, the mean absolute deviation, the

    variance or standard deviation, and thecoefficient of variation.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    67/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    68/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    69/85

    Mean Absolute Deviation isthe average of the absolutedeviation values from themean.

    Mean Absolute Deviationutilizes deviations of thedata values from the meanin its computation.

    nxMAD x! //

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    70/85

    What is the MAD for the following sample values?

    3 8 6 12 0 -4 10

    Data Values, x Absolute Deviations /x-mean/

    3 2

    8 36 1

    12 7

    0 5

    -4 9

    10 5

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    71/85

    The average (absolute) distance of the samplevalues from the mean.

    n

    xMAD x! //

    57.47

    32 !!MAD

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    72/85

    The Variance and Standard Deviation

    The Variance and Standard Deviation are the most commonand useful measures of variability. These two measuresprovide information about how the data vary about the mean.

    If the data are clustered around the mean, then the varianceand standard deviation will somewhat small.

    There is small variability when the data values are clusteredabout the mean

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    73/85

    Sample Variance

    The formula says that you subtract the mean

    from each data value and square thedifferences, then you add these values anddivide by the sample size minus 1.

    1

    2( )2 ! n

    x xS

    Do not let the formula frighten you. We will build a table to helpcompute the variance.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    74/85

    What is the variance for the following sample

    values?3 8 6 14 0 11

    Solution: First of all, we need to compute the

    sample mean:

    7642

    611014683 !!!X

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    75/85

    Table Used in Helping to Compute the SampleVariance

    Data Deviations (x-mean) Squared Deviations(x-mean)

    3 3 7 = -4 16

    8 8 7 = -1 16 6 7 = -1 1

    14 14 7 = 7 49

    0 0 7 = -7 49

    11 11 7 = -4 16

    Total 0 132

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    76/85

    The sample variance is

    1

    2( )2 ! n

    x xS

    4.26

    5

    132

    16

    1322 !!!S

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    77/85

    Wh i h d d d i i f h f ll i l

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    78/85

    What is the standard deviation for the following samplevalues?

    3 8 6 14 0 11

    Solution:

    1

    2( )!

    n

    xS x

    14.54.265

    132 !!!S

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    79/85

    Coefficient of Variation

    The Coefficient of Variation (CV) allows us to compare thevariation of two (or more) different variables.The sample coefficient of variation is defined as samplestandard deviation divided by the sample mean of the dataset. Usually, the result expressed as a percentage .

    %100x

    S

    CV X !

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    80/85

    The sample coefficient of variation standardizes the variation bydividing it by the sample mean. The CV has no units, since thestandard deviation and the mean have the same units, and thus theunits cancel each other. Because of this property, we can use thismeasure to compare variations for different variables with differentunits.

    We said that standard deviation measures the variation in a set of

    data. For distributions having the same mean, the distribution withthe largest standard deviation has the greatest variation. But whenconsidering distributions with different means, decision makerscan't compare the uncertainty in distribution only by comparingstandard deviations. In this case, the coefficient of variation is used,i.e., the coefficients of variation for different distributions arecompared, and the distribution with the largest coefficient of variation value has the greatest relative variation.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    81/85

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    82/85

    %56.5%100905

    )( !! xticket sCV

    %35.14

    %100400,5

    775)(

    !!x

    r evenue sCV

    Since the CV is larger for the revenues, there is

    more variability in the recorded revenues than in thenumber of tickets issued.

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    83/85

    For e x ample, Mark teaches two sections of statistics. He gives eachsection a different test covering the same material. T he mean score onthe test for the day section is 27, with a standard deviation of 3.4. T hemean score for the night section is 94 with a standard deviation of 8.0.Which section has the greatest variation or dispersion of scores?

    D ay Section.................... N ight Section

    M ean .......27.......................94S. D ...........3.4.....................8.0

    Direct comparison of the twostandard deviations

    shows that the nightsection has the greatest variation. But comparing the coefficient of variations show quite different results:

    C.V.(day) = (3.4/27) x 100 = 12.6% and C.V.(night) = (8/94) x 100 = 8.5%

    Example 2

    E mpirical Rule:

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    84/85

    E mpirical Rule:

    T his rule generally applies to mound-shaped data, but specifically to thedata that are normally distributed, i.e., bell shaped. T he rule is as follows:

    Appro x imately 68% of the measurements (data) will fall within onestandard deviation of the mean (One-Sigma Rule), 95% fall within twostandard deviations ( T wo-Sigma Rule), and 99.7% (or almost 100% ) fallwithin three standard deviations ( T hree-Sigma Rule). See the following

    figure:

  • 8/7/2019 educational statistics new lecture1_1stsem2010-2011

    85/85

    For e x ample, in the height problem, the mean height was 70 inches with astandard deviation of 3.4 inches. T hus, 68% of the heights fall between 66.6and 73.4 inches, one standard deviation, i.e., (mean + 1 standard deviation) =

    (70 + 3.4) = 73.4, and (mean - 1 standard deviation) = 66.6. Ninety fivepercent (95%) of the heights fall between 63.2 and 76.8 inches, two standarddeviations. Ninety nine and seven tenths percent (99.7%) fall between 59.8and 80.2 inches, three standard deviations. See the following figure: