Testing of Hypothesis1

Embed Size (px)

Citation preview

  • 7/31/2019 Testing of Hypothesis1

    1/34

    Sampling andSampling Distributions

  • 7/31/2019 Testing of Hypothesis1

    2/34

    A statistical population is the aggregate of all the unitspertaining to a study

    .i.e. it is the set of all elements about which we wish to

    make inferences.

    A sample is a subset of a population.

    The process of drawing a sample from a large

    population is called sampling.

    STATISTIC: Characteristic or measure obtained from asampl

    e.

    PARAMETER: Characteristic or measure obtainedfrom a population.

    A sampling distribution is the probability distribution,under repeated sampling of the population, of a givenstatistic.

  • 7/31/2019 Testing of Hypothesis1

    3/34

    Consider a very large population.

    Assume we repeatedly take samplesof a given size from the population

    and calculate the sample mean for

    each sample.Different samples will lead to

    different sample means.The distribution of these means is

    the sampling distribution of the

    sample mean.

  • 7/31/2019 Testing of Hypothesis1

    4/34

    When all of the possible sample means arecomputed, then the following properties are true:

    The mean of the sample means will be the meanof the population ().The variance of the sample means will be the

    variance of the population divided by the samplesize (2/n).

    The standard deviation of the distribution of a samplestatistic is known as thestandard error of the statistic.

    The nature of the sampling distribution depends onthe distribution of the population and/or thestatistic being considered and the sample sizeused.

  • 7/31/2019 Testing of Hypothesis1

    5/34

  • 7/31/2019 Testing of Hypothesis1

    6/34

    Testing of Hypothesis

  • 7/31/2019 Testing of Hypothesis1

    7/34

    Hypothesis is an assumption about a population

    A few examples are as follows:

    1. Mean purchases made by females (1) is more than

    or equal to the mean purchases made by males (2)

    in a textile stores (1 > 2).

    2. Mean age of female shoppers (1) is less than orequal to that of male shoppers (2) in a book

    exhibition (1 < 2).

    3. Mean monthly income of buyers () in a shop ismore than or equal to Rs 10000\- ( > 10000).

    4. The mean stay-over time of customers () in a shop

    is at most 45 minutes ( < 45).

  • 7/31/2019 Testing of Hypothesis1

    8/34

    Definitions

    Parameter: It is a function of population values.

    Statistic: It is a function of sample values.

    Null Hypothesis: It is an assumption about the

    population parameter which the statement of nochange. It is denoted by H0.

    Alternate Hypothesis: It is the statement of

    assumption which can be considered to be thealternative to the null hypothesis is called thealternative hypothesis. It is denoted by H1.

  • 7/31/2019 Testing of Hypothesis1

    9/34

    As long as there is no apparent contradiction tothe null hypothesis, we retain this belief. But,

    when we find observations contradicting it, thereis a reason to suspect the validity of this nullhypothesis and the problem of testing the nullhypothesis arises.

    When we proceed to test H0, we must be awareof the assumption that is expected to be valid if

    null hypothesis turns out to be valid if nullhypothesis turns out to be invalid. Thisassumption is known as alternative hypothesis.

  • 7/31/2019 Testing of Hypothesis1

    10/34

    H0: The mean I.Q. of all persons in a city is 105

    H1: The mean I.Q. of all persons in the city is 100

    (if it is known that the mean I.Q. is 105 or 100 andnothing else)OR

    H1: The mean I.Q. of all the persons in the city is lessthan 105

    (if it is known that the mean I.Q. is not more than 105)OR

    H1: The mean I.Q. of all the persons in the city is morethan 105

    (if it is known that the mean I.Q. is not less than 105)ORH1: The mean I.Q. of all the persons is not equal to 105

    (if any information is absent)

  • 7/31/2019 Testing of Hypothesis1

    11/34

    The first thing to do when given a claim is to

    write the claim mathematically (if possible), and

    decide whether the given claim is the null or

    alternative hypothesis.

    If the given claim contains equality, or a

    statement of no change from the given or

    accepted condition, then it is the null hypothesis,

    otherwise, if it represents change, it is thealternative hypothesis.

  • 7/31/2019 Testing of Hypothesis1

    12/34

    Example

    "He's dead, said Dr. X to Captain K.

    Mr. S, as the science officer, is put in charge ofstatistically determining the correctness of Xs'statement and deciding the fate of the crew member(to vaporize or try to revive)

    His first step is to arrive at the hypothesis to be

    tested.Does the statement represent a change in previouscondition?

    Yes, there is change, thus it is the alternativehypothesis, H1No, there is no change, therefore is the nullhypothesis, H0

  • 7/31/2019 Testing of Hypothesis1

    13/34

    The correct answer is that there is change.

    Dead represents a change from the acceptedstate of alive.

    The null hypothesis always represents no

    change.

    Therefore, the hypotheses are:

    H0: Patient is alive. H1: Patient is not alive (dead).

  • 7/31/2019 Testing of Hypothesis1

    14/34

    PROCEDURE IN HYPOTHESIS TESTING

    1.Formulate the Hypothesis: Set up a null hypothesis basedon the belief and an appropriate alternate hypothesis.

    2. Set up a Suitable Significance Level: The confidence withwhich a null hypothesis is rejected or accepted depends uponthe significance level used for the purpose.

    A level of significance say 5% means the risk of making awrong decision is only in 5 out of 100 cases. Level ofsignificance widely used is 5% or 1%. Thus, a 1% level of

    significance provides greater confidence to the decision than a5% significance level as the risk of making wrong decision isonly in 1 out of 100 cases. It is denoted by a Greek alphabet

    alpha (). Where (1)is the CONFIDENCE LEVEL.

  • 7/31/2019 Testing of Hypothesis1

    15/34

    3. Select Test Criterion: The test criterion is selectedon the basis of sample size. If the sample is large (n 30), the z-test implying normal distribution is used;

    whereas if the sample size is small (n < 30), the t-testis more suitable. The most commonly used tests are z,t, F and 2.

    A corresponding TEST STATISTIC is calculated.

    4. Decision Criterion: The Test Statistic calculated inthe previous step is now classified to fall within theacceptance region or the rejection region at the given

    level of significance. Accordingly the null hypothesisis accepted or rejected.

    5. Conclusion: On the basis of the decision theconclusion is stated.

  • 7/31/2019 Testing of Hypothesis1

    16/34

    ERRORS IN DECISION MAKING

    The problem of testing of a hypothesis isactually a problem of deciding whether toaccept or to reject the null hypothesis H0, infavor of alternate hypothesis H1.

    The decision of rejecting or accepting of thenull hypothesis is taken on the basis of

    observations made only on a sample of unitsselected from the population. This decisioncannot be always correct. When this decisionis not correct, an error is said to occur.

  • 7/31/2019 Testing of Hypothesis1

    17/34

    States of nature are something that you, as a

    decision maker has no control over.

    Either it is, or it isn't. This represents the truenature of things.

    Possible states of nature (Based on H0)

    Patient is alive (H0 true - H1 false )

    Patient is dead (H0 false - H1 true)

  • 7/31/2019 Testing of Hypothesis1

    18/34

    Decisions are something that you have controlover.

    You may make a correct decision or an incorrectdecision.

    It depends on the state of nature as to whetheryour decision is correct or in error.

    Possible decisions (Based on H0) / conclusions(Based on claim)

    Reject H0 / "Sufficient evidence to say patient

    is dead"

    Fail to Reject H0 / "Insufficient evidence to

    say patient is dead"

  • 7/31/2019 Testing of Hypothesis1

    19/34

    Statistically speakingState of NatureDecision H0 True H0 False

    Reject H0 Patient is alive,

    Sufficient evidence

    of death

    Patient is dead,

    Sufficient evidence

    of deathFail to

    reject H0

    Patient is alive,

    Insufficient evidence

    of death

    Patient is dead,

    Insufficient evidence

    of death

  • 7/31/2019 Testing of Hypothesis1

    20/34

    In English...State of Nature

    Decision H0 True H0 False

    Reject H0 Vaporize a live

    person

    Vaporize a dead

    personFail to

    reject H0

    Try to revive a

    person

    Try to revive a

    person

  • 7/31/2019 Testing of Hypothesis1

    21/34

    Following table gives the

    possibilities that exist in reality.

    Null Hypothesis H0 is

    TrueNot True

    Decision

    Reject H0 Type I Error No Error

    Do not reject H0 No Error Type II Error

  • 7/31/2019 Testing of Hypothesis1

    22/34

    Type I Error

    Reject H0, when H0 is True

    Type II ErrorDo Not Reject H0, when H0 is Not True

    Which of the two errors is more serious?

    Type I or Type II?

  • 7/31/2019 Testing of Hypothesis1

    23/34

    Critical Region: The setting up of a decision criterion

    involves partitioning the set of possible values of the

    test statistic into two subsets; one of which is attributedto the decision: Reject H0 and the other to the decision:

    Do not reject (Accept) H0.

    For example, Reject H0 if x < 50

    And Accept H0 if x 50

    Critical Region for a test is the region whichcorresponds to the subset of sample space for which the

    hypothesis H0 is to be rejected if a sample point falls in

    the region.

  • 7/31/2019 Testing of Hypothesis1

    24/34

    Level of significance, Test of Significance and Power of a

    test

    To design a good test we would like to arrive at adecision criterion in such a way that none of the twoerrors, (Type I Error and Type II Error) occur.

    But when P(Type I Error) 0, P(Type II Error) 1& when P(Type II Error) 0, P(Type I Error) 1

    Hence, no test can be perfect. We therefore design atest such that one of the two probabilities is restrictedto a small value (0 < < 1 and is closer to 0) andthen minimize probability of the other error.

  • 7/31/2019 Testing of Hypothesis1

    25/34

    The error in rejecting H0, when it is true (Type IError) is more serious error than (Type II Error),therefore an upper limit is put on P(Type I Error)

    and P(Type II Error) is simultaneously minimized.This upper limit is known as level of significance.

    Thus, if a test is so designed that

    P(Type I Error) <

    then is called level of significance and the test sodesigned is called a test of significance.

    Hence, = Max. P(Type I Error).

    And P(Type II Error is not committed)= 1P(Type II Error)

    measures strength of a test and it is known as powerfunction or the power of the test.

  • 7/31/2019 Testing of Hypothesis1

    26/34

    Tails of a testThe rejection region in hypothesis testing can be on both sides ofthe curve with the non-rejection region in between the two

    rejection regions.A hypothesis test with two rejection regions is called a two-tailtest and a test with one rejection region is called a one-tail test.

    The one rejection region can be either on the right or the left sideof the curve.

    If the rejection region is on the right side of the curve, the test isknown as the right-tail test.

    When the test has a rejection region on the left side, then it isknown as the left-tail test.

    To find out whether a particular test is one-tail or two-tail and if itis one-tail, is it left-tail or right tail test, we use the sign in thealternative hypothesis. If the alternative hypothesis has a sign, itis a two-tail test.

  • 7/31/2019 Testing of Hypothesis1

    27/34

    Two-Tail Test To test the hypothesis that the average

    monthly income of households in a certain town is Rs.

    5000/-;

    H0: = 5000

    H1: 5000

    = 5000

    Non-rejection

    regionRejection

    region

    Rejection

    region

    Area = /2 Area = /2

  • 7/31/2019 Testing of Hypothesis1

    28/34

  • 7/31/2019 Testing of Hypothesis1

    29/34

    b) H0: = 10H1: > 10

    = 10

    Rejection

    region

    Area = Non-rejection region

  • 7/31/2019 Testing of Hypothesis1

    30/34

    Signs in the tails of a test

    Two-tail

    Test

    Left-tail

    Test

    Right-tail

    Test

    1.Sign in the null hypothesis

    H0= = or = or

    2.Sign in the alternate

    hypothesis H1

    < >

    3. Rejection regionIn both

    tails

    In the

    left tail

    In the

    right tail

  • 7/31/2019 Testing of Hypothesis1

    31/34

    DECISION CRITERIONIn p-value of the test statistic is lessthan the level of significance , reject

    H0.

  • 7/31/2019 Testing of Hypothesis1

    32/34

    Distributions used intesting of hypothesis

    In order to test different parameters, for

    different sample sizes and comparisons of

    such parameters for multiple populations,

    different statistical distributions are used.

  • 7/31/2019 Testing of Hypothesis1

    33/34

    Testing of Hypotheses

    One SampleTests

    Testfor

    Mean

    Test forProportion

    Two SampleTests

    Testfor

    Mean

    Test forProportion

  • 7/31/2019 Testing of Hypothesis1

    34/34

    Testing ofHypothesis

    Large SampleTests (n > 30)

    Use z-test

    Small SampleTests ( n < 30)

    Use t-test