T=ANOVA=Regression

Embed Size (px)

Citation preview

  • 7/27/2019 T=ANOVA=Regression

    1/5

    T Tests, ANOVA, and Regression AnalysisHere is a one-sample t test of the null hypothesis that mu = 0:

    DATAONESAMPLE; INPUT Y @@;

    CARDS;

    1 2 3 4 5 6 7 8 9 10

    PROCMEANST PRT; RUN;

    ------------------------------------------------------------------------------------------------

    The SAS System

    The MEANS Procedure

    Analysis Variable : Y

    t Value Pr > |t|

    5.74 0.0003

    ------------------------------------------------------------------------------------------------

    Now an ANOVA on the same data but with no grouping variable:

    PROCANOVA; MODEL Y = ; run;------------------------------------------------------------------------------------------------

    The SAS System

    The ANOVA Procedure

    Dependent Variable: Y

    Sum of

    Source DF Squares Mean Square F Value Pr > F

    Model 1 302.5000000 302.5000000 33.00 0.0003

    Error 9 82.5000000 9.1666667

    Uncorrected Total 10 385.0000000

    R-Square Coeff Var Root MSE Y Mean

    0.000000 55.04819 3.027650 5.500000

    Source DF Anova SS Mean Square F Value Pr > F

    Intercept 1 302.5000000 302.5000000 33.00 0.0003

    ------------------------------------------------------------------------------------------------

    Notice that the ANOVA F is simply the square of the one-sample t, and the one-tailed p from the ANOVA is identical to the two-tailed p from the t.

    Now an Regression analysis with Model Y = intercept + error.PROCREG; MODEL Y = ; run;------------------------------------------------------------------------------------------------

    The REG Procedure

    Model: MODEL1

    Dependent Variable: Y

    Sum of Mean

    Source DF Squares Square F Value Pr > F

  • 7/27/2019 T=ANOVA=Regression

    2/5

    Model 0 0 . . .

    Error 9 82.50000 9.16667

    Corrected Total 9 82.50000

    Root MSE 3.02765 R-Square 0.0000

    Dependent Mean 5.50000 Adj R-Sq 0.0000

    Coeff Var 55.04819

    Parameter Estimates

    Parameter Standard

    Variable DF Estimate Error t Value Pr > |t|

    Intercept 1 5.50000 0.95743 5.74 0.0003

    ------------------------------------------------------------------------------------------------

    Notice that the ANOVA is replicated.

    Now consider a two independent groups t test with pooled variances, null ismu1-mu2 = 0:DATATWOSAMPLE; INPUT X Y @@;

    CARDS;1 1 1 2 1 3 1 4 1 5

    2 6 2 7 2 8 2 9 2 10

    PROC TTEST; CLASS X; VAR Y; RUN;------------------------------------------------------------------------------------------------

    The SAS System

    T-Tests

    Variable Method Variances DF t Value Pr > |t|

    Y Pooled Equal 8 -5.00 0.0011

    ------------------------------------------------------------------------------------------------

    Now an ANOVA on the same data:PROCANOVA; CLASS X; MODEL Y = X; RUN;

    ------------------------------------------------------------------------------------------------

    The ANOVA Procedure

    Dependent Variable: Y

    Sum of

    Source DF Squares Mean Square F Value Pr > F

    Model 1 62.50000000 62.50000000 25.00 0.0011

    Error 8 20.00000000 2.50000000

    Corrected Total 9 82.50000000

    R-Square Coeff Var Root MSE Y Mean

    0.757576 28.74798 1.581139 5.500000

    Source DF Anova SS Mean Square F Value Pr > F

    X 1 62.50000000 62.50000000 25.00 0.0011

    ------------------------------------------------------------------------------------------------

    Notice that the ANOVA F is simply the square of the independent samples tandthe one-tailed ANOVA p identical to the two-tailed p from t.

  • 7/27/2019 T=ANOVA=Regression

    3/5

    And finally replication of the ANOVA with a regression analysis:PROCREG; MODEL Y = X; run;------------------------------------------------------------------------------------------------

    The SAS System

    The REG ProcedureModel: MODEL1

    Dependent Variable: Y

    Number of Observations Read 10

    Number of Observations Used 10

    Analysis of Variance

    Sum of Mean

    Source DF Squares Square F Value Pr > F

    Model 1 62.50000 62.50000 25.00 0.0011

    Error 8 20.00000 2.50000

    Corrected Total 9 82.50000

    Root MSE 1.58114 R-Square 0.7576

    Dependent Mean 5.50000 Adj R-Sq 0.7273

    Coeff Var 28.74798

    Parameter Estimates

    Parameter Standard

    Variable DF Estimate Error t Value Pr > |t|

    Intercept 1 -2.00000 1.58114 -1.26 0.2415

    X 1 5.00000 1.00000 5.00 0.0011

    OK, but what if we have more than two groups? Show me that the ANOVA is aregression analysis in that case.

    Here is the SAS program, with data:data Lotus;

    input Dose N; Do I=1 to N; Input Illness @@; output; end;

    cards;

    0 20

    101 101 101 104 104 105 110 111 111 113 114 79 89 91 94 95 96 99 99 99

    10 20

    100 65 65 67 68 80 81 82 85 87 87 88 88 91 92 94 95 94 96 96

    20 20

    64 75 75 76 77 79 79 80 80 81 81 81 82 83 83 85 87 88 90 96

    30 20

    100 105 108 80 82 85 87 87 87 89 90 90 92 92 92 95 95 97 98 99

    40 20

    101 102 102 105 108 109 112 119 119 123 82 89 92 94 94 95 95 97 98 99*****************************************************************************;

    proc GLM data=Lotus; class Dose;

    model Illness = Dose / ss1;

    title 'Here we have a traditional one-way independent samples ANOVA'; run;

    *****************************************************************************;

    data Polynomial; set Lotus; Quadratic=Dose*Dose; Cubic=Dose**3;

    Quartic=Dose**4;

  • 7/27/2019 T=ANOVA=Regression

    4/5

    proc GLM data=Polynomial; model Illness = Dose Quadratic Cubic Quartic / ss1;

    title 'Here we have a polynomial regression analysis.'; run;

    *****************************************************************************Here is the output:

    Here we have a traditional one-way independent samples ANOVA 2

    The GLM Procedure

    Dependent Variable: Illness

    Sum of

    Source DF Squares Mean Square F Value Pr > F

    Model 4 6791.54000 1697.88500 20.78 F

    Dose 4 6791.540000 1697.885000 20.78

  • 7/27/2019 T=ANOVA=Regression

    5/5

    Dependent Variable: Illness

    Sum of

    Source DF Squares Mean Square F Value Pr > F

    Model 4 6791.54000 1697.88500 20.78 F

    Dose 1 174.845000 174.845000 2.14 0.1468

    Quadratic 1 6100.889286 6100.889286 74.66