Codes Samples of 524

Embed Size (px)

Citation preview

  • 8/8/2019 Codes Samples of 524

    1/23

    Bi norm.sas

    * Produce a 3-D plot of a bivariate-normal frequency function;* Generate data points for the 3-D plot;

    data fxyz;rho=0.50;pi=arcos(-1);k=1/(2*pi*sqrt(1-rho**2));do x=-3 to 3 by 0.1;

    do y=-3 to 3 by 0.1;fxy=k*exp(-(x**2+2*rho*x*y+y**2)/(1-rho**2));output;

    end;end;label x='x'

    y='y'fxy='f(x,y)';

    run;

    * Examine the dataset generated;

    proc print data=fxyz;run;

    * Use proc G3D to do the 3-dimensional plot *;proc g3d data=fxyz;

    title "Bivariate Normal Density Plot";plot y*x=fxy;

    run;

    Bivariate Normal Density Plot

    -3

    -1

    1

    3

    x

    -3

    -1

    1

    3

    y

    f(x,y)

    0.000

    0.061

    0.123

    0.184

  • 8/8/2019 Codes Samples of 524

    2/23

    qq .sas

    Generate Normal, LogNormal, Uniform and Cauchy samplesLook at their Q-Q plots;

    * Normal distribution with mean 5, std 10;data mynorm;

    do case=1 to 200;x = 5 + 10*rannor(0);output;

    end;drop case;

    run;proc univariate data=mynorm;

    qqplot x / normal(mu=5 sigma=10) square;histogram;

    run;The UNIVARIATE Procedure

    Variable: x

    Moments

    N 200 Sum Weights 200Mean 4.83364778 Sum Observations 966.729556Std Deviation 10.0943517 Variance 101.895936Skewness -0.1078659 Kurtosis 0.9382263Uncorrected SS 24950.1214 Corrected SS 20277.2913Coeff Variation 208.835069 Std Error Mean 0.71377845

    Basic Statistical Measures

    Location Variability

    Mean 4.833648 Std Deviation 10.09435Median 5.292006 Variance 101.89594Mode . Range 72.51388

    Interquartile Range 12.23499

    Tests for Location: Mu0=0

    Test -Statistic- -----p Value------

    Student's t t 6.771916 Pr > |t| = |M| = |S|

  • 8/8/2019 Codes Samples of 524

    3/23

    Extreme Observations

    ------Lowest----- -----Highest-----

    Value Obs Value Obs

    -33.2277 46 23.8457 19-19.4380 99 26.8322 162

    -18.6544 188 29.6161 171-16.6733 137 31.1100 42-14.6448 51 39.2862 192

    *Skewed: Log-Normal;data myln;

    do case=1 to 200;

    x = exp(0.5+normal(0));output;end;drop case;

    run;proc univariate data=myln;

    histogram;qqplot x / square;

    run;

    The UNIVARIATE ProcedureVariable: x

    Moments

    N 200 Sum Weights 200Mean 2.64964615 Sum Observations 529.929229Std Deviation 2.83819505 Variance 8.05535114Skewness 2.95259 Kurtosis 11.9228537Uncorrected SS 3007.13982 Corrected SS 1603.01488Coeff Variation 107.116003 Std Error Mean 0.2006907

    Basic Statistical Measures

    Location Variability

    Biva r iate No r al Density Plot

    -32 -24 -16 -8 0 8 16 24 32 40

    0

    5

    10

    15

    20

    25

    30

    35

    P e r c e n t

    x

    Bivaria t

    rmal Densi t Plo t

    -3 -2 -1 0 1 2 3

    -40

    -20

    0

    20

    40

    x

    Normal Quantiles

  • 8/8/2019 Codes Samples of 524

    4/23

    Mean 2.649646 Std Deviation 2.83820Median 1.668506 Variance 8.05535Mode . Range 20.26777

    Interquartile Range 2.31833

    Tests for Location: Mu0=0

    Test -Statistic- -----p Value------Student's t t 13.20264 Pr > |t| = |M| = |S|

  • 8/8/2019 Codes Samples of 524

    5/23

    end;drop case;

    run;proc univariate data=myunif;

    histogram;qqplot x / square;

    run;

    The UNIVARIATE ProcedureVariable: x

    Moments

    N 200 Sum Weights 200Mean 2.49579535 Sum Observations 499.15907Std Deviation 1.43465793 Variance 2.05824338Skewness -0.0994752 Kurtosis -1.189572Uncorrected SS 1655.38932 Corrected SS 409.590433Coeff Variation 57.4829957 Std Error Mean 0.10144564

    Basic Statistical Measures

    Location Variability

    Mean 2.495795 Std Deviation 1.43466Median 2.516562 Variance 2.05824Mode . Range 4.95069

    Interquartile Range 2.33376

    Tests for Location: Mu0=0

    Test -Statistic- -----p Value------

    Student's t t 24.60229 Pr > |t| = |M| = |S|

  • 8/8/2019 Codes Samples of 524

    6/23

    *Long tailed: Cauchy;data myc;

    do case=1 to 200;x = rancau(-1);

    output;end;drop case;

    run;proc univariate data=myc;

    histogram;qqplot x / square;

    run;The UNIVARIATE Procedure

    Variable: x

    Moments

    N 200 Sum Weights 200Mean 34.5561688 Sum Observations 6911.23377

    Std Deviation 449.957706 Variance 202461.937Skewness 14.1160694 Kurtosis 199.499983Uncorrected SS 40528751.3 Corrected SS 40289925.5Coeff Variation 1302.1053 Std Error Mean 31.8168145

    Basic Statistical Measures

    Location Variability

    Mean 34.55617 Std Deviation 449.95771Median 0.39643 Variance 202462Mode . Range 6377

    Interquartile Range 2.98650

    Tests for Location: Mu0=0

    Test -Statistic- -----p Value------

    Student's t t 1.086098 Pr > |t| 0.2787Sign M 19 Pr >= |M| 0.0087Signed Rank S 2660 Pr >= |S| 0.0010

    Quantiles (Definition 5)

    Quantile Estimate

    Biva r iate No r % al Density P lot

    0 0.6 1.2 1.8 2.4 3.0 3.6 4.2 4.8

    0

    2.5

    5.0

    7.5

    10.0

    12.5

    15.0

    &

    e r c e n t

    x

    Biva ria ' e N ( r ) a l De 0 s i ' y Pl ( '

    1 3 1 2 1 1 0 1 2 3

    0

    1

    2

    3

    4

    5

    x

    2

    3 r 4 5 l6

    7

    5 ntile8

  • 8/8/2019 Codes Samples of 524

    7/23

    100% Max 6362.20028799% 137.76251995% 11.73927990% 5.19840875% Q3 2.21902750% Median 0.39642725% Q1 -0.76747410% -2.325776

    5% -5.5580301% -11.1769080% Min -15.275947

    The UNIVARIATE ProcedureVariable: x

    Extreme Observations

    ------Lowest------ ------Highest------

    Value Obs Value Obs

    -15.27595 6 49.3172 95-11.43931 84 75.5253 41-10.91451 37 113.4744 151

    -9.20129 114 162.0507 19-9.05907 112 6362.2003 131

    Chi sas *Chi-S qu are Plot;

    *%let dist=uniform;*%let dist=rancau;

    %let dist=normal;%let nobs=50;%let nvar=5;

    title "ChiSquare QQ plot";data cqtest;

    drop i;do i=1 to &nobs;

    x1 = &dist(123425535);x2 = &dist(123425535) + x1;x3 = &dist(123425535) + x1 - x2;

    B i9 @ r i@ t A B o r m @ l D A n C it D Plot

    0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000 66000

    20

    40

    60

    80

    100

    PE

    rF

    E

    G

    t

    H

    B i9 @ r i@ t A B o r m @ l D A n C it D Plot

    I 3 -2 -1 0 1 2 3

    -1000

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    x

    Normal Qua P tilQ R

  • 8/8/2019 Codes Samples of 524

    8/23

    x4 = &dist(123425535) - x1 + x2;x5 = &dist(123425535) - x1 + x2 + x3;output;end;

    run;

    %cqplot(data=cqtest, var=x1-x5, nvar=5);

  • 8/8/2019 Codes Samples of 524

    9/23

    S q u a r e

    d D iS t a n c e

    T 10

    0

    10

    20

    30

    Ch iSqu are Q u an tile

    0 1 2 3 4 5 6 U 8 9 10 11 12 13 14 15 16

    Ch iSqu r QQ pl

    D e v i a t i o

    V

    F r o m

    C h i S

    W

    u a r e

    - X

    - Y

    -`

    - a

    -b

    - c

    d

    c

    b

    a

    `

    Y

    X

    Ch iS e uar e f

    ua g tile

    0 1 2 3 4 5 6 7 h i 10 11 12 13 14 15 16

    Ch iSqu r QQ pl

  • 8/8/2019 Codes Samples of 524

    10/23

    weibull.sas data failures;

    input time @@;label time='Time in Months';datalines;

    29.42 32.14 30.58 27.50 26.08 29.06 25.10 31.3429.14 33.96 30.64 27.32 29.86 26.28 29.68 33.7629.32 30.82 27.26 27.92 30.92 24.64 32.90 35.4630.28 28.36 25.86 31.36 25.26 36.32 28.58 28.8826.72 27.42 29.02 27.54 31.60 33.46 26.78 27.8229.18 27.94 27.66 26.42 31.00 26.64 31.44 32.52

    ;run;

    symbol v=plus;title 'Three-Parameter Weibull Q-Q Plot for Failure Times';proc capability data=failures noprint;

    qqplot time / weibull(c=est theta=est sigma=est)

    cframe = ligrsquarehref=0.5 1 1.5 2vref=25 27.5 30 32.5 35chref=ywhcvref=ywh;

    run;

    symbol v=plus;title 'Two-Parameter Weibull Q-Q Plot for Failure Times';proc capability data=failures noprint;

    qqplot time / weibull2(theta=24 c=est sigma=est) squarecframe = ligrhref= -4 to 1vref= 0 to 2.5 by 0.5chref=pay cvref=pay;

    run;

  • 8/8/2019 Codes Samples of 524

    11/23

    Th r - P r t r W ibull Q -Q P l t f r F ilu r T i

    0 0 .5 1 .0 1 .5 2 .0 2 .5

    22 .5

    2 5 .0

    27 .5

    3 0 .0

    3 2 .5

    35 .0

    3 7 .5

    T i m e

    i n M o n

    t h s

    Weibull Quan tile s p q =1.98 782)

    Weibull Line: Th r e sho ld= 2 4 .188, S r a le= 5 .8 2 85

    Tw o- P r r W ibull Q -Q Pl o f o r F ilur T i

    s 5 s 4 s 3 s 2 -1 0 1 2

    -0.5

    0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    L o g

    T i m e

    i n M o n t h s

    log -We ibull Quan tile t Th e ta= 24)

    log -We ibull Line: Shape= 2.082, S u a le= 6.051499

  • 8/8/2019 Codes Samples of 524

    12/23

    lognormal.sas data rods;

    input diameter @@;label diameter='Diameter in mm';datalines;

    5.501 5.251 5.404 5.366 5.4455.576 5.607 5.200 5.977 5.1775.332 5.399 5.661 5.512 5.2525.404 5.739 5.525 5.160 5.4105.823 5.376 5.202 5.470 5.4105.394 5.146 5.244 5.309 5.4805.388 5.399 5.360 5.368 5.3945.248 5.409 5.304 6.239 5.7815.247 5.907 5.208 5.143 5.3045.603 5.164 5.209 5.475 5.223;

    run;

    ****** Normal Probability Plotsymbol v=plus;

    title 'Normal Probability Plot for Diameters';proc capability data=rods noprint;probplot diameter / cframe = ligr;

    run;

    ****** LogNormal Probability Plot

    symbol v=plus height=3.5pct;title 'Lognormal Probability Plot for Diameters';proc capability data=rods noprint;

    probplot diameter / lognormal(theta=est zeta=est sigma=0.2 0.50.8)

    href = 95lhref=1chref=redcframe = ligrsquare;

    run;

    symbol v=plus height=3.5pct;title 'Lognormal Probability Plot for Diameters';proc capability data=rods noprint;

    probplot diameter / lognormal(theta=est zeta=est sigma=estcolor=yellow)

    href = 95lhref = 1chref = red

    cframe = ligrsquare;run;

    symbol v=plus height=3.5pct;legend2 frame cframe=ligr cborder=black position=center;title 'Lognormal Probability Plot for Diameters';proc capability data=rods noprint;

    probplot diameter / lognormal(sigma=0.5 theta=estzeta=est color=yellow w=2)

  • 8/8/2019 Codes Samples of 524

    13/23

    squarepctlminorhref = 95lhref = 2hreflabel = '95%'vref = 5.8 to 6.0 by 0.1lvref = 3cframe = ligrlegend = legend2chref = redcvref = blue;

    run;

    No r v a l P r ob ab iliw

    y P lot f o r x ia v e te r s

    1 5 10 25 50 75 90 95 99

    5.0

    5.2

    5.4

    5.6

    5.8

    6.0

    6.2

    6.4

    D i

    y

    e e r

    i n

    D i

    y

    e e r

    i n

    N r l Pe r cen

    ile

    L ogno r a l P r o bab ility P lot f o r

    ia e te r s

    1 5 10 25 50 75 90 95 99

    5.0

    5.2

    5.4

    5.6

    5.8

    6.0

    6.2

    6.4

    D i

    e e r

    i n

    D i

    e e r

    i n

    L gn r l Pe r cen ile (S ig =0 .2)

    L gn r l Line: T

    r e h ld=4 .4312 , S c le= -0.032

    L ogno r a l P r obab ility P lot f o r ia e te r s

    1 5 10 25 50 75 90 95 99

    5 .0

    5 .2

    5 .4

    5 .6

    5 .8

    6 .0

    6 .2

    6 .4

    D i

    y

    e e r

    i n

    D i

    y

    e e r

    i n

    L gn r l Pe r cen

    ile (S ig =0 .5)

    L gn r l Line: Th r e h ld=4 .4312 , S c le= -0 .032

    L ogno r j a l P r obab ility P lot f o r k ia j e te r s

    1 5 25 50 75 90 95 99

    5.0

    5.2

    5.4

    5.6

    5.8

    6.0

    6.2

    6.4

    D i

    y

    e e r

    i n

    D i

    y

    e e r

    i n

    L gn r l Pe r cen

    ile (S ig =0.8)

    L gn r l Line: Thr e h ld=4.4312 , Sc le=-0.032

  • 8/8/2019 Codes Samples of 524

    14/23

    summary.sas *Bowei Xi;*Examine the data set T1-2.DAT nemurically and graphically;

    data ex1_5;infile 'U:\T1-2.DAT';input density smd scd;

    proc print data=ex1_5;run;

    *Examine the variables individually;proc univariate data=ex1_5 plot;

    title 'More Descriptive Statistics';var density smd scd;histogram density;

    run;

    *Correlation coefficient;proc corr data=ex1_5;

    var density smd scd;run;

    *Scatter Plot;proc gplot data=ex1_5;

    plot density*smd density*scd smd*scd;run;

    The UNIVARIATE ProcedureVariable: density

    Moments

    N 41 Sum Weights 41Mean 0.81185366 Sum Observations 33.286Std Deviation 0.03556091 Variance 0.00126458Skewness 2.02080081 Kurtosis 9.15445997Uncorrected SS 27.073944 Corrected SS 0.05058312Coeff Variation 4.38021136 Std Error Mean 0.00555368

    L ognormal l ro m a m ility l lo t for n iame ters

    1o

    10 2o o

    0 7o

    90 9o

    99

    o

    .0

    o

    .2

    o

    .4

    o

    .6

    o

    .8

    6.0

    6.2

    6.4

    i a

    t r

    i n

    i a

    t r

    i n

    Logno r al P r ntil s (S ig a=

    .649838 )

    Logno r al Lin : T r shold=o

    .0689, S al =-1.236

    L ogno r z a l{

    r obab ility {

    lot f o r Dia z e te r s

    1o

    10 2o o

    0 7o

    90 9o

    99

    9o

    %

    o

    .0

    o

    .2

    o

    .4

    o

    .6

    o

    .8

    6.0

    6.2

    6.4

    i a

    t r

    i n

    i a

    t r

    i n

    Logno r al P r ntil s (S ig a=

    .|

    )

    Logno r al Lin : Thr shold=o

    .004, S al =-1.003

  • 8/8/2019 Codes Samples of 524

    15/23

    Basic Statistical Measures

    Location Variability

    Mean 0.811854 Std Deviation 0.03556Median 0.815000 Variance 0.00126Mode 0.802000 Range 0.21300

    Interquartile Range 0.03100NOTE: The mode displayed is the smallest of 2 modes with a count of 3.

    Tests for Location: Mu0=0

    Test -Statistic- -----p Value------

    Student's t t 146.183 Pr > |t| = |M| = |S|

  • 8/8/2019 Codes Samples of 524

    16/23

    Normal Probability Plot0.97+ *

    ||| +| +++++| ++++++| +++**+** *

    | *********| ********| ****++| *+***+

    0.75+ * ++*++*+----+----+----+----+----+----+----+----+----+----+

    -2 -1 0 +1 +2

    The UNIVARIATE ProcedureVariable: smd

    Moments

    N 41 Sum Weights 41Mean 120.953415 Sum Observations 4959.09

    Std Deviation 7.70202233 Variance 59.321148Skewness -0.2676396 Kurtosis -0.7087529Uncorrected SS 602191.715 Corrected SS 2372.84592Coeff Variation 6.36775932 Std Error Mean 1.2028538

    Basic Statistical Measures

    Location Variability

    Mean 120.9534 Std Deviation 7.70202Median 121.4100 Variance 59.32115Mode 115.1000 Range 31.59000

    Interquartile Range 11.60000

    NOTE: The mode displayed is the smallest of 2 modes with a count of 2.

    Tests for Location: Mu0=0

    Test -Statistic- -----p Value------

    Student's t t 100.5554 Pr > |t| = |M| = |S|

  • 8/8/2019 Codes Samples of 524

    17/23

    Quantile Estimate

    100% Max 135.1099% 135.1095% 131.5090% 130.8075% Q3 126.7050% Median 121.41

    25% Q1 115.1010% 110.605% 109.101% 103.510% Min 103.51

    The UNIVARIATE ProcedureVariable: smd

    Extreme Observations

    -----Lowest----- ----Highest----

    Value Obs Value Obs

    103.51 39 130.8 9

    107.40 34 131.0 23109.10 17 131.5 6109.81 16 131.8 4110.60 38 135.1 5

    Stem Leaf # Boxplot134 1 1 |132 |130 58058 5 |128 2 1 |126 17789 5 +-----+124 16578 5 | |122 39 2 | |120 37849 5 *--+--*118 033 3 | |116 25 2 | |114 2117 4 +-----+112 68 2 |110 67 2 |108 18 2 |106 4 1 |104 |102 5 1 |

    ----+----+----+----+

    The UNIVARIATE ProcedureVariable: scd

    Moments

    N 41 Sum Weights 41Mean 67.7231707 Sum Observations 2776.65Std Deviation 9.79064182 Variance 95.8566672Skewness -0.779458 Kurtosis -0.8936491Uncorrected SS 191877.809 Corrected SS 3834.26669Coeff Variation 14.4568568 Std Error Mean 1.52904136

    Basic Statistical Measures

    Location Variability

  • 8/8/2019 Codes Samples of 524

    18/23

    Mean 67.72317 Std Deviation 9.79064Median 70.70000 Variance 95.85667Mode . Range 31.40000

    Interquartile Range 18.36000

    Tests for Location: Mu0=0

    Test -Statistic- -----p Value------Student's t t 44.29126 Pr > |t| = |M| = |S|

  • 8/8/2019 Codes Samples of 524

    19/23

    3 Variables: density smd scd

    Simple Statistics

    Variable N Mean Std Dev Sum Minimum Maximum

    density 41 0.81185 0.03556 33.28600 0.75800 0.97100

    smd 41 120.95341 7.70202 4959 103.51000 135.10000scd 41 67.72317 9.79064 2777 48.93000 80.33000

    Pearson Correlation Coefficients, N = 41Prob > |r| under H0: Rho=0

    density smd scd

    density 1.00000 0.61501 0.64696

  • 8/8/2019 Codes Samples of 524

    20/23

    corr.sas

    * Sample Correlation Matrix;data a1; n=100;

    do case=1 to n;x1=5+normal(0);

    x2=10+3*normal(0);

    den s ity

    0.75

    0.76

    0.77

    0.78

    0.79

    0.80

    0.81

    0.82

    0.83

    0.84

    0.85

    0.86

    0.87

    0.88

    0.89

    0.90

    0.91

    0.92

    0.93

    0.94

    0.95

    0.96

    0.97

    0.98

    sc d

    40 50 60 70 80 90

    M o r e Desc r ip t i e S ta t is t ics

    sm d

    100

    110

    120

    130

    140

    sc d

    40 50 60 70 80 90

    M o r e Desc r ip ti e S ta tis tics

  • 8/8/2019 Codes Samples of 524

    21/23

    x3=15+5*normal(0);output;

    end;drop n case;

    *proc print data=a1;*run;proc corr data=a1 cov;

    var x1 x2 x3;run;

    The CORR Procedure

    3 Variables: x1 x2 x3

    Covariance Matrix, DF = 99

    x1 x2 x3

    x1 1.21142254 -0.13980247 -0.07271637x2 -0.13980247 8.35415511 0.81155719x3 -0.07271637 0.81155719 21.45089516

    Simple Statistics

    Variable N Mean Std Dev Sum Minimum Maximum

    x1 100 5.04263 1.10065 504.26306 2.00073 7.79359x2 100 10.17241 2.89036 1017 2.54370 16.84191x3 100 14.38618 4.63151 1439 0.81502 24.69897

    Pearson Correlation Coefficients, N = 100Prob > |r| under H0: Rho=0

    x1 x2 x3

    x1 1.00000 -0.04395 -0.014260.6642 0.8880

    x2 -0.04395 1.00000 0.060620.6642 0.5491

    x3 -0.01426 0.06062 1.000000.8880 0.5491

    Partial Correlation

    options ls=78;title "Partial Correlations - Wechsler Data";data wechsler;

    infile "./wechsler.txt";input id info sim arith pict;run;

    **(I)proc glm;

    model info sim = arith pict / nouni;manova / printe;

  • 8/8/2019 Codes Samples of 524

    22/23

    run;

    **(II)proc corr data=wechsler;var info sim;partial arith pict;run;

    **proc corr data=wechsler;var info sim;run;

    The GLM ProcedureMultivariate Analysis of Variance

    E = Error SSCP Matrix

    info sim

    info 269.38421752 208.61081124sim 208.61081124 318.7788636

    Partial Correlation Coefficients from the Error SSCP Matrix / Prob > |r|

    DF = 34 info sim

    info 1.000000 0.711879

  • 8/8/2019 Codes Samples of 524

    23/23

    info 1.00000 0.71188