Upload
shuai-yu
View
218
Download
0
Embed Size (px)
Citation preview
8/8/2019 Codes Samples of 524
1/23
Bi norm.sas
* Produce a 3-D plot of a bivariate-normal frequency function;* Generate data points for the 3-D plot;
data fxyz;rho=0.50;pi=arcos(-1);k=1/(2*pi*sqrt(1-rho**2));do x=-3 to 3 by 0.1;
do y=-3 to 3 by 0.1;fxy=k*exp(-(x**2+2*rho*x*y+y**2)/(1-rho**2));output;
end;end;label x='x'
y='y'fxy='f(x,y)';
run;
* Examine the dataset generated;
proc print data=fxyz;run;
* Use proc G3D to do the 3-dimensional plot *;proc g3d data=fxyz;
title "Bivariate Normal Density Plot";plot y*x=fxy;
run;
Bivariate Normal Density Plot
-3
-1
1
3
x
-3
-1
1
3
y
f(x,y)
0.000
0.061
0.123
0.184
8/8/2019 Codes Samples of 524
2/23
qq .sas
Generate Normal, LogNormal, Uniform and Cauchy samplesLook at their Q-Q plots;
* Normal distribution with mean 5, std 10;data mynorm;
do case=1 to 200;x = 5 + 10*rannor(0);output;
end;drop case;
run;proc univariate data=mynorm;
qqplot x / normal(mu=5 sigma=10) square;histogram;
run;The UNIVARIATE Procedure
Variable: x
Moments
N 200 Sum Weights 200Mean 4.83364778 Sum Observations 966.729556Std Deviation 10.0943517 Variance 101.895936Skewness -0.1078659 Kurtosis 0.9382263Uncorrected SS 24950.1214 Corrected SS 20277.2913Coeff Variation 208.835069 Std Error Mean 0.71377845
Basic Statistical Measures
Location Variability
Mean 4.833648 Std Deviation 10.09435Median 5.292006 Variance 101.89594Mode . Range 72.51388
Interquartile Range 12.23499
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student's t t 6.771916 Pr > |t| = |M| = |S|
8/8/2019 Codes Samples of 524
3/23
Extreme Observations
------Lowest----- -----Highest-----
Value Obs Value Obs
-33.2277 46 23.8457 19-19.4380 99 26.8322 162
-18.6544 188 29.6161 171-16.6733 137 31.1100 42-14.6448 51 39.2862 192
*Skewed: Log-Normal;data myln;
do case=1 to 200;
x = exp(0.5+normal(0));output;end;drop case;
run;proc univariate data=myln;
histogram;qqplot x / square;
run;
The UNIVARIATE ProcedureVariable: x
Moments
N 200 Sum Weights 200Mean 2.64964615 Sum Observations 529.929229Std Deviation 2.83819505 Variance 8.05535114Skewness 2.95259 Kurtosis 11.9228537Uncorrected SS 3007.13982 Corrected SS 1603.01488Coeff Variation 107.116003 Std Error Mean 0.2006907
Basic Statistical Measures
Location Variability
Biva r iate No r al Density Plot
-32 -24 -16 -8 0 8 16 24 32 40
0
5
10
15
20
25
30
35
P e r c e n t
x
Bivaria t
rmal Densi t Plo t
-3 -2 -1 0 1 2 3
-40
-20
0
20
40
x
Normal Quantiles
8/8/2019 Codes Samples of 524
4/23
Mean 2.649646 Std Deviation 2.83820Median 1.668506 Variance 8.05535Mode . Range 20.26777
Interquartile Range 2.31833
Tests for Location: Mu0=0
Test -Statistic- -----p Value------Student's t t 13.20264 Pr > |t| = |M| = |S|
8/8/2019 Codes Samples of 524
5/23
end;drop case;
run;proc univariate data=myunif;
histogram;qqplot x / square;
run;
The UNIVARIATE ProcedureVariable: x
Moments
N 200 Sum Weights 200Mean 2.49579535 Sum Observations 499.15907Std Deviation 1.43465793 Variance 2.05824338Skewness -0.0994752 Kurtosis -1.189572Uncorrected SS 1655.38932 Corrected SS 409.590433Coeff Variation 57.4829957 Std Error Mean 0.10144564
Basic Statistical Measures
Location Variability
Mean 2.495795 Std Deviation 1.43466Median 2.516562 Variance 2.05824Mode . Range 4.95069
Interquartile Range 2.33376
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student's t t 24.60229 Pr > |t| = |M| = |S|
8/8/2019 Codes Samples of 524
6/23
*Long tailed: Cauchy;data myc;
do case=1 to 200;x = rancau(-1);
output;end;drop case;
run;proc univariate data=myc;
histogram;qqplot x / square;
run;The UNIVARIATE Procedure
Variable: x
Moments
N 200 Sum Weights 200Mean 34.5561688 Sum Observations 6911.23377
Std Deviation 449.957706 Variance 202461.937Skewness 14.1160694 Kurtosis 199.499983Uncorrected SS 40528751.3 Corrected SS 40289925.5Coeff Variation 1302.1053 Std Error Mean 31.8168145
Basic Statistical Measures
Location Variability
Mean 34.55617 Std Deviation 449.95771Median 0.39643 Variance 202462Mode . Range 6377
Interquartile Range 2.98650
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student's t t 1.086098 Pr > |t| 0.2787Sign M 19 Pr >= |M| 0.0087Signed Rank S 2660 Pr >= |S| 0.0010
Quantiles (Definition 5)
Quantile Estimate
Biva r iate No r % al Density P lot
0 0.6 1.2 1.8 2.4 3.0 3.6 4.2 4.8
0
2.5
5.0
7.5
10.0
12.5
15.0
&
e r c e n t
x
Biva ria ' e N ( r ) a l De 0 s i ' y Pl ( '
1 3 1 2 1 1 0 1 2 3
0
1
2
3
4
5
x
2
3 r 4 5 l6
7
5 ntile8
8/8/2019 Codes Samples of 524
7/23
100% Max 6362.20028799% 137.76251995% 11.73927990% 5.19840875% Q3 2.21902750% Median 0.39642725% Q1 -0.76747410% -2.325776
5% -5.5580301% -11.1769080% Min -15.275947
The UNIVARIATE ProcedureVariable: x
Extreme Observations
------Lowest------ ------Highest------
Value Obs Value Obs
-15.27595 6 49.3172 95-11.43931 84 75.5253 41-10.91451 37 113.4744 151
-9.20129 114 162.0507 19-9.05907 112 6362.2003 131
Chi sas *Chi-S qu are Plot;
*%let dist=uniform;*%let dist=rancau;
%let dist=normal;%let nobs=50;%let nvar=5;
title "ChiSquare QQ plot";data cqtest;
drop i;do i=1 to &nobs;
x1 = &dist(123425535);x2 = &dist(123425535) + x1;x3 = &dist(123425535) + x1 - x2;
B i9 @ r i@ t A B o r m @ l D A n C it D Plot
0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000 66000
20
40
60
80
100
PE
rF
E
G
t
H
B i9 @ r i@ t A B o r m @ l D A n C it D Plot
I 3 -2 -1 0 1 2 3
-1000
0
1000
2000
3000
4000
5000
6000
7000
x
Normal Qua P tilQ R
8/8/2019 Codes Samples of 524
8/23
x4 = &dist(123425535) - x1 + x2;x5 = &dist(123425535) - x1 + x2 + x3;output;end;
run;
%cqplot(data=cqtest, var=x1-x5, nvar=5);
8/8/2019 Codes Samples of 524
9/23
S q u a r e
d D iS t a n c e
T 10
0
10
20
30
Ch iSqu are Q u an tile
0 1 2 3 4 5 6 U 8 9 10 11 12 13 14 15 16
Ch iSqu r QQ pl
D e v i a t i o
V
F r o m
C h i S
W
u a r e
- X
- Y
-`
- a
-b
- c
d
c
b
a
`
Y
X
Ch iS e uar e f
ua g tile
0 1 2 3 4 5 6 7 h i 10 11 12 13 14 15 16
Ch iSqu r QQ pl
8/8/2019 Codes Samples of 524
10/23
weibull.sas data failures;
input time @@;label time='Time in Months';datalines;
29.42 32.14 30.58 27.50 26.08 29.06 25.10 31.3429.14 33.96 30.64 27.32 29.86 26.28 29.68 33.7629.32 30.82 27.26 27.92 30.92 24.64 32.90 35.4630.28 28.36 25.86 31.36 25.26 36.32 28.58 28.8826.72 27.42 29.02 27.54 31.60 33.46 26.78 27.8229.18 27.94 27.66 26.42 31.00 26.64 31.44 32.52
;run;
symbol v=plus;title 'Three-Parameter Weibull Q-Q Plot for Failure Times';proc capability data=failures noprint;
qqplot time / weibull(c=est theta=est sigma=est)
cframe = ligrsquarehref=0.5 1 1.5 2vref=25 27.5 30 32.5 35chref=ywhcvref=ywh;
run;
symbol v=plus;title 'Two-Parameter Weibull Q-Q Plot for Failure Times';proc capability data=failures noprint;
qqplot time / weibull2(theta=24 c=est sigma=est) squarecframe = ligrhref= -4 to 1vref= 0 to 2.5 by 0.5chref=pay cvref=pay;
run;
8/8/2019 Codes Samples of 524
11/23
Th r - P r t r W ibull Q -Q P l t f r F ilu r T i
0 0 .5 1 .0 1 .5 2 .0 2 .5
22 .5
2 5 .0
27 .5
3 0 .0
3 2 .5
35 .0
3 7 .5
T i m e
i n M o n
t h s
Weibull Quan tile s p q =1.98 782)
Weibull Line: Th r e sho ld= 2 4 .188, S r a le= 5 .8 2 85
Tw o- P r r W ibull Q -Q Pl o f o r F ilur T i
s 5 s 4 s 3 s 2 -1 0 1 2
-0.5
0
0.5
1.0
1.5
2.0
2.5
3.0
L o g
T i m e
i n M o n t h s
log -We ibull Quan tile t Th e ta= 24)
log -We ibull Line: Shape= 2.082, S u a le= 6.051499
8/8/2019 Codes Samples of 524
12/23
lognormal.sas data rods;
input diameter @@;label diameter='Diameter in mm';datalines;
5.501 5.251 5.404 5.366 5.4455.576 5.607 5.200 5.977 5.1775.332 5.399 5.661 5.512 5.2525.404 5.739 5.525 5.160 5.4105.823 5.376 5.202 5.470 5.4105.394 5.146 5.244 5.309 5.4805.388 5.399 5.360 5.368 5.3945.248 5.409 5.304 6.239 5.7815.247 5.907 5.208 5.143 5.3045.603 5.164 5.209 5.475 5.223;
run;
****** Normal Probability Plotsymbol v=plus;
title 'Normal Probability Plot for Diameters';proc capability data=rods noprint;probplot diameter / cframe = ligr;
run;
****** LogNormal Probability Plot
symbol v=plus height=3.5pct;title 'Lognormal Probability Plot for Diameters';proc capability data=rods noprint;
probplot diameter / lognormal(theta=est zeta=est sigma=0.2 0.50.8)
href = 95lhref=1chref=redcframe = ligrsquare;
run;
symbol v=plus height=3.5pct;title 'Lognormal Probability Plot for Diameters';proc capability data=rods noprint;
probplot diameter / lognormal(theta=est zeta=est sigma=estcolor=yellow)
href = 95lhref = 1chref = red
cframe = ligrsquare;run;
symbol v=plus height=3.5pct;legend2 frame cframe=ligr cborder=black position=center;title 'Lognormal Probability Plot for Diameters';proc capability data=rods noprint;
probplot diameter / lognormal(sigma=0.5 theta=estzeta=est color=yellow w=2)
8/8/2019 Codes Samples of 524
13/23
squarepctlminorhref = 95lhref = 2hreflabel = '95%'vref = 5.8 to 6.0 by 0.1lvref = 3cframe = ligrlegend = legend2chref = redcvref = blue;
run;
No r v a l P r ob ab iliw
y P lot f o r x ia v e te r s
1 5 10 25 50 75 90 95 99
5.0
5.2
5.4
5.6
5.8
6.0
6.2
6.4
D i
y
e e r
i n
D i
y
e e r
i n
N r l Pe r cen
ile
L ogno r a l P r o bab ility P lot f o r
ia e te r s
1 5 10 25 50 75 90 95 99
5.0
5.2
5.4
5.6
5.8
6.0
6.2
6.4
D i
e e r
i n
D i
e e r
i n
L gn r l Pe r cen ile (S ig =0 .2)
L gn r l Line: T
r e h ld=4 .4312 , S c le= -0.032
L ogno r a l P r obab ility P lot f o r ia e te r s
1 5 10 25 50 75 90 95 99
5 .0
5 .2
5 .4
5 .6
5 .8
6 .0
6 .2
6 .4
D i
y
e e r
i n
D i
y
e e r
i n
L gn r l Pe r cen
ile (S ig =0 .5)
L gn r l Line: Th r e h ld=4 .4312 , S c le= -0 .032
L ogno r j a l P r obab ility P lot f o r k ia j e te r s
1 5 25 50 75 90 95 99
5.0
5.2
5.4
5.6
5.8
6.0
6.2
6.4
D i
y
e e r
i n
D i
y
e e r
i n
L gn r l Pe r cen
ile (S ig =0.8)
L gn r l Line: Thr e h ld=4.4312 , Sc le=-0.032
8/8/2019 Codes Samples of 524
14/23
summary.sas *Bowei Xi;*Examine the data set T1-2.DAT nemurically and graphically;
data ex1_5;infile 'U:\T1-2.DAT';input density smd scd;
proc print data=ex1_5;run;
*Examine the variables individually;proc univariate data=ex1_5 plot;
title 'More Descriptive Statistics';var density smd scd;histogram density;
run;
*Correlation coefficient;proc corr data=ex1_5;
var density smd scd;run;
*Scatter Plot;proc gplot data=ex1_5;
plot density*smd density*scd smd*scd;run;
The UNIVARIATE ProcedureVariable: density
Moments
N 41 Sum Weights 41Mean 0.81185366 Sum Observations 33.286Std Deviation 0.03556091 Variance 0.00126458Skewness 2.02080081 Kurtosis 9.15445997Uncorrected SS 27.073944 Corrected SS 0.05058312Coeff Variation 4.38021136 Std Error Mean 0.00555368
L ognormal l ro m a m ility l lo t for n iame ters
1o
10 2o o
0 7o
90 9o
99
o
.0
o
.2
o
.4
o
.6
o
.8
6.0
6.2
6.4
i a
t r
i n
i a
t r
i n
Logno r al P r ntil s (S ig a=
.649838 )
Logno r al Lin : T r shold=o
.0689, S al =-1.236
L ogno r z a l{
r obab ility {
lot f o r Dia z e te r s
1o
10 2o o
0 7o
90 9o
99
9o
%
o
.0
o
.2
o
.4
o
.6
o
.8
6.0
6.2
6.4
i a
t r
i n
i a
t r
i n
Logno r al P r ntil s (S ig a=
.|
)
Logno r al Lin : Thr shold=o
.004, S al =-1.003
8/8/2019 Codes Samples of 524
15/23
Basic Statistical Measures
Location Variability
Mean 0.811854 Std Deviation 0.03556Median 0.815000 Variance 0.00126Mode 0.802000 Range 0.21300
Interquartile Range 0.03100NOTE: The mode displayed is the smallest of 2 modes with a count of 3.
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student's t t 146.183 Pr > |t| = |M| = |S|
8/8/2019 Codes Samples of 524
16/23
Normal Probability Plot0.97+ *
||| +| +++++| ++++++| +++**+** *
| *********| ********| ****++| *+***+
0.75+ * ++*++*+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2
The UNIVARIATE ProcedureVariable: smd
Moments
N 41 Sum Weights 41Mean 120.953415 Sum Observations 4959.09
Std Deviation 7.70202233 Variance 59.321148Skewness -0.2676396 Kurtosis -0.7087529Uncorrected SS 602191.715 Corrected SS 2372.84592Coeff Variation 6.36775932 Std Error Mean 1.2028538
Basic Statistical Measures
Location Variability
Mean 120.9534 Std Deviation 7.70202Median 121.4100 Variance 59.32115Mode 115.1000 Range 31.59000
Interquartile Range 11.60000
NOTE: The mode displayed is the smallest of 2 modes with a count of 2.
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student's t t 100.5554 Pr > |t| = |M| = |S|
8/8/2019 Codes Samples of 524
17/23
Quantile Estimate
100% Max 135.1099% 135.1095% 131.5090% 130.8075% Q3 126.7050% Median 121.41
25% Q1 115.1010% 110.605% 109.101% 103.510% Min 103.51
The UNIVARIATE ProcedureVariable: smd
Extreme Observations
-----Lowest----- ----Highest----
Value Obs Value Obs
103.51 39 130.8 9
107.40 34 131.0 23109.10 17 131.5 6109.81 16 131.8 4110.60 38 135.1 5
Stem Leaf # Boxplot134 1 1 |132 |130 58058 5 |128 2 1 |126 17789 5 +-----+124 16578 5 | |122 39 2 | |120 37849 5 *--+--*118 033 3 | |116 25 2 | |114 2117 4 +-----+112 68 2 |110 67 2 |108 18 2 |106 4 1 |104 |102 5 1 |
----+----+----+----+
The UNIVARIATE ProcedureVariable: scd
Moments
N 41 Sum Weights 41Mean 67.7231707 Sum Observations 2776.65Std Deviation 9.79064182 Variance 95.8566672Skewness -0.779458 Kurtosis -0.8936491Uncorrected SS 191877.809 Corrected SS 3834.26669Coeff Variation 14.4568568 Std Error Mean 1.52904136
Basic Statistical Measures
Location Variability
8/8/2019 Codes Samples of 524
18/23
Mean 67.72317 Std Deviation 9.79064Median 70.70000 Variance 95.85667Mode . Range 31.40000
Interquartile Range 18.36000
Tests for Location: Mu0=0
Test -Statistic- -----p Value------Student's t t 44.29126 Pr > |t| = |M| = |S|
8/8/2019 Codes Samples of 524
19/23
3 Variables: density smd scd
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum
density 41 0.81185 0.03556 33.28600 0.75800 0.97100
smd 41 120.95341 7.70202 4959 103.51000 135.10000scd 41 67.72317 9.79064 2777 48.93000 80.33000
Pearson Correlation Coefficients, N = 41Prob > |r| under H0: Rho=0
density smd scd
density 1.00000 0.61501 0.64696
8/8/2019 Codes Samples of 524
20/23
corr.sas
* Sample Correlation Matrix;data a1; n=100;
do case=1 to n;x1=5+normal(0);
x2=10+3*normal(0);
den s ity
0.75
0.76
0.77
0.78
0.79
0.80
0.81
0.82
0.83
0.84
0.85
0.86
0.87
0.88
0.89
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
sc d
40 50 60 70 80 90
M o r e Desc r ip t i e S ta t is t ics
sm d
100
110
120
130
140
sc d
40 50 60 70 80 90
M o r e Desc r ip ti e S ta tis tics
8/8/2019 Codes Samples of 524
21/23
x3=15+5*normal(0);output;
end;drop n case;
*proc print data=a1;*run;proc corr data=a1 cov;
var x1 x2 x3;run;
The CORR Procedure
3 Variables: x1 x2 x3
Covariance Matrix, DF = 99
x1 x2 x3
x1 1.21142254 -0.13980247 -0.07271637x2 -0.13980247 8.35415511 0.81155719x3 -0.07271637 0.81155719 21.45089516
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum
x1 100 5.04263 1.10065 504.26306 2.00073 7.79359x2 100 10.17241 2.89036 1017 2.54370 16.84191x3 100 14.38618 4.63151 1439 0.81502 24.69897
Pearson Correlation Coefficients, N = 100Prob > |r| under H0: Rho=0
x1 x2 x3
x1 1.00000 -0.04395 -0.014260.6642 0.8880
x2 -0.04395 1.00000 0.060620.6642 0.5491
x3 -0.01426 0.06062 1.000000.8880 0.5491
Partial Correlation
options ls=78;title "Partial Correlations - Wechsler Data";data wechsler;
infile "./wechsler.txt";input id info sim arith pict;run;
**(I)proc glm;
model info sim = arith pict / nouni;manova / printe;
8/8/2019 Codes Samples of 524
22/23
run;
**(II)proc corr data=wechsler;var info sim;partial arith pict;run;
**proc corr data=wechsler;var info sim;run;
The GLM ProcedureMultivariate Analysis of Variance
E = Error SSCP Matrix
info sim
info 269.38421752 208.61081124sim 208.61081124 318.7788636
Partial Correlation Coefficients from the Error SSCP Matrix / Prob > |r|
DF = 34 info sim
info 1.000000 0.711879
8/8/2019 Codes Samples of 524
23/23
info 1.00000 0.71188