Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
1
( LAB SHEET)
By
Dr. WAEL KHEDR
May 2017
2
Chapter(1): Introduction to MATLAB
3
4
5
6
7
Chapter(2): Data
(1) Nominal data To create a nominal array B from the array X according to labels.
B= nominal(X , labels )
EX: >> X = {'b' 'b' 'g' ; 'g' 'r' 'b' ; 'b' 'r' 'g' } >> B = nominal(X,{'blue','green','red'}) X =
'b' 'b' 'g'
'g' 'r' 'b'
'b' 'r' 'g'
B =
blue blue green
green red blue
blue red green
(2) Operations on nominal Data
Example : compute the Mode and Entropy
► Compute Mode for nominal data matrix B as the following:
Step(1) >> [K ,I]= hist (B(:) ) % histogram gives bines with its frequencies
K =
4 3 2 % frequencies
https://www.mathworks.com/help/stats/nominal.html#inputarg_X
8
I = blue green red % Bines
Step(2) >> [ v , indx ] = max( K) % determine the index of maximum frequency
v = 4
indx = 1
Step(3) >> mod= B(indx)
mod = blue
The mode of matrix B is “ blue”.
► To Compute Entropy for nominal data matrix B as the following:
Step(1) We have K = [ 4 3 2 ]
Step(2) >> s=sum(K)
s = 9
Step(3) >> pi=K/s % computer percentiles of frequencies
pi = 0.4444 0.3333 0.2222
Step(4) >> E = -sum(pi .* log2(pi))
E = 1.5305
(3) Ordinal data
B= ordinal(X, labels ) creates an ordinal array object B from the array X.
labels the levels in B according to labels. ordinal creates the levels of B from the
sorted unique values in X, and creates default labels for them.
EX: >> quality = ordinal([1 2 3 3 1 3],{'low' 'medium' 'high'})
quality =
low medium high high low high
https://www.mathworks.com/help/stats/ordinal.html#inputarg_Xhttps://www.mathworks.com/help/stats/ordinal.html#inputarg_labelshttps://www.mathworks.com/help/stats/ordinal-object.html
9
► Compute Median and percentiles for ordinal data matrix B as the following:
>> median ([1 2 3 3 1 3])
ans = 2.5000
%% Y = prctile (X,P) returns percentiles of the values in X. P is a scalar
or a vector of percent values
Compute the Percentiles : 10th , 25th , 50th , 75th , 90th as follow:
>> prctile (K , [10 25 50 75 90] )
ans =
2.0000 2.2500 3.0000 3.7500 4.0000
(4) Interval/Scale data
>> x=[ 1 2 3 4 5 6 ] % Or >> x= 1 : 6 >> m=mean(x) m= 3.5000 >> v= var(x) v = 3.5000 >> st= std(x) st = 1.8708
-----------------------------------------------------------------------------------------------------------------
Data matrix :
1.12.216.226.2512.65
1.22.715.225.2710.23
Thickness LoadDistanceProjection
of y load
Projection
of x Load
1.12.216.226.2512.65
1.22.715.225.2710.23
Thickness LoadDistanceProjection
of y load
Projection
of x Load
10
>> D = [ 10.23 5.27 15.22 2.7 1.2 ; 12.65 6.25 16.22 2.2 1.1 ]
D = 10.2300 5.2700 15.2200 2.7000 1.2000
12.6500 6.2500 16.2200 2.2000 1.1000
► Aggregation : Combining two or more attributes (or objects) into a
single attribute (or object)
► By using Histogram >> x = 1:100 >> m = hist(x) m = 10 10 10 10 10 10 10 10 10 10 >> n = hist(x,6) n = 17 17 16 17 16 17
0 20 40 60 80 100 1200
1
2
3
4
5
6
7
8
9
10
0 10 20 30 40 50 60 70 80 90 1000
2
4
6
8
10
12
14
16
18
► Aggregation ► Sampling ► Dimensionality Reduction ► Feature subset selection ► Feature creation ► Discretization and Binarization ► Attribute Transformation
11
► Data Sampling : is the main technique employed for data selection
>> x =1:100;
>> for i =1: 2 : 50 % select dataset x step 2
y( (i+1) /2 ) = x( i );
end
>> y
y =
Columns 1 through 16 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 Columns 17 through 25 33 35 37 39 41 43 45 47 49
Sampling with replacement:
>> data= randn(1000,1000); >> nRows = 100; % number of rows >> nSample = 100; % number of samples >> rndIDX = randi(nRows, nSample, 1); >> newSample = data(rndIDX, :);
----------------------------------------------------------------------------------------
► Dimensionality Reduction by PCA
► Step(1) load dataset “IRIS” >> Iris=load('D:\Iris.txt');
>> size(Iris)
ans = 150 4
► Step(2) Calculate covariance matrix
12
>> C=cov(Iris)
C =
0.6857 -0.0393 1.2737 0.5169
-0.0393 0.1880 -0.3217 -0.1180
1.2737 -0.3217 3.1132 1.2964
0.5169 -0.1180 1.2964 0.5824
► Step(3) Calculate Eigen vectors and Eigen values of covariance
matrix
>> [vect Lamba] = eigs(C)
vect =
0.3616 0.6565 0.5810 -0.3173
-0.0823 0.7297 -0.5964 0.3241
0.8566 -0.1758 -0.0725 0.4797
0.3588 -0.0747 -0.5491 -0.7511
Lamba =
4.2248 0 0 0
0 0.2422 0 0
0 0 0.0785 0
0 0 0 0.0237
► Step(4) Determine Eigen vectors that correspond highest
eigenvalues “PCA”:
13
>> V=vect(1:2,:)
V = 0.3616 0.6565 0.5810 -0.3173
-0.0823 0.7297 -0.5964 0.3241
► Step(4) Dimension reduction from 4 - 2
>> Y= Iris*V';
>> size(Y)
ans = 150 2
-------------------------------------------------------------------------------------------------
Mapping Data to a New Space
Fourier transform (FFT)
Wavelet transform (DWT)
Step(1): compute DFT for one dimension
>> X=1:2:20
X = 1 3 5 7 9 11 13 15 17 19
>> Y=fft(X)
Y =
1.0e+002 *
Columns 1 through 5
1.0000 -0.1000 + 0.3078i -0.1000 + 0.1376i -0.1000 + 0.0727i -
0.1000 + 0.0325i
Columns 6 through 10
-0.1000 -0.1000 - 0.0325i -0.1000 - 0.0727i -0.1000 - 0.1376i -
0.1000 - 0.3078i
Step(2): compute inverse DFT for one dimension
14
>> XX=abs(ifft(Y))
XX =
1.0 3.0 5.0 7.0 9.0 11.0 13.0 15.0 17.0 19.0
Step(3): compute DFT for two dimension
>> A=[1 2 3 ; 4 5 6; 6 7 8]
A =
1 2 3
4 5 6
6 7 8
>> B=fft2(A)
B =
42.0000 -4.5000 + 2.5981i -4.5000 - 2.5981i
-12.0000 + 5.1962i 0 0
-12.0000 - 5.1962i 0 0
Step(3): compute inverse DFT for two dimension
>> Bv= abs( ifft2(B))
Bv =
1 2 3
4 5 6
6 7 8
15
Attribute Transformation
– Simple functions: xk, log(x), ex, |x|
>> C2=A.^2
C2 =
1 4 9
16 25 36
36 49 64
>> Lg=log(A)
Lg =
0 0.6931 1.0986
1.3863 1.6094 1.7918
1.7918 1.9459 2.0794
>> Ex=exp(A)
Ex = 1.0e+003 *
0.0027 0.0074 0.0201
0.0546 0.1484 0.4034
0.4034 1.0966 2.9810
>> ab=abs(A)
ab =
1 2 3
4 5 6
6 7 8
16
– Standardization and Normalization
>> x=1:2:20
x =
1 3 5 7 9 11 13 15 17 19
>> m=mean(x)
m = 10
>> s=std(x)
s = 6.0553
>> y=(x-m)/s
y =
-1.4863 -1.1560 -0.8257 -0.4954 -0.1651 0.1651 0.4954 0.8257
1.1560 1.4863
17
Distance measurement
> p=[1 2 3]; q=[2 4 3]
q = 2 4 3
>> dist= norm ( p – q , 2 )
dist = 2.2361
>> L1= norm(p-q,1)
L1 = 3
>> L_inf= norm(p-q,inf)
L_inf = 2
Minkowski Distance
Euclidean Distance
r=2
Manhattan, taxicab, L1 norm r=1
supremum” (Lmax norm, L norm
r=inf
rn
k
rkk qpdist
1
1)||(
18
► Binary Data ‘ Similarity ‘:
>> p=[1 1 1 0 1]
p = 1 1 1 0 1
>> q=[1 1 0 1 0]
q = 1 1 0 1 0
>> JaccardIndex_ac= sum(p(:) & q(:)) / sum(p(:) | q(:))
JaccardIndex_ac = 0.4000
>> Jaccard_distance= 1-JaccardIndex_ac
Jaccard_distance = 0.6000
► For Matrix:
Alice = [0 1 0;
0 1 0;
0 1 0];
% Carol tries to draw a line.
Carol = [0 1 0;
0 1 0;
1 1 0;];
jaccardIndex_ac = sum(Alice(:) & Carol(:)) /
sum(Alice(:) | Carol(:))
jaccardIndex_ac =
0.7500
% As expected, we can see that Alice's and Carol's
drawing of a line is
% much MORE "similar" than Alice's and Bob's drawing
(0.2).
% Let's check the Jaccard distance.
jaccardDistance_ac = 1 - jaccardIndex_ac
% jaccardDistance_ac =
% 0.2500
19
► Cosine Similarity: If d1 and d2 are two document vectors, then
cos( d1, d2 ) = (d1 d2) / ||d1|| ||d2|| ,
where indicates vector dot product and || d || is the length of vector d.
l Example:
d1 = 3 2 0 5 0 0 0 2 0 0
d2 = 1 0 0 0 0 0 0 1 0 2 cos( d1, d2 ) = .3150
>> d1 = [3 2 0 5 0 0 0 2 0 0 ]
d1 =
3 2 0 5 0 0 0 2 0 0
>> d2 = [1 0 0 0 0 0 0 1 0 2 ]
d2 =
1 0 0 0 0 0 0 1 0 2
>> theta= dot(d1,d2)/(norm(d1,2)*norm(d2,2))
theta = 0.3150
20
► Correlation
ps=(d1-mean(d1))/std(d1)
ps =
1.0279 0.4568 -0.6852 2.1700 -0.6852 -0.6852 -0.6852 0.4568 -
0.6852 -0.6852
>> ds=(d2-mean(d2))/std(d2)
ds =
0.8581 -0.5721 -0.5721 -0.5721 -0.5721 -0.5721 -0.5721 0.8581 -
0.5721 2.2883
>> cor=dot(ps,ds)
cor = 0.1633
)(/))(( pstdpmeanpp kk
)(/))(( qstdqmeanqq kk
qpqpncorrelatio ),(
21
Chapter(3) Data Exploration
In our discussion of data exploration, we focus on
– Summary statistics
– Visualization
– Online Analytical Processing (OLAP)
Example: Iris Plant data set.
Three flower types (classes):
Setosa
Virginica
Versicolour
Four (non-class) attributes
► Sepal width and length
► Petal width and length
//////////////////////////////////////////////////
Summary Statistics : Examples: location - mean spread - standard deviation - Frequency and Mode
Percentiles
22
>> Iris= load('d:\Iris.txt') ;
>>m=mean(Iris)
m =
5.8433 3.0540 3.7587 1.1987
>> mode(Iris)
ans =
5.0000 3.0000 1.5000 0.2000
>> median(Iris)
ans =
5.8000 3.0000 4.3500 1.3000
>> C=cov(Iris)
C =
0.6857 -0.0393 1.2737 0.5169
-0.0393 0.1880 -0.3217 -0.1180
1.2737 -0.3217 3.1132 1.2964
0.5169 -0.1180 1.2964 0.5824
Percentiles
Y = prctile(X,p) returns percentiles of the values in a data vector or
matrix X for the percentages p in the interval [0,100].
Y = prctile(X,[10 25 50 75 90 ]) >> setosa=Iris(1:50,:);
>> Y = prctile(setosa,[10 25 50 75 90 ])
Y =
10 th : 4.5500 3.0000 1.3000 0.1000
https://www.mathworks.com/help/stats/prctile.html#outputarg_Yhttps://www.mathworks.com/help/stats/prctile.html#inputarg_Xhttps://www.mathworks.com/help/stats/prctile.html#inputarg_p
23
25th : 4.8000 3.1000 1.4000 0.2000
50th : 5.0000 3.4000 1.5000 0.2000
75 th : 5.2000 3.7000 1.6000 0.3000
90 th : 5.4500 3.9000 1.7000 0.4000
Visualization
Visualization Techniques: Histograms
>> figure ,hist(Iris(:,1))
4 4.5 5 5.5 6 6.5 7 7.5 80
5
10
15
20
25
30
24
>> figure ,hist(Iris(:,2))
>> figure ,hist(Iris(:,3))
2 2.5 3 3.5 4 4.50
10
20
30
40
50
60
0 1 2 3 4 5 6 70
5
10
15
20
25
30
35
40
25
► Two-Dimensional Histograms
>> figure ,hist3(Iris(:,1:2))
► Visualization Techniques: Box Plots
outlier
10th
percentile
25th
percentile
75th
percentile
50th
percentile
10th
percentile
26
>> boxplot(Iris)
0
1
2
3
4
5
6
7
8
1 2 3 4
27
► Scatter Plot Array of Iris Attributes : >> plotmatrix(Iris)
0 2 40 5 102 4 64 6 8
0
2
4
0
5
10
2
4
6
4
6
8
28
► Parallel Coordinates Plots for Iris Data >> parallelcoords(Iris)
1 2 3 40
1
2
3
4
5
6
7
8
Coordinate
Coord
inate
Valu
e
29
Chapter (8) Cluster Analysis
Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups
Hierarchical Clustering Partitional Clustering
Clustering Algorithms : K-means Method
A Partitional
p4p1 p2 p3
30
>> x = Iris( randperm(150) , : ) ; >> [indx M]= kmeans(x,3) indx =
1 3 3 3 3 1 2
2 3 1 3 3 1 3
3 1 3 2 3 1 3
4 1 3 3 3 3 1
5 1 3 3 3 1 3
6 2 1 3 3 3 3
7 3 3 3 1 1 3
8 3 3 3 3 2 1
9 3 2 3 3 1 3
10 3 3 2 3 3 3
11 3 3 1 3 1 3
12 3 3 3 1 3 3
13 3 3 3 3 1 3
14 1 3 3 3 3 3
15 2 3 2 3 2 3
16 3 3 3 3 1 1
17 3 3 2 3 3 1
18 3 3 3 3 3 3
19 3 3 1 2 1 3
20 3 3 3 1 1 1
21 3 3 3 3 1 3
22 2 3 3 2 3 2
23 3 2 1 2 1 3
24 3 2 2 3 1 3
25 3 2 3 3 1 1
31
M = 5.2161 3.5387 1.6806 0.3581 4.7091 3.1091 1.3955 0.1909 6.3010 2.8866 4.9588 1.6959 >> clusters = [x indx] >> Y =sortrows( clusters, [5]) Y = 5.2000 3.4000 1.4000 0.2000 1.0000 5.5000 3.5000 1.3000 0.2000 1.0000 5.7000 3.8000 1.7000 0.3000 1.0000 5.4000 3.4000 1.5000 0.4000 1.0000 5.0000 3.4000 1.5000 0.2000 1.0000 5.0000 3.4000 1.6000 0.4000 1.0000 4.8000 3.4000 1.9000 0.2000 1.0000 5.0000 3.5000 1.6000 0.6000 1.0000 5.4000 3.7000 1.5000 0.2000 1.0000 5.0000 3.6000 1.4000 0.2000 1.0000 5.0000 2.3000 3.3000 1.0000 1.0000 5.1000 3.8000 1.9000 0.4000 1.0000 5.1000 3.5000 1.4000 0.3000 1.0000 5.1000 3.7000 1.5000 0.4000 1.0000 5.3000 3.7000 1.5000 0.2000 1.0000 5.5000 4.2000 1.4000 0.2000 1.0000 5.2000 3.5000 1.5000 0.2000 1.0000 5.1000 3.3000 1.7000 0.5000 1.0000 4.9000 2.4000 3.3000 1.0000 1.0000 5.1000 3.8000 1.6000 0.2000 1.0000 5.1000 3.4000 1.5000 0.2000 1.0000 5.7000 4.4000 1.5000 0.4000 1.0000 5.1000 2.5000 3.0000 1.1000 1.0000 5.0000 3.5000 1.3000 0.3000 1.0000 5.2000 4.1000 1.5000 0.1000 1.0000
32
5.1000 3.8000 1.5000 0.3000 1.0000 5.1000 3.5000 1.4000 0.2000 1.0000 5.8000 4.0000 1.2000 0.2000 1.0000 5.4000 3.4000 1.7000 0.2000 1.0000 5.4000 3.9000 1.7000 0.4000 1.0000 5.4000 3.9000 1.3000 0.4000 1.0000 4.8000 3.1000 1.6000 0.2000 2.0000 4.7000 3.2000 1.3000 0.2000 2.0000 4.8000 3.0000 1.4000 0.1000 2.0000 4.5000 2.3000 1.3000 0.3000 2.0000 4.6000 3.2000 1.4000 0.2000 2.0000 4.7000 3.2000 1.6000 0.2000 2.0000 4.6000 3.1000 1.5000 0.2000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.4000 3.0000 1.3000 0.2000 2.0000 4.3000 3.0000 1.1000 0.1000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.4000 3.2000 1.3000 0.2000 2.0000 4.9000 3.0000 1.4000 0.2000 2.0000 5.0000 3.3000 1.4000 0.2000 2.0000 5.0000 3.0000 1.6000 0.2000 2.0000 4.8000 3.0000 1.4000 0.3000 2.0000 4.6000 3.6000 1.0000 0.2000 2.0000 4.8000 3.4000 1.6000 0.2000 2.0000 5.0000 3.2000 1.2000 0.2000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.6000 3.4000 1.4000 0.3000 2.0000 4.4000 2.9000 1.4000 0.2000 2.0000 6.0000 2.2000 5.0000 1.5000 3.0000 5.7000 3.0000 4.2000 1.2000 3.0000 6.2000 3.4000 5.4000 2.3000 3.0000 5.6000 3.0000 4.1000 1.3000 3.0000 6.3000 2.3000 4.4000 1.3000 3.0000 5.8000 2.7000 5.1000 1.9000 3.0000 6.4000 3.2000 5.3000 2.3000 3.0000 6.7000 2.5000 5.8000 1.8000 3.0000 6.3000 2.7000 4.9000 1.8000 3.0000 7.7000 3.8000 6.7000 2.2000 3.0000
33
6.4000 3.2000 4.5000 1.5000 3.0000 6.7000 3.3000 5.7000 2.5000 3.0000 5.8000 2.6000 4.0000 1.2000 3.0000 6.4000 2.8000 5.6000 2.2000 3.0000 5.6000 2.9000 3.6000 1.3000 3.0000 6.3000 3.4000 5.6000 2.4000 3.0000 7.9000 3.8000 6.4000 2.0000 3.0000 6.7000 3.1000 4.4000 1.4000 3.0000 6.4000 2.7000 5.3000 1.9000 3.0000 6.2000 2.9000 4.3000 1.3000 3.0000 7.0000 3.2000 4.7000 1.4000 3.0000 5.8000 2.7000 5.1000 1.9000 3.0000 7.2000 3.0000 5.8000 1.6000 3.0000 6.7000 3.0000 5.2000 2.3000 3.0000 6.1000 2.8000 4.7000 1.2000 3.0000 6.3000 3.3000 4.7000 1.6000 3.0000 7.2000 3.6000 6.1000 2.5000 3.0000 6.8000 3.2000 5.9000 2.3000 3.0000 6.3000 2.5000 5.0000 1.9000 3.0000 5.9000 3.2000 4.8000 1.8000 3.0000 6.4000 2.9000 4.3000 1.3000 3.0000 6.2000 2.2000 4.5000 1.5000 3.0000 6.6000 2.9000 4.6000 1.3000 3.0000 5.6000 3.0000 4.5000 1.5000 3.0000 7.2000 3.2000 6.0000 1.8000 3.0000 6.3000 2.9000 5.6000 1.8000 3.0000 7.4000 2.8000 6.1000 1.9000 3.0000 5.8000 2.7000 3.9000 1.2000 3.0000 5.7000 2.8000 4.1000 1.3000 3.0000 5.7000 2.9000 4.2000 1.3000 3.0000 6.5000 2.8000 4.6000 1.5000 3.0000 6.3000 2.5000 4.9000 1.5000 3.0000 6.9000 3.1000 5.4000 2.1000 3.0000 6.7000 3.0000 5.0000 1.7000 3.0000 6.0000 3.4000 4.5000 1.6000 3.0000 6.7000 3.3000 5.7000 2.1000 3.0000 6.9000 3.1000 5.1000 2.3000 3.0000 6.5000 3.0000 5.2000 2.0000 3.0000 6.1000 3.0000 4.9000 1.8000 3.0000 6.5000 3.0000 5.5000 1.8000 3.0000
34
7.3000 2.9000 6.3000 1.8000 3.0000 5.6000 2.7000 4.2000 1.3000 3.0000 6.8000 2.8000 4.8000 1.4000 3.0000 6.9000 3.1000 4.9000 1.5000 3.0000 5.2000 2.7000 3.9000 1.4000 3.0000 6.7000 3.1000 5.6000 2.4000 3.0000 6.3000 3.3000 6.0000 2.5000 3.0000 5.5000 2.5000 4.0000 1.3000 3.0000 5.7000 2.5000 5.0000 2.0000 3.0000 6.1000 2.9000 4.7000 1.4000 3.0000 5.5000 2.4000 3.8000 1.1000 3.0000 7.7000 3.0000 6.1000 2.3000 3.0000 6.7000 3.1000 4.7000 1.5000 3.0000 6.1000 2.6000 5.6000 1.4000 3.0000 5.7000 2.6000 3.5000 1.0000 3.0000 5.7000 2.8000 4.5000 1.3000 3.0000 7.1000 3.0000 5.9000 2.1000 3.0000 6.6000 3.0000 4.4000 1.4000 3.0000 6.0000 3.0000 4.8000 1.8000 3.0000 6.5000 3.0000 5.8000 2.2000 3.0000 7.6000 3.0000 6.6000 2.1000 3.0000 5.0000 2.0000 3.5000 1.0000 3.0000 4.9000 2.5000 4.5000 1.7000 3.0000 5.4000 3.0000 4.5000 1.5000 3.0000 6.0000 2.7000 5.1000 1.6000 3.0000 6.9000 3.2000 5.7000 2.3000 3.0000 5.5000 2.6000 4.4000 1.2000 3.0000 6.2000 2.8000 4.8000 1.8000 3.0000 5.6000 2.5000 3.9000 1.1000 3.0000 5.6000 2.8000 4.9000 2.0000 3.0000 6.8000 3.0000 5.5000 2.1000 3.0000 6.0000 2.9000 4.5000 1.5000 3.0000 5.8000 2.7000 4.1000 1.0000 3.0000 6.4000 3.1000 5.5000 1.8000 3.0000 7.7000 2.8000 6.7000 2.0000 3.0000 6.3000 2.8000 5.1000 1.5000 3.0000 5.9000 3.0000 4.2000 1.5000 3.0000 5.9000 3.0000 5.1000 1.8000 3.0000 6.4000 2.8000 5.6000 2.1000 3.0000 6.5000 3.2000 5.1000 2.0000 3.0000
35
5.5000 2.4000 3.7000 1.0000 3.0000 6.1000 2.8000 4.0000 1.3000 3.0000 6.1000 3.0000 4.6000 1.4000 3.0000 6.0000 2.2000 4.0000 1.0000 3.0000 5.5000 2.3000 4.0000 1.3000 3.0000 7.7000 2.6000 6.9000 2.3000 3.0000 5.8000 2.8000 5.1000 2.4000 3.0000
36
>>dsa = load('hospital') Sex Age Weight Smoker YPL-320 Male 38 176 true GLI-532 Male 43 163 false PNI-258 Female 38 131 false MIJ-579 Female 40 133 false XLK-030 Female 49 119 false TFP-518 Female 46 142 false LPD-746 Female 33 142 true ATA-945 Male 40 180 false VNL-702 Male 28 183 false LQW-768 Female 31 132 false QFY-472 Female 45 128 false UJG-627 Female 42 137 false XUE-826 Male 25 174 false TRW-072 Male 39 202 true ELG-976 Female 36 129 false KOQ-996 Male 48 181 true YUZ-646 Male 32 191 true XBR-291 Female 27 131 true KPW-846 Male 37 179 false XBA-581 Male 50 172 false BKD-785 Female 48 133 false JHV-416 Female 39 117 false VWL-936 Female 41 137 false AAX-056 Female 44 146 true DTT-578 Female 28 123 true FZR-250 Male 25 189 false FZI-843 Female 39 143 false PUE-347 Female 25 114 false HLE-603 Male 36 166 false FME-049 Male 30 186 true AFK-336 Female 45 126 true TQW-430 Female 40 137 false LIM-480 Female 25 138 false YYV-570 Male 47 187 false MSL-692 Male 44 193 false KKL-155 Female 48 137 false WTL-804 Male 44 192 true NGK-757 Female 35 118 false FLX-785 Male 33 180 true RYA-895 Female 38 128 false VRH-620 Male 39 164 true AFB-271 Male 44 183 false RVS-253 Male 44 169 true
37
JQQ-692 Male 37 194 true VDZ-577 Male 45 172 false NFO-023 Female 37 135 false SPK-046 Male 30 182 false LQF-219 Female 39 121 false HJQ-495 Male 42 158 false EOT-439 Male 42 179 true FLJ-908 Male 49 170 true RBA-579 Female 44 136 true HAK-381 Female 43 135 true OIT-428 Female 47 147 false DAU-529 Male 50 186 true SJX-191 Female 38 124 false JRV-811 Female 41 134 false WCJ-997 Male 45 170 true WAQ-577 Male 36 180 false PPT-086 Female 38 130 false MPF-827 Female 29 130 false XAX-646 Female 28 127 false VAO-708 Female 30 141 false QEQ-082 Female 28 111 false VPG-454 Female 29 134 false RBO-332 Male 36 189 false IJY-130 Female 45 137 false HQI-880 Female 32 136 false ISR-838 Female 31 130 false NSK-403 Female 48 137 true SCQ-914 Male 25 186 false ILS-109 Female 40 127 true VLK-852 Male 39 176 false OJK-718 Female 41 127 false JDR-456 Female 33 115 true SRV-618 Male 31 178 true OSJ-974 Female 35 131 false LSL-639 Male 32 183 false SMP-283 Male 42 194 false QOO-305 Female 48 126 false UDS-151 Male 34 186 false YLN-495 Male 39 188 false NSU-424 Male 28 189 true WXM-486 Female 29 120 false EHE-616 Female 32 132 false ZGS-009 Male 39 182 true HWZ-321 Female 37 120 true GGU-691 Female 49 123 true WUS-105 Female 31 141 true TXM-629 Female 37 129 false
38
DGC-290 Male 38 184 true AGR-528 Male 45 181 false XBJ-540 Female 30 124 false FCD-425 Male 48 174 false HQO-561 Female 48 134 false REV-997 Male 25 171 true HVR-372 Male 44 188 true MEZ-469 Male 49 186 false BEZ-311 Male 45 172 true ZZB-405 Male 48 177 false --------------------------------------------------------------------------------------------------------------------------------
>> dsa = hospital( :, {'Sex','Age','Weight','Smoker'}); >> statarray = grpstats(dsa,'Sex') statarray =
Sex GroupCount mean_Age mean_Weight mean_Smoker Female Female 53 37.717 130.47 0.24528 Male Male 47 38.915 180.53 0.44681
>> dsa = hospital(:,{'Age','Weight','Smoker'}); >> statarray = grpstats(dsa,[],{'mean','min','max'}) statarray =
GroupCount mean_Age min_Age max_Age mean_Weight min_Weight max_Weight All 100 38.28 25 50 154 111 202 mean_Smoker min_Smoker max_Smoker All 0.34 false true