Dr. WAEL KHEDR - mu.edu.sa · 2019. 3. 23. · 9 Compute Median and percentiles for ordinal data matrix B as the following: >> median ([1 2 3 3 1 3]) ans = 2.5000 %% Y = prctile (X,P)

1

( LAB SHEET)

By

Dr. WAEL KHEDR

May 2017

2

Chapter(1): Introduction to MATLAB

7

Chapter(2): Data

(1) Nominal data To create a nominal array B from the array X according to labels.

B= nominal(X , labels )

EX: >> X = {'b' 'b' 'g' ; 'g' 'r' 'b' ; 'b' 'r' 'g' } >> B = nominal(X,{'blue','green','red'}) X =

'b' 'b' 'g'

'g' 'r' 'b'

'b' 'r' 'g'

B =

blue blue green

green red blue

blue red green

(2) Operations on nominal Data

Example : compute the Mode and Entropy

► Compute Mode for nominal data matrix B as the following:

Step(1) >> [K ,I]= hist (B(:) ) % histogram gives bines with its frequencies

K =

4 3 2 % frequencies

https://www.mathworks.com/help/stats/nominal.html#inputarg_X

8

I = blue green red % Bines

Step(2) >> [ v , indx ] = max( K) % determine the index of maximum frequency

v = 4

indx = 1

Step(3) >> mod= B(indx)

mod = blue

The mode of matrix B is “ blue”.

► To Compute Entropy for nominal data matrix B as the following:

Step(1) We have K = [ 4 3 2 ]

Step(2) >> s=sum(K)

s = 9

Step(3) >> pi=K/s % computer percentiles of frequencies

pi = 0.4444 0.3333 0.2222

Step(4) >> E = -sum(pi .* log2(pi))

E = 1.5305

(3) Ordinal data

B= ordinal(X, labels ) creates an ordinal array object B from the array X.

labels the levels in B according to labels. ordinal creates the levels of B from the

sorted unique values in X, and creates default labels for them.

EX: >> quality = ordinal([1 2 3 3 1 3],{'low' 'medium' 'high'})

quality =

low medium high high low high

https://www.mathworks.com/help/stats/ordinal.html#inputarg_Xhttps://www.mathworks.com/help/stats/ordinal.html#inputarg_labelshttps://www.mathworks.com/help/stats/ordinal-object.html

9

► Compute Median and percentiles for ordinal data matrix B as the following:

>> median ([1 2 3 3 1 3])

ans = 2.5000

%% Y = prctile (X,P) returns percentiles of the values in X. P is a scalar

or a vector of percent values

Compute the Percentiles : 10th , 25th , 50th , 75th , 90th as follow:

>> prctile (K , [10 25 50 75 90] )

ans =

2.0000 2.2500 3.0000 3.7500 4.0000

(4) Interval/Scale data

>> x=[ 1 2 3 4 5 6 ] % Or >> x= 1 : 6 >> m=mean(x) m= 3.5000 >> v= var(x) v = 3.5000 >> st= std(x) st = 1.8708

-----------------------------------------------------------------------------------------------------------------

Data matrix :

1.12.216.226.2512.65

1.22.715.225.2710.23

Thickness LoadDistanceProjection

of y load

Projection

of x Load

1.12.216.226.2512.65

1.22.715.225.2710.23

Thickness LoadDistanceProjection

of y load

Projection

of x Load

10

>> D = [ 10.23 5.27 15.22 2.7 1.2 ; 12.65 6.25 16.22 2.2 1.1 ]

D = 10.2300 5.2700 15.2200 2.7000 1.2000

12.6500 6.2500 16.2200 2.2000 1.1000

► Aggregation : Combining two or more attributes (or objects) into a

single attribute (or object)

► By using Histogram >> x = 1:100 >> m = hist(x) m = 10 10 10 10 10 10 10 10 10 10 >> n = hist(x,6) n = 17 17 16 17 16 17

0 20 40 60 80 100 1200

1

2

3

4

5

6

7

8

9

10

0 10 20 30 40 50 60 70 80 90 1000

2

4

6

8

10

12

14

16

18

► Aggregation ► Sampling ► Dimensionality Reduction ► Feature subset selection ► Feature creation ► Discretization and Binarization ► Attribute Transformation

11

► Data Sampling : is the main technique employed for data selection

>> x =1:100;

>> for i =1: 2 : 50 % select dataset x step 2

y( (i+1) /2 ) = x( i );

end

>> y

y =

Columns 1 through 16 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 Columns 17 through 25 33 35 37 39 41 43 45 47 49

Sampling with replacement:

>> data= randn(1000,1000); >> nRows = 100; % number of rows >> nSample = 100; % number of samples >> rndIDX = randi(nRows, nSample, 1); >> newSample = data(rndIDX, :);

----------------------------------------------------------------------------------------

► Dimensionality Reduction by PCA

► Step(1) load dataset “IRIS” >> Iris=load('D:\Iris.txt');

>> size(Iris)

ans = 150 4

► Step(2) Calculate covariance matrix

12

>> C=cov(Iris)

C =

0.6857 -0.0393 1.2737 0.5169

-0.0393 0.1880 -0.3217 -0.1180

1.2737 -0.3217 3.1132 1.2964

0.5169 -0.1180 1.2964 0.5824

► Step(3) Calculate Eigen vectors and Eigen values of covariance

matrix

>> [vect Lamba] = eigs(C)

vect =

0.3616 0.6565 0.5810 -0.3173

-0.0823 0.7297 -0.5964 0.3241

0.8566 -0.1758 -0.0725 0.4797

0.3588 -0.0747 -0.5491 -0.7511

Lamba =

4.2248 0 0 0

0 0.2422 0 0

0 0 0.0785 0

0 0 0 0.0237

► Step(4) Determine Eigen vectors that correspond highest

eigenvalues “PCA”:

13

>> V=vect(1:2,:)

V = 0.3616 0.6565 0.5810 -0.3173

-0.0823 0.7297 -0.5964 0.3241

► Step(4) Dimension reduction from 4 - 2

>> Y= Iris*V';

>> size(Y)

ans = 150 2

-------------------------------------------------------------------------------------------------

Mapping Data to a New Space

Fourier transform (FFT)

Wavelet transform (DWT)

Step(1): compute DFT for one dimension

>> X=1:2:20

X = 1 3 5 7 9 11 13 15 17 19

>> Y=fft(X)

Y =

1.0e+002 *

Columns 1 through 5

1.0000 -0.1000 + 0.3078i -0.1000 + 0.1376i -0.1000 + 0.0727i -

0.1000 + 0.0325i

Columns 6 through 10

-0.1000 -0.1000 - 0.0325i -0.1000 - 0.0727i -0.1000 - 0.1376i -

0.1000 - 0.3078i

Step(2): compute inverse DFT for one dimension

14

>> XX=abs(ifft(Y))

XX =

1.0 3.0 5.0 7.0 9.0 11.0 13.0 15.0 17.0 19.0

Step(3): compute DFT for two dimension

>> A=[1 2 3 ; 4 5 6; 6 7 8]

A =

1 2 3

4 5 6

6 7 8

>> B=fft2(A)

B =

42.0000 -4.5000 + 2.5981i -4.5000 - 2.5981i

-12.0000 + 5.1962i 0 0

-12.0000 - 5.1962i 0 0

Step(3): compute inverse DFT for two dimension

>> Bv= abs( ifft2(B))

Bv =

1 2 3

4 5 6

6 7 8

15

Attribute Transformation

– Simple functions: xk, log(x), ex, |x|

>> C2=A.^2

C2 =

1 4 9

16 25 36

36 49 64

>> Lg=log(A)

Lg =

0 0.6931 1.0986

1.3863 1.6094 1.7918

1.7918 1.9459 2.0794

>> Ex=exp(A)

Ex = 1.0e+003 *

0.0027 0.0074 0.0201

0.0546 0.1484 0.4034

0.4034 1.0966 2.9810

>> ab=abs(A)

ab =

1 2 3

4 5 6

6 7 8

16

– Standardization and Normalization

>> x=1:2:20

x =

1 3 5 7 9 11 13 15 17 19

>> m=mean(x)

m = 10

>> s=std(x)

s = 6.0553

>> y=(x-m)/s

y =

-1.4863 -1.1560 -0.8257 -0.4954 -0.1651 0.1651 0.4954 0.8257

1.1560 1.4863

17

Distance measurement

> p=[1 2 3]; q=[2 4 3]

q = 2 4 3

>> dist= norm ( p – q , 2 )

dist = 2.2361

>> L1= norm(p-q,1)

L1 = 3

>> L_inf= norm(p-q,inf)

L_inf = 2

Minkowski Distance

Euclidean Distance

r=2

Manhattan, taxicab, L1 norm r=1

supremum” (Lmax norm, L norm

r=inf

rn

k

rkk qpdist

1

1)||(

18

► Binary Data ‘ Similarity ‘:

>> p=[1 1 1 0 1]

p = 1 1 1 0 1

>> q=[1 1 0 1 0]

q = 1 1 0 1 0

>> JaccardIndex_ac= sum(p(:) & q(:)) / sum(p(:) | q(:))

JaccardIndex_ac = 0.4000

>> Jaccard_distance= 1-JaccardIndex_ac

Jaccard_distance = 0.6000

► For Matrix:

Alice = [0 1 0;

0 1 0;

0 1 0];

% Carol tries to draw a line.

Carol = [0 1 0;

0 1 0;

1 1 0;];

jaccardIndex_ac = sum(Alice(:) & Carol(:)) /

sum(Alice(:) | Carol(:))

jaccardIndex_ac =

0.7500

% As expected, we can see that Alice's and Carol's

drawing of a line is

% much MORE "similar" than Alice's and Bob's drawing

(0.2).

% Let's check the Jaccard distance.

jaccardDistance_ac = 1 - jaccardIndex_ac

% jaccardDistance_ac =

% 0.2500

19

► Cosine Similarity: If d1 and d2 are two document vectors, then

cos( d1, d2 ) = (d1 d2) / ||d1|| ||d2|| ,

where indicates vector dot product and || d || is the length of vector d.

l Example:

d1 = 3 2 0 5 0 0 0 2 0 0

d2 = 1 0 0 0 0 0 0 1 0 2 cos( d1, d2 ) = .3150

>> d1 = [3 2 0 5 0 0 0 2 0 0 ]

d1 =

3 2 0 5 0 0 0 2 0 0

>> d2 = [1 0 0 0 0 0 0 1 0 2 ]

d2 =

1 0 0 0 0 0 0 1 0 2

>> theta= dot(d1,d2)/(norm(d1,2)*norm(d2,2))

theta = 0.3150

20

► Correlation

ps=(d1-mean(d1))/std(d1)

ps =

1.0279 0.4568 -0.6852 2.1700 -0.6852 -0.6852 -0.6852 0.4568 -

0.6852 -0.6852

>> ds=(d2-mean(d2))/std(d2)

ds =

0.8581 -0.5721 -0.5721 -0.5721 -0.5721 -0.5721 -0.5721 0.8581 -

0.5721 2.2883

>> cor=dot(ps,ds)

cor = 0.1633

)(/))(( pstdpmeanpp kk

)(/))(( qstdqmeanqq kk

qpqpncorrelatio ),(

21

Chapter(3) Data Exploration

In our discussion of data exploration, we focus on

– Summary statistics

– Visualization

– Online Analytical Processing (OLAP)

Example: Iris Plant data set.

Three flower types (classes):

Setosa

Virginica

Versicolour

Four (non-class) attributes

► Sepal width and length

► Petal width and length

//////////////////////////////////////////////////

Summary Statistics : Examples: location - mean spread - standard deviation - Frequency and Mode

Percentiles

22

>> Iris= load('d:\Iris.txt') ;

>>m=mean(Iris)

m =

5.8433 3.0540 3.7587 1.1987

>> mode(Iris)

ans =

5.0000 3.0000 1.5000 0.2000

>> median(Iris)

ans =

5.8000 3.0000 4.3500 1.3000

>> C=cov(Iris)

C =

0.6857 -0.0393 1.2737 0.5169

-0.0393 0.1880 -0.3217 -0.1180

1.2737 -0.3217 3.1132 1.2964

0.5169 -0.1180 1.2964 0.5824

Percentiles

Y = prctile(X,p) returns percentiles of the values in a data vector or

matrix X for the percentages p in the interval [0,100].

Y = prctile(X,[10 25 50 75 90 ]) >> setosa=Iris(1:50,:);

>> Y = prctile(setosa,[10 25 50 75 90 ])

Y =

10 th : 4.5500 3.0000 1.3000 0.1000

https://www.mathworks.com/help/stats/prctile.html#outputarg_Yhttps://www.mathworks.com/help/stats/prctile.html#inputarg_Xhttps://www.mathworks.com/help/stats/prctile.html#inputarg_p

23

25th : 4.8000 3.1000 1.4000 0.2000

50th : 5.0000 3.4000 1.5000 0.2000

75 th : 5.2000 3.7000 1.6000 0.3000

90 th : 5.4500 3.9000 1.7000 0.4000

Visualization

Visualization Techniques: Histograms

>> figure ,hist(Iris(:,1))

4 4.5 5 5.5 6 6.5 7 7.5 80

5

10

15

20

25

30

24



2 2.5 3 3.5 4 4.50

10

20

30

40

50

60

0 1 2 3 4 5 6 70

5

10

15

20

25

30

35

40

25

► Two-Dimensional Histograms

>> figure ,hist3(Iris(:,1:2))

► Visualization Techniques: Box Plots

outlier

10th

percentile

25th

percentile

75th

percentile

50th

percentile

10th

percentile

26

>> boxplot(Iris)

0

1

2

3

4

5

6

7

8

1 2 3 4

27

► Scatter Plot Array of Iris Attributes : >> plotmatrix(Iris)

0 2 40 5 102 4 64 6 8

0

2

4

0

5

10

2

4

6

4

6

8

28

► Parallel Coordinates Plots for Iris Data >> parallelcoords(Iris)

1 2 3 40

1

2

3

4

5

6

7

8

Coordinate

Coord

inate

Valu

e

29

Chapter (8) Cluster Analysis

Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups

Hierarchical Clustering Partitional Clustering

Clustering Algorithms : K-means Method

A Partitional

p4p1 p2 p3

30

>> x = Iris( randperm(150) , : ) ; >> [indx M]= kmeans(x,3) indx =

1 3 3 3 3 1 2

2 3 1 3 3 1 3

3 1 3 2 3 1 3

4 1 3 3 3 3 1

5 1 3 3 3 1 3

6 2 1 3 3 3 3

7 3 3 3 1 1 3

8 3 3 3 3 2 1

9 3 2 3 3 1 3

10 3 3 2 3 3 3

11 3 3 1 3 1 3

12 3 3 3 1 3 3

13 3 3 3 3 1 3

14 1 3 3 3 3 3

15 2 3 2 3 2 3

16 3 3 3 3 1 1

17 3 3 2 3 3 1

18 3 3 3 3 3 3

19 3 3 1 2 1 3

20 3 3 3 1 1 1

21 3 3 3 3 1 3

22 2 3 3 2 3 2

23 3 2 1 2 1 3

24 3 2 2 3 1 3

25 3 2 3 3 1 1

31

M = 5.2161 3.5387 1.6806 0.3581 4.7091 3.1091 1.3955 0.1909 6.3010 2.8866 4.9588 1.6959 >> clusters = [x indx] >> Y =sortrows( clusters, [5]) Y = 5.2000 3.4000 1.4000 0.2000 1.0000 5.5000 3.5000 1.3000 0.2000 1.0000 5.7000 3.8000 1.7000 0.3000 1.0000 5.4000 3.4000 1.5000 0.4000 1.0000 5.0000 3.4000 1.5000 0.2000 1.0000 5.0000 3.4000 1.6000 0.4000 1.0000 4.8000 3.4000 1.9000 0.2000 1.0000 5.0000 3.5000 1.6000 0.6000 1.0000 5.4000 3.7000 1.5000 0.2000 1.0000 5.0000 3.6000 1.4000 0.2000 1.0000 5.0000 2.3000 3.3000 1.0000 1.0000 5.1000 3.8000 1.9000 0.4000 1.0000 5.1000 3.5000 1.4000 0.3000 1.0000 5.1000 3.7000 1.5000 0.4000 1.0000 5.3000 3.7000 1.5000 0.2000 1.0000 5.5000 4.2000 1.4000 0.2000 1.0000 5.2000 3.5000 1.5000 0.2000 1.0000 5.1000 3.3000 1.7000 0.5000 1.0000 4.9000 2.4000 3.3000 1.0000 1.0000 5.1000 3.8000 1.6000 0.2000 1.0000 5.1000 3.4000 1.5000 0.2000 1.0000 5.7000 4.4000 1.5000 0.4000 1.0000 5.1000 2.5000 3.0000 1.1000 1.0000 5.0000 3.5000 1.3000 0.3000 1.0000 5.2000 4.1000 1.5000 0.1000 1.0000

32

5.1000 3.8000 1.5000 0.3000 1.0000 5.1000 3.5000 1.4000 0.2000 1.0000 5.8000 4.0000 1.2000 0.2000 1.0000 5.4000 3.4000 1.7000 0.2000 1.0000 5.4000 3.9000 1.7000 0.4000 1.0000 5.4000 3.9000 1.3000 0.4000 1.0000 4.8000 3.1000 1.6000 0.2000 2.0000 4.7000 3.2000 1.3000 0.2000 2.0000 4.8000 3.0000 1.4000 0.1000 2.0000 4.5000 2.3000 1.3000 0.3000 2.0000 4.6000 3.2000 1.4000 0.2000 2.0000 4.7000 3.2000 1.6000 0.2000 2.0000 4.6000 3.1000 1.5000 0.2000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.4000 3.0000 1.3000 0.2000 2.0000 4.3000 3.0000 1.1000 0.1000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.4000 3.2000 1.3000 0.2000 2.0000 4.9000 3.0000 1.4000 0.2000 2.0000 5.0000 3.3000 1.4000 0.2000 2.0000 5.0000 3.0000 1.6000 0.2000 2.0000 4.8000 3.0000 1.4000 0.3000 2.0000 4.6000 3.6000 1.0000 0.2000 2.0000 4.8000 3.4000 1.6000 0.2000 2.0000 5.0000 3.2000 1.2000 0.2000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.6000 3.4000 1.4000 0.3000 2.0000 4.4000 2.9000 1.4000 0.2000 2.0000 6.0000 2.2000 5.0000 1.5000 3.0000 5.7000 3.0000 4.2000 1.2000 3.0000 6.2000 3.4000 5.4000 2.3000 3.0000 5.6000 3.0000 4.1000 1.3000 3.0000 6.3000 2.3000 4.4000 1.3000 3.0000 5.8000 2.7000 5.1000 1.9000 3.0000 6.4000 3.2000 5.3000 2.3000 3.0000 6.7000 2.5000 5.8000 1.8000 3.0000 6.3000 2.7000 4.9000 1.8000 3.0000 7.7000 3.8000 6.7000 2.2000 3.0000

33

6.4000 3.2000 4.5000 1.5000 3.0000 6.7000 3.3000 5.7000 2.5000 3.0000 5.8000 2.6000 4.0000 1.2000 3.0000 6.4000 2.8000 5.6000 2.2000 3.0000 5.6000 2.9000 3.6000 1.3000 3.0000 6.3000 3.4000 5.6000 2.4000 3.0000 7.9000 3.8000 6.4000 2.0000 3.0000 6.7000 3.1000 4.4000 1.4000 3.0000 6.4000 2.7000 5.3000 1.9000 3.0000 6.2000 2.9000 4.3000 1.3000 3.0000 7.0000 3.2000 4.7000 1.4000 3.0000 5.8000 2.7000 5.1000 1.9000 3.0000 7.2000 3.0000 5.8000 1.6000 3.0000 6.7000 3.0000 5.2000 2.3000 3.0000 6.1000 2.8000 4.7000 1.2000 3.0000 6.3000 3.3000 4.7000 1.6000 3.0000 7.2000 3.6000 6.1000 2.5000 3.0000 6.8000 3.2000 5.9000 2.3000 3.0000 6.3000 2.5000 5.0000 1.9000 3.0000 5.9000 3.2000 4.8000 1.8000 3.0000 6.4000 2.9000 4.3000 1.3000 3.0000 6.2000 2.2000 4.5000 1.5000 3.0000 6.6000 2.9000 4.6000 1.3000 3.0000 5.6000 3.0000 4.5000 1.5000 3.0000 7.2000 3.2000 6.0000 1.8000 3.0000 6.3000 2.9000 5.6000 1.8000 3.0000 7.4000 2.8000 6.1000 1.9000 3.0000 5.8000 2.7000 3.9000 1.2000 3.0000 5.7000 2.8000 4.1000 1.3000 3.0000 5.7000 2.9000 4.2000 1.3000 3.0000 6.5000 2.8000 4.6000 1.5000 3.0000 6.3000 2.5000 4.9000 1.5000 3.0000 6.9000 3.1000 5.4000 2.1000 3.0000 6.7000 3.0000 5.0000 1.7000 3.0000 6.0000 3.4000 4.5000 1.6000 3.0000 6.7000 3.3000 5.7000 2.1000 3.0000 6.9000 3.1000 5.1000 2.3000 3.0000 6.5000 3.0000 5.2000 2.0000 3.0000 6.1000 3.0000 4.9000 1.8000 3.0000 6.5000 3.0000 5.5000 1.8000 3.0000

34

7.3000 2.9000 6.3000 1.8000 3.0000 5.6000 2.7000 4.2000 1.3000 3.0000 6.8000 2.8000 4.8000 1.4000 3.0000 6.9000 3.1000 4.9000 1.5000 3.0000 5.2000 2.7000 3.9000 1.4000 3.0000 6.7000 3.1000 5.6000 2.4000 3.0000 6.3000 3.3000 6.0000 2.5000 3.0000 5.5000 2.5000 4.0000 1.3000 3.0000 5.7000 2.5000 5.0000 2.0000 3.0000 6.1000 2.9000 4.7000 1.4000 3.0000 5.5000 2.4000 3.8000 1.1000 3.0000 7.7000 3.0000 6.1000 2.3000 3.0000 6.7000 3.1000 4.7000 1.5000 3.0000 6.1000 2.6000 5.6000 1.4000 3.0000 5.7000 2.6000 3.5000 1.0000 3.0000 5.7000 2.8000 4.5000 1.3000 3.0000 7.1000 3.0000 5.9000 2.1000 3.0000 6.6000 3.0000 4.4000 1.4000 3.0000 6.0000 3.0000 4.8000 1.8000 3.0000 6.5000 3.0000 5.8000 2.2000 3.0000 7.6000 3.0000 6.6000 2.1000 3.0000 5.0000 2.0000 3.5000 1.0000 3.0000 4.9000 2.5000 4.5000 1.7000 3.0000 5.4000 3.0000 4.5000 1.5000 3.0000 6.0000 2.7000 5.1000 1.6000 3.0000 6.9000 3.2000 5.7000 2.3000 3.0000 5.5000 2.6000 4.4000 1.2000 3.0000 6.2000 2.8000 4.8000 1.8000 3.0000 5.6000 2.5000 3.9000 1.1000 3.0000 5.6000 2.8000 4.9000 2.0000 3.0000 6.8000 3.0000 5.5000 2.1000 3.0000 6.0000 2.9000 4.5000 1.5000 3.0000 5.8000 2.7000 4.1000 1.0000 3.0000 6.4000 3.1000 5.5000 1.8000 3.0000 7.7000 2.8000 6.7000 2.0000 3.0000 6.3000 2.8000 5.1000 1.5000 3.0000 5.9000 3.0000 4.2000 1.5000 3.0000 5.9000 3.0000 5.1000 1.8000 3.0000 6.4000 2.8000 5.6000 2.1000 3.0000 6.5000 3.2000 5.1000 2.0000 3.0000

35

5.5000 2.4000 3.7000 1.0000 3.0000 6.1000 2.8000 4.0000 1.3000 3.0000 6.1000 3.0000 4.6000 1.4000 3.0000 6.0000 2.2000 4.0000 1.0000 3.0000 5.5000 2.3000 4.0000 1.3000 3.0000 7.7000 2.6000 6.9000 2.3000 3.0000 5.8000 2.8000 5.1000 2.4000 3.0000

36

>>dsa = load('hospital') Sex Age Weight Smoker YPL-320 Male 38 176 true GLI-532 Male 43 163 false PNI-258 Female 38 131 false MIJ-579 Female 40 133 false XLK-030 Female 49 119 false TFP-518 Female 46 142 false LPD-746 Female 33 142 true ATA-945 Male 40 180 false VNL-702 Male 28 183 false LQW-768 Female 31 132 false QFY-472 Female 45 128 false UJG-627 Female 42 137 false XUE-826 Male 25 174 false TRW-072 Male 39 202 true ELG-976 Female 36 129 false KOQ-996 Male 48 181 true YUZ-646 Male 32 191 true XBR-291 Female 27 131 true KPW-846 Male 37 179 false XBA-581 Male 50 172 false BKD-785 Female 48 133 false JHV-416 Female 39 117 false VWL-936 Female 41 137 false AAX-056 Female 44 146 true DTT-578 Female 28 123 true FZR-250 Male 25 189 false FZI-843 Female 39 143 false PUE-347 Female 25 114 false HLE-603 Male 36 166 false FME-049 Male 30 186 true AFK-336 Female 45 126 true TQW-430 Female 40 137 false LIM-480 Female 25 138 false YYV-570 Male 47 187 false MSL-692 Male 44 193 false KKL-155 Female 48 137 false WTL-804 Male 44 192 true NGK-757 Female 35 118 false FLX-785 Male 33 180 true RYA-895 Female 38 128 false VRH-620 Male 39 164 true AFB-271 Male 44 183 false RVS-253 Male 44 169 true

37

JQQ-692 Male 37 194 true VDZ-577 Male 45 172 false NFO-023 Female 37 135 false SPK-046 Male 30 182 false LQF-219 Female 39 121 false HJQ-495 Male 42 158 false EOT-439 Male 42 179 true FLJ-908 Male 49 170 true RBA-579 Female 44 136 true HAK-381 Female 43 135 true OIT-428 Female 47 147 false DAU-529 Male 50 186 true SJX-191 Female 38 124 false JRV-811 Female 41 134 false WCJ-997 Male 45 170 true WAQ-577 Male 36 180 false PPT-086 Female 38 130 false MPF-827 Female 29 130 false XAX-646 Female 28 127 false VAO-708 Female 30 141 false QEQ-082 Female 28 111 false VPG-454 Female 29 134 false RBO-332 Male 36 189 false IJY-130 Female 45 137 false HQI-880 Female 32 136 false ISR-838 Female 31 130 false NSK-403 Female 48 137 true SCQ-914 Male 25 186 false ILS-109 Female 40 127 true VLK-852 Male 39 176 false OJK-718 Female 41 127 false JDR-456 Female 33 115 true SRV-618 Male 31 178 true OSJ-974 Female 35 131 false LSL-639 Male 32 183 false SMP-283 Male 42 194 false QOO-305 Female 48 126 false UDS-151 Male 34 186 false YLN-495 Male 39 188 false NSU-424 Male 28 189 true WXM-486 Female 29 120 false EHE-616 Female 32 132 false ZGS-009 Male 39 182 true HWZ-321 Female 37 120 true GGU-691 Female 49 123 true WUS-105 Female 31 141 true TXM-629 Female 37 129 false

38

DGC-290 Male 38 184 true AGR-528 Male 45 181 false XBJ-540 Female 30 124 false FCD-425 Male 48 174 false HQO-561 Female 48 134 false REV-997 Male 25 171 true HVR-372 Male 44 188 true MEZ-469 Male 49 186 false BEZ-311 Male 45 172 true ZZB-405 Male 48 177 false --------------------------------------------------------------------------------------------------------------------------------

>> dsa = hospital( :, {'Sex','Age','Weight','Smoker'}); >> statarray = grpstats(dsa,'Sex') statarray =

Sex GroupCount mean_Age mean_Weight mean_Smoker Female Female 53 37.717 130.47 0.24528 Male Male 47 38.915 180.53 0.44681

>> dsa = hospital(:,{'Age','Weight','Smoker'}); >> statarray = grpstats(dsa,[],{'mean','min','max'}) statarray =

GroupCount mean_Age min_Age max_Age mean_Weight min_Weight max_Weight All 100 38.28 25 50 154 111 202 mean_Smoker min_Smoker max_Smoker All 0.34 false true

Documents

Dr. WAEL KHEDR - mu.edu.sa · 2019. 3. 23. · 9 Compute Median and percentiles for ordinal data matrix B as the following: >> median ([1 2 3 3 1 3]) ans = 2.5000 %% Y = prctile (X,P)