38
1 ( LAB SHEET) By Dr. WAEL KHEDR May 2017

Dr. WAEL KHEDR - mu.edu.sa · 2019. 3. 23. · 9 Compute Median and percentiles for ordinal data matrix B as the following: >> median ([1 2 3 3 1 3]) ans = 2.5000 %% Y = prctile (X,P)

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • 1

    ( LAB SHEET)

    By

    Dr. WAEL KHEDR

    May 2017

  • 2

    Chapter(1): Introduction to MATLAB

  • 3

  • 4

  • 5

  • 6

  • 7

    Chapter(2): Data

    (1) Nominal data To create a nominal array B from the array X according to labels.

    B= nominal(X , labels )

    EX: >> X = {'b' 'b' 'g' ; 'g' 'r' 'b' ; 'b' 'r' 'g' } >> B = nominal(X,{'blue','green','red'}) X =

    'b' 'b' 'g'

    'g' 'r' 'b'

    'b' 'r' 'g'

    B =

    blue blue green

    green red blue

    blue red green

    (2) Operations on nominal Data

    Example : compute the Mode and Entropy

    ► Compute Mode for nominal data matrix B as the following:

    Step(1) >> [K ,I]= hist (B(:) ) % histogram gives bines with its frequencies

    K =

    4 3 2 % frequencies

    https://www.mathworks.com/help/stats/nominal.html#inputarg_X

  • 8

    I = blue green red % Bines

    Step(2) >> [ v , indx ] = max( K) % determine the index of maximum frequency

    v = 4

    indx = 1

    Step(3) >> mod= B(indx)

    mod = blue

    The mode of matrix B is “ blue”.

    ► To Compute Entropy for nominal data matrix B as the following:

    Step(1) We have K = [ 4 3 2 ]

    Step(2) >> s=sum(K)

    s = 9

    Step(3) >> pi=K/s % computer percentiles of frequencies

    pi = 0.4444 0.3333 0.2222

    Step(4) >> E = -sum(pi .* log2(pi))

    E = 1.5305

    (3) Ordinal data

    B= ordinal(X, labels ) creates an ordinal array object B from the array X.

    labels the levels in B according to labels. ordinal creates the levels of B from the

    sorted unique values in X, and creates default labels for them.

    EX: >> quality = ordinal([1 2 3 3 1 3],{'low' 'medium' 'high'})

    quality =

    low medium high high low high

    https://www.mathworks.com/help/stats/ordinal.html#inputarg_Xhttps://www.mathworks.com/help/stats/ordinal.html#inputarg_labelshttps://www.mathworks.com/help/stats/ordinal-object.html

  • 9

    ► Compute Median and percentiles for ordinal data matrix B as the following:

    >> median ([1 2 3 3 1 3])

    ans = 2.5000

    %% Y = prctile (X,P) returns percentiles of the values in X. P is a scalar

    or a vector of percent values

    Compute the Percentiles : 10th , 25th , 50th , 75th , 90th as follow:

    >> prctile (K , [10 25 50 75 90] )

    ans =

    2.0000 2.2500 3.0000 3.7500 4.0000

    (4) Interval/Scale data

    >> x=[ 1 2 3 4 5 6 ] % Or >> x= 1 : 6 >> m=mean(x) m= 3.5000 >> v= var(x) v = 3.5000 >> st= std(x) st = 1.8708

    -----------------------------------------------------------------------------------------------------------------

    Data matrix :

    1.12.216.226.2512.65

    1.22.715.225.2710.23

    Thickness LoadDistanceProjection

    of y load

    Projection

    of x Load

    1.12.216.226.2512.65

    1.22.715.225.2710.23

    Thickness LoadDistanceProjection

    of y load

    Projection

    of x Load

  • 10

    >> D = [ 10.23 5.27 15.22 2.7 1.2 ; 12.65 6.25 16.22 2.2 1.1 ]

    D = 10.2300 5.2700 15.2200 2.7000 1.2000

    12.6500 6.2500 16.2200 2.2000 1.1000

    ► Aggregation : Combining two or more attributes (or objects) into a

    single attribute (or object)

    ► By using Histogram >> x = 1:100 >> m = hist(x) m = 10 10 10 10 10 10 10 10 10 10 >> n = hist(x,6) n = 17 17 16 17 16 17

    0 20 40 60 80 100 1200

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    0 10 20 30 40 50 60 70 80 90 1000

    2

    4

    6

    8

    10

    12

    14

    16

    18

    ► Aggregation ► Sampling ► Dimensionality Reduction ► Feature subset selection ► Feature creation ► Discretization and Binarization ► Attribute Transformation

  • 11

    ► Data Sampling : is the main technique employed for data selection

    >> x =1:100;

    >> for i =1: 2 : 50 % select dataset x step 2

    y( (i+1) /2 ) = x( i );

    end

    >> y

    y =

    Columns 1 through 16 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 Columns 17 through 25 33 35 37 39 41 43 45 47 49

    Sampling with replacement:

    >> data= randn(1000,1000); >> nRows = 100; % number of rows >> nSample = 100; % number of samples >> rndIDX = randi(nRows, nSample, 1); >> newSample = data(rndIDX, :);

    ----------------------------------------------------------------------------------------

    ► Dimensionality Reduction by PCA

    ► Step(1) load dataset “IRIS” >> Iris=load('D:\Iris.txt');

    >> size(Iris)

    ans = 150 4

    ► Step(2) Calculate covariance matrix

  • 12

    >> C=cov(Iris)

    C =

    0.6857 -0.0393 1.2737 0.5169

    -0.0393 0.1880 -0.3217 -0.1180

    1.2737 -0.3217 3.1132 1.2964

    0.5169 -0.1180 1.2964 0.5824

    ► Step(3) Calculate Eigen vectors and Eigen values of covariance

    matrix

    >> [vect Lamba] = eigs(C)

    vect =

    0.3616 0.6565 0.5810 -0.3173

    -0.0823 0.7297 -0.5964 0.3241

    0.8566 -0.1758 -0.0725 0.4797

    0.3588 -0.0747 -0.5491 -0.7511

    Lamba =

    4.2248 0 0 0

    0 0.2422 0 0

    0 0 0.0785 0

    0 0 0 0.0237

    ► Step(4) Determine Eigen vectors that correspond highest

    eigenvalues “PCA”:

  • 13

    >> V=vect(1:2,:)

    V = 0.3616 0.6565 0.5810 -0.3173

    -0.0823 0.7297 -0.5964 0.3241

    ► Step(4) Dimension reduction from 4 - 2

    >> Y= Iris*V';

    >> size(Y)

    ans = 150 2

    -------------------------------------------------------------------------------------------------

    Mapping Data to a New Space

    Fourier transform (FFT)

    Wavelet transform (DWT)

    Step(1): compute DFT for one dimension

    >> X=1:2:20

    X = 1 3 5 7 9 11 13 15 17 19

    >> Y=fft(X)

    Y =

    1.0e+002 *

    Columns 1 through 5

    1.0000 -0.1000 + 0.3078i -0.1000 + 0.1376i -0.1000 + 0.0727i -

    0.1000 + 0.0325i

    Columns 6 through 10

    -0.1000 -0.1000 - 0.0325i -0.1000 - 0.0727i -0.1000 - 0.1376i -

    0.1000 - 0.3078i

    Step(2): compute inverse DFT for one dimension

  • 14

    >> XX=abs(ifft(Y))

    XX =

    1.0 3.0 5.0 7.0 9.0 11.0 13.0 15.0 17.0 19.0

    Step(3): compute DFT for two dimension

    >> A=[1 2 3 ; 4 5 6; 6 7 8]

    A =

    1 2 3

    4 5 6

    6 7 8

    >> B=fft2(A)

    B =

    42.0000 -4.5000 + 2.5981i -4.5000 - 2.5981i

    -12.0000 + 5.1962i 0 0

    -12.0000 - 5.1962i 0 0

    Step(3): compute inverse DFT for two dimension

    >> Bv= abs( ifft2(B))

    Bv =

    1 2 3

    4 5 6

    6 7 8

  • 15

    Attribute Transformation

    – Simple functions: xk, log(x), ex, |x|

    >> C2=A.^2

    C2 =

    1 4 9

    16 25 36

    36 49 64

    >> Lg=log(A)

    Lg =

    0 0.6931 1.0986

    1.3863 1.6094 1.7918

    1.7918 1.9459 2.0794

    >> Ex=exp(A)

    Ex = 1.0e+003 *

    0.0027 0.0074 0.0201

    0.0546 0.1484 0.4034

    0.4034 1.0966 2.9810

    >> ab=abs(A)

    ab =

    1 2 3

    4 5 6

    6 7 8

  • 16

    – Standardization and Normalization

    >> x=1:2:20

    x =

    1 3 5 7 9 11 13 15 17 19

    >> m=mean(x)

    m = 10

    >> s=std(x)

    s = 6.0553

    >> y=(x-m)/s

    y =

    -1.4863 -1.1560 -0.8257 -0.4954 -0.1651 0.1651 0.4954 0.8257

    1.1560 1.4863

  • 17

    Distance measurement

    > p=[1 2 3]; q=[2 4 3]

    q = 2 4 3

    >> dist= norm ( p – q , 2 )

    dist = 2.2361

    >> L1= norm(p-q,1)

    L1 = 3

    >> L_inf= norm(p-q,inf)

    L_inf = 2

    Minkowski Distance

    Euclidean Distance

    r=2

    Manhattan, taxicab, L1 norm r=1

    supremum” (Lmax norm, L norm

    r=inf

    rn

    k

    rkk qpdist

    1

    1)||(

  • 18

    ► Binary Data ‘ Similarity ‘:

    >> p=[1 1 1 0 1]

    p = 1 1 1 0 1

    >> q=[1 1 0 1 0]

    q = 1 1 0 1 0

    >> JaccardIndex_ac= sum(p(:) & q(:)) / sum(p(:) | q(:))

    JaccardIndex_ac = 0.4000

    >> Jaccard_distance= 1-JaccardIndex_ac

    Jaccard_distance = 0.6000

    ► For Matrix:

    Alice = [0 1 0;

    0 1 0;

    0 1 0];

    % Carol tries to draw a line.

    Carol = [0 1 0;

    0 1 0;

    1 1 0;];

    jaccardIndex_ac = sum(Alice(:) & Carol(:)) /

    sum(Alice(:) | Carol(:))

    jaccardIndex_ac =

    0.7500

    % As expected, we can see that Alice's and Carol's

    drawing of a line is

    % much MORE "similar" than Alice's and Bob's drawing

    (0.2).

    % Let's check the Jaccard distance.

    jaccardDistance_ac = 1 - jaccardIndex_ac

    % jaccardDistance_ac =

    % 0.2500

  • 19

    ► Cosine Similarity: If d1 and d2 are two document vectors, then

    cos( d1, d2 ) = (d1 d2) / ||d1|| ||d2|| ,

    where indicates vector dot product and || d || is the length of vector d.

    l Example:

    d1 = 3 2 0 5 0 0 0 2 0 0

    d2 = 1 0 0 0 0 0 0 1 0 2 cos( d1, d2 ) = .3150

    >> d1 = [3 2 0 5 0 0 0 2 0 0 ]

    d1 =

    3 2 0 5 0 0 0 2 0 0

    >> d2 = [1 0 0 0 0 0 0 1 0 2 ]

    d2 =

    1 0 0 0 0 0 0 1 0 2

    >> theta= dot(d1,d2)/(norm(d1,2)*norm(d2,2))

    theta = 0.3150

  • 20

    ► Correlation

    ps=(d1-mean(d1))/std(d1)

    ps =

    1.0279 0.4568 -0.6852 2.1700 -0.6852 -0.6852 -0.6852 0.4568 -

    0.6852 -0.6852

    >> ds=(d2-mean(d2))/std(d2)

    ds =

    0.8581 -0.5721 -0.5721 -0.5721 -0.5721 -0.5721 -0.5721 0.8581 -

    0.5721 2.2883

    >> cor=dot(ps,ds)

    cor = 0.1633

    )(/))(( pstdpmeanpp kk

    )(/))(( qstdqmeanqq kk

    qpqpncorrelatio ),(

  • 21

    Chapter(3) Data Exploration

    In our discussion of data exploration, we focus on

    – Summary statistics

    – Visualization

    – Online Analytical Processing (OLAP)

    Example: Iris Plant data set.

    Three flower types (classes):

    Setosa

    Virginica

    Versicolour

    Four (non-class) attributes

    ► Sepal width and length

    ► Petal width and length

    //////////////////////////////////////////////////

    Summary Statistics : Examples: location - mean spread - standard deviation - Frequency and Mode

    Percentiles

  • 22

    >> Iris= load('d:\Iris.txt') ;

    >>m=mean(Iris)

    m =

    5.8433 3.0540 3.7587 1.1987

    >> mode(Iris)

    ans =

    5.0000 3.0000 1.5000 0.2000

    >> median(Iris)

    ans =

    5.8000 3.0000 4.3500 1.3000

    >> C=cov(Iris)

    C =

    0.6857 -0.0393 1.2737 0.5169

    -0.0393 0.1880 -0.3217 -0.1180

    1.2737 -0.3217 3.1132 1.2964

    0.5169 -0.1180 1.2964 0.5824

    Percentiles

    Y = prctile(X,p) returns percentiles of the values in a data vector or

    matrix X for the percentages p in the interval [0,100].

    Y = prctile(X,[10 25 50 75 90 ]) >> setosa=Iris(1:50,:);

    >> Y = prctile(setosa,[10 25 50 75 90 ])

    Y =

    10 th : 4.5500 3.0000 1.3000 0.1000

    https://www.mathworks.com/help/stats/prctile.html#outputarg_Yhttps://www.mathworks.com/help/stats/prctile.html#inputarg_Xhttps://www.mathworks.com/help/stats/prctile.html#inputarg_p

  • 23

    25th : 4.8000 3.1000 1.4000 0.2000

    50th : 5.0000 3.4000 1.5000 0.2000

    75 th : 5.2000 3.7000 1.6000 0.3000

    90 th : 5.4500 3.9000 1.7000 0.4000

    Visualization

    Visualization Techniques: Histograms

    >> figure ,hist(Iris(:,1))

    4 4.5 5 5.5 6 6.5 7 7.5 80

    5

    10

    15

    20

    25

    30

  • 24

    >> figure ,hist(Iris(:,2))

    >> figure ,hist(Iris(:,3))

    2 2.5 3 3.5 4 4.50

    10

    20

    30

    40

    50

    60

    0 1 2 3 4 5 6 70

    5

    10

    15

    20

    25

    30

    35

    40

  • 25

    ► Two-Dimensional Histograms

    >> figure ,hist3(Iris(:,1:2))

    ► Visualization Techniques: Box Plots

    outlier

    10th

    percentile

    25th

    percentile

    75th

    percentile

    50th

    percentile

    10th

    percentile

  • 26

    >> boxplot(Iris)

    0

    1

    2

    3

    4

    5

    6

    7

    8

    1 2 3 4

  • 27

    ► Scatter Plot Array of Iris Attributes : >> plotmatrix(Iris)

    0 2 40 5 102 4 64 6 8

    0

    2

    4

    0

    5

    10

    2

    4

    6

    4

    6

    8

  • 28

    ► Parallel Coordinates Plots for Iris Data >> parallelcoords(Iris)

    1 2 3 40

    1

    2

    3

    4

    5

    6

    7

    8

    Coordinate

    Coord

    inate

    Valu

    e

  • 29

    Chapter (8) Cluster Analysis

    Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups

    Hierarchical Clustering Partitional Clustering

    Clustering Algorithms : K-means Method

    A Partitional

    p4p1 p2 p3

  • 30

    >> x = Iris( randperm(150) , : ) ; >> [indx M]= kmeans(x,3) indx =

    1 3 3 3 3 1 2

    2 3 1 3 3 1 3

    3 1 3 2 3 1 3

    4 1 3 3 3 3 1

    5 1 3 3 3 1 3

    6 2 1 3 3 3 3

    7 3 3 3 1 1 3

    8 3 3 3 3 2 1

    9 3 2 3 3 1 3

    10 3 3 2 3 3 3

    11 3 3 1 3 1 3

    12 3 3 3 1 3 3

    13 3 3 3 3 1 3

    14 1 3 3 3 3 3

    15 2 3 2 3 2 3

    16 3 3 3 3 1 1

    17 3 3 2 3 3 1

    18 3 3 3 3 3 3

    19 3 3 1 2 1 3

    20 3 3 3 1 1 1

    21 3 3 3 3 1 3

    22 2 3 3 2 3 2

    23 3 2 1 2 1 3

    24 3 2 2 3 1 3

    25 3 2 3 3 1 1

  • 31

    M = 5.2161 3.5387 1.6806 0.3581 4.7091 3.1091 1.3955 0.1909 6.3010 2.8866 4.9588 1.6959 >> clusters = [x indx] >> Y =sortrows( clusters, [5]) Y = 5.2000 3.4000 1.4000 0.2000 1.0000 5.5000 3.5000 1.3000 0.2000 1.0000 5.7000 3.8000 1.7000 0.3000 1.0000 5.4000 3.4000 1.5000 0.4000 1.0000 5.0000 3.4000 1.5000 0.2000 1.0000 5.0000 3.4000 1.6000 0.4000 1.0000 4.8000 3.4000 1.9000 0.2000 1.0000 5.0000 3.5000 1.6000 0.6000 1.0000 5.4000 3.7000 1.5000 0.2000 1.0000 5.0000 3.6000 1.4000 0.2000 1.0000 5.0000 2.3000 3.3000 1.0000 1.0000 5.1000 3.8000 1.9000 0.4000 1.0000 5.1000 3.5000 1.4000 0.3000 1.0000 5.1000 3.7000 1.5000 0.4000 1.0000 5.3000 3.7000 1.5000 0.2000 1.0000 5.5000 4.2000 1.4000 0.2000 1.0000 5.2000 3.5000 1.5000 0.2000 1.0000 5.1000 3.3000 1.7000 0.5000 1.0000 4.9000 2.4000 3.3000 1.0000 1.0000 5.1000 3.8000 1.6000 0.2000 1.0000 5.1000 3.4000 1.5000 0.2000 1.0000 5.7000 4.4000 1.5000 0.4000 1.0000 5.1000 2.5000 3.0000 1.1000 1.0000 5.0000 3.5000 1.3000 0.3000 1.0000 5.2000 4.1000 1.5000 0.1000 1.0000

  • 32

    5.1000 3.8000 1.5000 0.3000 1.0000 5.1000 3.5000 1.4000 0.2000 1.0000 5.8000 4.0000 1.2000 0.2000 1.0000 5.4000 3.4000 1.7000 0.2000 1.0000 5.4000 3.9000 1.7000 0.4000 1.0000 5.4000 3.9000 1.3000 0.4000 1.0000 4.8000 3.1000 1.6000 0.2000 2.0000 4.7000 3.2000 1.3000 0.2000 2.0000 4.8000 3.0000 1.4000 0.1000 2.0000 4.5000 2.3000 1.3000 0.3000 2.0000 4.6000 3.2000 1.4000 0.2000 2.0000 4.7000 3.2000 1.6000 0.2000 2.0000 4.6000 3.1000 1.5000 0.2000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.4000 3.0000 1.3000 0.2000 2.0000 4.3000 3.0000 1.1000 0.1000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.4000 3.2000 1.3000 0.2000 2.0000 4.9000 3.0000 1.4000 0.2000 2.0000 5.0000 3.3000 1.4000 0.2000 2.0000 5.0000 3.0000 1.6000 0.2000 2.0000 4.8000 3.0000 1.4000 0.3000 2.0000 4.6000 3.6000 1.0000 0.2000 2.0000 4.8000 3.4000 1.6000 0.2000 2.0000 5.0000 3.2000 1.2000 0.2000 2.0000 4.9000 3.1000 1.5000 0.1000 2.0000 4.6000 3.4000 1.4000 0.3000 2.0000 4.4000 2.9000 1.4000 0.2000 2.0000 6.0000 2.2000 5.0000 1.5000 3.0000 5.7000 3.0000 4.2000 1.2000 3.0000 6.2000 3.4000 5.4000 2.3000 3.0000 5.6000 3.0000 4.1000 1.3000 3.0000 6.3000 2.3000 4.4000 1.3000 3.0000 5.8000 2.7000 5.1000 1.9000 3.0000 6.4000 3.2000 5.3000 2.3000 3.0000 6.7000 2.5000 5.8000 1.8000 3.0000 6.3000 2.7000 4.9000 1.8000 3.0000 7.7000 3.8000 6.7000 2.2000 3.0000

  • 33

    6.4000 3.2000 4.5000 1.5000 3.0000 6.7000 3.3000 5.7000 2.5000 3.0000 5.8000 2.6000 4.0000 1.2000 3.0000 6.4000 2.8000 5.6000 2.2000 3.0000 5.6000 2.9000 3.6000 1.3000 3.0000 6.3000 3.4000 5.6000 2.4000 3.0000 7.9000 3.8000 6.4000 2.0000 3.0000 6.7000 3.1000 4.4000 1.4000 3.0000 6.4000 2.7000 5.3000 1.9000 3.0000 6.2000 2.9000 4.3000 1.3000 3.0000 7.0000 3.2000 4.7000 1.4000 3.0000 5.8000 2.7000 5.1000 1.9000 3.0000 7.2000 3.0000 5.8000 1.6000 3.0000 6.7000 3.0000 5.2000 2.3000 3.0000 6.1000 2.8000 4.7000 1.2000 3.0000 6.3000 3.3000 4.7000 1.6000 3.0000 7.2000 3.6000 6.1000 2.5000 3.0000 6.8000 3.2000 5.9000 2.3000 3.0000 6.3000 2.5000 5.0000 1.9000 3.0000 5.9000 3.2000 4.8000 1.8000 3.0000 6.4000 2.9000 4.3000 1.3000 3.0000 6.2000 2.2000 4.5000 1.5000 3.0000 6.6000 2.9000 4.6000 1.3000 3.0000 5.6000 3.0000 4.5000 1.5000 3.0000 7.2000 3.2000 6.0000 1.8000 3.0000 6.3000 2.9000 5.6000 1.8000 3.0000 7.4000 2.8000 6.1000 1.9000 3.0000 5.8000 2.7000 3.9000 1.2000 3.0000 5.7000 2.8000 4.1000 1.3000 3.0000 5.7000 2.9000 4.2000 1.3000 3.0000 6.5000 2.8000 4.6000 1.5000 3.0000 6.3000 2.5000 4.9000 1.5000 3.0000 6.9000 3.1000 5.4000 2.1000 3.0000 6.7000 3.0000 5.0000 1.7000 3.0000 6.0000 3.4000 4.5000 1.6000 3.0000 6.7000 3.3000 5.7000 2.1000 3.0000 6.9000 3.1000 5.1000 2.3000 3.0000 6.5000 3.0000 5.2000 2.0000 3.0000 6.1000 3.0000 4.9000 1.8000 3.0000 6.5000 3.0000 5.5000 1.8000 3.0000

  • 34

    7.3000 2.9000 6.3000 1.8000 3.0000 5.6000 2.7000 4.2000 1.3000 3.0000 6.8000 2.8000 4.8000 1.4000 3.0000 6.9000 3.1000 4.9000 1.5000 3.0000 5.2000 2.7000 3.9000 1.4000 3.0000 6.7000 3.1000 5.6000 2.4000 3.0000 6.3000 3.3000 6.0000 2.5000 3.0000 5.5000 2.5000 4.0000 1.3000 3.0000 5.7000 2.5000 5.0000 2.0000 3.0000 6.1000 2.9000 4.7000 1.4000 3.0000 5.5000 2.4000 3.8000 1.1000 3.0000 7.7000 3.0000 6.1000 2.3000 3.0000 6.7000 3.1000 4.7000 1.5000 3.0000 6.1000 2.6000 5.6000 1.4000 3.0000 5.7000 2.6000 3.5000 1.0000 3.0000 5.7000 2.8000 4.5000 1.3000 3.0000 7.1000 3.0000 5.9000 2.1000 3.0000 6.6000 3.0000 4.4000 1.4000 3.0000 6.0000 3.0000 4.8000 1.8000 3.0000 6.5000 3.0000 5.8000 2.2000 3.0000 7.6000 3.0000 6.6000 2.1000 3.0000 5.0000 2.0000 3.5000 1.0000 3.0000 4.9000 2.5000 4.5000 1.7000 3.0000 5.4000 3.0000 4.5000 1.5000 3.0000 6.0000 2.7000 5.1000 1.6000 3.0000 6.9000 3.2000 5.7000 2.3000 3.0000 5.5000 2.6000 4.4000 1.2000 3.0000 6.2000 2.8000 4.8000 1.8000 3.0000 5.6000 2.5000 3.9000 1.1000 3.0000 5.6000 2.8000 4.9000 2.0000 3.0000 6.8000 3.0000 5.5000 2.1000 3.0000 6.0000 2.9000 4.5000 1.5000 3.0000 5.8000 2.7000 4.1000 1.0000 3.0000 6.4000 3.1000 5.5000 1.8000 3.0000 7.7000 2.8000 6.7000 2.0000 3.0000 6.3000 2.8000 5.1000 1.5000 3.0000 5.9000 3.0000 4.2000 1.5000 3.0000 5.9000 3.0000 5.1000 1.8000 3.0000 6.4000 2.8000 5.6000 2.1000 3.0000 6.5000 3.2000 5.1000 2.0000 3.0000

  • 35

    5.5000 2.4000 3.7000 1.0000 3.0000 6.1000 2.8000 4.0000 1.3000 3.0000 6.1000 3.0000 4.6000 1.4000 3.0000 6.0000 2.2000 4.0000 1.0000 3.0000 5.5000 2.3000 4.0000 1.3000 3.0000 7.7000 2.6000 6.9000 2.3000 3.0000 5.8000 2.8000 5.1000 2.4000 3.0000

  • 36

    >>dsa = load('hospital') Sex Age Weight Smoker YPL-320 Male 38 176 true GLI-532 Male 43 163 false PNI-258 Female 38 131 false MIJ-579 Female 40 133 false XLK-030 Female 49 119 false TFP-518 Female 46 142 false LPD-746 Female 33 142 true ATA-945 Male 40 180 false VNL-702 Male 28 183 false LQW-768 Female 31 132 false QFY-472 Female 45 128 false UJG-627 Female 42 137 false XUE-826 Male 25 174 false TRW-072 Male 39 202 true ELG-976 Female 36 129 false KOQ-996 Male 48 181 true YUZ-646 Male 32 191 true XBR-291 Female 27 131 true KPW-846 Male 37 179 false XBA-581 Male 50 172 false BKD-785 Female 48 133 false JHV-416 Female 39 117 false VWL-936 Female 41 137 false AAX-056 Female 44 146 true DTT-578 Female 28 123 true FZR-250 Male 25 189 false FZI-843 Female 39 143 false PUE-347 Female 25 114 false HLE-603 Male 36 166 false FME-049 Male 30 186 true AFK-336 Female 45 126 true TQW-430 Female 40 137 false LIM-480 Female 25 138 false YYV-570 Male 47 187 false MSL-692 Male 44 193 false KKL-155 Female 48 137 false WTL-804 Male 44 192 true NGK-757 Female 35 118 false FLX-785 Male 33 180 true RYA-895 Female 38 128 false VRH-620 Male 39 164 true AFB-271 Male 44 183 false RVS-253 Male 44 169 true

  • 37

    JQQ-692 Male 37 194 true VDZ-577 Male 45 172 false NFO-023 Female 37 135 false SPK-046 Male 30 182 false LQF-219 Female 39 121 false HJQ-495 Male 42 158 false EOT-439 Male 42 179 true FLJ-908 Male 49 170 true RBA-579 Female 44 136 true HAK-381 Female 43 135 true OIT-428 Female 47 147 false DAU-529 Male 50 186 true SJX-191 Female 38 124 false JRV-811 Female 41 134 false WCJ-997 Male 45 170 true WAQ-577 Male 36 180 false PPT-086 Female 38 130 false MPF-827 Female 29 130 false XAX-646 Female 28 127 false VAO-708 Female 30 141 false QEQ-082 Female 28 111 false VPG-454 Female 29 134 false RBO-332 Male 36 189 false IJY-130 Female 45 137 false HQI-880 Female 32 136 false ISR-838 Female 31 130 false NSK-403 Female 48 137 true SCQ-914 Male 25 186 false ILS-109 Female 40 127 true VLK-852 Male 39 176 false OJK-718 Female 41 127 false JDR-456 Female 33 115 true SRV-618 Male 31 178 true OSJ-974 Female 35 131 false LSL-639 Male 32 183 false SMP-283 Male 42 194 false QOO-305 Female 48 126 false UDS-151 Male 34 186 false YLN-495 Male 39 188 false NSU-424 Male 28 189 true WXM-486 Female 29 120 false EHE-616 Female 32 132 false ZGS-009 Male 39 182 true HWZ-321 Female 37 120 true GGU-691 Female 49 123 true WUS-105 Female 31 141 true TXM-629 Female 37 129 false

  • 38

    DGC-290 Male 38 184 true AGR-528 Male 45 181 false XBJ-540 Female 30 124 false FCD-425 Male 48 174 false HQO-561 Female 48 134 false REV-997 Male 25 171 true HVR-372 Male 44 188 true MEZ-469 Male 49 186 false BEZ-311 Male 45 172 true ZZB-405 Male 48 177 false --------------------------------------------------------------------------------------------------------------------------------

    >> dsa = hospital( :, {'Sex','Age','Weight','Smoker'}); >> statarray = grpstats(dsa,'Sex') statarray =

    Sex GroupCount mean_Age mean_Weight mean_Smoker Female Female 53 37.717 130.47 0.24528 Male Male 47 38.915 180.53 0.44681

    >> dsa = hospital(:,{'Age','Weight','Smoker'}); >> statarray = grpstats(dsa,[],{'mean','min','max'}) statarray =

    GroupCount mean_Age min_Age max_Age mean_Weight min_Weight max_Weight All 100 38.28 25 50 154 111 202 mean_Smoker min_Smoker max_Smoker All 0.34 false true