30
Introduction to MATLAB: Data Analysis and Statistics IAP 2007 Introduction to MATLAB Violeta Ivanova, Ph.D. Office for Educational Innovation & Technology [email protected] http://web.mit.edu/violeta/www

Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Embed Size (px)

Citation preview

Page 1: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Introduction to MATLAB

Violeta Ivanova, Ph.D.Office for Educational Innovation & Technology

[email protected]://web.mit.edu/violeta/www

Page 2: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Topics

MATLAB Interface and Basics Calculus, Linear Algebra, ODEs Graphics and Visualization Basic Programming Programming Practice Statistics and Data Analysis

Page 3: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Resources

Class materialshttp://web.mit.edu/acmath/matlab/IAP2007 Previous sessions: InterfaceBasics, Graphics This session: Statistics <.zip, .tar>

Mathematical Tools at MIThttp://web.mit.edu/ist/topics/math

Page 4: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

MATLAB Help Browser MATLAB

+ Data Analysis+ Preparing Data for Analysis+ Data Fitting Using Linear Regression

Curve Fitting Toolbox+ Fitting Data

Statistics Toolbox+ Descriptive Statistics+ Linear Models+ Hypothesis Tests+ Statistical Plots

Page 5: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

MATLAB Data Analysis

Preparing DataCorrelationBasic Fitting

Page 6: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Data Input / Output

Import Wizard for data importFile->Import Data …

File input with loadB = load(‘datain.txt’)

File output with savesave(‘dataout’, ‘A’, ‘-ascii’)

Page 7: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Missing Data

Removing missing data Removing NaN elements from vectors>> x = x(~isnan(x)) Removing rows with NaN from matrices >> X(any(isnan(X),2),:) = []

Interpolating missing dataYI = interp1(X, Y, XI, ‘method’)Methods: ‘spline’, ‘nearest’, ‘linear’, …

Page 8: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Correlation

DefinitionTendency of two variables to increase ordecrease together.

MeasurePearson product-moment coefficient

!X ,Y

=cov X,Y( )"

X"

Y

Page 9: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Correlation Example Import Data: cancersmoking.dat Correlation coefficient & confidence interval

>> [R, P] = corrcoef(X);>> [i, j] = find(P < 0.05);

Page 10: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Data Statistics Figure Editor: smokecancer.fig

Tools->

Data Statistics

Page 11: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Basic Fitting Figure Editor : Tools->Basic Fitting …

Page 12: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Statistics Toolbox

Probability DistributionsDescriptive StatisticsLinear & Nonlinear ModelsHypothesis TestsStatistical Plots

Page 13: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Descriptive Statistics

Central tendency>> m = mean(X)>> gm = geomean(X)>> med = median(X)>> mod = mode(X)

Dispersion>> s = std(X)>> v = var(X)

Page 14: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Probability Distributions

Probability density functions>> Y = exppdf(X, mu)>> Y = normpdf(X, mu, sigma)

Cumulative density functions>> Y = expcdf(X, mu)>> Y = normcdf(X, mu, sigma)

Parameter estimation>> m = expfit(data)>> [m, s] = normfit(data)

Page 15: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Statistical Plots >> bp = boxplot(X, group)

Page 16: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Polynomial Fitting Tool>> polytool(X, Y)

Page 17: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Distribution Fitting Tool>> dfittool

Page 18: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Linear Models

Definition:

y: n x 1 vector of observationsX: n x p matrix of predictorsβ: p x 1 vector of parametersε: n x 1 vector of random disturbances

y = X! + "

Page 19: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Linear Regression

Multiple linear regression>> [B, Bint, R, Rint, stats] = regress(y, X)

B: vector of regression coefficientsBint: matrix of 95% confidence intervals for BR: vector of residualsRint: intervals for diagnosing outlinersstats: vector containing R2 statistic etc.

Residuals plot>> rcoplot(R, Rint)

Page 20: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Hypothesis Testing

Definition: use of statistics to determine theprobability that a given hypothesis is true. Null hypothesis (observations are the result of

pure chance) and alternative hypothesis.

Test statistic to assess truth of null hypothesis.

P-value: probability of test statistic to be thatsignificant if null hypothesis were true.

Comparison of P-value to acceptable α-value.

Page 21: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Analysis of Variance (ANOVA)

One-way ANOVA>> anova1(X,group)

Page 22: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Multiple Comparisons

>> [p, tbl, stats]= anova1(X,group)>> [c, m] =multcompare(stats)

Page 23: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

More Built-In Functions

Two-way ANOVA>> [P, tbl, stats] = anova2(X, reps)

Other hypothesis tests>> H = ttest(X)>> H = lillietest(X)

Page 24: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Data Analysis Exercises Exercise One: dataanalysis.m,

rfid.dat, barcode.dat

Correlation coefficient Hypothesis testing Statistical plots ANOVA

Follow instructions in the m-file …

Page 25: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Curve Fitting Toolbox

Curve Fitting ToolGoodness of FitAnalyzing a FitFourier Series Fit

Page 26: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Curve Fitting Tool>> cftool

Page 27: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Goodness of Fit Statistics

Page 28: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Analyzing a Fit

Page 29: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Fourier Series Fit

Page 30: Introduction to MATLAB - MITweb.mit.edu/acmath/matlab/IAP2007/IntroMatlabStatistics.pdf · IAP 2007 Introduction to MATLAB: Data Analysis and Statistics Topics MATLAB Interface and

Introduction to MATLAB: Data Analysis and StatisticsIAP 2007

Data Analysis Exercises Exercise Two: regression.m,

worlddata.dat, star.txt

Linear regression Polynomial fitting Probability density function fitting Goodness of Fit

Follow instructions in the m-file …