103
Laboratory forInterdisciplinary StatisticalA nalysis C ollaboration: Visitourw ebsite to requestpersonalized statistical advice and assistance w ith: D esigning Experim ents •Analyzing D ata •Interpreting R esults G rantProposals •Softw are (R ,SAS,JM P,M initab...) LISA statistical collaborators aim to explain concepts in w ays useful foryourresearch. G reatadvice rightnow :M eetw ith LISA before collecting yourdata. All services are FR EE forVT researchers . W e assistw ith research— notclass projects orhom ew ork. LISA helps VT researchers benefitfrom the use ofStatistics w w w .lisa.stat.vt.edu LISA also offers: EducationalShortC ourses: D esigned to help graduate students apply statistics in theirresearch W alk-In C onsulting: M -F 1-3PM in 401 H utcheson H all and W ed.1-3PM in the G LC forquestions <30 m ins

Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech October, 2011

Embed Size (px)

Citation preview

Page 1: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Laboratory for Interdisciplinary Statistical Analysis

Collaboration:Visit our website to request personalized statistical advice and assistance with:

Designing Experiments • Analyzing Data • Interpreting ResultsGrant Proposals • Software (R, SAS, JMP, Minitab...)

LISA statistical collaborators aim to explain concepts in ways useful for your research.

Great advice right now: Meet with LISA before collecting your data.

All services are FREE for VT researchers. We assist with research—not class projects or homework.

LISA helps VT researchers benefit from the use of Statistics

www.lisa.stat.vt.edu

LISA also offers:

Educational Short Courses: Designed to help graduate students apply statistics in their researchWalk-In Consulting: M-F 1-3PM in 401 Hutcheson Hall and Wed. 1-3PM in the GLC for questions <30 mins

Page 2: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Introduction to Using JMP®Wandi Huang

Laboratory for Interdisciplinary Statistical AnalysisDepartment of Statistics, Virginia Techhttp://www.lisa.stat.vt.edu/

October, 2011

Page 3: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Outline

Introduction Getting Started Managing Data Visualizing Data Creating Summary Statistics Performing Basic Statistical Analysis Saving and Exporting Results Resources

3

Page 4: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

JMP was developed by SAS Institute Inc., Cary, NC

Using JMP statistical software, you can Interact with your graphs and data to

discover patterns and relationships in your data

See how the data and the model work together to produce the statistics

Perform statistical summary and analysis

About JMP®

Page 5: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

JMP Download and Installation JMP license information

All Virginia Tech students may download JMP free of charge by going to the Software Distribution Office's Network Software page and logging on using your PID and password▪ http://www2.ita.vt.edu/software/student/products/

sas/jmp/index.html JMP 9 is available now for both Windows and

Mac Unzip the JMP 9 file, click on the ‘setup’ icon,

and follow the instructions for installation

5

Page 6: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Before you begin using JMP, note the following information: You can use many JMP features, such as

data manipulation, graphs, and scripting features, without any statistical knowledge

A basic understanding of basic statistical concepts, such as mean and variation, is recommended

Analytical features require statistical knowledge appropriate for the feature

Prerequisites

Page 7: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

JMP Terminology

JMP platforms use these windows: Launch windows where you set up and run your

analysis Report windows showing the output of your analysis

Report windows normally contain the following items: A graph of some type (such as a scatterplot or a

histogram) Specific reports that you can show or hide using the

disclosure button Platform options that are located within red triangle

menus

Page 8: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Outline

IntroductionGetting Started Managing Data Visualizing Data Creating Summary Statistics Performing Basic Statistical Analysis Saving and Exporting Results Resources

8

Page 9: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

JMP Home Window (Windows Only)

9

Tab + Alt to switch among different windows Ctrl + Q to quit

Page 10: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

You can enter, view, edit, and manage data using data tables

In a data table, each variable is a column, and each observation is a row

To create a new data table: Select File > New > Data Table Ctrl + N Click on the first icon below the File menu

JMP Data Table

Page 11: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

JMP Data Table

This shows an empty data table with no rows and one numeric column, labeled Column 1

11

Page 12: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Manually: Move the cursor onto a cell, click in the cell and

enter a value Construct a formula to calculate column values

Open the formula editor by right-clicking the column name to which you want to apply the formula and selecting Formula…

Or Double-click the column name to which you want to apply the formula, Column Properties > Formula > Edit Formula

Select an empty formula element in the formula editing area by clicking it

Entering Data

Page 13: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

You can import many file formats into JMP by default. For example: Comma-separated (.csv) .dat files that consist of text Microsoft Excel 1997–2003 (.xls) Plain text (.txt) SAS versions 6–9 on Windows

(.sd2, .sd5, .sd7, .sas7bdat) SPSS files (.sav)

Other files, such as Microsoft Excel 2007 files, require specific Open Database Connectivity (ODBC)

Importing Data

Page 14: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Import from Excel Files

File > Open or Ctrl + O or Or, select all data in the excel

spreadsheet, copy, switch to JMP, create a new data table, Edit > Paste with Column Names

Exercise: Open the SAT.xls excel file in JMP

In the Open Data File window, change ‘All JMP Files’ to ‘All Files’

Copy and paste data in SAT.xls to a JMP data table

14

Page 15: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

There are three data table panels Table panel Columns panel Rows panel

The data table panels are arranged to the left of the data grid

These panels contain information about the table and its contents

Data Table Panels

Page 16: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

The modeling type of a variable can be one of the following types, shown with its corresponding icon: Continuous Ordinal Nominal

When you import data into JMP, it predicts which modeling types to use Character data is considered nominal Numeric data is considered continuous

To change the modeling type, click on the modeling type icon next to the variable and make your selection

JMP Modeling Types

Page 17: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Access Sample Data Tables

17

All of the examples in the JMP documentation suite use sample data. To access JMP’s sample data tables,

Select Help > Sample Data. From here, you can:

Open the sample data directory Open an alphabetized list of all sample data tables Search for a sample data table within a category

Alternatively, the sample data tables are installed in the following directory:

On Windows: C:\Program Files\SAS\JMP\9\Support Files <language>\Sample Data

On Macintosh: \Library\Application Support\JMP\9\<language>\Sample Data

Page 18: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

A saved session can help get you back to a previous state without having to manually re-open files and re-run analyses

Select File > Save By default, JMP asks whether you would

like to save the state of your session each time you exit the program Saving session upon exiting:

Saving JMP Sessions

Page 19: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Outline

Introduction Getting StartedManaging Data Visualizing Data Creating Summary Statistics Performing Basic Statistical Analysis Saving and Exporting Results Resources

19

Page 20: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

To add one or multiple new empty rows, you can take one of the following actions: Select Rows > Add Rows Double-click an empty row number area below the last

row to add that many empty rows Double-click the gray lower triangular area in the

upper left corner of the data grid. In the Add Rows… window,▪ Enter the number of rows to add▪ Specify where you would like to add them

Right-click in an empty row below the last row, and select Add Rows… ▪ Enter the number of rows to add

Adding Rows

Page 21: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

To delete rows from the data grid, you can do one of the following: Highlight the rows that you want to

delete, then select Rows > Delete Rows Right-click on the row numbers and select

Delete Rows

Deleting Rows

Page 22: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

To add one or multiple new empty columns, you can take one of the following actions: Select Cols > New Column Double-click the empty space to the right of the last

data table column Select Cols > Add Multiple Cols… (or double-click

the gray upper triangular area in the upper left corner of the data grid). In the Add Multiple Cols… window,▪ Enter the number of columns to add▪ Specify if they are to be grouped▪ Select a data type▪ Enter their location▪ Select the initial data values

Adding Columns

Page 23: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

To delete columns from the data grid, you can do one of the following: Highlight the columns that you want to

delete, then select Cols > Delete Columns

Right-click on the column numbers and select Delete Columns

Deleting Columns

Page 24: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Select or deselect rows: Select Rows > Row Selection > Go to

Row… to select a certain row number Select Rows > Row Selection > Select All

Rows Select Rows > Clear Row States Hold down Shift and click the gray lower

triangular area in the upper left corner of the data grid to select all rows. Click again to deselect

To clear all highlights in the data table, press the ESC key on your keyboard

Selecting/Deselecting Rows

Page 25: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Select or deselect columns: Select Cols> Go … to select a certain

column number or name Hold down Shift and click the gray upper

triangular area in the upper left corner of the data grid to select all columns. Click again to deselect

To clear all highlights in the data table, press the ESC key on your keyboard

Selecting/Deselecting Columns

Page 26: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Selecting cells that match the currently highlighted cell Highlight the cells that contain the value(s)

that you want to locate Select Rows > Row Selection > Select

Matching Cells Selecting cells that contain specific

values Select Rows > Row Selection > Select

Where

Selecting Cells with Specific Values

Page 27: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

You suppress (hide) rows and columns so they are included in analyses but do not appear in plots and graphs. To do so, you Select Hide/Unhide from the Rows menu or

Cols menu A mask icon appears beside the hidden

row number or the column name, indicating that the row or column is hidden

To unhide rows or columns, you select Hide/Unhide again

Show/Hide Data

Page 28: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

You can exclude data from calculations in analyses. For most platforms, excluded data are not hidden in plots. To do so, you Select Exclude/Unexclude from the Rows

menu or Cols menu A circle with a strikethrough appears

beside either the row number or the column name, indicating that the row or column is excluded and not analyzed

To un exclude rows or columns, you select Exclude/Unexclude again

Include/Exclude Data

Page 29: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

The Data Filter gives you a variety of ways to identify subsets of data

Using Data Filter commands and options, you interactively select complex subsets of data, hide these subsets in plots, or exclude them from analyses

Select Rows > Data Filter

Data Filter

Page 30: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Data Filter

Exercise: Select data for Virginia Open SAT data in JMP Select Rows > Data Filter Select State and click Add Let’s check Select for Virginia Can also check Show or Include De-select? Click Clear Choose another variable?

Click Start Over

30

Page 31: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Data Filter

To select/show/include continuous variables such as time or weight, Use sliders to control selection Drag the end sliders to select the range

you want Need specific end points?

Click on those values

31

Page 32: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Outline

Introduction Getting Started Managing DataVisualizing Data Creating Summary Statistics Performing Basic Statistical Analysis Saving and Exporting Results Resources

32

Page 33: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Histograms visually display the distribution of your data For categorical (nominal or ordinal)

variables, the histogram shows a bar for each level of the ordinal or nominal variable

For continuous variables, the histogram shows a bar for grouped values of the continuous variable

Select Analyze > Distribution

Histograms

Page 34: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Create a histogram for SAT Math Open SAT data in JMP Select Analyze > Distribution In the Select Columns box, select SAT

Math > Y, Columns, then click on OK

Histograms

Page 35: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Interacting with the histogram Change the orientation:

▪ Click on the ▼ red triangle menu > Histogram Options > Vertical Display the count of within each bar:

▪ Click on the ▼ red triangle menu > Histogram Options > Show Counts

Rescaling the axis (continuous variables only):▪ Click and drag on an axis to rescale it▪ Hover over the axis until you see a hand, double-click on the axis and

set the parameters in the X Axis Specification window Resizing histogram bars (continuous variables only):

▪ Click on the ▼ red triangle menu > Histogram Options > Set Bin Width

▪ Hover over the axis until you see a hand, double-click on the axis and set the increment in the X Axis Specification window

Histograms

Page 36: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Interacting with the histogram Clicking on a histogram

bar highlights the bar and selects the corresponding rows in the data table

The appropriate portions of all other graphical displays also highlight the selection

Histograms

Page 37: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Select Analyze > Fit Y by X

Exercise: Plot SAT Verbal vs. SAT Math Select Analyze >Fit Y by X Click SAT Verbal in Select

Columns box > Y, Response Click SAT Math in Select

Columns box > X, Factor button

Click OK

Scatterplots

Page 38: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Interacting with the scatterplots Suppose we are interested in

the points with both SAT Math and SAT Verbal greater than 600▪ Point at this point and click on it▪ The point gets highlighted▪ The corresponding row (row

274) is also highlighted in the data table

Scatterplots

Page 39: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Interacting with the scatterplots Suppose we are

interested in all the points with both SAT Math and SAT Math > 580▪ Shift-click on all the points

that satisfied this condition

• Or, drag a box over all these points

▪ To deselect, Ctrl-click

Scatterplots

Page 40: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Interacting with the scatterplots Color the selected

points red and change the symbol to an empty circle▪ Right click on the

scatterplot▪ Row Colors▪ Row Markers▪ etc.

Scatterplots

Page 41: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Interacting with the scatterplots Suppose those highlighted

points are considered as ‘outliers’ and need to be removed from the plot (or the analysis)▪ Right click on the scatterplot

▪ Row Hide▪ Row Exclude

▪ ▼ Red triangle menu > Script > Redo Analysis to update the plot

Scatterplots

Page 42: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Using the Scatterplot Matrix platform, you can assess the relationships between multiple variables simultaneously

A scatterplot matrix is an ordered collection of bivariate graphs Select Graph > Scatterplot Matrix Select Analyze > Multivariate

Methods > Multivariate (continuous data only)

Exercise: Help > Sample data > Iris Select Sepal length, Sepal width,

Petal length, and Petal width and click Y, Columns

Select Species and click Group Click OK

Scatterplot Matrix

Page 43: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

To make the groupings stand out, you can: From the ▼ red

triangle menu, select Density Ellipses

From the ▼ red triangle menu, select Shaded Ellipses

Scatterplot Matrix

Page 44: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

The Scatterplot 3D platform shows the values of numeric columns in the associated data table in a rotatable, 3D view

Select Graph > Scatterplot 3D Exercise:

Help > Sample data > Iris Select Graph > Scatterplot 3D Select Sepal length, Sepal width,

Petal length, and Petal width and click Y, Columns

Click OK

Scatterplot 3D

Page 45: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Information Displayed on the Scatterplot 3D Report

Scatterplot 3D

Page 46: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Normal Contour Ellipsoids Exercise: Grouped normal contour ellipsoids

The ellipsoids cover 75% of the data points and are 50% transparent The contours are color-coded based on species Help > Sample data > Iris Select Graph > Scatterplot 3D Select Sepal length, Sepal width, Petal length, and Petal width and

click Y, Columns Click OK ▼ Red triangle menu > Normal Contour Ellipsoids Select Grouped by Column Select Species Type 0.75 next to Coverage Type 0.5 next to Transparency Click OK

Scatterplot 3D

Page 47: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Example of Grouped Normal Contour Ellipsoids

Scatterplot 3D

Page 48: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

If we select Nonpar Density Contour instead of Normal Contour Ellipsoids, we can create nonparametric density contours

Scatterplot 3D

Page 49: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

The variability charts are used when we have multiple categorical x variables and one y variable

Select Graph > Variability/Gauge Chart

Exercise: Help > Sample data > Car

Physical Data Select Graph >

Variability/Gauge Chart Select Weight as Y, Response,

Country and Type as X, Grouping Click OK

Variability Charts

Page 50: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

From the ▼ red triangle menu, you can Connect Cell Means

(blue lines are added) Uncheck Show Range

Bars (easier to see points)

Show Group Means (purple lines are added)

Variability Charts

Page 51: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

A bubble plot is a scatter plot that represents its points as circles, or bubbles. You can use bubble plots to: dynamically animate bubbles using a time variable,

to see patterns and movement across time use size and color to clearly distinguish between

different variables Bubble plots can produce dramatic

visualizations and readily show patterns and trends

Select Graph > Bubble Plot

Bubble Plots

Page 52: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Open SAT data in JMP Graph > Bubble Plot

▪ Select SAT Verbal for Y▪ Select SAT Math for X▪ Select Region, State for ID▪ Select Year for Time▪ Select SAT % Taking (2004)

for Sizes▪ Select ACT % Taking (2004)

for Coloring▪ Click OK▪ Click on one bubble > ▼ red triangle menu > Trail Lines▪ ▼ Red triangle menu > Save for Adobe Flash platform

(.SWF)…

Bubble Plots

Page 53: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Graph Builder provides a platform where you can interactively create and modify graphs

Graph types include points, lines, bars, histograms, etc.

It allows you to explore relationships between several variables on the same graph

Select Graph > Graph Builder

Graph Builder

Page 54: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Open SAT data Create a histogram for SAT Math

Graph Builder

Page 55: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Open SAT data Create a histogram for

SAT Math by Region

Graph Builder

Page 56: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Open SAT data Create a histogram for SAT Verbal by

Region▪ Drag SAT Verbal and drop it on top of SAT Math▪ Where to drop matters

Graph Builder

Page 57: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Interaction plot Open Car Physical Data Select Graph > Graph Builder Click, drag and drop Weight to Y Click, drag and drop Type to X Click, drag and drop Country to

Overlay Right click on the plot > Add >

Line

Graph Builder

Page 58: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Car Physical Data

Graph Builder

Page 59: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Outline

Introduction Getting Started Managing Data Visualizing DataCreating Summary Statistics Performing Basic Statistical Analysis Saving and Exporting Results Resources

59

Page 60: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

To general numerical summaries of data, you can: Create a table that contains columns of

summary statistics Tabulate data so it is displayed in a

tabular format

Numerical Summaries of Data

Page 61: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

The Tables > Summary command calculates various summary statistics, including the mean and median, standard deviation, minimum and maximum value, etc.

Select Tables > Summary Select the columns you want to summarize in

the Select Columns box A new data table is created to store all the

summary statistics requested but it is not saved when you close it unless you select File > Save As to give it a name and location

Summarizing Columns

Page 62: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Create summary statistics for SAT Verbal Open SAT data Select Tables > Summary Click SAT Verbal near upper left Click Statistics button

and choose Mean• Can choose any statistic• Can choose more than

one statistic – click Statistics again

Click OK

Summarizing Columns

Page 63: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Use the Tables > Tabulate command for constructing tables of descriptive statistics

The tables are built from grouping columns, analysis columns, and statistics keywords

Through its interactive interface for defining and modifying tables, the Tabulate command provides a powerful and flexible way to present summary data in tabular form

Tabulating Data

Examples of summary tables:

Page 64: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

To create a summary table using the Tabulate command is an iterative process: Click and drag the items (column name from

the column list or statistics from the keywords list) from the appropriate list

Drop the items on the dimension (row table or column table) where you want to place the items’ labels

After creating a table, add to it by repeating the above process

Tabulating Data

Page 65: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

When you drag and drop a variable, JMP populates the table automatically for it if its role is obvious, such as keywords or character columns

Otherwise, a popup menu lets you choose the role for the variable Add Grouping Columns – if you want to use the

variables to categorize the data. For multiple grouping columns, Tabulate creates a hierarchical nesting of the variable

Add Analysis Columns – if you want to compute the statistics of these columns

Tabulating Data

Page 66: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Create descriptive statistics for SAT Math by Region Open SAT data Select Tables > Tabulate Click Region and drag and drop it into the Drop

zone for columns Select Add Grouping Columns Click Mean and drag and drop it into the first

blank cell on the third row Click Std Dev and drag and drop it just below

Mean

Tabulating Data

Page 67: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Exercise: Create descriptive statistics for SAT Math by Region

Tabulating Data

Page 68: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Outline

Introduction Getting Started Managing Data Visualizing Data Creating Summary StatisticsPerforming Basic Statistical

Analysis Saving and Exporting Results Resources

68

Page 69: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Types of Data Analysis

One variable (univariate) Distribution

Two variables (bivariate) Fit Y by X

More than two variable Fit Model

More advanced features Modeling

69

Page 70: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Comparing Means

One-Sample t-Test

Data: Help > Sample Data > Fitness

Linneruds Fitness data: fitting oxygen uptake to exercise and other variables. The original is in Rawlings (1988), but certain values of MaxPulse and RunPulse were changed for illustration. Names and Sex columns were contrived for illustration

70

Page 71: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Comparing Means

One-Sample t-Test Example: Fitness

▪ Select Analyze > Distribution▪ Select RunPulse > Y, Columns▪ Click OK▪ ▼ Red triangle menu next to RunPulse > Normal Quantile Plot▪ ▼ Red triangle menu next to RunPulse > Continuous Fit >

Normal▪ ▼ Red triangle menu next to Fitted Normal > Goodness of Fit▪ ▼ Red triangle menu next to RunPulse > Test Mean▪ Enter 170 for Specify Hypothesized Mean to test if RunPulse

equals 170▪ Click OK▪ Prob >|t| is 0.8485, there is not enough evidence to reject the null

hypothesis

71

Page 72: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Comparing Means

Paired t-Test – used when you have two related measurements Create a new column for ‘difference’

▪ Select Cols > New Column▪ Type Difference in the Column Name box▪ Select Cols > Formula▪ Select col 1▪ Select the subtraction sign▪ Select col 2▪ Click OK▪ Click OK

Then perform the same procedures as for One-Sample t-Test

Or, select Analyze > Matched Pairs72

Page 73: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Comparing Means

Two-Sample t-Test – used when you compare the means of two populations Example: Fitness

▪ Select Analyze > Fit Y by X▪ Choose Sex > X, Factor▪ Choose RunPulse > Y, Response▪ Click OK▪ ▼ Red triangle menu next to Oneway Analysis of

RunPulse by Sex > Normal Quantile Plot▪ ▼ Red triangle menu next to Oneway Analysis of

RunPulse by Sex > UnEqual Variances▪ ▼ Red triangle menu next to Oneway Analysis of

RunPulse by Sex > Means/Anova/Pooled t (for unequal variance select t-test)

▪ Prob >|t| is 0.1835, there is not enough evidence to reject the null hypothesis

73

Page 74: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANOVA

One-Way ANOVA with two groups – used when you compare the means of two populations

Same as Two-Sample t-Test

74

Page 75: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANOVA

One-Way ANOVA with more than two groups – used when you compare the means of more than two populations Example: Help > Sample Data > Car Physical Data

▪ Select Analyze > Fit Y by X▪ Select Country > X, Factor▪ Select Weight > Y, Response▪ Click OK▪ ▼ Red triangle menu next to Oneway Analysis of

Weight by Country > Normal Quantile Plot▪ ▼ Red triangle menu next to Oneway Analysis of

Weight by Country > UnEqual Variances75

Page 76: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANOVA

One-Way ANOVA with more than two groups Example: Car Physical Data (cont.) -

Residuals▪ ▼ Red triangle menu next to Oneway Analysis

of Weight by Country > Save > Save Residuals▪ Rename Weight centered by Country as residual▪ Select Analyze > Distribution > residual > Y,

Columns > OK▪ Select Continuous Fit > Normal > Goodness of

Fit▪ ▼ Red triangle menu next to Oneway Analysis

of Weight by Country > Means/ANOVA▪ Prob > F is 0.0001, this is strong evidence for

concluding that at least one mean is statistically different from one of the other means

76

Page 77: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANOVA

One-Way ANOVA with more than two groups Example: Car Physical Data (cont.) –

Contrasts ▪ ▼ Red triangle menu next to Oneway Analysis

of Weight by Country > Compare Means > Each Pair Student’s t

▪ The diamonds for 1 and 2 overlap – they probably are not different; 2 and 3 do not overlap – probably different

▪ The circles cannot be interpreted unless you interact with them – select a comparison circle to highlight it

▪ ▼ Red triangle menu next to Comparisons for each pair using Student’s t > Different Matrix

▪ ▼ Red triangle menu next to Comparisons for each pair using Student’s t > Detailed Comparisons Report

77

Page 78: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANOVA

One-Way ANOVA with more than two groups Example: Car Physical Data (cont.) –

Contrasts ▪ ▼ Red triangle menu next to Oneway Analysis

of Weight by Country > Compare Means > All Pairs, Tukey HSD

▪ Use this test to control the experimentwise error rate at the significance level α (e.g. α=0.05)

78

Page 79: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANOVA

N-Way ANOVA – used when there are more than one categorical factor Example: Car Physical Data

▪ Select Analyze > Fit Model▪ Select Weight > Y▪ Select Country, Type > Macros > Full Factorial▪ Click Run ▪ ▼ Red triangle menu next to the response > Factor

Profiling > Interaction Plots▪ ▼ Red triangle menu next to the two-way interaction >

LSMeans Plot▪ p-values for the interactions is smaller than 0.05;

not all the lines in interaction plots are parallel – conclude there is a significant interaction between the factors

79

Page 80: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANOVA

N-Way ANOVA Example: Car Physical Data – Contrasts

▪ ▼ Red triangle menu next to Country*Type > LSMeans Contrast

▪ Select the plus sign for USA, Compact; the minus sign for USA, Sporty > Done

▪ Prob > F is 0.03 – A US made sporty car is heavier than a US made compact car

▪ ▼ Red triangle menu next to Country*Type > LSMeans Contrast

▪ Select the plus sign for Japan, Sporty; the minus sign for USA, Sporty > Done

▪ Prob > F is 0.01 – A US made sporty car is heavier than a Japan made sporty car

80

Page 81: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Regression

Simple Linear Regression – used to assess the significance of the predictor in explaining the variability in the response Example: Help > Sample Data > Fitness

▪ Select Analyze > Distribution▪ Select Age, Shift-click MaxPlus > Y, Columns > OK▪ Hold down Ctrl and click ▼ Red triangle menu

next to Age > Display Options > More Moments▪ Hold down Ctrl and click ▼ Red triangle menu

next to Age > Normal Quantile Plot▪ Hold down Ctrl and click ▼ Red triangle menu

next to Age > Continuous Fit → Normal

81

Page 82: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Regression

Simple Linear Regression Example: Fitness (cont.)

▪ Select Analyze > Fit Y by X▪ Select Oxy > Y, Response▪ Select Age and hold down Shift and click MaxPulse > X,

Factor▪ Click OK▪ Select Oxy, Remove from X, Factor▪ Click OK▪ Hold down Ctrl and click ▼ Red triangle menu next to

Bivariate Fit of Oxy By Age > Density Ellipse > 0.95▪ Hold down Ctrl and click ▼ Red triangle menu next to

Bivariate Fit of Oxy By Age > Fit Mean▪ Hold down Ctrl and click ▼ Red triangle menu next to

Bivariate Fit of Oxy By Age > Fit Line82

Page 83: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Regression

Multiple Linear Regression – used to model the relationship between many continuous predictors and a single continuous response Example: Help > Sample Data > Fitness

▪ Select Analyze > Fit Model▪ Select Oxy > Y▪ Select Age and Shift-click MaxPulse > Add▪ Select Oxy, Remove from Model Effects▪ Run ▪ ▼ Red triangle menu next to Response Oxy > Save

Columns > Residuals▪ Rename Residual Oxy as residual▪ Select Analyze > Distribution > residual > Y, Columns >

OK▪ Select Continuous Fit > Normal > Goodness of Fit

83

Page 84: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Regression

Multiple Linear Regression Example: Fitness (cont.) – Model selection

▪ ▼ Red triangle menu next to Response Oxy > Model Dialog

▪ Select RstPulse from the Model Effects list and select Remove

▪ Run▪ Select Weight from the Model Effects list and

select Remove▪ Run

84

Page 85: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Regression

Multiple Linear Regression Example: Fitness (cont.) – Model selection

▪ Select Analyze > Fit Model▪ Select Oxy > Y▪ Select Age and Shift-click MaxPulse > Add▪ Select Oxy, Remove from Model Effects▪ Select Standard Least Squares > Stepwise▪ Run▪ Direction: Forward > Go▪ Run Model▪ Direction: Backward > Enter All > Go▪ Run Model

85

Page 86: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Regression

Multiple Linear Regression Example: Fitness (cont.) – Add interaction and

higher order terms▪ Select Analyze > Fit Model▪ Select Oxy > Y▪ Select Age and Ctrl-click Runtime and RunPulse >

Macro > Factorial to degree (2 is used here)▪ Run▪ Select Analyze > Fit Model▪ Select Oxy > Y▪ Select Age and Ctrl-click Runtime and RunPulse >

Macro > Polynomial to Degree (2 is used here)▪ Run

86

Page 87: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANCOVA

A model relating a categorical predictor and a continuous covariate to a single continuous response is known as an analysis of covariance (ANCOVA) model

ANOVA with categorical and continuous predictors First of all, need to identify if there is interaction

between predictors Example 1: DrugLBI – no interactions

Data: ▪ Help > Sample Data > DrugLBI

Notes: ▪ From Snedecor and Cockran, Statistical Methods, 1967▪ Use Fit Model with 'LBS' as response (Y), 'Drug' and 'LBI' as

effects (Xs)87

Page 88: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANCOVA

Example 1: DrugLBI – no interactions▪ Select Analyze > Fit Model▪ Select LBS > Y▪ Select Drug, LBI > Macros > Full Factorial or

Factorial to Degree▪ Click Run▪ P-value for Drug*LBI = 0.5606, greater than 0.05,

indicating that Drug*LBI is not significant, thus can be removed from the model

▪ Examine the interaction in the Regression Plot:A linear regression line is drawn with a different color for each level of Drug. It may be difficult to interpret this graph for statistical significance of the interaction 88

Page 89: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANCOVA

Example 1: DrugLBI – no interactions Re-do the analysis without including the

interaction term▪ Select Analyze > Fit Model▪ Select LBS > Y▪ Select Drug, LBI > Add▪ Click Run▪ Effect Tests report that Drug is not significant (p-

value = 0.1384), and LBI is significant (p-value < 0.0001);it appears that there is no difference among Drug types in the response LBS

89

Page 90: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANCOVA

Example 2: Sawblade – model with interaction Data:

▪ Import Sawblade.xls file to JMP Notes:

▪ Fit a model to study the effect of blade material and blade speed on heat generation

90

Page 91: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

ANCOVA

Example 2: Sawblade – model with interaction▪ Select Analyze > Fit Model▪ Select Heat > Y▪ Select Material, Speed > Macros > Full Factorial or

Factorial to Degree▪ Click Run▪ p-value for the interaction term Material*Speed <

0.0001, which is significant▪ When there is a significant interaction, we cannot make

a conclusion about Material or Speed along; the effect of Material depends on the Speed of the blade

▪ To interpret the interaction, look at the Regression Plot:A linear regression line is drawn with a different color for each level of Material

91

Page 92: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Saving Analyses to Data Table To re-produce the previous analysis

when you re-open the data table, you can:

▼ Red triangle menu > Script > Save Script to Data Table

Re-produce the analysis from Data Table by ▼ Red triangle menu > Run Script

92

Page 93: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Outline

Introduction Getting Started Managing Data Visualizing Data Creating Summary Statistics Performing Basic Statistical AnalysisSaving and Exporting Results Resources

93

Page 94: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Saving Data Tables

You can save data tables in multiple formats: JMP data table (.jmp) SAS Transport File (.xpt) Excel File (.xls) Text File (.txt, .dat) etc.

Select File > Save As

94

Page 95: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Saving Reports

JMP saves reports in the following formats : JMP report (.jrp) Hypertext markup language (.htm, .html) Joint photographics expert group(.jpg) Microsoft Word (.doc) Portable Document Format (.pdf) Portable Network Graphics (.pgn) Text File (.txt) etc.

Select File > Save As

95

Page 96: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Pasting Reports into Another Program

When you need to use JMP reports or data tables in another program, you can copy and paste parts of it into the document, such as Microsoft Word or PowerPoint file. Click the selection tool Click and drag (or hold down Shift and click) to select items in

a report window or data table Click the selected items and drag them from JMP to the other

program Or, copy the selected items in JMP and paste them into the

other program Note:

To copy all text (no graphs) from the active report window as unformatted text, select Edit > Copy As Text

To copy only the graph (no text), right-click the graph and select Edit > Copy Picture 96

Page 97: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Pasting Reports into Another Program

Exercise: Bring up any analysis in JMP

Press Alt and choose selection tool

Click on plot Copy (Ctrl + C) from JMP,

Paste (or Paste Special) into the desired program

97

Page 98: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Outline

Introduction Getting Started Managing Data Visualizing Data Creating Summary Statistics Performing Basic Statistical Analysis Saving and Exporting Results Resources

98

Page 99: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Resources

Help menu Indexes Tutorials Books – JMP documentations

▪ Discovering JMP▪ Using JMP▪ Basic Analysis and Graphing▪ DOE Guide

Sample Data

99

Page 100: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Resources

On-line resources http://www.jmp.com/about/events/webcasts

/ for webcasts and recorded demos

http://www.jmp.com/academic/ check out Learning Library▪ JMP 9 Quick Guide

100

Page 101: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Resources

On-line resources http://www.lisa.stat.vt.edu/

Welcome to LISA! http://www.lisa.stat.vt.edu/?q=short_course

sLISA short courses

101

Page 102: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

References

JMP Sample Data Car Physical Data DrugLBI Fitness Iris SAT Saw Blade

JMP Documentation Using JMP Basic Analysis and Graphing

JMP® Software: ANOVA and Regression Course Notes

102

Page 103: Wandi Huang Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech  October, 2011

Thank You

103