Making Tables and Figures with Stata Biostatistics 212 Lecture 6

Preview:

Citation preview

Making Tables and Figures with Stata

Biostatistics 212

Lecture 6

Housekeeping

• Brackets indicate optional parts of a command (usually!)• use vs. insheet• Low p-value for heterogeneity ≠ important interaction

• “The strata never lie”

• Final projects– Read 1-page directions closely

Today

• Organizing your Stata files

• Making a table

• Making a figure

Today

• Organizing your Stata files• Making a table Lab practice = Final Project

• Making a figure Lab practice = Lab 6

Organizing your Stata files

• Pitfalls– Proliferating dataset– Can’t remember what you did– Can’t remember why you did it– Can’t easily redo with new data

Organizing your Stata files

My system (it’s not perfect)1) Import data into Stata

a) Using a Stata command (e.g., insheet or import) within a do file

b) Using other method (e.g., StatTransfer?) outside a do file, then SAVE the “raw” Stata file immediately

2) Write a do file that “cleans” your data, and saves it as a new clean dataset

3) Write do files for each component of your analysis

Raw data

My organizational scheme

Raw data

Raw data.csv

Pre-process

My organizational scheme

Raw data

Raw data.csv

In Stata

My organizational scheme

Pre-process

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log

My organizational scheme

Pre-process

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log Table 1.do

Table 1.log

My organizational scheme

Pre-process

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log Table 1.do

Table 1.log

Table 1.xls

Cut and paste

My organizational scheme

Pre-process

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log Table 1.do

Table 1.log

Table 1.xls

Cut and paste

My organizational scheme

Table 1.doc

Cut and paste

Pre-process

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log Table 1.doTable 2.do

Table 1.logTable 2.log

Table 1.xls Table 2.xls

Cut and paste

My organizational schemeTable 1.doc Table 2.doc

Cut and paste

Pre-process

Organizing your Stata files

• You will end up with:– 1 or 2 Stata datasets

• Data, from Excel.dta (only if you import outside your do file)

• Data.dta

– 1 do file used for cleaning• Data prep.do

– 1 do file to create each Table and Figure• Table 1.do, Figure 1.do, Text data.do, etc

– Matching log files (with the same names) for each do file• Data prep.log, Table 1.log, Figure 2.log, Text data.log, etc

Organizing your Stata files

• Put them all in one folder called, “Stata files”, sort by file type.

• Example

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log Table 1.doTable 2.do

Table 1.logTable 2.log

Table 1.xls Table 2.xls

Cut and paste

Any questions?Table 1.doc Table 2.doc

Cut and paste

Pre-process

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log Table 1.doTable 2.do

Table 1.logTable 2.log

Table 1.xls Table 2.xls

Cut and paste

My organizational schemeTable 1.doc Table 2.doc

Cut and paste

Lecture 3Pre-process

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log Table 1.doTable 2.do

Table 1.logTable 2.log

Table 1.xls Table 2.xls

Cut and paste

My organizational schemeTable 1.doc Table 2.doc

Cut and paste

Lecture 3 Lecture 5Pre-process

Raw data

Raw data.csv

In Stata

Clean data.dta

Data prep.do Data prep.log Table 1.doTable 2.do

Table 1.logTable 2.log

Table 1.xls Table 2.xls

Cut and paste

My organizational schemeTable 1.doc Table 2.doc

Cut and paste

Lecture 3 Lecture 5

Lecture 7

Pre-process

Tables

• Two main purposes– Present the facts in a compact format

– Provide side-by-side comparisons

• Six main components:– Data

– Title, row heading, column headings

– Row names

– Footnotes

Browner, W. Publishing and Presenting Clinical Research

5 Steps to Making a Table

• Step 1: Decide what the Table will be about– Sketch it out on paper

• Title, column headings, etc

5 Steps to Making a Table

• Step 2: Make the dummy table– Excel or Word– Makes you specify what you actually want!

• Row headings

• Decide on category cut-offs, labels

• Decide on reference categories for regression, etc

• Footnote liberally

– Leave data cells blank

5 Steps to Making a Table

• Step 3: Write a do file that will produce each number you need– Iterative process, as you know

5 Steps to Making a Table

• Step 4: Copy and Paste the data in– Copy and Paste each number, or– “Copy Table” (under the “Edit” menu)

• http://www.stata.com/support/faqs/data/copytable.html

– Minimize manual retyping, rounding– Use Excel to calculate and round for you

5 Steps to Making a Table

• Step 5: Format it so it looks nice– Standard, plain style – usually:

• Horizontal lines, not vertical

• Double-spaced

• Footnotes - *, †, ‡, §, ║, ¶ (or a,b,c,d,…)

– Create a template for yourself

Word vs. Excel for Tables

• Stata Word– Fewer steps, fewer files– But…

• Can’t cut and paste full tables

• Doesn’t do any calculations for you

• Formatting can become “corrupted”

Word vs. Excel for Tables

• Stata Excel Word– Can cut and paste values or whole tables– Set rounding, do calculations easily– Formatting easier?– Copy and Paste into Word (extra step)

Demo

• Table 1 for “Moderate drinking and coronary calcium in young adults: The CARDIA Study”– Basic content

– Sketch

– Generate numbers in Stata

– Copy and paste into Word

– Show final table

– Demonstrate pasting a full table into Excel

Figures

• When use a figure?

• Making a figure with Excel

• Making a figure with Stata

When use a figure?

• When a graphical display of information more effectively conveys the intended message than words.

• “A picture is worth a thousand words”

Figures

• “A picture is worth a thousand words”

52%48%

No Yes

Moderate alcohol consumption in CARDIA participants

How many words is this picture worth?

Figures

• “A picture is worth a thousand words”

How many words is this picture worth?

48% of CARDIA participants consume alcohol moderately.

Worth = 7 words

Figures

• “A picture is worth a thousand words”

How many words is this picture worth?

40%

39%

13%

8%

57%26%

9%8%

White Black

0 <1

1-1.9 2+

Alcohol consumption, in drinks/day

Figures

• “A picture is worth a thousand words”

How many words is this picture worth?

White Black

Drinks/day n=1935 n=1727

0 40% 57%

0.1-0.9 39% 26%

1-1.9 13% 9%

2+ 8% 8%

Worth = 1 small table?

(and avoid pie charts in general…)

Figures• “A picture is worth a thousand words”

How many words is this picture worth?

0.0

5.1

.15

.2P

reva

lenc

e of

cor

onar

y ca

lcifi

catio

n

Black women White women Black men White men

By race and genderPrevalence of coronary calcification in moderate drinkers and abstainers

Abstainer Moderate drinker

Figures

• “A picture is worth a thousand words”

How many words is this picture worth?

Proportion with CAC

Abstainer Mod drinker

Black women .047 .036

White women .054 .049

Black men .068 .132

White men .180 .167

Can you see the interaction in this table without a figure?

(Figures are good for illustrating interactions)

Figures

• “A picture is worth a thousand words”

How many words is this picture worth?

-20

00

-10

00

01

00

02

00

0

Ch

an

ge

in

FE

V1 (

mill

ilite

rs)

0 20 40 60

Pack-years of exposure to tobacco

Menthol smokers Non-menthol smokers

Menthol regression Non-menthol regression

Figures

• “A picture is worth a thousand words”

How many words is this picture worth?

Worth = 968 data points?

Nice to show actual data points along with main effect, if possible!

Making a figure

• With Excel– First make a TABLE in Excel!

• Use Stata to generate numbers for the table

– Create a figure from the Table using Excel tools

• With Stata– Use Stata commands to create the figure directly

Steps in making an Excel figure

• Sketch your figure

• Make a dummy TABLE

• Write a .do file to fill in the table

• Copy and paste from the log file or the results window into the Table

• Use the Chart Wizard to create the Figure

• Format, format, format until it looks nice

Example

• Figure 2 from Lipids paper

Steps in making an Excel figure

• Sketch your figure

• Make a dummy TABLE

• Write a .do file to fill in the table

• Copy and paste from the log file or the results window into the Table

• Use the Chart Wizard to create the Figure

• Format, format, format until it looks nice

Steps in making a Stata figure

• Sketch your figure

• Make a dummy TABLE

• Write a .do file with a graph command

• Copy and paste from the log file or the results window into the Table

• Use the Chart Wizard to create the Figure

• Format, format, format until it looks nice

Pay attention to…

• Formatting– Make it look nice and professional, but not

gaudy• Black and white, usually

– The time-consuming part of making a figure is usually related to formatting.

Pay attention to…

• Labeling– Your figure should be understandable by itself,

without the rest of the manuscript– All axes should be labeled.– Include important p-values

Pay attention to…

• The Figure Legend– Title, explanations, extra p-values, etc– Separate section in manuscript or at bottom of

page – depends on journal

Stata vs. Excel for Figures

• Excel– Flexible and intuitive point-and-click figures

• Easy to create and modify• Flexible, more options, error bars, adjusted

estimates, good for bar graphs, etc

– But…• Requires an extra step – copy/pasting to Excel• Harder to reproduce• Much harder to do scatter plots

Stata vs. Excel for Figures

• Stata– Can create very customizable figures using 1 complex

Stata command• Easy to recreate – simple do file

• No error

• Scatter plots are MUCH easier with Stata

– But…• Harder to create the first time? - no point and click

• A little less flexible?

• Difficult to format: Graphic Editor helps address this

Stata vs. Excel for Figures

Easiest tool

• Bar/Line* figures Excel

• Confidence intervals Excel

• Scatter plots Stata

• Box plots Stata

• Dot plots Stata

* Where x-axis is categorical

Stata Demo

• Scatter plots: bmi vs. lipids– Iterative process of adding commands to do file– Cutting and pasting with substitution– Lowess smoother and linear fit lines– Post-graph editing

twoway (scatter dfev1 cumpy10 if menthol1==1, msymbol(plus) msize(small) mcolor(black)) /// (scatter dfev1 cumpy10 if menthol1==0, msymbol(circle_hollow)) /// (line m cumpy10 if menthol1==1, sort clcolor(black) clpat(dash) clwidth(thick)) /// (line nm cumpy10 if menthol1==0, sort clcolor(black) clpat(solid) clwidth(thick)) /// , ytitle(Change in FEV1 (milliliters), size(large)) yscale(titlegap(5)) /// xtitle(Pack-years of exposure to tobacco, size(large)) /// xscale(titlegap(3)) /// legend(order(1 "Menthol smokers" 2 "Non-menthol smokers" 3 "Menthol regression" /// 4 "Non-menthol regression")) /// scheme(s1mono) /// graphregion(fcolor(none) lcolor(none) ifcolor(none) ilcolor(none)) /// plotregion(fcolor(none) lcolor(none) ifcolor(none) ilcolor(none))

-20

00

-10

00

01

00

02

00

0

Ch

an

ge

in

FE

V1 (

mill

ilite

rs)

0 20 40 60

Pack-years of exposure to tobacco

Menthol smokers Non-menthol smokersMenthol regression Non-menthol regression

Key points• It’s worth putting thought into your file organization

• Tables:– First sketch it on paper– Next make your dummy table– Only then write your do file

• Figures:– Make it with Stata if you can– Use dialog boxes to get the correct syntax, paste into your do file– Make bar charts with Excel

• Document everything you do!

Today’s Lab

• You will create 4 figures using stata.

• The focus will be on learning to use the dialog boxes to get the syntax right, and then transferring commands and options into a do file.

Recommended