18
Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Embed Size (px)

Citation preview

Page 1: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Computing for Research ISpring 2011

Primary Instructor: Elizabeth Garrett-Mayer

Stata ProgrammingFebruary 28

Page 2: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Some simple programming

• Once again, princeton’s site has some great easy info:http://data.princeton.edu/stata/programming.aspx

• We will discuss a few things:– ‘macros’– looping– writing commands

• We will not discuss ‘mata’: powerful matrix programming language

Page 3: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

macros

• macro = a name associated with some text. • macros can be local or global in scope. • Example of use: shorthand for repeated

phrase– graphics title– set of ‘adjustment’ covariates

• syntax: local name content

Page 4: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Example: covariates* use SCBC datause "I:\Classes\StatComputingI\SCBC2004.dta", clear

* make tumor numeric and transformgen sizen=real(tumor)gen logsize = log(sizen) replace logsize = . if sizen==999

regress logsize age black graden

*define local macrolocal adjusters age black gradenregress logsize `adjusters'

regress logsize `adjusters' i.ercatregress logsize `adjusters' i.prcatregress logsize `adjusters' i.ercat i.prcat

NOTE: must use accent (`) in upper leftof keyboard as beginning quote and apostrophe (‘) (next to enter key)for end quote.

Page 5: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

More exampleslocal erprknown ercat<9 & prcat<9regress logsize `adjusters' i.ercat i.prcat if `erprknown'

Page 6: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Example: titles* another exampleinfile str14 country setting effort change /// using http://data.princeton.edu/wws509/datasets/effort.raw, clear

graph twoway (lfitci change setting) /// (scatter change setting) /// , title("Fertility Decline by Social Setting") /// ytitle("Fertility Decline") /// legend(ring(0) pos(5) order(2 "linear fit" 1 "95% CI"))

local gtitles title("Fertility Decline by Social Setting") ytitle("Fertility Decline")

* with macrograph twoway (lfitci change setting) /// (scatter change setting) /// , `gtitles' legend(ring(0) pos(5) order(2 "linear fit" 1 "95% CI"))

* without macrograph twoway (lfitci change setting) /// (scatter change setting) /// , legend(ring(0) pos(5) order(2 "linear fit" 1 "95% CI"))

Page 7: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Storing results

• Stata commands (and new commands that you and others write) can be classified as follows:– r-class: General commands such as summarize.

Results are returned in r() and generally must be used before executing more commands.

– e-class: Estimation commands such as regress, logistic etc., that fit statistical models. Results are returned in e() and remain there until the next model is estimated.

Page 8: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

(continued)

– s-class: Programming commands that assist in parsing. These commands are relatively rare. Results are returned in s().

– n-class: Commands that do not save results at all, such as generate and replace.

– c-class: Values of system parameters and settings and certain constants, such as the value of π, which are contained in c().

Page 9: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Accessing returned values

• return list, ereturn list, sreturn list and creturn list return all the values contained in the r(), e(), s() and c() vectors, respectively.

• For example, after using summarize, r() will contain r(N), r(mean), r(sd), r(sum) etc.

• Elements of each of the vectors can be used when creating new variables. They can also be saved as macros.

Page 10: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Using regression results

Although coefficients and standard errors from the most recent model are saved in e(), it is quicker to refer to them by using _b[varname] and _se[varname], respectively.

regress change setting effort

gen fitvals = setting*_b[setting] + effort*_b[effort] + _cons*_b[_cons]

predict fit

Page 11: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Storing results* run regression and store r-squared valueregress change settinglocal rsq = e(r2)display rsq

* run new regressionregress change setting effortdisplay e(r2)see old saved r-squareddisplay rsq* still there!

Page 12: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Global macros

• Global macros have names of up to 32 characters and, as the name indicates, have global scope.

• You define a global macro using global name [=] text and evaluate it using $name. (You may need to use ${name} to clarify where the name ends.)

• “I suggest you avoid global macros because of the potential for name conflicts.”

• A useful application, however, is to map the function keys on your keyboard. If you work on a shared network folder with a long name try something like this

Page 13: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

More on macros

• Macros can also be used to obtain and store information about the system or the variables in your dataset using extended macro functions.

• For example you can retrieve variable and value labels, a feature that can come handy in programming.

• There are also commands to manage your collection of macros, including macro list and macro drop. Type help macro to learn more.

Page 14: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Looping

• foreach: loops over a set of variables

• forvalues: loops over a set of values (index)

• Also:– while loops– if and else sets of commands

Page 15: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Programming

• ‘ado’ files• create commands in ado file and put them in

the appropriate directory for Stata to find• Can also create them in do files for local use• See– http://data.princeton.edu/stata/programming.aspx– www.ssc.upenn.edu/scg/stata/stata-programming-1.ppt – http://www.ssc.wisc.edu/sscc/pubs/stata_prog2.htm

Page 16: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Ado files

• An ado-file (“automatic do-file”) is a do-file that defines a Stata command. It has the file extension .ado.

• Not all Stata commands are defined by ado-files: some are built-in commands.

• The difference between a do-file and an ado-file is that when the name of the latter is typed as a Stata command, Stata will search for and run that file.

• For example, the program mysum could be saved in mysum.ado and used in future sessions

Page 17: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Ado files

• Ado-files often have help (.hlp) files associated with them.

• There are three main sources of ado-files:– Official updates from StataCorp.– User-written additions (e.g. from the Stata

Journal).– Ado-files that you have written yourself.

• Stata stores these in different locations, which can be reviewed by typing sysdir.

Page 18: Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28

Ado files

• Official updates are saved in the folder associated with UPDATES.

• User-written additions are saved in the folder associated with PLUS.

• Ado-files written by yourself should be saved in the folder associated with PERSONAL.

• If you have an Internet connection, official updates and user-written ado-files can be installed easily.

• To install official updates, type:update from http://www.stata.com