Upload
feo
View
21
Download
1
Embed Size (px)
DESCRIPTION
Stata Intro. Practice Exercises. 2014 - Debby Kermer, George Mason University Libraries Data Services. Instructions. Create and run syntax to accomplish each task. Press the spacebar to see the next instruction, an answer or a hint. Open the Pew Social Trends Dataset - PowerPoint PPT Presentation
Citation preview
Stata Intro
Practice Exercises
2014 - Debby Kermer, George Mason University Libraries Data Services
Instructions
Create and run syntax to accomplish each task. Press the spacebar to see the next instruction, an answer or a hint.
Open the Pew Social Trends Dataset___ http://dataservices.gmu.edu/files/pew.dtaORFile | Open…
[type in] http://dataservices.gmu.edu/files/pew.dtaORDownload the dataset from:http://www.pewsocialtrends.org/category/datasets/?download=5753
hint usegiven at the workshop
given at the workshop
Exercise 1
Using Help
1
Produce statistics about yrborn using the summarize commandsummarize yrborn
Open the help for that commandhelp summarize
Modify the syntax to…
… use abbreviationssum yrborn orsum yr orsu y
… display additional statisticssum yr, detail
<continues…>
summarize
sum yr, _____hint
1a
Need to create yrborn? generate yrborn = 2005 - age
summarize yrborn
… ignore those who refused to give their age sum yr if (age != 99)sum yr if (age < 99)
Now, summarize age, ignoring those who refused to answersum age if (age < 99)
… and ALSO display additional statisticssum age if (age < 99), detail
sum yr if (________)sum yr if (________)
3 hints
1b
Forgot which value meant refused? label list AGE
Your result should look like ↓ Variable | Obs Mean Std. Dev. Min Max----------+-------------------------------------------------- yrborn | 2948 1963.089 18.01353 1915 1995
Extra Challenge
Compare average age by Region (cregion)tab cregion, sum(age)
Notice how this is a combination of bothtab cregion - frequencies for categorical variables
and sum age - means for numeric variables
But, summarize is used as an option, so the comma and parentheses are necessary
hint
1c
See the help page we used as an example:help tab then tabulate, summarize()
Exercise 2
Indicator Variables
2
generate voted = (________)
codebook ________hint
2a
generate voted = (pvote == 1)
Make a new variable "voted" indicating those who voted in the '04 election. Voters should have a 1, non-voters should have a 0. First, get information about the variable you will use:
codebook pvote04a
Then, create your variable:generate voted ___________
use tab pvote04a voted to check your work:
3 hints
If you want, this is how you can label the variable "voted"
label variable voted "Voted in the '04 Election"
label define yesno 1 "Yes" 0 "No"label values voted yesno
("yesno" is a made-up name, you can use anything)
Now, you try: label the variable "youth" appropriately
lab var youth "Youth: age < 30"
lab def under30 1 "< 30 yrs old" 0 "30 yrs and up"lab val youth under30
2b
Need to create "youth"?generate youth = (age < 30)replace youth = . if (age == 99)
Extra Challenge
In one statement (i.e., one line of syntax), create a variable legal indicating only those of legal drinking age
gen legal = (age >= 21) & (age < 99) gen legal = (age >= 21) if (age < 99)
Although both of the above are good, the values generated by these two commands are not identical. How do they differ?
2c
& recodes 99's as 0if recodes 99's as missing
Legal Drinker
Not Legal
No Age (99)
gen legal = (age >= 21) & (age < 99) 1 0 0gen legal = (age >= 21) if (age < 99) 1 0 .
legal | Freq. ------+------- 1 | 2,842
Exercise 3
Illustrating Relationships
3
3aShow the relationship between age group and voting rate
What variables can you use?youth and voted
What command can you use? Open help.help tab then tabulate twoway
Construct your syntax
tab youth voted___________Use options to include percentages, like this ↓
12
3 hints
| voted youth | 0 1 | Total-------+----------------------+---------- 0 | 18.68 81.32 | 100.00 1 | 47.65 52.35 | 100.00 -------+----------------------+---------- Total | 23.05 76.95 | 100.00
Pearson chi2(1) = 179.6007 Pr = 0.000
Show the relationship between age group and voting rate
tab youth voted, row nofreq chi2
| voted youth | 0 1 | Total-------+----------------------+---------- 0 | 18.68 81.32 | 100.00 1 | 47.65 52.35 | 100.00 -------+----------------------+---------- Total | 23.05 76.95 | 100.00
Pearson chi2(1) = 179.6007 Pr = 0.000
So, is there a relationship between age and voting?Among those younger than 30, 52% voted. But, among those 30 or older, 81% voted. Youth were less likely to have voted (p < .001).
13
hint
• • •
3b
Extra Challenge3c
That's All!
Thanks for trying the Stata Exercises.
If you have any questions about using Statacontact Debby Kermer at [email protected] see our online resources at:http://dataservices.gmu.edu/software/stata