30
Summer SAS Workshop Lecture 3

Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Embed Size (px)

Citation preview

Page 1: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer SAS Workshop Lecture 3

Page 2: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 2

SAS Workshop Website

www.musc.edu/~simpsona/SASWorkshop/

Page 3: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Part I of Lecture 3Thinking through a programming problemProgramming logic Subsetting data

Page 4: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 4

How to program? What are your goals? What does your data look like? How does your data need to look

to accomplish your goals? What is the first thing you type and

run when you open SAS and want to start coding?????

Page 5: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 5

Dropping or Keeping Variables

You often get big data sets that you only want to use part of.

First Drop the variables that you don’t want. (or keep the ones you do want)

Data newdata;set annie.olddataset;Keep name ssnumber visdate dob;

Run;OrData newdata;

set annie.olddataset (Keep= name ssnumber visdate dob);

Run;

Page 6: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 6

Conditional Logic [If-Then-Else]

Frequently, you want an assignment statement to apply to some observations but not all - under some conditions, but not others.

This is also how you create new variables by recategorizing the old variables into new groupings.

1) IF condition THEN action;2) IF condition THEN action;

ELSE IF condition THEN action;ELSE IF condition THEN action;

3) IF condition THEN action;ELSE IF condition THEN action;ELSE action;

Page 7: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 7

IF-THEN-ELSE Rules A single IF-THEN statement can only have

one action. If you add the keywords DO and END, then you can execute more than one action (put it in a loop).

You can also specify multiple conditions with the keywords AND and OR

*Remember SAS considers missing values to be smaller than non-missing values.

Page 8: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 8

Comparison Operators These operators can be coded using Symbols or

Mnemonics.Symbol Mnemonic Meaning= EQ Equals~= NE Not Equal> GT Greater Than< LT Less Than>= GE Greater than or Equal<= LE Less than or Equal& AND All comparisons must be true| OR Only one comparison must be true

Page 9: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 9

Subsetting Often programmers find that they want

to use some of the observations in a data set and exclude the rest. The most common way to do this is with a subsetting IF statement in a DATA step.

Syntax: IF expression; Ex: IF Sex = ‘f’;

Page 10: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 10

Subsetting (cont.) If the expression is true, then SAS continues with

the DATA step. If the expression is false, then no further statements are processed for that observation; that observation is not added to the data set being created; and SAS moves to the next observation.

While the subsetting IF statement tells SAS which observations to include, the DELETE statement tells SAS which observations to exclude:IF expression THEN DELETE;IF Sex = ‘m’ THEN DELETE; (same as If Sex = “f”;)

Page 11: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 11

Open SAS code from website Go through the code. Run the program. Questions? How could we make a new data set

with only Males in it?

Page 12: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Part II of Lecture 3Merging data setsSAS Functions

Page 13: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 13

It’s all in the way that you look at things …

Page 14: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 14

Combining Data Sets Using One-to-One Match Merge

When you have two data sets with related data and you want to combine them.

If you merge two data sets, and they have variables with the same names – besides the BY variables, then variables from the second data set will overwrite any variables having the same names in the first data set.

All observations from old data sets will be included in the new data set whether they have a match or not.

Page 15: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 15

Match Merge ExampleProc Sort Data = Rat;

BY RatID Date;Run;Proc Sort Data = Rat2;

BY RatID Date;Run;DATA BigRat;

MERGE Rat Rat2;BY RatID Date;

Run;

Page 16: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 16

SAS Functions Previously created SAS functions are

used to simplify some complex programming problems

Usually arithmetic or mathematical calculations

Syntax of Function used in an expression:NewVar = FunctionName (VariableName);

Page 17: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 17

Common FunctionsLog ( );Log10 ( );Sin ( );Cos ( );Tan ( );Int ( );SQRT ( );Weekday ( );MDY ( , , );

Round (x, 1);Mean ( );RANUNI ( );Put ( );Input ( );Lag ( );Dif ( );N ( );NMISS ( );

Page 18: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Part III of Lecture 3Debugging

Page 19: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 19

Debugging?

“If debugging is the process of removing bugs, then programming must be the process of putting them in.”

–From some strange, but insightful website

Page 20: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 20

Syntactic Errors vs. Logic Errors We will focus mainly on syntax errors;

however, it is also possible for SAS to calculate a new variable using syntactically correct code that results in inaccurate calculations, I.e. a logic error.

For this reason, it is always wise to check values of a new variable against values of the original variable used in the calculation.

Page 21: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 21

READ THE LOG WINDOW!! I know that I spout this all of the time, and that is

because too many people begin skipping this step and then can’t figure out why their program isn’t working

If you have an ERROR message, look at that line as well as a few of the lines above it

Don’t ignore Warnings and Notes in the log simply because your program seems to have run, they could indicate a serious error that just did not happen to be syntactically incorrect, in this case, check your logic or add some Proc Prints to understand what is going on inside your program

Page 22: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 22

Debugging: The Basics

The better you can read and understand your program, the easier it is to find the problem(s).

Put only one SAS statement on a line Use indentions to show the different

parts of the program within DATA and PROC steps

Use comment statements GENEROUSLY to document your code

Page 23: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 23

Know your colors Make sure that you are using the

enhanced editor and know what code is generally what color (i.e. comments are green)

Page 24: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 24

Scroll Up Remember that your output and log windows

are scrolled to the very bottom of the screen, scroll ALL the way up and check the whole thing.

Look for common mistakes first (Semicolons and spelling errors!)

Make sure you haven’t typed an ‘O’ where you want an ‘0’ or vice versa, this can cause SAS to think that your numeric or character variable should be change to the other variable type. SAS may do this automatically when you don’t want it done!

Page 25: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 25

What is wrong here?

*Read the data file ToadJump.dat using a list input

Data toads;Infile ‘c:MyRawData\ToadJump.dat’;Input ToadName$ Weight Jump1 Jump2 Jump3;

Run;

Page 26: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 26

Here is the log window…___________________________________________________________*Read the data file ToadJump.dat using the list inputData toads;

Infile ‘c:\MyRawData\ToadJump.dat’;------180

ERROR 180-322: Statement is not valid or it is used out of proper order.Input ToadName$ Weight Jump1 Jump2 Jump3;-------180

ERROR 180-322: Statement is not valid or it is used out of proper order.Run;__________________________________________________________

123

4

5

Page 27: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 27

Page 28: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 28

SAS is still running… You need to check the message above the

menu on the Log window. If it says, as in this example, "DATA STEP

running", then steps must be taken to stop the program from running.

Even though SAS will continue to process other programs, results of such programs may be inaccurate, without any indication of syntax problems showing up in the log.

Page 29: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 29

SAS is still running… Several suggestions to stop the

program are: Submit the following line: '; run; Submit the following line: *))%*'''))*/; If all else fails, exit SAS entirely

(making sure that the revised program has been saved) and re-start it again.

Page 30: Summer SAS Workshop Lecture 3. Summer 20072 SAS Workshop Website simpsona/SASWorkshop

Summer 2007 30

TPA Data Practice Go to the website and download the a TPA

sample data set. Save it in a place that you can successfully write the Libname to point to!

Either find a SAS program that you can change to fit the current problem or begin writing the code with a blank Editor page.

The Goal: See how much reproduction of Tables 1 and 2 from the published paper you can recreate with your sample.

We will practice Proc Boxplot together.