30
IOWA STATE UNIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September 7, 2010

I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

Embed Size (px)

Citation preview

Page 1: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Getting Your Data Into SAS(Chapter 2 in the Little SAS Book)

Animal Science 500

Lecture No. 3

September 7, 2010

Page 2: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Arithmetic Operators

Operation Symbol Example Result+ addition Num + Num

Example: 5 + 3add two numbers together

- subtraction Num - Num Example: 5 – 3 or can use two variables ending wt. – beginning wt.

subtract the value of 5 -3

* multiplication (table note 1)

2*yAlways have to have * cannot use 2(y) or 2y

multiply 2 by the value of Y

/ division var/5or can use variable weight gain / days on test

divide the value of VAR by 5

** can

also use the ^

exponentiation a**2or a^2

raise A to the second power

Page 3: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Comparison Operatorsu Comparison operators set up a comparison,

operation, or calculation with two variables, constants, or expressions within the dataset being used . n If the comparison is true, the result is 1. n If the comparison is false, the result is 0.

u Comparison operators can be expressed as symbols or with their mnemonic equivalents, which are shown in the following table:

Page 4: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Comparison Operators

SymbolMnemonic Equivalent Definition Example

= EQ equal to a=3

^= NE not equal to (table note 1) a ne 3

¬= NE not equal to

~= NE not equal to

> GT greater than num>5

< LT less than num<8

>= GE greater than or equal to (table note 2)

sales>=300

<= LE less than or equal to (table note 3) sales<=100

IN equal to one of a list num in (3, 4, 5)

Page 5: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Logical (Boolean) Operators and Expressions

Symbol Mnemonic Equivalent Example

& AND (a>b & c>d)

| OR (a>b or c>d)

! OR

¦ OR

¬ NOT not(a>b)

ˆ NOT

~ NOT

Logical operators, also called Boolean operators, are usually used in expressions to link sequences of comparisons.

Page 6: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Finding your data

u Most of the time your “raw” data files will be saved as external files

1. Text files – Word, WordPerfect, Writer, etc.

2. Spreadsheets - Excel, Lotus, Quattro Pro, etc.

3. Other systems – Unix, Open VMS, etc.

Page 7: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SAS

u The files containing your stored data will typically be stored

1. On the hard drive of the computer that you will ultimately use to analyze the data with SAS

2. Stored externally – l USB memory stick (flash memory)l External hard drive

Must get your data from “storage” into SAS to conduct the analyses

Page 8: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SAS

u Use the Infile statement within a DATA step

u Data mytrial;

Infile ‘c:\mydocument\trial.xls’;

Input statement (Input variable names

Remember to put the $ for character variables.

You may have to tell SAS which columns individual variables are found and place the decimal

Page 9: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SASu Data mytrial;

Infile ‘c:\mydocument\trial.xls’ DLM=“,” ;

Many options to assist you when using the infile command.

DLM=used to specify the delimiter that separates the variables in your raw data file. For example, dlm=','indicates a comma is the delimiter (e.g., a comma separated file, .csv file).

Or, dlm='09'x indicates that tabs are used to separate your variables (e.g., a tab separated file).

Page 10: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SASu Other options

n DSD The dsd option has 2 functions.

n First, it recognizes two consecutive delimiters as a missing value.

n For example, if your file contained the line 20,30,,50 SAS will treat this as 20 30 50 but with the the dsd option SAS will treat it as 20 30 . 50 , which is probably what you intended.

Page 11: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SASu Other options

n DSD option allows you to include the delimiter within quoted strings. For example, you would want to use the dsd option if you had a comma separated file and your data included values like "George Bush, Jr.". With the dsd option, SAS will recognize that the comma in "George Bush, Jr." is part of the name, and not a separator indicating a new variable.

Page 12: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SASu Other options

n FIRSTOBS=Tells SAS what on what line you want it to start reading your raw data file. (Default = 1)

If the first record(s) contains header information such as variable names, then set firstobs=n where n is the record number where the data actually begin.

Example: Assume you are reading a comma separated file or a tab separated file where the variable names are on the first line.

Use firstobs=2 to tell SAS to begin reading at the second line. (Ignores the first line with the names of the variables).

Page 13: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SASu Other options

n MISSOVER This option prevents SAS from going to a new input line if it does not find values for all of the variables in the current line of data.

For example, you may be reading a space delimited file and that is supposed to have 10 values per line, but one of the line had only 9 values.

Without the missover option, SAS will look for the 10th value on the next line of data.

Sets all empty variables to missing when reading a short line.

Page 14: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SAS

u Other optionsn MISSOVER

If your data is supposed to only have one observation for each line of raw data, then this could cause errors throughout the rest of your data file. If you have a raw data file that has one record per line, this option is a prudent method of trying to keep such errors from cascading through the rest of your data file.

Page 15: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SAS

u Other optionsn OBS=

Indicates which line in your raw data file should be treated as the last record to be read by SAS.

This is a good option to use for testing your program. For example, you might use obs=100 to just read in the first 100 lines of data while you are testing your program.

Page 16: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SAS

u Other options

u A typical infile statement for reading a comma delimited file that contains the variable names in the first line of data would be:

u INFILE "test.txt" DLM=',' DSD MISSOVER FIRSTOBS=2 ;

Page 17: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SAS

u Other optionsn LRECL = logical record length

LRECL is really useful for Windows users.

Default, Windows creates files with a logical record length of 256.

May appear that SAS is not reading all of your data or that beyond some point all variables are not being read.

Page 18: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Reading external files into SAS

u Other optionsn LRECL = logical record length

LRECL is really useful for Windows users.

You can tell Windows exactly how long to make the record length on the filename statement in SAS.

The option is lrecl= (logical record length) and it looks like this:filename myFile "c:\some directory\some file.txt" LRECL= 400;

u This option is REQUIRED if length of data line is over 256.

Page 19: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Knowing what Options are Available

u Obviously can look up using:n SAS on-line helpn SAS manuals and booksn Other example programs

Can also determine what options are available using the PROC Options:

Run;

Quit;

Will output what options are available to you at this step of your SAS program or code.

Page 20: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Informats

u Host of selected informats on pages 46-47 in the The Little SAS Book, 4th Edition.n Different ways data can be formatted and read in SASn Dates, Times, and combined DateTimen Reading Julian dates

Page 21: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Titles and Footnotes

u SAS allows up to 10 lines of text at the top (titles) and the bottom (footnote) on each page of output using the title and footnote statements.n Title <n> text;n Footnote <n> text;n Where n is the number of lines and have the range of

limits for each 1 to 10.n If text is omitted, the title or footnote is deletedn Otherwise it remains in effect until it is redefined.

Page 22: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Titles and Footnotes

u SAS allows up to 10 lines of text at the top (titles) and the bottom (footnote) on each page of output using the title and footnote statements.n To have no titles you can include title; n The default in SAS included the date and page number

at the top of each output.n To get rid of these options

l Type nodate and / or nonumber in the options section.

Page 23: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Temporary versus Permanent SAS Data Sets

u Temporary SAS datasetn Only exists during the current job or sessionn It is erased by SAS when you finish and close down

SAS

u Permanent SAS datasetn Does not mean it is around for ever or eternityn It remains stored even after you close your SAS

session.

u If you use a data set more than once, it is more efficient to save it as a permanent SAS data set

Page 24: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Temporary versus Permanent SAS Data Sets

u Using the Permanent SAS data set allows you to skip the infile step whether you use the import wizard or use an infile statement.

u If you are going to modify your data set it is likely easier to use the temporary SAS data set. n Need to add more data to “final” data setn Have not checked the “final” data set for errorsn Maybe other reasons.

Page 25: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Listing the Contents of a SAS Data Set

u Proc Contentsn Place Proc Contents data=yourdatasetname;n If you leave off the data= then SAS will perform the Proc

Contents on the last data set created.n It is a good way to check and see if all of your data are

being correctly read into SAS for further analyses.

Page 26: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Listing the Contents of a SAS Data Set

u Output from Proc Contents –

1. Data Set Name – be sure you evaluated the correct data set

2. Observations – did the correct number of observations get read into the analysis

3. Variables - were the correct number of variables identified

4. Created – date the analysis was created

5. Label – Some label you might have provided

Page 27: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Listing the Contents of a SAS Data Set

u Output from Proc Contents –Listing of variables in alphabetical order

The following output is created for each variable

1. Type – numeric or character

2. Length – storage size (in bytes)

3. Format for printing if any (for example the date may have been converted to worddate)

4. Informat for input if any (for example mmddyyyy for a date)

5. Variable label (e.g. date of birth, height in inches, weight in pounds

Page 28: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Processing an Existing Data Set

u When you want to process an existing SAS data setn Use the set statement rather than an infile statement

u Each time SAS encounters a set statement, SAS inputs an observation from an existing data set which contains all of the variables

Page 29: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Processing an Existing Data Set

Data data1; set data2;

Average daily gain = (offweight – onweight) / daysontest;

Run;

Quit;

Again, if the user does not specify a dataset to perform the operations, the last dataset used will be used again.

Page 30: I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September

IOWA STATE UNIVERSITYDepartment of Animal Science

Arithmetic Operators

u Arithmetic operators indicate that an arithmetic calculation is performed, as shown in the following table: