41
Into to SAS ®

Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

Embed Size (px)

Citation preview

Page 1: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

Into to SAS®

Page 2: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

2

List the components of a SAS program. Open an existing SAS program and run it.

Objectives

Page 3: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

3

DATA steps are typically used to create SAS data sets.

PROC steps are typically used to process SAS data sets (that is, generate reports and graphs, edit data, and sort data).

A SAS program is a sequence of steps that the user submits for execution.

RawData

RawData

DATAStep

DATAStep

ReportReport

SASDataSet

SASDataSet

PROCStep

PROCStep

SAS Programs

Page 4: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

4

data work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;

proc print data=work.staff;

proc means data=work.staff; class JobTitle; var Salary;run;

Step Boundaries

Page 5: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

5

Interactive windows enable you to interface with SAS.

SAS Windowing Environment

Page 6: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

6

Open the SAS program “example.sas.”

Submit the program and examine the results.

Data for today's class located at

http://www.missouri.edu/~baconr/sas/econ

Exercises

Page 7: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

7

Learn the two fundamental SAS syntax programming rules.

Learn to create a SAS dataset from a text file. Write a Data Step to read a course data file.

Objectives

Page 8: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

8

SAS statements have these characteristics: usually begin with an identifying keyword always end with a semicolon

data staff; input LastName $ FirstName $ JobTitle $ Salary;datalines;…insert text here…run;

proc print data=staff;run;

Fundamental SAS Syntax Rules

Page 9: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

9

Reading Raw Data Files

1 1 21---5----0----5----043912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 1517098212/12/00dfw 5 8543912/13/00LAX 1419698212/13/00DFW 1511643112/14/00LaX 1716698212/14/00DFW 7 8811412/15/00LAX 18798212/15/00DFW 14 31

Description Columns Flight Number 1- 3

Date 4-11 Destination 12-14 First Class Passengers

15-17

Economy Passengers

18-20

Data for flights from New York to Dallas (DFW) and Los Angeles (LAX) is stored in a raw data file. Create a SAS data set from the raw data.

Page 10: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

10

data SAS-data-set-name; infile 'raw-data-filename';

input input-specifications;

Creating a SAS Data Set

1 1 21---5----0----5----0

43912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 15170

Raw Data File

DATA Step

Flight Date Dest First Class

Economy

439 12/11/00 LAX 20 137 921 12/11/00 DFW 20 131 114 12/12/00 LAX 15 170

SAS Data Set

In order to create a SAS data set from a raw data file, you must do the following:1. Start a DATA step and name the

SAS data set being created (DATA statement).

2. Identify the location of the raw data file to read (INFILE statement).

3. Describe how to read the data fields from the raw data file (INPUT statement).

run;

Page 11: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

11

Creating a SAS Data SetGeneral form of the DATA statement:

Example: This DATA statement creates a temporarySAS data set named dfwlax:

Example: This DATA statement creates a permanent SAS data set named dfwlax:

DATA libref.SAS-data-set(s);DATA libref.SAS-data-set(s);

data work.dfwlax;

libname ia 'SAS-data-library';data ia.dfwlax;

Page 12: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

12

Pointing to a Raw Data File General form of the INFILE statement:

Examples:

The PAD option in the INFILE statement is useful for reading variable-length records typically found in Windows and UNIX environments.

z/OS (OS/390)infile 'userid.prog1.dfwlax';

UNIX infile '/users/userid/dfwlax.dat';

Windows infile 'c:\workshop\winsas\prog1\dfwlax.dat';

INFILE 'filename' <options>; INFILE 'filename' <options>;

Page 13: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

13

Reading Data FieldsGeneral form of the INPUT statement:

input-specifications names the SAS variables identifies the variables as character or numeric specifies the locations of the fields in the raw data can be specified as column, formatted, list, or

named input.

INPUT input-specifications;INPUT input-specifications;

Page 14: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

14

Reading Data Using Column InputColumn input is appropriate for reading the following: data in fixed columns standard character and numeric data

General form of a column INPUT statement:

Examples of standard numeric data:

The term standard data refers to character and numeric data that SAS recognizes automatically.

15 -15 15.4 +1.23 1.23E3 -1.23E-3

INPUT variable <$> startcol-endcol . . . ;INPUT variable <$> startcol-endcol . . . ;

Page 15: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

15

Reading Data Using Column Input

43912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 15170

1 1 21---5----0----5----0

input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20;

...

Page 16: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

16

Creating Temporary SAS Data Sets

NOTE: The data set WORK.DFWLAX has 10 observations and 5 variables.

data work.dfwlax; infile 'raw-data-file'; input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20;run;

Store the dfwlax data set in the work library.

c06s1d1

Page 17: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

17

Assignment Statements: Creating variables

NOTE: The data set WORK.DFWLAX has 10 observations and 5 variables.

data work.dfwlax; infile 'raw-data-file'; input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20;Total=Firstclass+Economy;LogEconomy=LOG(Economy);run;

c06s1d1

Page 18: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

18

Formatted input is appropriate for reading the following: data in fixed columns standard and nonstandard character and numeric data calendar values to be converted to SAS date values

Reading Data Using Formatted Input

Page 19: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

19

Date values that are stored as SAS dates are special numeric values.

A SAS date value is interpreted as the number of days between January 1, 1960, and a specific date.

01JAN1959 01JAN1960 01JAN1961

-365 0 366

01/01/1959 01/01/1960 01/01/1961

informat

format

Working with Date Values

Page 20: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

20

General form of the INPUT statement with formatted input:

Reading Data Using Formatted Input

INPUT pointer-control variable informat . . . ;INPUT pointer-control variable informat . . . ;

Formatted input is used to read data values by doing the following: moving the input pointer to the starting position of

the field specifying a variable name specifying an informat

Page 21: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

21

Pointer controls:

@n moves the pointer to column n.

+n moves the pointer n positions.

An informat specifies the following: the width of the input field how to read the data values that are stored in the field

Reading Data Using Formatted Input

Page 22: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

22

<$>informat-namew.<d>

An informat is an instruction that SAS uses to read data values.

SAS informats have the following form:

Informatname

Number ofdecimal placesIndicates a

characterinformat

What Is a SAS Informat?

Total widthof the fieldto read

Requireddelimiter

Page 23: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

23

w. standard numeric informat

Raw Data Value Informat SAS Data Value

8.0

1 2 3 4 5 6 7

1 2

2 3 4 5 6 7

Selected Informats

8.0

1 2 3 4 . 5 6 7

1 2 2

3 4 . 5 6 7

Page 24: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

24

COMMAw. reads numeric data and removes selected nonnumeric characters such as dollar signs and commas.

Raw Data Value Informat SAS Data Value

MMDDYYw. reads dates of the form mm/dd/yyyy.

Raw Data Value Informat SAS Data Value

COMMA7.0

$ 1 2 , 5 6 7

2

13

2 5 6 7

MMDDYY8.

1 0 / 2 9 / 0 1

1 5 2 7 7

Selected Informats

Page 25: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

25

$8.

J A M E S

J A

A M EJ

S

Selected Informats$w. standard character informat

(removes leading blanks)

Raw Data Value Informat SAS Data Value

Page 26: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

26

SAS uses date informats to read and convert dates to SAS date values.

10/29/2001 MMDDYY10. 1527710/29/01 MMDDYY8. 1527729OCT2001 DATE9. 1527729/10/2001 DDMMYY10. 15277

InformatRaw Data

ValueConverted

Value

Examples:

Number of days between 01JAN1960 and 29OCT2001

Converting Dates to SAS Date Values

Page 27: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

27

1 1 2 1---5----0----5----043912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 15170

Reading Data: Formatted Input

input @1 Flight $3. @4 Date mmddyy8. @12 Dest $3. @15 FirstClass 3. @18 Economy 3.;

...

Page 28: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

28

Raw Data File 43912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 15170

1 1 2 1---5----0----5----0

Reading Data: Formatted Input

data work.dfwlax; infile 'raw-data-file'; input @1 Flight $3. @4 Date mmddyy8. @12 Dest $3. @15 FirstClass 3. @18 Economy 3.;run;

c06s2d1

Page 29: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

29

The SAS System

First Obs Flight Date Dest Class Economy

1 439 14955 LAX 20 137 2 921 14955 DFW 20 131 3 114 14956 LAX 15 170 4 982 14956 dfw 5 85 5 439 14957 LAX 14 196 6 982 14957 DFW 15 116 7 431 14958 LaX 17 166 8 982 14958 DFW 7 88 9 114 14959 LAX . 187 10 982 14959 DFW 14 31

Reading Data: Formatted Inputproc print data=work.dfwlax;run;

SAS date values

c06s2d1

Page 30: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

30

List Input with the Default Delimiter

The data is not in fixed columns. The fields are separated by spaces. There is one nonstandard field.

51 4feb1989 132 53050002 11nov1989 152 54050003 22oct1991 90 53050004 4feb1993 172 55050005 24jun1993 170 51050006 20dec1994 180 520

Page 31: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

31

Delimiters

tab characters

A space (blank) is the default delimiter.

blanks

commas

Common delimiters are

Page 32: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

32

List InputGeneral form of the INPUT statement for list input:

You must specify the variables in the order that they appear in the raw data file.

Specify a $ after the variable name if it is character. No symbol after the variable name indicates a numeric variable.

INPUT var-1 $ var-2 . . . var-n;INPUT var-1 $ var-2 . . . var-n;

Page 33: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

33

Input DataThe second field is a date. How does SAS store date

values?

50001 4feb1989 132 53050002 11nov1989 152 54050003 22oct1991 90 53050004 4feb1993 172 55050005 24jun1993 170 51050006 20dec1994 180 520

Page 34: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

34

InformatsTo read nonstandard data, you must apply an informat.

General form of an informat:

Informats are instructions that specify how SAS reads raw data.

<$>INFORMAT-NAME<w>.<d><$>INFORMAT-NAME<w>.<d>

Page 35: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

35

Specifying an InformatTo specify an informat when using list input, use the colon

(:) format modifier in the INPUT statement between the variable name and the informat.

General form of a format modifier in an INPUT statement:

INPUT variable : informat;INPUT variable : informat;

Page 36: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

36

Reading a Delimited Raw Data Filedata airplanes; infile 'raw-data-file'; input ID $ InService : date9. PassCap CargoCap;run;

Page 37: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

37

Non-Default DelimiterThe fields are separated by commas.

50001 , 4feb1989,132, 53050002, 11nov1989,152, 54050003, 22oct1991,90, 53050004, 4feb1993,172, 55050005, 24jun1993, 170, 51050006, 20dec1994, 180, 520

Page 38: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

38

Using the DLM= OptionThe DLM= option sets a character or characters that

SAS recognizes as a delimiter in the raw data file.

General form of the INFILE statement with the DLM= option:

Any character you can type on your keyboard can be a delimiter. You can also use hexadecimal characters.

INFILE 'raw-data-file' DLM='delimiter(s)';INFILE 'raw-data-file' DLM='delimiter(s)';

Page 39: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

39

Multiple Records Per ObservationA raw data file has three

records per employee. Record 1 contains the first and last names, record 2 contains the city and state of residence, and record 3 contains the employee’s phone number.

Farr, SueAnaheim, CA869-7008Anderson, Kay B.Chicago, IL483-3321Tennenbaum, Mary AnnJefferson, MO589-9030

Page 40: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

40

Desired OutputThe SAS data set should have one observation per

employee.

LName FName City State Phone

Farr Sue Anaheim CA 869-7008Anderson Kay B. Chicago IL 483-3321Tennenbaum Mary Ann Jefferson MO 589-9030

Page 41: Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives

41

data address;infile 'raw-data-file' dlm=','; input #1 LName $ FName $ #2 City $ State $ #3 Phone $;run;

Reading Multiple Records per Observation

...