23
Lesson 5 - Topics • Formatting Output • Working with Dates • Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Embed Size (px)

Citation preview

Page 1: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Lesson 5 - Topics

• Formatting Output

• Working with Dates

• Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Page 2: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Annotating SAS Output

• TITLE statements - label procedure output

• LABEL statements - label names of variables

• FORMAT statements - label values of variables

Page 3: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Standard Output The FREQ Procedure

Cumulative Cumulative clinic Frequency Percent Frequency Percent ----------------------------------------------------------- A 18 18.00 18 18.00 B 29 29.00 47 47.00 C 36 36.00 83 83.00 D 17 17.00 100 100.00

Annotated Output

Number of Patients by Clinic The FREQ Procedure

Clinical Center Cumulative Cumulative clinic Frequency Percent Frequency Percent ---------------------------------------------------------------- Birmingham 18 18.00 18 18.00 Chicago 29 29.00 47 47.00 Minneapolis 36 36.00 83 83.00 Pittsburgh 17 17.00 100 100.00

TITLE

LABEL for clinic

FORMAT forclinic

Page 4: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Standard Output The FREQ Procedure

Cumulative Cumulative sebl_6 Frequency Percent Frequency Percent ----------------------------------------------------------- 1 70 70.00 70 70.00 2 23 23.00 93 93.00 3 6 6.00 99 99.00 4 1 1.00 100 100.00

Annotated OutputThe FREQ Procedure

Patient Report Headaches Cumulative Cumulative sebl_6 Frequency Percent Frequency Percent ------------------------------------------------------------- None 70 70.00 70 70.00 Mild 23 23.00 93 93.00 Moderate 6 6.00 99 99.00 Severe 1 1.00 100 100.00

LABEL for sebl_6

FORMAT forsebl_6

Page 5: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

TITLE STATEMENTS

PROC FREQ DATA=tdata; TABLES clinic group sex educ sebl_1 sebl_6; TITLE 'Distribution of Selected Variables'; TITLE2 'on the TOMHS Dataset' ;RUN;

• TITLE statements can go anywhere in the program. Good practice to put under

PROC• Can change the titles at any time• TITLE(n)'text' is general syntax

Page 6: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Label Statements

LABEL clinic = 'Clinical Center';LABEL group = 'Drug Treatment Group';LABEL educ = 'Highest Education Attained';LABEL sebl_1 = 'Patient Report Drowsiness';LABEL sebl_6 = 'Patient Report Headaches';

Label statements can go anywhere in the datastep or under a procedure

(But not in-between!)

Page 7: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Format Statements

FORMAT brthdate mmddyy10. ;FORMAT group groupF. ;FORMAT fever headache seF. ;FORMAT clinic $clinicF. ;

Tells SAS to display the values of the variable according to the format.

• Format statements can go anywhere in the datastep or under a procedure

• There are build in formats (e.g. dates) and user defined formats (which need to be defined using PROC FORMAT)

• A format can apply to more than one variable.

• Formats end with a period (.)

• Character formats begin with a $

Page 8: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

How to Make User Defined FORMATS

PROC FORMAT; VALUE groupF 1 = 'Beta Blocker' 2 = 'Calcium Channel Blocker' 3 = 'Diuretic' 4 = 'Alpha Blocker'

5 = 'ACE Inhibitor' 6 = 'Placebo';

VALUE seF 1 = 'None' 2 = 'Mild’ 3 = 'Moderate' 4 = 'Severe';

The format name does NOT have to be the name of a variable on the dataset. It cannot end in a number.

Name of format

Page 9: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

PROC FORMAT; VALUE $clinicF 'A' = 'Birmingham' 'B' = 'Chicago' 'C' = 'Minneapolis' 'D' = 'Pittsburgh' ;

Don't confuse the format with the variable(s) to be formatted!

From PROC FORMAT alone SAS does not know which variables you plan to format with the given format. You need to apply format to the variable using the format statement

Page 10: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

LOG FILEPROC FORMAT;7 VALUE groupF 1 = 'Beta Blocker' 2 = 'Calcium Channel Blocker'8 3 = 'Diuretic' 4 = 'Alpha Blocker'9 5= 'ACE Inhibitor' 6 = 'Placebo';NOTE: Format GROUPF has been output.1011 1819 VALUE se 1 = 'None' 2 = 'Mild' 3 = 'Moderate' 4 = 'Severe';NOTE: Format SE has been output.20212223 VALUE $clinic 'A' = 'Birmingham' 'B' = 'Chicago'24 'C' = 'Minneapolis' 'D' = 'Pittsburgh' ;NOTE: Format $CLINICF has been output.2526 run;

Page 11: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

* Applying the formats ;

PROC FREQ; TABLES clinic sebl_6; FORMAT clinic $clinicF. sebl_6 seF. ;

RUN;

Page 12: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

* Program 7;PROC FORMAT;...DATA tdata ;INFILE ‘C:\SAS_Files\tomhs.data' ;INPUT @ 1 ptid $10. @ 12 clinic $1. @ 25 group 1. @ 30 sex 1. @ 49 educ 1. @ 51 eversmk 2. @230 alcbl 1. @236 sebl_1 1. @246 sebl_6 1. ;

LABEL clinic = 'Clinical Center';LABEL group = 'Drug Treatment Group';LABEL educ = 'Highest Education Attained';LABEL sebl_1 = 'Patient Report Drowsiness';LABEL sebl_6 = 'Patient Report Headaches';LABEL alcbl = 'Alcoholic Drinks Per Week';LABEL eversmk = 'Ever Smoke Cigarettes';

PROC FREQ DATA=tdata; TABLES clinic sebl_6; FORMAT clinic $clinicF. sebl_6 seF. ;

Page 13: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Items to Remember

• Formats need to be defined before you use them (PROC FORMAT).

• Formats are applied by using the FORMAT statement.

• Label and format statements in the datastep apply to all subsequent PROCs

• Label and format statements under a PROC apply only to that PROC

Page 14: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Working With Dates:Dates Come in Many Ways

• 10/18/04• 18/10/04• 10/18/2004• 18OCT2004• 101804• October 18, 2004

Need to know how to read-in dates and then work with them

Page 15: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

What do you want to do with dates?

• Display them• Compare two dates: find the number of days

between 2 datesndays = date2 - date1; Will this work?

Problem: dates do not subtract well

What if: date2 = 03/02/2003

date1 = 08/02/2002

==========

-05/00/0001

Page 16: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

DATA dates;INFILE DATALINES;INPUT @1 brthdate mmddyy10.; * Use informat;DATALINES;03/03/197102/14/195601/01/1960;PROC PRINT; VAR brthdate;PROC PRINT; VAR brthdate; FORMAT brthdate mmddyy10.;------------------------------------------------------Obs brthdate

1 4079 2 -1417 3 0

Obs brthdate

1 03/03/1971 2 02/14/1956 3 01/01/1960

Jan 1, 1960

Page 17: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

When you read in a variable with a date informat

• SAS makes the variable numeric• SAS assigns the numeric value relative to January 1, 1960This makes it easy to subtract two dates to get the number of days between the dates.

dayselapsed = date2 – date1; FORMAT date1 date2 mmddyy10.;

Note: Once read in SAS treats the variable as it does any numeric variable.

Page 18: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

* Program 8 ;DATA age;INFILE ‘C:\SAS_Files\tomhs.data' ;INPUT @14 randdate mmddyy10. @34 brthdate mmddyy10. @74 date12 mmddyy10. ; agedays = randdate - brthdate ;ageyrs = (randdate - brthdate)/365.25;ageint = INT( (randdate - brthdate)/365.25);

* Can also use YRDIF function;ageyrsX = yrdif(brthdate,randdate,'Actual');

yrrand = YEAR(randdate);

Page 19: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

PROC PRINT DATA=age (obs=10); VAR brthdate randdate agedays ageyrs

ageyrsX ageint ;TITLE 'Printing Dates Without a Date Format';RUN;

PROC PRINT DATA=age (obs=10); VAR brthdate randdate agedays ageyrs ageyrsX ageint ; FORMAT brthdate mmddyy10. randdate mmddyy10.;TITLE 'Printing Dates With a Date Format';RUN;

Page 20: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Printing Dates Without a Date Format

Obs brthdate randdate agedays ageyrs ageyrsX ageint

1 -8589 10175 18764 51.3730 51.3739 51

2 -6880 10239 17119 46.8693 46.8711 46

3 -12572 10002 22574 61.8042 61.8055 61

4 -9592 10175 19767 54.1191 54.1205 54

5 -12996 10280 23276 63.7262 63.7268 63

All before 1960

Page 21: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Printing Dates With a Date Format

Obs brthdate randdate

1 06/26/1936 11/10/1987

2 03/01/1941 01/13/1988

3 07/31/1925 05/21/1987

4 09/27/1933 11/10/1987

5 06/02/1924 02/23/1988

Page 22: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

PROC FREQ DATA=age; TABLES yrrand ; ; TITLE 'Frequency Distribution of Year

Randomized';RUN;

Page 23: Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4

Frequency Distribution of Year Randomized

The FREQ Procedure

Cumulative Cumulativeyrrand Frequency Percent Frequency Percent----------------------------------------------------------- 1986 9 9.00 9 9.00 1987 65 65.00 74 74.00 1988 26 26.00 100 100.00