41
1 Copyright © 2009 i3 | CONFIDENTIAL Macrotize! A Beginner’s Guide to Designing and Writing Macros Stacey D. Phillips, i3 Statprobe, Madison, WI

Summary of Presentation

  • Upload
    tulia

  • View
    15

  • Download
    0

Embed Size (px)

DESCRIPTION

Macrotize! A Beginner’s Guide to Designing and Writing Macros Stacey D. Phillips, i3 Statprobe, Madison, WI. Summary of Presentation. Introduction to Macro Design and Writing Description of Real World Problem using Quality of Life Data (QOL) Overview of 5 Steps to Designing and Writing Macros - PowerPoint PPT Presentation

Citation preview

Page 1: Summary of Presentation

1Copyright © 2009 i3 | CONFIDENTIAL

Macrotize! A Beginner’s Guide to Designing and Writing Macros

Stacey D. Phillips, i3 Statprobe, Madison, WI

Page 2: Summary of Presentation

2Copyright © 2009 i3 | CONFIDENTIAL

Summary of Presentation Introduction to Macro Design and Writing Description of Real World Problem using Quality of

Life Data (QOL) Overview of 5 Steps to Designing and Writing

Macros 5 Steps In-Depth using QOL data Additional example of the 5 Steps using Vital Signs

data Conclusion Questions

Page 3: Summary of Presentation

3Copyright © 2009 i3 | CONFIDENTIAL

Introduction to Designing and Writing Macros

Learning to write macro code is a valuable skill for a SAS programmer though it can be intimidating to beginning programmers.

This presentation and the corresponding paper focuses more on the concepts of macro design/writing rather than specific macro code details.

There are many ways to think about macro design. I will present the five steps I use when developing a new macro.

The paper demonstrates the five steps using a real-world example involving Quality of Life data.

This presentation will include an additional example that uses the five steps.

Page 4: Summary of Presentation

4Copyright © 2009 i3 | CONFIDENTIAL

Why Use Macros?

The macro facility is a tool for extending and customizing SAS and for reducing the amount of text you must enter to do a particular task.

SAS macros are used to reduce the amount of code one has to write by automating similar operations that are likely to be repeated.

In general, it is best to use macro language when you find yourself writing the same or very similar code over and over. Macro code can be more difficult to debug in the short term but it may save you time in the long run.

Page 5: Summary of Presentation

5Copyright © 2009 i3 | CONFIDENTIAL

Some reasons you might consider using macro code…

Macros allow you to make a change in one location of your program so that SAS can cascade the change throughout your program.

Macros allow you to write a section of code once and use it over and over again in the same program or even in different programs.

Macros allow you to make programs data driven letting SAS decide what to do based on actual data values.

Page 6: Summary of Presentation

6Copyright © 2009 i3 | CONFIDENTIAL

Some examples of where macro code might be useful…

You find yourself doing the same statistical calculation in multiple programs so decide to create and store an external macro (macro contained in a completely separate program) in the macro library that can be called whenever these calculations are necessary.– Example: A macro named %count01 is created to count

subjects, calculate percentages and format the data into character variables appropriate for display in any RTF table.

Page 7: Summary of Presentation

7Copyright © 2009 i3 | CONFIDENTIAL

Some examples of where macro code might be useful…

You are producing the similar output (eg. a data set or table or graph, etc.) for different groups of a certain population.

– Example: A macro named %by_state is created to generate an economic profile table for each of the 50 states.

You want to run multiple statistical analyses with only slight modifications to the inputs.

– Example: A macro named %reg is created to run a series of regression models and includes parameters to specify both the independent and dependent variables for each individual model without retyping the whole PROC REG code.

Page 8: Summary of Presentation

8Copyright © 2009 i3 | CONFIDENTIAL

A Real-world Example Using Quality of Life (QOL) Regression Data

A statistician on my project ran many regression models and output each individual statistic into 225 separate SAS datasets.

The datasets needed to be accessed, manipulated and reformatted based on the table shell provided (see Handout reference #1).

The original task required six tables be created though this presentation with demonstrate only three tables as the process is the same whether three or six tables are created.

In the interest of simplicity, I will include only partial sections of the code needed for the various steps. The complete code can be found in Step 5 of the paper and in Handout reference #5.

Page 9: Summary of Presentation

9Copyright © 2009 i3 | CONFIDENTIAL

Some Things to Keep in Mind Paper is designed to be concept-based rather than code-

based. Don’t get too bogged down in the specific code…but do ask

questions if you have them. Code examples are excerpts in the interest of simplicity.

The complete code is in the paper. The ultimate goal of this example is to write a dynamic

macro that will be as data independent as possible. Because the PROC REPORT portion of the generated table

is not relevant to the included macro writing, I’ve left out the code.

Page 10: Summary of Presentation

10Copyright © 2009 i3 | CONFIDENTIAL

Table Shell (see Handout Reference #1)Table 1.1 Summary of Statistical Analysis of Time-Adjusted PRO AUC by Scale

EORTC QLQ - C30 QOL Analysis Set

Treatment Comparison

95% CI

Scale Treatment n * Mean

Estimate Comparison Comparison

Estimate Lower Upper

Appetite loss scale A x xx.xx A-C x.xx x.xxx x.xxx

B x xx.xx B-C x.xx x.xxx x.xxx

C x xx.xx

Constipation scale A x xx.xx A-C x.xx x.xxx x.xxx

B x xx.xx B-C x.xx x.xxx x.xxx

C x xx.xx

Diarrhoea scale A x xx.xx A-C x.xx x.xxx x.xxx

B x xx.xx B-C x.xx x.xxx x.xxx

C x xx.xx

Additional scales to be included in subsequent pages…

Page X of Y

Footnote 1 Footnote X Program: xxx.sas Output: xxxrtf Source Data: QOL

Page 11: Summary of Presentation

11Copyright © 2009 i3 | CONFIDENTIAL

Five Easy Steps to Designing and Writing Macros

Step 1 - Know where you are and where you’re going.

Step 2 - Pretend you’re not writing a macro.

Step 3 - Look for patterns. Step 4 - Vary one parameter at a time. Step 5 - Put it all together.

Page 12: Summary of Presentation

12Copyright © 2009 i3 | CONFIDENTIAL

Step 1 - Know where you are and where you are going.

Do you even need to write a macro? Sometimes utilizing macro code is unnecessary and cumbersome.

Macro code is more difficult to debug than normal SAS code so macro language should only be used when it will be advantageous.

The benefits of using SAS macro language include its adaptability and portability which make it ideal for reducing the amount of code necessary when you find yourself repeating code over and over.

How do we know when using a macro is a good idea? I usually start to think about writing a macro when I what I’m working on involves repetition or pattern.

Page 13: Summary of Presentation

13Copyright © 2009 i3 | CONFIDENTIAL

Step 1 - Know where you are and where you are going. (See Handout Reference #2)

Where are we?

Page 14: Summary of Presentation

14Copyright © 2009 i3 | CONFIDENTIAL

Step 1 - Know where you are and where you are going. (See Handout Reference #2)Where are we going?

Annotated Table Shell Based on QOL Data

Treatment Comparison

95% CI

Scale Treatment cohortcd n

Mean Estimate

Compar-ison

Comparison Estimate Lower Upper

Appetite loss scale A n.n m.estimate A-C d.estimate d.lower d.upper

Data: B n.n m.estimate B-C d.estimate d.lower d.upper

<n/m/d>appetite C n.n m.estimate

Constipation scale A n.n m.estimate A-C d.estimate d.lower d.upper

Data: B n.n m.estimate B-C d.estimate d.lower d.upper

<n/m/d>constipation C n.n m.estimate

Diarrhoea scale A n.n m.estimate A-C d.estimate d.lower d.upper

<n/m/d>diarrhoea B n.n m.estimate B-C d.estimate d.lower d.upper

C n.n m.estimate

Additional scales to be included in subsequent pages…

Page X of Y

Footnote 1 Footnote X Program: xxx.sas Output: xxx.rtf Source Data: QOL

Page 15: Summary of Presentation

15Copyright © 2009 i3 | CONFIDENTIAL

Step 2 - Pretend you’re not writing a macro.

Ignore the fact that you want to write a macro and write out your code in regular SAS code without any macro language.

Why would I suggest this? – It’s important to make sure your basic SAS code works

properly before you “macrotize.”– While using macro code will make your work more

efficient in the long-run, in the short-term it can make the task more complex if you don’t have your basic code hammered out.

– Sometimes writing out the code in its basic form helps to identify additional patterns that will be useful for macro writing.

Page 16: Summary of Presentation

16Copyright © 2009 i3 | CONFIDENTIAL

Step 2 - Pretend you’re not writing a macro.QOL example: I want to SET the required data for each scale

together then MERGE the “N”, “mean” and “estimate/CI” data together.

/*EXAMPLE: Setting the “N” datasets together to get the N statistics*/libname c30auc "/project/data/stats/output/PRO/C30/AUC";data N; set C30AUC.NAPPETITE C30AUC.NCONSTIPATION

C30AUC.NDIARRHOEA C30AUC.NEMOTIONAL C30AUC.NFATIGUE C30AUC.NGLOBAL C30AUC.NINSOMNIA C30AUC.NNAUSEA

C30AUC.NPAIN C30AUC.NPHYSICAL C30AUC.NROLE;

run;

proc sort; by scale cohort;run;

Page 17: Summary of Presentation

17Copyright © 2009 i3 | CONFIDENTIAL

/*Merge the “N”, “mean” and “estimate/CI” data together to get final dataset ALL used in generating the table*/

data ALL; merge N /* “N” data */ M /* “mean” data*/ D; /* “estimate/confidence interval” data */ by scale cohort;run;

Using the ALL data I can now go ahead and use PROC REPORT to generate my table.

While this code is simple enough, I’m really interested in getting SAS to figure out which datasets I need based on what’s in the directory so the individual scale datasets don’t need to be specified.

Page 18: Summary of Presentation

18Copyright © 2009 i3 | CONFIDENTIAL

Step 2 - Pretend you’re not writing a macro. I am interested in generating the tables independent of the

input data. In other words, I don’t want to specify the individual datasets for each scale (eg. NAPPETITE, DPAIN) but want SAS to figure it out for me.

The following code creates a variable that includes the list of relevant datasets from a specified directory using SASHELP tables.

SASHELP tables are outside the scope of this paper but if you’re interested I’ve included the reference to a paper I wrote several years ago regarding this topic.

Try not to get bogged down in the specific code but notice the patterns that I’ve highlighted which will help write our macro later.

Page 19: Summary of Presentation

19Copyright © 2009 i3 | CONFIDENTIAL

Get “N” Population Datadata N_data;    length N_data $200;  set vstable;  by libname memname;  retain N_data;  if first.libname then do;    N_data=trim(left(libname))||"."||trim(left(memname));;  end;  else do;    N_data=trim(left(N_data))||" "||trim(left(libname))||"."||

trim(left(memname));  end;  call symput("N_data",N_data);  if last.libname;run;

Page 20: Summary of Presentation

20Copyright © 2009 i3 | CONFIDENTIAL

Step 2 - Pretend you’re not writing a macro.The PROC PRINT of the newly created N_DATA variable containing

the dataset lists is below:N_data

C30AUC.NAPPETITE C30AUC.NCONSTIPATION C30AUC.NDIARRHOEA C30AUC.NEMOTIONAL C30AUC.NFATIGUE C30AUC.NGLOBAL C30AUC.NINSOMNIA C30AUC.NNAUSEA C30AUC.NPAIN C30AUC.NPHYSICAL C30AUC.NROLE

Note how this list matches the SET statement we previously wrote out manually earlier in Step 2:

data N; set C30AUC.NAPPETITE C30AUC.NCONSTIPATION

C30AUC.NDIARRHOEA C30AUC.NEMOTIONAL C30AUC.NFATIGUE C30AUC.NGLOBAL

C30AUC.NINSOMNIA C30AUC.NNAUSEA C30AUC.NPAIN C30AUC.NPHYSICAL

C30AUC.NROLE;run;

Page 21: Summary of Presentation

21Copyright © 2009 i3 | CONFIDENTIAL

Step 3 - Look for patterns and variability.

Now that you know your code is working well, it is time to start thinking about introducing macro code.

First, let’s revisit the annotated shell to look for patterns and variability in dataset and variable naming conventions.

Next, revisit the code from Step 2 for the N data and compare it to the same code for the M data.

The locations of the “N” and “M” differences demonstrate a great place to add a macro parameter.

Further, by noting the similarities in variable naming conventions (eg. mean, estimate, etc.) we know the data can be set easily together without the need to rename variables.

Page 22: Summary of Presentation

22Copyright © 2009 i3 | CONFIDENTIAL

Step 3 – Look for patterns and variability.Returning to the annotated shell, note the patterns in the dataset (eg. scales named the same thing in N/M/D datasets) as well as the variability (eg. each scale is prefixed differently).

Annotated Table Shell Based on QOL Data

Treatment Comparison

95% CI

Scale Treatment cohortcd n

Mean Estimate

Compar-ison

Comparison Estimate Lower Upper

Appetite loss scale A n.n m.estimate A-C d.estimate d.lower d.upper

Data: B n.n m.estimate B-C d.estimate d.lower d.upper

<n/m/d>appetite C n.n m.estimate

Constipation scale A n.n m.estimate A-C d.estimate d.lower d.upper

Data: B n.n m.estimate B-C d.estimate d.lower d.upper

<n/m/d>constipation C n.n m.estimate

Diarrhoea scale A n.n m.estimate A-C d.estimate d.lower d.upper

<n/m/d>diarrhoea B n.n m.estimate B-C d.estimate d.lower d.upper

C n.n m.estimate

Additional scales to be included in subsequent pages…

Page X of Y

Footnote 1 Footnote X Program: xxx.sas Output: xxx.rtf Source Data: QOL

Page 23: Summary of Presentation

23Copyright © 2009 i3 | CONFIDENTIAL

Get “N” Population Datadata N_data;    length N_data $200;  set vstable;  by libname memname;  retain N_data;  if first.libname then do;    N_data=trim(left(libname))||"."||trim(left(memname));;  end;  else do;    N_data=trim(left(N_data))||" "||trim(left(libname))||"."||

trim(left(memname));  end;  call symput("N_data",N_data);  if last.libname;run;

Page 24: Summary of Presentation

24Copyright © 2009 i3 | CONFIDENTIAL

Get “M” Population Datadata M_data;    length M_data $200;  set vstable;  by libname memname;  retain M_data;  if first.libname then do;    M_data=trim(left(libname))||"."||trim(left(memname));;  end;  else do;    M_data=trim(left(M_data))||" "||trim(left(libname))||"."||

trim(left(memname));  end;  call symput(“M_data",M_data);  if last.libname;run;

Page 25: Summary of Presentation

25Copyright © 2009 i3 | CONFIDENTIAL

Step 4 - Vary one parameter at a time.

It’s best to vary one parameter at a time and check your work before continuing to add additional parameters.

By varying only one parameter at a time the code will be easier to debug should there be problems.

See Handout Reference #3 for the macro %AUC. Similar in concept to varying one parameter at a time is to

work from the “inside to the outside.” When working with “inner” and “outer” macros (often

referred to as “nested”) it is best to work on the inner macros and then the outer macros to reduce complexity in debugging.

Page 26: Summary of Presentation

26Copyright © 2009 i3 | CONFIDENTIAL

Step 4 - Vary one parameter at a time.

Remember that we’re generating not one but three similar tables. The titles are below:

Table 1.1 Summary of Statistical Analysis of Time-Adjusted PRO AUC by Scale

EORTC QLQ - C30QOL Analysis Set

Table 1.2 Summary of Statistical Analysis of Time-Adjusted PRO AUC by Scale

EORTC QLQ – LC13QOL Analysis Set

Table 1.3 Summary of Statistical Analysis of Time-Adjusted PRO AUC by Scale

EORTC QLQ – TSQMQOL Analysis Set

Page 27: Summary of Presentation

27Copyright © 2009 i3 | CONFIDENTIAL

Step 4 - Vary one parameter at a time.

Similar in concept to varying one parameter at a time is to work from the “inside to the outside.”

When working with “inner” and “outer” macros (often referred to as “nested”) it is best to work on the inner macros and then the outer macros to reduce complexity in debugging.

See Handout Reference #4 for example of nested macros. Note that %AUC generates the interior of the table (eg.

columns, rows, etc.) and is the inner macro. Note that %PRO generates each individual table (eg.

t_pro_auc_c30) and is the outer macro.

Page 28: Summary of Presentation

28Copyright © 2009 i3 | CONFIDENTIAL

Step 5 - Put it all together.1. Do a complete run of your program and fix any errors,

warnings, etc. in the log file.

2. Review results and compare back to the original assignment. Have you included all of the necessary components?

3. Identify additional places of variability and if applicable, adjust the code and start step #5 again.

4. Once you’re convinced the output is what is desired, review the macro code and clean-up as necessary. Check for indentation for better readability and document with comments throughout the macro especially for large, nested macros.

5. See Handout Reference #5 for complete code.

Page 29: Summary of Presentation

29Copyright © 2009 i3 | CONFIDENTIAL

Another Example Using the Five Steps: Vital Sign Data

The Task:

1. Create two datasets containing vital sign data. 1. Blood Pressure Measurements (Systolic and Diastolic BP)

2. Non-Blood Pressure Measurements (Pulse, Respiration and Temperature).

1. Add unit labels to each vital sign parameters (eg. mmHG for BP measurements).

Page 30: Summary of Presentation

30Copyright © 2009 i3 | CONFIDENTIAL

Five Easy Steps to Designing and Writing Macros

Step 1 - Know where you are and where you’re going.

Step 2 - Pretend you’re not writing a macro.

Step 3 - Look for patterns. Step 4 - Vary one parameter at a time. Step 5 - Put it all together.

Page 31: Summary of Presentation

31Copyright © 2009 i3 | CONFIDENTIAL

Step 1 – Know where you are and where you’re going: Where are we?

SUBJECT RESULT TEST

1 96.000 PULSE

1 18.000 RESP

1 36.800 TEMP

1 92.000 SYSBP

1 60.000 DIABP

2 96.000 PULSE

2 18.000 RESP

2 36.700 TEMP

2 129.000 SYSBP

2 84.333 DIABP

Page 32: Summary of Presentation

32Copyright © 2009 i3 | CONFIDENTIAL

Step 1 – Know where you are and where you’re going: Where are we going?

Data set with Blood Pressure Parameters:

SUBJECT TEST RESULT UNIT

1 SYSBP 92.000 mmHg

2 SYSBP 129.000 mmHg

3 SYSBP 90.000 mmHg

4 SYSBP 120.667 mmHg

5 SYSBP 144.000 mmHg

1 DIABP 60.000 mmHg

2 DIABP 84.333 mmHg

3 DIABP 60.667 mmHg

4 DIABP 73.333 mmHg

5 DIABP 84.000 mmHg

Page 33: Summary of Presentation

33Copyright © 2009 i3 | CONFIDENTIAL

Step 1 – Know where you are and where you’re going: Where are we going?

Data set with non-Blood Pressure Parameters:

SUBJECT TEST RESULT UNIT

1 PULSE 96.00 Beats per minute

2 PULSE 96.00 Beats per minute

3 PULSE 77.00 Beats per minute

4 PULSE 74.00 Beats per minute

5 PULSE 80.00 Beats per minute

1 RESP 18.00 Breaths per minute

2 RESP 18.00 Breaths per minute

3 RESP 18.00 Breaths per minute

4 RESP 16.00 Breaths per minute

5 RESP 18.00 Breaths per minute

1 TEMP 36.80 Degrees Celsius

2 TEMP 36.70 Degrees Celsius

3 TEMP 37.06 Degrees Celsius

4 TEMP 36.60 Degrees Celsius

5 TEMP 36.89 Degrees Celsius

Page 34: Summary of Presentation

34Copyright © 2009 i3 | CONFIDENTIAL

Step 2 - Pretend you’re not writing a macro.data pulse;

length UNIT $50;

set vitals_v;

if test='PULSE';

UNIT='Beats per minute';

run;

data resp;

length UNIT $50;

set vitals_v;

if test='RESP';

UNIT='Breaths per minute';

run;

Page 35: Summary of Presentation

35Copyright © 2009 i3 | CONFIDENTIAL

Step 3 - Look for patterns.data pulse;

length UNIT $50;

set vitals_v;

if test=“PULSE”;

UNIT=“Beats per minute”;

run;

data resp;

length UNIT $50;

set vitals_v;

if test=“RESP”;

UNIT=“Breaths per minute”;

run;

Page 36: Summary of Presentation

36Copyright © 2009 i3 | CONFIDENTIAL

Step 4 - Vary one parameter at a time.

%macro inner(test=);

data &test; /*Vary the TEST parameter first*/

length UNIT $50;

set vitals_v;

if test=“&test”;

UNIT=“Beats per minute”; /*Vary the UNIT parameter next*/

run;

%mend inner;

%inner(test=PULSE)

%inner(test=RESP)

Page 37: Summary of Presentation

37Copyright © 2009 i3 | CONFIDENTIAL

Step 5 - Put it all together.%macro outer(name=,data=,label=);

%macro inner(test=,unit=);

data &test;

length UNIT $50;

set vitals_v;

if test="&test";

UNIT="&unit";

run;

%mend inner;

%inner(test=RESP,unit=Breaths per minute);

%inner(test=PULSE,unit=Beats per minute);

%inner(test=TEMP,unit=Degrees Celsius);

%inner(test=SYSBP,unit=mmHg);

%inner(test=DIABP,unit=mmHg);

data &name;

set &data;

run;

%mend outer;

%outer(name=BP,data=sysbp diabp,label=Blood Pressure Parameters);

%outer(name=NON_BP,data=pulse resp temp,label=non-Blood Pressure Parameters);

Page 38: Summary of Presentation

38Copyright © 2009 i3 | CONFIDENTIAL

Step 5 - Put it all together.%macro outer(name=,data=,label=);

%macro inner(test=,unit=);

data &test;

length UNIT $50;

set vitals_v;

if test="&test";

UNIT="&unit";

run;

%mend inner;

%inner(test=RESP,unit=Breaths per minute);

%inner(test=PULSE,unit=Beats per minute);

%inner(test=TEMP,unit=Degrees Celsius);

%inner(test=SYSBP,unit=mmHg);

%inner(test=DIABP,unit=mmHg);

data &name;

set &data;

run;

%mend outer;

%outer(name=BP,data=sysbp diabp,label=Blood Pressure Parameters);

%outer(name=NON_BP,data=pulse resp temp,label=non-Blood Pressure Parameters);

Page 39: Summary of Presentation

39Copyright © 2009 i3 | CONFIDENTIAL

Step 5 – Put it all together.

Data set with Blood Pressure Parameters:

SUBJECT TEST RESULT UNIT

1 SYSBP 92.000 mmHg

2 SYSBP 129.000 mmHg

3 SYSBP 90.000 mmHg

4 SYSBP 120.667 mmHg

5 SYSBP 144.000 mmHg

1 DIABP 60.000 mmHg

2 DIABP 84.333 mmHg

3 DIABP 60.667 mmHg

4 DIABP 73.333 mmHg

5 DIABP 84.000 mmHg

Page 40: Summary of Presentation

40Copyright © 2009 i3 | CONFIDENTIAL

Step 5 – Put it all together.Data set with non-Blood Pressure Parameters:

SUBJECT TEST RESULT UNIT

1 PULSE 96.00 Beats per minute

2 PULSE 96.00 Beats per minute

3 PULSE 77.00 Beats per minute

4 PULSE 74.00 Beats per minute

5 PULSE 80.00 Beats per minute

1 RESP 18.00 Breaths per minute

2 RESP 18.00 Breaths per minute

3 RESP 18.00 Breaths per minute

4 RESP 16.00 Breaths per minute

5 RESP 18.00 Breaths per minute

1 TEMP 36.80 Degrees Celsius

2 TEMP 36.70 Degrees Celsius

3 TEMP 37.06 Degrees Celsius

4 TEMP 36.60 Degrees Celsius

5 TEMP 36.89 Degrees Celsius

Page 41: Summary of Presentation

41Copyright © 2009 i3 | CONFIDENTIAL

Conclusion

Learning to write macro code is a useful skill for the SAS programmer.

Macros can make your programs more efficient and reduce the amount of code writing that needs to be done.

By learning a few basic concepts of macro writing, the beginning SAS programmer can easily get started in designing and writing macros.