26
Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc.

Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

Embed Size (px)

Citation preview

Page 1: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

Let SAS Do the Coding for You!

Robert Williams

Business Info Analyst Sr.

WellPoint Inc.

Page 2: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

2

Overview

• Coding Dilemma: Direct coding or “Robo-Coding”?

• Step 1: Setting up SAS Data Set for “Robo-Coding”

• Step 2a: Technique #1 – Macro variable• Generate SAS code using DATA _NULL_ and CALL SYMPUT to a macro

• Step 2b: Technique #2 – External SAS code file• Generate SAS code using DATA _NULL_ and PUT statements to write to

an external SAS program file

• Step 3: Run the auto-generated SAS codes

• Extra Example and Conclusion

Page 3: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

3

Coding Dilemma: Claims Data Extract Request

• Example data extraction asking for a subset of claims based on the following diagnosis and the corresponding diagnosis categories

• Muscle problems that are of a disabling nature, limited to: 356.1, 359.0-359.2, 045.9, 728.11

• Hemophilia: 286.0-286.4

• Acquired or congenital heart disease: 745.0-747.9; 424.0-429.9

Page 4: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

4

Coding Dilemma: 1-digit & 2-digit DX sub-codes

• Diagnosis (DX) codes can be cumbersome to code into the WHERE statement due to having to account for each 1-digit and 2-digit sub-codes.

• Example: To account for all medical DX codes from 359.0 to 359.2 (www.icd9data.com)

Page 5: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

5

Coding Dilemma: Write the code the hard way

data WORK.CLAIMS_SUBSET_DX_cat; set WORK.SESUG_samp_claims; where (IDCD_ID like "0459%" or IDCD_ID like "3561%" or IDCD_ID like "3590%" or IDCD_ID like "3591%" or IDCD_ID like "3592%" or IDCD_ID like "72811%" or IDCD_ID like "2860%" or IDCD_ID like "2861%" or IDCD_ID like "2862%" or IDCD_ID like "2863%" or IDCD_ID like "2864%" or IDCD_ID like "424%" or IDCD_ID like "425%" or IDCD_ID like "426%" or IDCD_ID like "427%" or IDCD_ID like "428%" or IDCD_ID like "429%" or IDCD_ID like "745%" or IDCD_ID like "746%" or IDCD_ID like "747%"); format DX_category $150.; select; when (substr(IDCD_ID,1,4) = '0459') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3561') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3590') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3591') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3592') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,5) = '72811') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '2860') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2861') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2862') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2863') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2864') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,3) = '424') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '425') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '426') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '427') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '428') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '429') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '745') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '746') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '747') DX_category = 'Heart disease'; OTHERWISE DX_category = "WHAT CATEGORY?!"; END; run;

Code all the LIKE

statements with %

Code all the WHEN

statements with

different sizes for SUBSTR

function

Page 6: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

6

Step 1: Setting up SAS Data File

• Create a SAS data set with key data. It can be created in a DATA step or imported from Excel.

DX_list DX_CATEGORY

0459 Muscle problems

3561 Muscle problems

3590 Muscle problems

3591 Muscle problems

3592 Muscle problems

72811 Muscle problems

2860 Hemophilia

2861 Hemophilia

2862 Hemophilia

2863 Hemophilia

2864 Hemophilia

424 Heart disease

425 Heart disease

426 Heart disease

427 Heart disease

428 Heart disease

429 Heart disease

745 Heart disease

746 Heart disease

747 Heart disease

Categories for the WHEN

statements

DX codes for WHERE statements

and WHEN statements

(without decimals)

Muscle problems: 356.1, 359.0-359.2, 045.9, 728.11Hemophilia: 286.0-286.4Heart disease: 745.0-747.9; 424.0-429.9

Page 7: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

7

Step 2a: Technique #1 – Macro Variable

• Using the SAS data set from previous slide (Step 1), this technique uses DATA _NULL_ to generate a data string for the WHERE statement like this“WHERE IDCD_ID like '747%' or IDCD_ID like '0459%' or …”

• Uses the RETAIN statement so the data string is retained for each DATA step iteration

• Uses the END statement and CALL SYMPUT function to output the data string to a macro variable. The CALL SYMPUT is invoked at the last iteration of the DATA step

Page 8: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

8

Step 2a: Technique #1 – The Robo-Coder

• DATA _NULL_ step code to generate the data string data _null_;

set WORK.SESUG_DX_LISTING END=last; format where_string $5000.; retain where_string; /* This is the first line for the where statement */ if _n_ = 1 then where_string = 'IDCD_ID like "'||trim(DX_LIST)||'%"'; /* subsequent lines that adds the "OR" line for the where statement */

else where_string = trim(where_string)||' '||'OR IDCD_ID like "'||trim(DX_LIST)||'%"';

/* At the end, this outputs entire where statement to the macro variable */ if last then call symput("IDCD_ID_where_statement",where_string); run;

At _n_=1, start the string. Then At _n_>1, “OR…” is added to the

data string

At _n_ = 1, where_string = IDCD_ID like "0459%"

At _n_ = 2, where_string = IDCD_ID like "0459%"

OR IDCD_ID like "3561%"

At the last iteration, the DATA _NULL_ outputs the where_string value as a SAS macro variable named: “IDCD_ID_WHERE_STATEMENT"

Page 9: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

9

Step 2a: Technique #1 – The Robo-Coder

• DATA _NULL_ step code to generate the data string

• Using the %PUT statement, the macro variable will look like this in the SAS log

data _null_; set WORK.SESUG_DX_LISTING END=last; format where_string $5000.; retain where_string; /* This is the first line for the where statement */ if _n_ = 1 then where_string = 'IDCD_ID like "'||trim(DX_LIST)||'%"'; /* subsequent lines that adds the "OR" line for the where statement */

else where_string = trim(where_string)||' '||'OR IDCD_ID like "'||trim(DX_LIST)||'%"';

/* At the end, this outputs entire where statement to the macro variable */ if last then call symput("IDCD_ID_where_statement",where_string); run;

%put &IDCD_ID_where_statement; IDCD_ID like "0459%" or IDCD_ID like "3561%" or IDCD_ID like "3590%" or IDCD_ID like "3591%" or IDCD_ID like "3592%" or IDCD_ID like "72811%" or IDCD_ID like "2860%" or IDCD_ID like "2861%" or IDCD_ID like "2862%" or IDCD_ID like "2863%" or IDCD_ID like "2864%" or IDCD_ID like "424%" or IDCD_ID like "425%" or IDCD_ID like "426%" or IDCD_ID like "427%" or IDCD_ID like "428%" or IDCD_ID like "429%" or IDCD_ID like "745%" or IDCD_ID like "746%" or IDCD_ID like "747%"

_n_ = 1 starts the string, after that, the “OR…” is added to the

data string

Page 10: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

10

Step 2b: Technique #2 – External Program File

• Using the SAS data set from Step 1, this technique uses DATA _NULL_ to write the series of SELECT/WHEN statements to an external SAS program file using the PUT statements

• Uses FILENAME statement to establish external SAS file reference i.e. “tmpwhen” to be used with FILE statement

• Uses LENGTH function to determine the size of the DX_LIST field of the SAS data set (from step 1) i.e.“str_length = put(length(DX_LIST),1.0);”

• Each DATA step iteration writes the each WHEN statement for the DX_ Category

Page 11: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

11

Step 2b: Technique #2 – External Program File

• Uses the END statement (inside the SET statement) to write the final the final two lines of the series of SELECT/WHEN statements i.e.“OTHERWISE DX_category="WHAT CATEGORY?!";END;”

• When the DATA _NULL_ is completed, the external SAS program file is created with SELECT/WHEN statement block generated by all the PUT statements inside the DATA _NULL_

Page 12: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

12

Step 2b: Technique #2 – The Robo-Coder

• DATA _NULL_ step code to write to external program

filename tmpwhen "C:\SESUG\tempwhenstatements.sas"; /* Makes a SAS program file */ data _null_; set WORK.SESUG_DX_LISTING END=last; format when_string $250.; file tmpwhen; /* This is the file the PUT statements writes to */ /* This sets the length for the SUBSTR function in the WHEN statement */ str_length = put(length(DX_LIST),1.0); /* Write first line which is just the SELECT statement */ if _n_ = 1 then put @1 'select;'; /* Write each WHEN statement for each DX with the corresponding category */

when_string = "when (substr(IDCD_ID,1,"||trim(str_length)||") = '"||trim(DX_LIST)||"') DX_category = '"||trim(DX_CATEGORY)||"';";

put @5 when_string; /* Finish writing the last two statements in the SELECT/WHEN statement block */ if last then do; put @5 'OTHERWISE DX_category = "WHAT CATEGORY?!";'; put @1 'END;'; end; run;

Filename statement established an external SAS file with reference name ”tmpwhen”.LENGTH function to get the size of DX_LIST field. It changes for each DATA step iteration.

At _n_ = 1, it writes to the SAS file these two lines select; when (substr(IDCD_ID,1,4) = '0459') DX_category = 'Muscle problems';

At _n_ = 2 and so on, it writes another line to the SAS file like this when (substr(IDCD_ID,1,4) = '3561') DX_category = 'Muscle problems';

At the last iteration, it writes these two lines to the SAS file OTHERWISE DX_category = "WHAT CATEGORY?!";END;

Page 13: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

13

Step 2b: Technique #2 – The Robo-Coder

• Opening up the SAS program name “tempwhenstatements.sas”, you will see this block of statements

select; when (substr(IDCD_ID,1,4) = '0459') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3561') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3590') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3591') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3592') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,5) = '72811') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '2860') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2861') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2862') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2863') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2864') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,3) = '424') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '425') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '426') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '427') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '428') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '429') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '745') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '746') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '747') DX_category = 'Heart disease'; OTHERWISE DX_category = "WHAT CATEGORY?!"; END;

Notice how the length size

changes for the SUBSTR function

Page 14: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

14

Step 3: Run the auto-generated SAS codes

• Invoke the macro variable inside the WHERE statement using the “&” sign or ampersand symbol

• Insert external SAS program using the %INCLUDE statement. Note: this line, “options source2;” will allow the external SAS code to display in the log.

/* Source2 option will allow the external SAS code to display in the log */ options source2; data WORK.CLAIMS_SUBSET_DX_cat; set WORK.SESUG_samp_claims;

/* The macro variable will put the WHERE conditions we created */ where (&IDCD_ID_where_statement); format DX_category $150.;

/* Inserts the external SAS code that we created the WHEN statements */ %include tmpwhen; run;

Invoke macro here

Insert external SAS code as referenced

with FILENAME

Page 15: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

15

Step 3: What the auto-generated code looks like

data WORK.CLAIMS_SUBSET_DX_cat; set WORK.SESUG_samp_claims; where (IDCD_ID like "0459%" or IDCD_ID like "3561%" or IDCD_ID like "3590%" or IDCD_ID like "3591%" or IDCD_ID like "3592%" or IDCD_ID like "72811%" or IDCD_ID like "2860%" or IDCD_ID like "2861%" or IDCD_ID like "2862%" or IDCD_ID like "2863%" or IDCD_ID like "2864%" or IDCD_ID like "424%" or IDCD_ID like "425%" or IDCD_ID like "426%" or IDCD_ID like "427%" or IDCD_ID like "428%" or IDCD_ID like "429%" or IDCD_ID like "745%" or IDCD_ID like "746%" or IDCD_ID like "747%"); format DX_category $150.; select; when (substr(IDCD_ID,1,4) = '0459') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3561') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3590') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3591') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '3592') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,5) = '72811') DX_category = 'Muscle problems'; when (substr(IDCD_ID,1,4) = '2860') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2861') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2862') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2863') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,4) = '2864') DX_category = 'Hemophilia'; when (substr(IDCD_ID,1,3) = '424') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '425') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '426') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '427') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '428') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '429') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '745') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '746') DX_category = 'Heart disease'; when (substr(IDCD_ID,1,3) = '747') DX_category = 'Heart disease'; OTHERWISE DX_category = "WHAT CATEGORY?!"; END; run;

It looks exactly the

same as slide 6 if I

hard coded it manually.

Page 16: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

16

Step 3: Result of the claims data extraction

• Here is the results of the claims data extraction as well as the DX categories.

Claim_ID DOS IDCD_ID DX_category

521959269 01MAR20132863 Hemophilia

620558593 01MAR20133591 Muscle problems

834446494 01MAR201342511 Heart disease

350650440 01MAR20132860 Hemophilia

999400510 04MAR20137467 Heart disease

295583464 04MAR201374510 Heart disease

294778600 04MAR20137452 Heart disease

725175228 06MAR20132860 Hemophilia

576261199 06MAR20134280 Heart disease

325277861 06MAR201335921 Muscle problems

702352846 07MAR20134254 Heart disease

575425830 07MAR20134281 Heart disease

740288358 07MAR20137464 Heart disease

245189238 08MAR20132860 Hemophilia

930080969 11MAR201342821 Heart disease

411880687 11MAR20133591 Muscle problems

931851740 14MAR20133591 Muscle problems

305813261 16MAR20133590 Muscle problems

Page 17: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

17

Extra Example

• Suppose, we get data with the patients’ diagnoses. We want to transpose the patients diagnosis to put the most severe diagnoses in the first column based on diag scores

NOTE: Some patient has

only one diag while others

has four diags (or maybe

more.)

DX 055 has a higher diag score than DX 058. So, it will be first for the first patient.

Page 18: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

18

Extra Example

SAS code to sort each patient ID with diag_score in descending sort before transposing the diags into columns.

/* Sort the diag data with the highest diag score on top for each patient */proc sort data=work.vasug_sample_diag;

by patient_id descending diag_score;run;/* Transpose the diag for each patient */proc transpose data=work.vasug_sample_diag out=work.trans_diags (drop= _name_ _label_) prefix=diag_;

var DIAG;by patient_id;

run;

Page 19: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

19

Extra Example

SAS code to sort each patient ID with diag_score in descending sort before transposing the diags into columns.

/* Sort the diag data with the highest diag score on top for each patient */proc sort data=work.vasug_sample_diag;

by patient_id descending diag_score;run;/* Transpose the diag for each patient */proc transpose data=work.vasug_sample_diag out=work.trans_diags (drop= _name_ _label_) prefix=diag_;

var DIAG;by patient_id;

run;

Page 20: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

20

Extra Example

It was nice to transpose the diags but the care managers don’t know who the patient is. We join with member demographics but we don’t know how many diag columns. So, use PROC CONTENTS (with OUT= and VARNUM options) to see all the diag columns from the PROC TRANSPOSE. Looks like at-most 5 diag columns

proc contents data=work.trans_diags out=work.trans_diag_var_list varnum noprint;run;

Page 21: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

21

Extra Example

Take advantage of the output from PROC CONTENTS to create the variable list for PROC SQL when joining to member table. Use marco variable method (technique #1)

data _null_; format string_diag_var_list $500.; /* Set it large enough */ set work.trans_diag_var_list end=last; retain string_diag_var_list " "; if _n_ = 2 then string_diag_var_list = NAME; /* This will be the first one */ else string_diag_var_list = trim(string_diag_var_list)||", "||trim(NAME); if last then call symput("diag_var_list",trim(string_diag_var_list));run;%put &DIAG_var_list; /* See what it looks like */

Page 22: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

22

Extra Example

Take advantage of the output from PROC CONTENTS to create the variable list for PROC SQL when joining to member table. Use marco variable method (technique #1)

data _null_; format string_diag_var_list $500.; /* Set it large enough */ set work.trans_diag_var_list end=last; retain string_diag_var_list " "; if _n_ = 2 then string_diag_var_list = NAME; /* This will be the first one */ else string_diag_var_list = trim(string_diag_var_list)||", "||trim(NAME); if last then call symput("diag_var_list",trim(string_diag_var_list));run;%put &DIAG_var_list; /* See what it looks like */

Page 23: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

23

Extra Example

Put the macro variable into the PROC SQLproc sql;create table work.Final_member_ordered_DIAGSas select distinct a.MEMBER_ID, a.MEMBER_NAME, a.AGE, a.GENDER, &DIAG_var_listfrom VASUG_sample_MEMBERS a left join work.trans_diags b on a.member_id=b.patient_id;quit;

PROC SQL after invoking the macro substitutionproc sql;create table work.Final_member_ordered_DIAGSas select distinct a.MEMBER_ID, a.MEMBER_NAME, a.AGE, a.GENDER, DIAG_1, DIAG_2, DIAG_3, DIAG_4, DIAG_5from VASUG_sample_MEMBERS a left join work.trans_diags b on a.member_id=b.patient_id;quit;

Page 24: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

24

Conclusion

• These two techniques saved us a significant amount of manual coding labor.

• This example may not seem like much coding but the idea will be useful if we have a large number of repetitive SAS statements to code in a dynamical process

• This idea is not limited to just making WHERE statements or SELECT/WHEN statements.

• The Robo-Coder can be expanded to include other SAS PROCs as well as dozen or hundreds of routine weekly/monthly reports (See my VASUG Spring 2011 presentation on the PROC REPORT example)

Page 25: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

25

Questions?

• Questions?

• SAS code and sample SAS data set is available. Send me an email.

• Contact:Robert WilliamsWellPoint Inc.4425 Corporation Ln.Virginia Beach, VA 23462 [email protected]

Page 26: Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc

26

References

• SAS Certification Prep Guide: Base Programming for SAS 9 Third Edition Cary, NC: SAS Institute Inc. 2011

• SAS Certification Prep Guide: Advanced Programming for SAS 9 Third Edition Cary, NC: SAS Institute Inc. 2011

• The Web's Free 2013 Medical Coding Reference. (http://www.icd9data.com/)

• Robo-SAS Coding? Yes! We can auto-generate SAS codes from a SAS data set. (http://vasug.org/about-2/past-presentations/)