Upload
mark-tabladillo
View
4.021
Download
3
Embed Size (px)
DESCRIPTION
The where expression allows for declaring filters on SAS System datasets. This presentation illustrates some uses in the data step and SAS Macro Language.
Citation preview
Introduction to Where ExpressionsMark Tabladillo, Ph.D.
Software Developer, MarkTab ConsultingAssociate Faculty University of PhoenixAssociate Faculty, University of Phoenix
January 30, 2007
IntroductionIntroduction
• WHERE expressions allow for processingWHERE expressions allow for processing subsets of observations
• WHERE expressions can be used in theWHERE expressions can be used in the DATA step or with PROC (procedure) statements
• This presentation will contain a series of features and examples of the WHERE pexpression
• We end with some intensive macros
WHERE-expression ProcessingWHERE expression Processing
• Enables us to conditionally select a subsetEnables us to conditionally select a subset of observations, so that SAS processes only the observations that meet a set ofonly the observations that meet a set of specified conditions.
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999253.htm
Work Sales DatasetWork.Sales Datasetdata work.sales (drop=i randomState);
length state $2 sales 8 randomState 3;length state $2 sales 8 randomState 3;do i = 1 to 2500;
randomState = round(rand('gaussian',3,1)+0.5);if randomState in (1,2,3,4,5) then do;( )
select(randomState);when(1) state='TN';when(2) state='AL';when(3) state='GA';when(3) state= GA ;when(4) state='FL';when(5) state='MS';
end;sales = int(rand('gaussian',1000000,500000));output work.sales;
end;end;end;run;
Data Set Option or StatementData Set Option or Statementdata work.highSales;
set work.sales (where=(sales>1500000));run;
data work highSales;data work.highSales;set work.sales;where sales>1500000;run;
proc means data=work.sales;where sales>1500000;run;;
Data Set Option or StatementData Set Option or Statementdata work.lowSales;
set work.sales (where=(sales<0));run;
data work lowSales;data work.lowSales;set work.sales;where sales<0;run;
proc means data=work.sales (where=(sales<0));run;
Multiple ComparisonsMultiple Comparisonsdata work.highFloridaSales;
set work.sales (where=(sales>1500000 and state = 'FL'));run;
data work highFloridaSales;data work.highFloridaSales;set work.sales;where sales>1500000 and state = 'FL';run;
proc freq data=work.sales;tables state;where sales>1500000 and state = 'FL';;run;
SAS FunctionsSAS Functionsdata work.highFloridaSales;
set work.sales (where=(sales>1500000 and substr(state,1,1) = 'F'));run;
data work highFloridaSales;data work.highFloridaSales;set work.sales;where sales>1500000 and substr(state,1,1) = 'F';run;
proc means data=work.sales;where sales>1500000 and substr(state,1,1) = 'F';run;;
Comparison OperatorsComparison OperatorsPriority Order of
EvaluationSymbols Mnemonic
EquivalentEvaluation EquivalentGroup I right to left **
++-
ˆ ¬ ~ NOT>< MIN<> MAX
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
Comparison OperatorsComparison OperatorsPriority Order of
EvaluationSymbols Mnemonic
EquivalentEvaluation EquivalentGroup II left to right *
//Group
IIIleft to right +
-Group
IVleft to right || ¦¦ !!
IV
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
Comparison OperatorsComparison OperatorsPriority Order of
EvaluationSymbols Mnemonic
EquivalentEvaluation EquivalentGroup
Vleft to right < LT
<= LE= EQ
¬= NE>= GE> GT
IN
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
Comparison OperatorsComparison OperatorsPriority Order of
EvaluationSymbols Mnemonic
EquivalentEvaluation EquivalentGroup
VIleft to right & AND
Group VII
left to right | ¦ ! OR
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
Comparison OperatorsComparison Operatorsdata work.extremeNonGeorgia;
set work.sales (where=((sales<0 | sales>1500000) and state in ('TN','AL','FL','MS')));
run;
data work.extremeNonGeorgia;set work.sales;where (sales<0 | sales>1500000) and state in ('TN','AL','FL','MS');run;
data work.extremeNonGeorgia;set work.sales;;where ^ (0 <= sales <= 1500000) & state ne 'GA';run;
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
“Between And”Between Anddata work.boundedNonGeorgia;
set work.sales (where=((sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS')));
run;
data work.boundedNonGeorgia;set work.sales;where (sales between 1000000 and 1500000) &
t t i ('TN' 'AL' 'FL' 'MS')state in ('TN','AL','FL','MS');run;
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
Contains ?Contains ?data work.LStates;
set work.sales (where=(state contains 'L'));run;
data work LStates;data work.LStates;set work.sales;where state contains 'L';run;
data work.LStates;set work.sales;where state ? 'L';;run;
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
Is Null/Is MissingIs Null/Is Missingdata work.nullStates;
set work.sales (where=(state is null));run;
data work.missingStates;set work.sales (where=(state is missing));se o sa es ( e e (s a e s ss g));run;
data work.nullSales;set work sales;set work.sales;where sales is missing;run;
data work.nonNullSales;set work.sales;where sales is not missing;run;run;
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
LikeLikedata work.likeL;
set work sales (where=(state like '%L'));set work.sales (where=(state like '%L'));run;
data work.likeL;set work.sales (where=(state like "%L"));run;
data work likeL;data work.likeL;set work.sales (where=(state like "%%L"));run;
data work.notLikeG;set work.sales;where state not like 'G_';run;run;
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
Sounds Like (Soundex)Sounds Like (Soundex)data work.soundsLikeFill;
set work.sales (where=(state =* 'fill'));run;
data work notSoundsLikeTin;data work.notSoundsLikeTin;set work.sales;where state not =* 'tin';run;
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
“Same And”Same Anddata work.boundedNonGeorgia;
set work.sales (where=((sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS')));
run;
data work.boundedNonGeorgia;set work.sales;where (sales between 1000000 and 1500000);
h d t t i ('TN' 'AL' 'FL' 'MS')where same and state in ('TN','AL','FL','MS');run;
data work.boundedNonGeorgia;g ;set work.sales;where same and (sales between 1000000 and 1500000); where same and state in ('TN','AL','FL','MS');run;run;
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
WHERE vs Subsetting IFWHERE vs. Subsetting IFTask Method
Make the selection in a procedure without using a WHERE expressionMake the selection in a procedure without using a preceding DATA step
WHERE expression
Take advantage of the efficiency available with an indexed data set
WHERE expression
Use one of a group of special operators, such as BETWEEN-AND, CONTAINS, IS MISSING or IS NULL, LIKE, SAME-AND, and Sounds-Like
WHERE expression
B th l ti thi th th i bl l b tti IFBase the selection on anything other than a variable value that already exists in a SAS data set. For example, you can select a value that is read from raw data, or a value that is calculated or assigned during the course f th DATA t
subsetting IF
of the DATA stepMake the selection at some point during a DATA step
rather than at the beginningsubsetting IF
Execute the selection conditionally subsetting IFExecute the selection conditionally subsetting IF
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a001000521.htm
Intensive Dataset GenerationIntensive Dataset Generation%macro OurCentury();%local year interest;y ;%do year = 2001 %to 2100;
%let interest = %sysfunc(compound(1,.,0.05,%eval(&year.-2001)));data work.sales&year. (drop=i randomState index=(state sales));
length state $2 stateName $20 sales 8 randomState 3;g ;do i = 1 to 2500;
randomState = round(56*rand('uniform')+0.5);if randomState <= 56 and randomState not in (3,7,14,43,52) then do;
state = fipstate(randomState);p ( )stateName = fipnameL(randomState);sales = int(rand('gaussian',1000000*&interest.,500000*&interest.));output work.sales&year.;
end;end;run;
%end;%mend OurCentury;y%OurCentury;
Year/State DatasetsYear/State Datasets%macro SalesByYearState();%local year stateCode state;%do year = 2001 %to 2100;
%do stateCode = 1 %to 56;%if &stateCode ne 3 & &stateCode ne 7 & &stateCode ne 14 &%if &stateCode. ne 3 & &stateCode. ne 7 & &stateCode. ne 14 &
&stateCode. ne 43 & &stateCode. ne 52 %then %do;%let state = %sysfunc(fipstate(&stateCode.));data work.sales&year.&state.;
t k l &set work.sales&year.;where state = "&state.";run;
%end;;%end;
%end;%mend SalesByYearState;%SalesByYearState;%SalesByYearState;
Year/State High Sales DatasetsYear/State High Sales Datasets%macro HighSalesByYearState();%local year stateCode state interest keepDataset;%do year = 2001 %to 2100;
%let interest = %sysfunc(compound(1,.,0.05,%eval(&year.-2001)));%do stateCode = 1 %to 56;
%if &stateCode. ne 3 & &stateCode. ne 7 & &stateCode. ne 14 & &stateCode. ne 43 &&stateCode. ne 52 %then %do;&stateCode. ne 52 %then %do;%let state = %sysfunc(fipstate(&stateCode.));%let keepDataset = 0;data work.sales&year.&state.high;
set work.sales&year.;h t t "& t t " d l > 2000000*&i t twhere state = "&state." and sales > 2000000*&interest.;
call symput('keepDataset',left('1'));run;
%if not(&keepDataset.) %then %do;proc datasets lib=work nolist;p
delete sales&year.&state.high;run; quit;
%end;%end;
%end;%end;%end;%mend HighSalesByYearState;%HighSalesByYearState;
ConclusionConclusion
• The WHERE expression allows forThe WHERE expression allows for efficient observation processing in the DATA step and the PROC statementsDATA step and the PROC statements
• The SAS System Documentation provides specific details on the syntaxspecific details on the syntax
• Using macros increases the processing f WHERE ipower of WHERE expressions
Contact InformationContact Information
• Mark TabladilloMark TabladilloMarkTab Consultinghttp://www marktab com/http://www.marktab.com/