37
Chapter 3 Combining Data Vertically

Combining Data Vertically

Embed Size (px)

DESCRIPTION

Combining Data Vertically

Citation preview

  • Chapter 3Combining Data Vertically

  • Section 3.1Appending Raw Data Files

  • ObjectivesUse the FSLIST procedure to view the values of a raw data file.Create a SAS data set from multiple raw data files using the FILENAME statement.Create a SAS data set from multiple raw data files using the FILEVAR= option.

  • Vertical Combination MethodsRaw data may be combined vertically using several methods.INFILE statementFILENAME statementFILEVAR= optionoperating system techniques....

  • Raw Data FilesYou can use PROC FSLIST to view the values of a raw data file.General form of the PROC FSLIST statement: PROC FSLIST FILE = file-specification; RUN;

  • PROC FSLISTc03s1d1proc fslist file = 'month1.dat';run;

  • Reading Multiple Raw Data FilesTo read multiple raw data files, you can usemultiple INFILE statements....continued...

  • Reading Multiple Raw Data FilesTo read multiple raw data files, you can usethe FILENAME statement continued......

  • Reading Multiple Raw Data FilesTo read multiple raw data files, you can usethe FILEVAR= option....IF trueIF true

  • FILENAME Statement SyntaxGeneral form of the FILENAME statement:

    fileref is any SAS name that is eight characters or fewer.'external-file' is the physical name of an external file. The physical name is the name that is recognized by the operating environment.FILENAME fileref ('external-file1' 'external-file2' 'external-filen');

  • Using the FILENAME Statement c03s1d2filename Q1 ('month1.dat' 'month2.dat' 'month3.dat'); data firstq; infile Q1; input Flight $ Origin $ Dest $ Date : date9. RevCargo : comma15.2;run;

  • Business TaskThe programmers must provide reports of three months of data to IA executives. The three months are the current month and the previous two months (rolling quarter). month8month9month10month11month12...

  • Business TaskThe programmers must provide reports of three months of data to IA executives. The three months are the current month and the previous two months (rolling quarter)....month8month9month10month11month12

  • Business TaskThe programmers must provide reports of three months of data to IA executives. The three months are the current month and the previous two months (rolling quarter). month8month9month10month11month12

  • INFILE Statement with FILEVAR= OptionFILEVAR = variable names a variable whose change in value causes the INFILE statement to close the current input file and open a new one. General form of the FILEVAR= variable option:INFILE file-specification FILEVAR = variable;

  • INFILE Statement with FILEVAR= Optionxxx is an arbitrarily named placeholder, not an actual filename or a fileref that was assigned to a file previously. SAS uses this placeholder for reporting processing information to the SAS log.

    NextFilecontains the name of the raw data file to be read (month9.dat, month10.dat, month11.dat, ). infile xxx filevar = NextFile;

  • Creating the File NameHow can you change and assign the names of the three files to be read? month + 9 + .datmonth + 10 + .datmonth + 11 + .dat...

  • Creating the File NameWhen I = 11 NextFile = month11.datWhen I = 10NextFile = month10.datWhen I = 9NextFile = month 9.datHow can you eliminate the space?...do I = 11,10,9; NextFile = "month"!!put(I,2.)!!".dat"; infile xxx filevar = NextFile;end;

  • COMPRESS FunctionGeneral form of the COMPRESS function:

    sourcespecifies a source string that contains the characters to remove.characters-to-removespecifies the character or characters that SAS removes from the source string. Example:NextFile = compress(NextFile, );COMPRESS(source,)

  • Reading Raw DataWhy is the STOP statement needed?How many observations are in movingq?How can the DATA step read all the data from month9, month10, and month11?c03s1d3data movingq; do I = 11,10,9; NextFile = "month"!!put(I,2.)!!".dat"; NextFile = compress(NextFile,' '); infile xxx filevar = NextFile; input Flight $ Origin $ Dest $ Date : date9. RevCargo : comma15.2; output; end; stop;run;

  • INFILE Statement with END= OptionGeneral form of the END= option:

    The END = option names a variable that SAS sets to0 when the current input data record is not the last in the input file1 when the current input record is the last in the input file.INFILE file-specification END = variable;

  • Reading Raw DataHow can the program always read the current month and previous two months?data movingq; do I = 11,10,9; NextFile = "month"!!put(I,2.)!!".dat"; NextFile = compress(NextFile,' '); do until (LastObs); infile xxx filevar = NextFile end = LastObs; input Flight $ Origin $ Dest $ Date : date9. RevCargo : comma15.2; output; end; end; stop;run;

  • Reading the Current Monthdata movingq; drop MonNum MidMon LastMon I; MonNum = month(today()); MidMon = MonNum-1; LastMon = MidMon-1; do I = MonNum, MidMon, LastMon; NextFile = "month"!!put(i,2.)!!".dat"; NextFile = compress(NextFile,' '); do until (LastObs); infile xxx filevar = NextFile end = LastObs; input Flight $ Origin $ Dest $ Date : date9. RevCargo : comma15.2; output; end; end; stop;run;

  • Calendar LogicWhat if the current month is January or February?

  • INTNX FunctionGeneral form of the INTNX function:

    'interval'specifies a character constant or variable of date, datetime, or time intervals.start-fromspecifies a SAS expression that represents a SAS date,datetime, or time value identifying a starting point. incrementspecifies a negative or positive integer that represents the specific number of time intervals.INTNX('interval',start-from,increment)

  • Reading Multiple Raw Data Files c03s1d4This demonstration illustrates using the FILEVAR= option to read from multiple raw data files.

  • ExercisesThis exercise reinforces the concepts discussed previously.

  • Section 3.2Appending SAS Data Sets

  • ObjectivesAppend two SAS data sets using the APPEND procedure.

  • Vertical Combination MethodsSAS data can be combined vertically using several methods.PROC APPENDPROC SQL INSERT INTO statementPROC SQL OUTER UNION CORRESPONDING set operator DATA step SET statement.

  • Using the APPEND ProcedureYou can use PROC APPEND to concatenate two SAS data sets.General form of the APPEND procedure:PROC APPEND BASE = SAS-data-setDATA = SAS-data-set;

  • Appending Fewer VariablesPROC APPEND concatenates the data sets even though there might be variables in the BASE data set that do not exist in the DATA= data set.......work.moredata

  • Partial Log

    proc append base=moredata data=partdata force; run;

    NOTE: Appending WORK.PARTDATA to WORK.MOREDATA.WARNING: Variable RouteID not appended because of type mismatch.WARNING: Variable Origin was not found on DATA file.WARNING: Variable Dest was not found on DATA file.WARNING: Variable SaleMon was not found on DATA file.WARNING: Variable CargoWgt was not found on DATA file.NOTE: FORCE is specified, so dropping/truncating will occur.NOTE: There were 15 observations read from the data set WORK.PARTDATA.NOTE: 15 observations added.NOTE: The data set WORK.MOREDATA has 40 observations and 19 variables.NOTE: PROCEDURE APPEND used: real time 0.14 seconds cpu time 0.02 seconds

  • FORCE OptionThe FORCE option enables PROC APPEND to concatenate the data sets even though there might be variables in the DATA= data set that do not exist in the BASE data set.......work.moredatawork.partdata

  • Partial Log

    proc append base=partdata data=moredata force; run;

    NOTE: Appending WORK.MOREDATA to WORK.PARTDATA.WARNING: Variable Origin was not found on BASE file.WARNING: Variable Dest was not found on BASE file.WARNING: Variable SaleMon was not found on BASE file.WARNING: Variable CargoWgt was not found on BASE file.WARNING: Variable RouteID not appended because of type mismatch.NOTE: FORCE is specified, so dropping/truncating will occur.NOTE: There were 25 observations read from the data set WORK.MOREDATA.NOTE: 25 observations added.NOTE: The data set WORK.PARTDATA has 40 observations and 15 variables.NOTE: PROCEDURE APPEND used: real time 0.00 seconds cpu time 0.01 seconds

  • Appending Variables with Different Attributes c03s2d1This demonstration illustrates what happens when the FORCE option is used to append data sets with variables of different lengths, labels, and types.

  • ExercisesThis exercise reinforces the concepts discussed previously.