Upload
mark-tabladillo
View
2.874
Download
1
Tags:
Embed Size (px)
DESCRIPTION
The SAS System provides two declarative syntax languages for regular expressions: SAS and Perl. This presentation compares and contrasts these two complementary choices for SAS application developers.
Citation preview
Regular Expressions Regular Expressions ––SAS® (RX) P l (PRX)SAS® (RX) P l (PRX)SAS® (RX) vs. Perl (PRX)SAS® (RX) vs. Perl (PRX)
Mark Tabladillo Ph.D.Mark Tabladillo Ph.D.April 10, 2005April 10, 2005April 10, 2005April 10, 2005
© 2005, markTab Consulting, All Rights Reserved
MotivationMotivationMotivationMotivation
The SAS System Version 9 introduces PerlThe SAS System Version 9 introduces PerlThe SAS System Version 9 introduces Perl The SAS System Version 9 introduces Perl regular expressions (PRX)regular expressions (PRX)Earlier software versions already had SASEarlier software versions already had SASEarlier software versions already had SAS Earlier software versions already had SAS regular expressions (RX)regular expressions (RX)
© 2005, markTab Consulting, All Rights Reserved
PurposePurposePurposePurpose
This presentation will compare andThis presentation will compare andThis presentation will compare and This presentation will compare and contrast the two types of regular contrast the two types of regular expressions (RX and PRX) from both theexpressions (RX and PRX) from both theexpressions (RX and PRX) from both the expressions (RX and PRX) from both the functionality and performance viewpointsfunctionality and performance viewpointsThe goal: Offer recommendations onThe goal: Offer recommendations onThe goal: Offer recommendations on The goal: Offer recommendations on when to use the two typeswhen to use the two typesA li ti T i l illA li ti T i l illApplication: Two generic examples will Application: Two generic examples will illustrate the recommended strategyillustrate the recommended strategy
© 2005, markTab Consulting, All Rights Reserved
OutlineOutlineOutlineOutline
BackgroundBackgroundBackgroundBackgroundSimilarities between SAS (RX) and Perl Similarities between SAS (RX) and Perl Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Unique Perl Regular Expression (PRX) Unique Perl Regular Expression (PRX) C bilitiC bilitiCapabilitiesCapabilitiesRecommended Strategy for SAS (RX) and Recommended Strategy for SAS (RX) and Perl Regular Expressions (PRX)Perl Regular Expressions (PRX)Two Examples of Recommended StrategyTwo Examples of Recommended Strategy
© 2005, markTab Consulting, All Rights Reserved
p gyp gy
OutlineOutlineOutlineOutline
BackgroundBackgroundBackgroundBackgroundSimilarities between SAS (RX) and Perl Similarities between SAS (RX) and Perl Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Unique Perl Regular Expression (PRX) Unique Perl Regular Expression (PRX) C bilitiC bilitiCapabilitiesCapabilitiesRecommended Strategy for SAS (RX) and Recommended Strategy for SAS (RX) and Perl Regular Expressions (PRX)Perl Regular Expressions (PRX)Two Examples of Recommended StrategyTwo Examples of Recommended Strategy
© 2005, markTab Consulting, All Rights Reserved
p gyp gy
VocabularyVocabularyVocabularyVocabularyPattern matching Pattern matching enables you to search for and enables you to search for and gg yyextract multiple matching patterns from a character extract multiple matching patterns from a character string in one step, as well as to make several string in one step, as well as to make several substitutions in a string in one stepsubstitutions in a string in one stepg pg pRegular expressions Regular expressions are a pattern language which are a pattern language which provides fast tools for parsing large amounts of text.provides fast tools for parsing large amounts of text.MetacharactersMetacharacters are special combinations ofare special combinations ofMetacharactersMetacharacters are special combinations of are special combinations of alphanumeric and/or symbolic characters which have alphanumeric and/or symbolic characters which have specific meaning in defining a regular expression.specific meaning in defining a regular expression.Ch t lCh t l i l bi ti fi l bi ti fCharacter classes Character classes are single or combinations of are single or combinations of alphanumeric and/or symbolic characters which alphanumeric and/or symbolic characters which represent themselves.represent themselves.
© 2005, markTab Consulting, All Rights Reserved
Is “One Step” Realistic?Is “One Step” Realistic?Is One Step Realistic?Is One Step Realistic?
Practical uses of regular expressions usePractical uses of regular expressions usePractical uses of regular expressions use Practical uses of regular expressions use more than one stepmore than one stepRegular expressions provide a powerfulRegular expressions provide a powerfulRegular expressions provide a powerful Regular expressions provide a powerful parsimonious syntax for string parsimonious syntax for string manipulationmanipulationmanipulationmanipulation
© 2005, markTab Consulting, All Rights Reserved
When to Use Regular ExpressionsWhen to Use Regular ExpressionsWhen to Use Regular ExpressionsWhen to Use Regular Expressions
Anything done in regular expressionsAnything done in regular expressionsAnything done in regular expressions Anything done in regular expressions could be coded another waycould be coded another wayMany people do not use metacharacters inMany people do not use metacharacters inMany people do not use metacharacters in Many people do not use metacharacters in (for example) Google® searches(for example) Google® searchesHi hHi h l l t i il l t i iHighHigh--volume or complex string processing volume or complex string processing (such as in a data step) provides excellent (such as in a data step) provides excellent
t ti lt ti lpotentialpotential
© 2005, markTab Consulting, All Rights Reserved
Why Regular Expressions can be Why Regular Expressions can be C f iC f iConfusingConfusing
Regular expressions are a combination of:Regular expressions are a combination of:Regular expressions are a combination of:Regular expressions are a combination of:–– Alphanumeric and/or symbolic characters Alphanumeric and/or symbolic characters
representing themselves (representing themselves (character classescharacter classes))–– Special combinations of alphanumeric and/or Special combinations of alphanumeric and/or
symbolic characters (symbolic characters (metacharactersmetacharacters) representing ) representing zero or more combinations of alphanumeric and/orzero or more combinations of alphanumeric and/orzero or more combinations of alphanumeric and/or zero or more combinations of alphanumeric and/or symbolic characterssymbolic characters
–– Specially flagged combinations of alphanumeric Specially flagged combinations of alphanumeric and/or symbolic characters which would normally be and/or symbolic characters which would normally be interpreted as metacharacters, but instead represent interpreted as metacharacters, but instead represent themselves (themselves (character classescharacter classes))
© 2005, markTab Consulting, All Rights Reserved
themselves (themselves (character classescharacter classes))
OutlineOutlineOutlineOutline
BackgroundBackgroundBackgroundBackgroundSimilarities between SAS (RX) and Perl Similarities between SAS (RX) and Perl Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Unique Perl Regular Expression (PRX) Unique Perl Regular Expression (PRX) C bilitiC bilitiCapabilitiesCapabilitiesRecommended Strategy for SAS (RX) and Recommended Strategy for SAS (RX) and Perl Regular Expressions (PRX)Perl Regular Expressions (PRX)Two Examples of Recommended StrategyTwo Examples of Recommended Strategy
© 2005, markTab Consulting, All Rights Reserved
p gyp gy
Similarity One: Parse FunctionSimilarity One: Parse FunctionSimilarity One: Parse FunctionSimilarity One: Parse Function
PARSE is the core function of creating aPARSE is the core function of creating aPARSE is the core function of creating a PARSE is the core function of creating a regular expression in memory using regular expression in memory using metacharacters, and assigning this regular metacharacters, and assigning this regular , g g g, g g gexpression to a numeric SAS variable, expression to a numeric SAS variable, called the called the regular expression IDregular expression ID. . The term ID refers to identification, and The term ID refers to identification, and SAS will assign every PARSE function to a SAS will assign every PARSE function to a diff t d i i l ddiff t d i i l ddifferent and unique numeric value, and different and unique numeric value, and track those values automatically.track those values automatically.
© 2005, markTab Consulting, All Rights Reserved
Similarity One: Parse FunctionSimilarity One: Parse FunctionSimilarity One: Parse FunctionSimilarity One: Parse Function
The programming challenge is to create aThe programming challenge is to create aThe programming challenge is to create a The programming challenge is to create a regular expression which generically regular expression which generically describes a character string patterndescribes a character string patterndescribes a character string patterndescribes a character string patternMetacharacters for SAS (RX) and Perl Metacharacters for SAS (RX) and Perl (PRX) regular expressions are usually(PRX) regular expressions are usually(PRX) regular expressions are usually (PRX) regular expressions are usually different, but either method can be used different, but either method can be used to create a similar if not identical resultto create a similar if not identical resultto create a similar if not identical resultto create a similar if not identical result
© 2005, markTab Consulting, All Rights Reserved
Similarity One: ExampleSimilarity One: ExampleSimilarity One: ExampleSimilarity One: Example
In this first example (SAS Institute, 2003), the In this first example (SAS Institute, 2003), the t s st e a p e (S S st tute, 003), t et s st e a p e (S S st tute, 003), t egoal is to find a pattern that matches (XXX) XXXgoal is to find a pattern that matches (XXX) XXX--XXXX or XXXXXXX or XXX--XXXXXX--XXXX for phone numbers in XXXX for phone numbers in the United Statesthe United Statesthe United States. the United States. –– The first three digits are the area code, and by The first three digits are the area code, and by
standardized rules, the area code cannot start with a standardized rules, the area code cannot start with a zero or a one. zero or a one.
–– The fourth through sixth digits are the prefix, and The fourth through sixth digits are the prefix, and again by standard rules, the prefix also cannot startagain by standard rules, the prefix also cannot startagain by standard rules, the prefix also cannot start again by standard rules, the prefix also cannot start with a zero or one. with a zero or one.
–– The suffix may have any digit, including zero or one, The suffix may have any digit, including zero or one, in any of the four placesin any of the four places
© 2005, markTab Consulting, All Rights Reserved
in any of the four places.in any of the four places.
Phone Number: Perl (PRX)Phone Number: Perl (PRX)Phone Number: Perl (PRX)Phone Number: Perl (PRX)
paren = "paren = "\\([2([2--9]9]\\dd\\dd\\) ?[2) ?[2--9]9]\\dd\\dd--paren = paren = \\([2([2 9]9]\\dd\\dd\\) ?[2) ?[2 9]9]\\dd\\dd\\dd\\dd\\dd\\d";d";dash = "[2dash = "[2 9]9]\\dd\\dd [2[2 9]9]\\dd\\dd \\dd\\dd\\dd\\d";d";dash = [2dash = [2--9]9]\\dd\\dd--[2[2--9]9]\\dd\\dd--\\dd\\dd\\dd\\d ;d ;regexp = "/(" || paren || ")|(" || dash || regexp = "/(" || paren || ")|(" || dash || ")/"")/"")/";")/";See the Paper for the full code and See the Paper for the full code and explanationexplanation
© 2005, markTab Consulting, All Rights Reserved
Phone Number: SAS (RX)Phone Number: SAS (RX)Phone Number: SAS (RX)Phone Number: SAS (RX)
paren = "'('$'2paren = "'('$'2--9'$d$d')'[' ']$'29'$d$d')'[' ']$'2--9'$d$d'9'$d$d'--paren = ( $ 2paren = ( $ 2 9 $d$d ) [ ]$ 29 $d$d ) [ ]$ 2 9 $d$d9 $d$d'$d$d$d$d";'$d$d$d$d";dash = "$'2dash = "$'2 9'$d$d'9'$d$d' '$'2'$'2 9'$d$d'9'$d$d'dash = $ 2dash = $ 2--9 $d$d9 $d$d -- $ 2$ 2--9 $d$d9 $d$d --'$d$d$d$d";'$d$d$d$d";
|| "|" || d h|| "|" || d hregexp = paren || "|" || dash;regexp = paren || "|" || dash;See the Paper for the full code and See the Paper for the full code and explanationexplanation
© 2005, markTab Consulting, All Rights Reserved
Comparing the MethodsComparing the MethodsComparing the MethodsComparing the Methods
A SAS Macro was created to compare theA SAS Macro was created to compare theA SAS Macro was created to compare the A SAS Macro was created to compare the methodsmethodsOne iteration did not show a difference soOne iteration did not show a difference soOne iteration did not show a difference, so One iteration did not show a difference, so the iterations were increased to 500the iterations were increased to 500SAS (RX) i t 3 69 d dSAS (RX) i t 3 69 d dSAS (RX) wins at 3.69 seconds compared SAS (RX) wins at 3.69 seconds compared to Perl (PRX) at 3.80 secondsto Perl (PRX) at 3.80 secondsPoint: If speed is an issue, you may try Point: If speed is an issue, you may try the two methods to see who winsthe two methods to see who wins
© 2005, markTab Consulting, All Rights Reserved
Similarity Two: MatchingSimilarity Two: MatchingSimilarity Two: MatchingSimilarity Two: Matching
The matching function uses the regularThe matching function uses the regularThe matching function uses the regular The matching function uses the regular expression to determine a specific numeric expression to determine a specific numeric position in a stringposition in a stringposition in a stringposition in a stringThe return from a match function is a The return from a match function is a number representing a character positionnumber representing a character positionnumber representing a character positionnumber representing a character position
© 2005, markTab Consulting, All Rights Reserved
Similarity Three: SubstringSimilarity Three: SubstringSimilarity Three: SubstringSimilarity Three: Substring
The substring routine allows for inputtingThe substring routine allows for inputtingThe substring routine allows for inputting The substring routine allows for inputting a regular expression and string, and a regular expression and string, and outputting a position and lengthoutputting a position and lengthoutputting a position and lengthoutputting a position and lengthRoutines (unlike functions) can have Routines (unlike functions) can have variable numbers of inputs and outputsvariable numbers of inputs and outputsvariable numbers of inputs and outputs, variable numbers of inputs and outputs, as in the substring routineas in the substring routine
© 2005, markTab Consulting, All Rights Reserved
Similarity Four: ChangeSimilarity Four: ChangeSimilarity Four: ChangeSimilarity Four: Change
The change routine allows for inputting aThe change routine allows for inputting aThe change routine allows for inputting a The change routine allows for inputting a regular expression, a maximum number of regular expression, a maximum number of times to replace an old string andtimes to replace an old string andtimes to replace, an old string, and times to replace, an old string, and outputs a new stringoutputs a new stringBoth SAS (RX) and Perl (PRX) allow forBoth SAS (RX) and Perl (PRX) allow forBoth SAS (RX) and Perl (PRX) allow for Both SAS (RX) and Perl (PRX) allow for changing a string in placechanging a string in place
© 2005, markTab Consulting, All Rights Reserved
Similarity Five: FreeSimilarity Five: FreeSimilarity Five: FreeSimilarity Five: Free
The free routine releases the memoryThe free routine releases the memoryThe free routine releases the memory The free routine releases the memory allocation for the regular expressionallocation for the regular expressionIt is recommended to always include aIt is recommended to always include aIt is recommended to always include a It is recommended to always include a FREE routine to prevent problemsFREE routine to prevent problems
© 2005, markTab Consulting, All Rights Reserved
OutlineOutlineOutlineOutline
BackgroundBackgroundBackgroundBackgroundSimilarities between SAS (RX) and Perl Similarities between SAS (RX) and Perl Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Unique Perl Regular Expression (PRX) Unique Perl Regular Expression (PRX) C bilitiC bilitiCapabilitiesCapabilitiesRecommended Strategy for SAS (RX) and Recommended Strategy for SAS (RX) and Perl Regular Expressions (PRX)Perl Regular Expressions (PRX)Two Examples of Recommended StrategyTwo Examples of Recommended Strategy
© 2005, markTab Consulting, All Rights Reserved
p gyp gy
Capture BuffersCapture BuffersCapture BuffersCapture Buffers
Perl (PRX) regular expressions can usePerl (PRX) regular expressions can usePerl (PRX) regular expressions can use Perl (PRX) regular expressions can use capture buffers, defined as part of a capture buffers, defined as part of a match explicitly specified in the Perl match explicitly specified in the Perl p y pp y pregular expressionregular expressionThe capture buffers are collectively a oneThe capture buffers are collectively a one--p yp ydimensional numbered array of results dimensional numbered array of results (starting at one, not zero)(starting at one, not zero)Example: Parts of a phone numberExample: Parts of a phone numberMore than one step is requiredMore than one step is required
© 2005, markTab Consulting, All Rights Reserved
p qp q
Unique Feature One: PRXPOSN Unique Feature One: PRXPOSN iiRoutineRoutine
The PRXPOSN routine finds the startThe PRXPOSN routine finds the startThe PRXPOSN routine finds the start The PRXPOSN routine finds the start position and length of a numbered capture position and length of a numbered capture bufferbufferbufferbuffer
© 2005, markTab Consulting, All Rights Reserved
Unique Feature Two: PRXPOSN Unique Feature Two: PRXPOSN iiFunctionFunction
The PRXPOSN Function uses the positionalThe PRXPOSN Function uses the positionalThe PRXPOSN Function uses the positional The PRXPOSN Function uses the positional capture buffer number to return the actual capture buffer number to return the actual string in the capture bufferstring in the capture bufferstring in the capture bufferstring in the capture bufferThis function is probably more useful than This function is probably more useful than the PRXPOSN routinethe PRXPOSN routinethe PRXPOSN routinethe PRXPOSN routine
© 2005, markTab Consulting, All Rights Reserved
Unique Feature Three: PRXPARENUnique Feature Three: PRXPARENUnique Feature Three: PRXPARENUnique Feature Three: PRXPAREN
The PRXPAREN function assumes that theThe PRXPAREN function assumes that theThe PRXPAREN function assumes that the The PRXPAREN function assumes that the capture buffer was an ordered hierarchical capture buffer was an ordered hierarchical array and will return the highest nonarray and will return the highest non--array, and will return the highest nonarray, and will return the highest nonmissing capture buffer numbermissing capture buffer numberSee the paper for an exampleSee the paper for an exampleSee the paper for an exampleSee the paper for an example
© 2005, markTab Consulting, All Rights Reserved
Unique Feature Four: PRXNEXTUnique Feature Four: PRXNEXTUnique Feature Four: PRXNEXTUnique Feature Four: PRXNEXT
Similar to PRXMATCH the PRXNEXTSimilar to PRXMATCH the PRXNEXTSimilar to PRXMATCH, the PRXNEXT Similar to PRXMATCH, the PRXNEXT routine will iteratively search a string for routine will iteratively search a string for matchesmatchesmatchesmatchesNot based on the capture bufferNot based on the capture bufferU f l h t i h lti lU f l h t i h lti lUseful when a string can have multiple, Useful when a string can have multiple, even overlapping, matcheseven overlapping, matches
© 2005, markTab Consulting, All Rights Reserved
Unique Feature Five: PRXDEBUGUnique Feature Five: PRXDEBUGUnique Feature Five: PRXDEBUGUnique Feature Five: PRXDEBUG
The PRXDEBUG routine writes debuggingThe PRXDEBUG routine writes debuggingThe PRXDEBUG routine writes debugging The PRXDEBUG routine writes debugging messages to the logmessages to the logProvides insight into how regularProvides insight into how regularProvides insight into how regular Provides insight into how regular expression functions and routines search expression functions and routines search through specific stringsthrough specific stringsthrough specific stringsthrough specific stringsDebugging works best when smaller Debugging works best when smaller i h k d fi t b ildi t di h k d fi t b ildi t dpieces are checked first, building toward pieces are checked first, building toward
the whole regular expressionthe whole regular expression
© 2005, markTab Consulting, All Rights Reserved
OutlineOutlineOutlineOutline
BackgroundBackgroundBackgroundBackgroundSimilarities between SAS (RX) and Perl Similarities between SAS (RX) and Perl Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Unique Perl Regular Expression (PRX) Unique Perl Regular Expression (PRX) C bilitiC bilitiCapabilitiesCapabilitiesRecommended Strategy for SAS (RX) and Recommended Strategy for SAS (RX) and Perl Regular Expressions (PRX)Perl Regular Expressions (PRX)Two Examples of Recommended StrategyTwo Examples of Recommended Strategy
© 2005, markTab Consulting, All Rights Reserved
p gyp gy
Recommended StrategyRecommended StrategyRecommended StrategyRecommended Strategy
Use the type which has the desiredUse the type which has the desiredUse the type which has the desired Use the type which has the desired functionalityfunctionalityIf you don’t know either start with PerlIf you don’t know either start with PerlIf you don t know either, start with Perl If you don t know either, start with Perl regular expressions (PRX)regular expressions (PRX)If l ki t fIf l ki t fIf you are looking at performance or If you are looking at performance or speed issues, try tests both ways (RX and speed issues, try tests both ways (RX and PRX)PRX)PRX)PRX)
© 2005, markTab Consulting, All Rights Reserved
OutlineOutlineOutlineOutline
BackgroundBackgroundBackgroundBackgroundSimilarities between SAS (RX) and Perl Similarities between SAS (RX) and Perl Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Regular Expressions (PRX)Unique Perl Regular Expression (PRX) Unique Perl Regular Expression (PRX) C bilitiC bilitiCapabilitiesCapabilitiesRecommended Strategy for SAS (RX) and Recommended Strategy for SAS (RX) and Perl Regular Expressions (PRX)Perl Regular Expressions (PRX)Two Examples of Recommended StrategyTwo Examples of Recommended Strategy
© 2005, markTab Consulting, All Rights Reserved
p gyp gy
Example One: Printer NamesExample One: Printer NamesExample One: Printer NamesExample One: Printer Names
The Universal Naming ConventionThe Universal Naming ConventionThe Universal Naming Convention The Universal Naming Convention describes printers as:describes printers as:\\\\computer namecomputer name\\printer shared nameprinter shared name\\\\computer_namecomputer_name\\printer_shared_nameprinter_shared_nameThe SYSPRINT option returns or sets the The SYSPRINT option returns or sets the UNC printer nameUNC printer nameUNC printer nameUNC printer name
© 2005, markTab Consulting, All Rights Reserved
Example One: Printer NameExample One: Printer NameExample One: Printer NameExample One: Printer Name
Problem: A variety of legal UNC formats:Problem: A variety of legal UNC formats:Problem: A variety of legal UNC formats:Problem: A variety of legal UNC formats:–– \\\\computer_namecomputer_name\\printer_shared_nameprinter_shared_name
((\\\\computer namecomputer name\\printer shared nameprinter shared name))–– ((\\\\computer_namecomputer_name\\printer_shared_nameprinter_shared_name))–– (“(“\\\\computer_namecomputer_name\\printer_shared_nameprinter_shared_name’)’)
12 i t * 3 f t 36 bi ti12 i t * 3 f t 36 bi ti12 printers * 3 formats = 36 combinations12 printers * 3 formats = 36 combinationsSAS (RX) could be used with 3 separate SAS (RX) could be used with 3 separate regular expressionsregular expressionsPerl (PRX) capture buffer usedPerl (PRX) capture buffer used
© 2005, markTab Consulting, All Rights Reserved
( ) p( ) p
Example One: PRXExample One: PRXExample One: PRXExample One: PRX
'/('/(\\\\\\\\[[--\\\\\\w]+|[w]+|[--\\w]+)/'w]+)/'/(/(\\\\\\\\[[ \\\\\\w]+|[w]+|[ \\w]+)/ w]+)/ The regular expression will extract the The regular expression will extract the printer name without the braces orprinter name without the braces orprinter name, without the braces, or printer name, without the braces, or brackets, or quotation marksbrackets, or quotation marksS th f l tiS th f l tiSee the paper for explanationSee the paper for explanation
© 2005, markTab Consulting, All Rights Reserved
Example Two: Windows Example Two: Windows S bdiS bdiSubdirectorySubdirectory
Get the subdirectory from the longerGet the subdirectory from the longerGet the subdirectory from the longer Get the subdirectory from the longer string which started with the drive name string which started with the drive name and ended with a specific filename:and ended with a specific filename:and ended with a specific filename:and ended with a specific filename:–– X:X:\\\\Sub_Directory_1Sub_Directory_1\\Sub_Directory_2Sub_Directory_2\\......\\SubSub
Directory NDirectory N\\Filename ExtensionFilename Extension_Directory_N_Directory_N\\Filename.ExtensionFilename.Extension
As in the previous example, the original As in the previous example, the original string includes the backslash which is astring includes the backslash which is astring includes the backslash, which is a string includes the backslash, which is a Perl delimiting metacharacterPerl delimiting metacharacter
© 2005, markTab Consulting, All Rights Reserved
Example Two: Regular ExpressionExample Two: Regular ExpressionExample Two: Regular ExpressionExample Two: Regular Expression
'/([A'/([A--ZaZa--z]:[z]:[ --\\\\\\w]+)w]+)\\\\([([ --\\w]+)w]+)\\\\([([ --/([A/([A ZaZa z]:[.z]:[. \\\\\\w]+)w]+)\\\\([.([. \\w]+)w]+)\\\\([.([.\\w]+)/' w]+)/' The regular expression creates threeThe regular expression creates threeThe regular expression creates three The regular expression creates three capture buffers, with the second capture capture buffers, with the second capture buffer containing the string of interestbuffer containing the string of interestbuffer containing the string of interestbuffer containing the string of interestSee the paper for a full explanationSee the paper for a full explanation
© 2005, markTab Consulting, All Rights Reserved
ConclusionConclusionConclusionConclusion
With version 9 SAS programmers haveWith version 9 SAS programmers haveWith version 9, SAS programmers have With version 9, SAS programmers have two regular expression choices: SAS (RX) two regular expression choices: SAS (RX) and Perl (PRX)and Perl (PRX)and Perl (PRX)and Perl (PRX)The presentation described similarities and The presentation described similarities and differences and offered a recommendeddifferences and offered a recommendeddifferences, and offered a recommended differences, and offered a recommended strategystrategyTh t i th d t il dTh t i th d t il dThe paper contains three detailed The paper contains three detailed examples, and an annotated bibliographyexamples, and an annotated bibliography
© 2005, markTab Consulting, All Rights Reserved