34
Copyright Copyright ©2005 ©2005 Department of Computer & Information Science Department of Computer & Information Science JavaScript Regular JavaScript Regular Expressions Expressions

JavaScript Regular Expressions

Embed Size (px)

DESCRIPTION

JavaScript Regular Expressions. Goals. By the end of this unit you should … Understand what regular expressions are Be able to use regular expressions to match text against a particular string pattern - PowerPoint PPT Presentation

Citation preview

Page 1: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

JavaScript Regular JavaScript Regular ExpressionsExpressions

Page 2: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

GoalsGoals

By the end of this unit you should …By the end of this unit you should …• Understand what regular expressions areUnderstand what regular expressions are• Be able to use regular expressions to Be able to use regular expressions to

match text against a particular string match text against a particular string patternpattern

• Be able to use special regular expression Be able to use special regular expression characters to match multiple search terms characters to match multiple search terms against stringsagainst strings

Page 3: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

What is a Regular What is a Regular Expression?Expression?

• A A regular expressionregular expression is a pattern of is a pattern of characters.characters.

• We use regular expressions to search We use regular expressions to search for matches on particular text.for matches on particular text.

• In JavaScript, we can use regular In JavaScript, we can use regular expressions by creating instances of expressions by creating instances of the regular expression object, the regular expression object, RegExpRegExp..

Page 4: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

The Regular Expression The Regular Expression ConstructorConstructor

• We can declare a regular expression We can declare a regular expression by using a constructor. by using a constructor. General Form:General Form:var revar regExpNamegExpName = new = new RegExp(“RegExp(“RegExpRegExp”, “flags”);”, “flags”);

• Example:Example: var searchTermRE = new var searchTermRE = new RegExp(“s”,“gi”);RegExp(“s”,“gi”);(search for the letter “s” globally, (search for the letter “s” globally, ignore case)ignore case)

Page 5: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

The Regular Expression The Regular Expression LiteralLiteral

• Declaring using a reg. expression Declaring using a reg. expression literal:literal:var var searchTermRE searchTermRE = /X1X4/gi;= /X1X4/gi;

• When declaring regular expression When declaring regular expression literals, do NOT include quotation marks literals, do NOT include quotation marks and offset the expression with a pair of and offset the expression with a pair of forward slashes. By convention, variables forward slashes. By convention, variables acting as regular expressions end with acting as regular expressions end with the suffix “RE.” Flags come after the the suffix “RE.” Flags come after the second forward slash.second forward slash.

Page 6: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Literal CharactersLiteral Characters

• A lot of the time, we use regular A lot of the time, we use regular expressions to match specific expressions to match specific patternspatterns, like the word “java”:, like the word “java”:var firstRE = /java/;var firstRE = /java/;(would match the words “java”, “javascript”, (would match the words “java”, “javascript”, “javabeans”, “myJava”)“javabeans”, “myJava”)

• Matching character for character Matching character for character is termed matching literals.is termed matching literals.

Page 7: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Non-Printing Literal Non-Printing Literal CharactersCharacters

• We also consider some non-We also consider some non-printing characters as literals:printing characters as literals:– \t\t (tab character) (tab character)– \n\n (newline character) (newline character)– \0\0 (NUL character – null value) (NUL character – null value)

Page 8: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

MetacharactersMetacharacters

• Sometimes, we want to search not Sometimes, we want to search not for specific patterns, but for parts for specific patterns, but for parts of patterns. of patterns.

• Consider searching for all lines Consider searching for all lines that end with the letter that end with the letter “s”“s”. To do . To do so, we’ll need to use so, we’ll need to use metacharactersmetacharacters::var firstRE = /s$/;var firstRE = /s$/;(finds all phrases that end with “s”)(finds all phrases that end with “s”)

Page 9: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

What are Metacharacters?What are Metacharacters?

• Metacharacters are characters used to Metacharacters are characters used to represent special patterns that don’t represent special patterns that don’t necessarily fit in the range of standard necessarily fit in the range of standard letters and numbers (A-Z; a-z; 0-9, etc.).letters and numbers (A-Z; a-z; 0-9, etc.).

• We often use symbols as metacharacters We often use symbols as metacharacters to indicate a special circumstance.to indicate a special circumstance.

• Some of these symbols include:Some of these symbols include:$ . ^ * ?$ . ^ * ?

Page 10: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Metacharacters as LiteralsMetacharacters as Literals

• What if I want to search for a literal What if I want to search for a literal symbol that is also used as a symbol that is also used as a metacharacters? To search for a metacharacters? To search for a symbol as a literal and not as a symbol as a literal and not as a metacharacter, we use the metacharacter, we use the \\ (backslash) to turn “off” the (backslash) to turn “off” the metacharacter property.metacharacter property.$ used as a metacharacter:$ used as a metacharacter:var firstRE = /s$/;var firstRE = /s$/;$ used as a literal character:$ used as a literal character:var firstRE = /\$/;var firstRE = /\$/;

Page 11: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

FlagsFlags

• When searching, When searching, flagsflags can help refine can help refine or expand a searchor expand a search

• Flags modify a particular search to fit Flags modify a particular search to fit certain criteriacertain criteria

• There are three common flags, the There are three common flags, the global flagglobal flag, , ignore case flagignore case flag and the and the multiline mode flagmultiline mode flag..

Page 12: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

The Global flagThe Global flag

• In a regular expression without flags, In a regular expression without flags, JavaScript will return JavaScript will return only the first instanceonly the first instance of a search term:of a search term:var mySearchRE = /X1X4/;var mySearchRE = /X1X4/;(returns (returns only the first only the first instance of “X1X4”)instance of “X1X4”)

• To modify the search to include To modify the search to include allall instances of “X1X4”, we would use the instances of “X1X4”, we would use the global flag:global flag:var mySearchRE = /X1X4/g;var mySearchRE = /X1X4/g;(returns all instances of “X1X4”)(returns all instances of “X1X4”)

Page 13: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

The Ignore Case flagThe Ignore Case flag

• In a regular expression without flags, In a regular expression without flags, JavaScript only returns an JavaScript only returns an exact matchexact match::var mySearchRE = /X1X4/;var mySearchRE = /X1X4/;(returns (returns only only an instance of “X1X4”, but not an instance of “X1X4”, but not “x1x4” or “x1X4”, etc.)“x1x4” or “x1X4”, etc.)

• To modify the search to include instances To modify the search to include instances of “X1X4”, regardless of case, we would of “X1X4”, regardless of case, we would use the ignore case flag:use the ignore case flag:var mySearchRE = /X1X4/i;var mySearchRE = /X1X4/i;(returns an instance of “X1X4”,“x1x4”, x1X4”, (returns an instance of “X1X4”,“x1x4”, x1X4”, etc.)etc.)

Page 14: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

The Multiline flagThe Multiline flag

• A single string may include A single string may include newline characters. We can use newline characters. We can use the multiline flag allows us to the multiline flag allows us to search at the beginning or end of search at the beginning or end of a line, not just the beginning or a line, not just the beginning or end of a string. To turn it on:end of a string. To turn it on:var mySearchRE = /^X1X4/m;var mySearchRE = /^X1X4/m;

Page 15: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Combining FlagsCombining Flags

• We can also combine flags to We can also combine flags to expand our search:expand our search:var mySearchRE = /X1X4/gi;var mySearchRE = /X1X4/gi;(returns (returns allall instances of “x1x4”, “x1X4”, instances of “x1x4”, “x1X4”, “X1x4” & “X1X4”)“X1x4” & “X1X4”)

Page 16: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Searching for Matches Only Searching for Matches Only at the Beginning of a Lineat the Beginning of a Line

• Consider the following string:Consider the following string:JimmyJimmy the Scot scooted his scooter the Scot scooted his scooter through the Park.through the Park.The park guard watched Jimmy do this.The park guard watched Jimmy do this.

• The code:The code:var mySearchRE = /^Jimmy/gm;var mySearchRE = /^Jimmy/gm;((would only return “Jimmy” from the first line)would only return “Jimmy” from the first line)

• The The ^̂ metacharacter says “look only for metacharacter says “look only for matches at the beginning of the string or matches at the beginning of the string or line (multiline mode).”line (multiline mode).”

Page 17: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Searching for Matches Only Searching for Matches Only at the End of a Lineat the End of a Line

• Consider the following string:Consider the following string:Jimmy the Scot scooted his scooter Jimmy the Scot scooted his scooter through the Park.through the Park.The park guard watched Jimmy do tThe park guard watched Jimmy do thishis..

• The code:The code:var mySearchRE = /his$/gm;var mySearchRE = /his$/gm;((would only return “his” from the second would only return “his” from the second line)line)

• The The $$ metacharacter says “look only metacharacter says “look only for matches at the end of the string or for matches at the end of the string or line (multiline mode).”line (multiline mode).”

Page 18: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Using BoundariesUsing Boundaries

• Consider the following string:Consider the following string:Jimmy Jimmy thethe Scot scooted his scooter Scot scooted his scooter through through thethe Park. Park.The park guard watched Jimmy do this.The park guard watched Jimmy do this.

• To search for the all instances of the word To search for the all instances of the word ““thethe” we could use the space metacharacter ” we could use the space metacharacter ((\s\s):):var mySearchRE = /\sthe\s/gim;var mySearchRE = /\sthe\s/gim;(Ignores “The” that begins the second line, since it (Ignores “The” that begins the second line, since it has no space before it -- it starts a line)has no space before it -- it starts a line)

Page 19: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Using BoundariesUsing Boundaries

• Consider the following string:Consider the following string:Jimmy Jimmy thethe Scot scooted his scooter Scot scooted his scooter through through thethe Park. Park.TheThe park guard watched Jimmy do this. park guard watched Jimmy do this.

• Instead of using a space character, we can Instead of using a space character, we can use the boundary (use the boundary (\b\b). The boundary ). The boundary metacharacter searches for all instances of metacharacter searches for all instances of a pattern which are not a prefix (a pattern which are not a prefix (\b\b at the at the beginning of a search pattern) or a suffix (beginning of a search pattern) or a suffix (\b \b at the end of a search pattern) of another at the end of a search pattern) of another word:word:var mySearchRE = /\bthe\b/gim;var mySearchRE = /\bthe\b/gim;

Page 20: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Using Boundaries Using Boundaries (continued)(continued)

• Our string:Our string:Jimmy Jimmy thethe Scot scooted his scooter Scot scooted his scooter throughthrough thethe Park. Park.TheThe park guard watched Jimmy do park guard watched Jimmy do thisthis..

• Code:Code:var mySearchRE = /\bt/gim;var mySearchRE = /\bt/gim;

• Search for all matches that begin with Search for all matches that begin with “t”. Ignore “t” if “t” is in the middle or “t”. Ignore “t” if “t” is in the middle or at the end of a word.at the end of a word.

Page 21: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Searching for Multiple Searching for Multiple Patterns at the Same TimePatterns at the Same Time

• Consider the following string:Consider the following string:loplop, , mopmop, , bopbop, , sopsop, , poppop, , gopgop, , toptop, , fopfop

• To search for the all instances that To search for the all instances that end with “op” we would use a wildcard end with “op” we would use a wildcard character (character (..) There no need for the ) There no need for the global flag, because the global is global flag, because the global is inherent in the wildcard character:inherent in the wildcard character:var mySearchRE = /.op/;var mySearchRE = /.op/;(returns the all the words)(returns the all the words)

Page 22: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Searching for Multiple Searching for Multiple Patterns at the Same TimePatterns at the Same Time

• Consider the following string:Consider the following string:loplop, , mopmop, , bopbop, , sopsop, , poppop, , gopgop, , toptop, , fopfop

• To search only for the instances that To search only for the instances that match “bop”, “lop” or “pop” we would match “bop”, “lop” or “pop” we would use brackets to include the search use brackets to include the search characters, but exclude all others (characters, but exclude all others ([][]):):var mySearchRE = /[blp]op/;var mySearchRE = /[blp]op/;

Page 23: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Searching for Multiple Searching for Multiple Patterns at the Same TimePatterns at the Same Time

• Consider the following string:Consider the following string:loplop, , mopmop, , bopbop, , sopsop, , poppop, , gopgop, , toptop, , fopfop

• We can also use ranges of letters in We can also use ranges of letters in the brackets:the brackets:var mySearchRE = /[a-m]op/;var mySearchRE = /[a-m]op/;(returns “bop”, “fop”, “gop”, “lop” and “mop”, (returns “bop”, “fop”, “gop”, “lop” and “mop”, but ignores all other words ending with “op”)but ignores all other words ending with “op”)

Page 24: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Excluding PatternsExcluding Patterns

• Consider the following string:Consider the following string:loplop, , mopmop, , bopbop, , sopsop, , poppop, , gopgop, , toptop, , fopfop

• To search for the all instances that end with To search for the all instances that end with “op” “op” exceptexcept those that begin with “b”, “l” or those that begin with “b”, “l” or “p”, we use the not metacharacter (“p”, we use the not metacharacter (^̂):):var mySearchRE = /[^blp]op/;var mySearchRE = /[^blp]op/;(returns the all the words except “bop”, “lop” and (returns the all the words except “bop”, “lop” and “pop”)“pop”)

• Inside bracketsInside brackets, the , the ^̂ symbol means “not” symbol means “not” and DOES NOT mean the beginning of a line!and DOES NOT mean the beginning of a line!

Page 25: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Excluding PatternsExcluding Patterns

• Our string:Our string:lop, mop, bop, lop, mop, bop, sopsop, , poppop, gop, , gop, toptop, fop, fop

• We can also use ranges of letters We can also use ranges of letters in the brackets:in the brackets:var mySearchRE = /[^a-m]op/;var mySearchRE = /[^a-m]op/;(returns all words except “bop”, “fop”, (returns all words except “bop”, “fop”, “gop”, “lop” and “mop”)“gop”, “lop” and “mop”)

Page 26: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Other Metacharacters: Other Metacharacters: ??, , ** and and ++

• To match zero or one characters:To match zero or one characters:var mySearchRE = /b?onk/;var mySearchRE = /b?onk/;(matches “bonk” or “onk”)(matches “bonk” or “onk”)

• To match zero or To match zero or nn characters: characters:var mySearchRE = /b*onk/;var mySearchRE = /b*onk/;(matches “bonk”, “onk” or “bbonk”)(matches “bonk”, “onk” or “bbonk”)

• To match one or To match one or nn characters: characters:var mySearchRE = /b+onk/;var mySearchRE = /b+onk/;(matches “bonk” or “bbonk”, but not “onk”)(matches “bonk” or “bbonk”, but not “onk”)

Page 27: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Other Metacharacters:Other Metacharacters:{ }{ }

• To match a specific number of To match a specific number of characters:characters:var mySearchRE = /g{2}op/;var mySearchRE = /g{2}op/;(matches “goop”, but not “gop” or (matches “goop”, but not “gop” or “gooop”)“gooop”)

• To match between To match between nn and and mm characters:characters: var mySearchRE = /g{1,3}p/;var mySearchRE = /g{1,3}p/;(matches “gop” “goop” or “gooop” only)(matches “gop” “goop” or “gooop” only)

Page 28: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

String.search()String.search() Method Method

• The The String.search()String.search() method gives method gives us the character position (index us the character position (index number) of where the search term number) of where the search term starts or –1 if there is not match. starts or –1 if there is not match.

• The The String.search() String.search() does not does not perform global searches and will perform global searches and will ignore the “ignore the “gg” flag!” flag!

Page 29: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Open the file called introRegExp_01.html

Page 30: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

String.match()String.match() Method Method

• The The String.match()String.match() method returns method returns an array containing all of the an array containing all of the matches from a string. matches from a string.

• Unlike the Unlike the String.search()String.search() method, the method, the String.match()String.match() method method does does perform global perform global searches.searches.

Page 31: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

Open the file called introRegExp_02.html

Page 32: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

SummarySummary

• We can use a regular expression to We can use a regular expression to search for a pattern of characters.search for a pattern of characters.

• We can create a JavaScript regular We can create a JavaScript regular expression by using the expression by using the RegExpRegExp constructor or by creating a regular constructor or by creating a regular expression literal.expression literal.

continued …continued …

Page 33: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

SummarySummary

• We can use the We can use the String.search()String.search() method to find the find first method to find the find first occurrence of a regular expression.occurrence of a regular expression.

• We can use the We can use the String.match()String.match() method to return an array of all method to return an array of all occurrences of a regular occurrences of a regular expression.expression.

Page 34: JavaScript Regular Expressions

Copyright Copyright ©2005 ©2005 Department of Computer & Information Department of Computer & Information ScienceScience

ResourcesResources

• JavaScript: The Definitive Guide JavaScript: The Definitive Guide 44thth Edition Edition by David Flanagan by David Flanagan (O’Reilly, 2002)(O’Reilly, 2002)