12
Regular Expressions in Regular Expressions in Scheme Scheme 480/680 – Comparative Languages 480/680 – Comparative Languages

Regular Expressions in Scheme CS 480/680 – Comparative Languages

Embed Size (px)

Citation preview

Page 1: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions in SchemeRegular Expressions in Scheme

CS 480/680 – Comparative LanguagesCS 480/680 – Comparative Languages

Page 2: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 2

Regular Expressions in SchemeRegular Expressions in Scheme Scheme supports two kings of regular

expressions:• Regexp – a built-in data type, supports many, but

not all of Perl/Ruby regular expression operations• Pregexp – requires an external library (pregexp.ss),

but more powerful than even Ruby’s regular expression syntax

Page 3: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 3

Built-in Regular ExpressionsBuilt-in Regular Expressions (regexp string) – takes a string

argument and returns a regular expression object

(regexp-match regexp string) – matches regexp object regexp to string• Returns #f if no match• If there is a match, returns a list of strings

First: The matching part of string Others: Matches for any () pairs

Page 4: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 4

Built-in ExamplesBuilt-in Examples

Language supports most regexp constructs, but not all:• +, *, +?, *?, [a-z][^aeiou], (a | b), (?:no-list), ./^/$,

etc. all supported• \1, \2 etc. Not supported in the match

See regexp1.scheme, regexp2.scheme

(regexp-match (regexp “ll”) “hello”) >> (“ll”)

(regexp-match (regexp “h(.*)o”) “hello world”)>> (“hello wo” “ello w”)

Page 5: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 5

More Built-in functionsMore Built-in functions Regexp-match-positions – like regexp-match,

but reports start, end pairs (regexp-match-positions (regexp "h(.*)o") "hello world")

>> ((0 . 8) (1 . 7)) (regexp-replace pattern string insert-string)• \1, \2, etc. work in insert-string

Page 6: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 6

pregexp.sspregexp.ss The pregexp.ss (“Perl-like regular

expressions”) library adds additional power Beware: the syntax and return-values of the

functions are changed for this library!

Page 7: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 7

Perl-like functionsPerl-like functions (pregexp-match “pattern” “string”)

(pregexp-match-positions “pattern” “string”)• “pattern” will automatically be compiled into a

pregexp object If calling repeatedly, more efficient to use (pregexp “pat”) to compile first

Page 8: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 8

Other useful functionsOther useful functions (pregexp-split “pat” “string”) (pregexp-replace “pat” “text” “insert-string”)• (pregexp-replace* “pat” “text” “insert-string”) – all matches

Page 9: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 9

Pregexp syntaxPregexp syntax Support all of the Ruby regexp syntax,

including:• Backreferences (\1, \2, etc.)• Non-capturing clusters (?:blah) – no capture to \1 or

list

And, in addition:• (?i:blah) – case insensitive match• (?x:blah) – space and comment insensitive match• And more…

Page 10: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 10

Lookahead, lookbehindLookahead, lookbehind “grey(?=hound)” – matches (and thus returns)

grey, but only if the following expression could match• (pregexp-match-positions "grey(?!hound)" "the gray greyhound ate the grey socks") >> ((27 . 31))

(?=blah) – positive lookahead (?!blah) – negative lookahead (?<=blah) – positive lookbehind (?<!blah) – negative lookbehind

Page 11: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 11

BacktrackingBacktracking Both greedy and non-greedy regular

expressions are matched using a backtracking process

Greedy example:• (pregexp-match "a*a" "aaaa")

Non-greedy example:• (pregexp-match "a*?a" "aaaa")

Page 12: Regular Expressions in Scheme CS 480/680 – Comparative Languages

Regular Expressions 12

Disabling BacktrackingDisabling Backtracking Sometimes, disabling backtracking can be

useful:• (pregexp-match "(?>a+)." "aaaa")=> #f

• Equivalent to:(pregexp-match "a*[^a]" "aaaa")