View
241
Download
0
Category
Preview:
Citation preview
Automata and Formal Languages - CM0081Regular Expressions
Andrés Sicard-Ramírez
Universidad EAFIT
Semester 2018-2
Introduction
Finite automataMachine-like descriptions of regular languages.
Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗Uses
Search commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)
Regular Expressions 2/40
Introduction
Finite automataMachine-like descriptions of regular languages.
Regular expressionsAlgebraic description of regular languages
Declarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗Uses
Search commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)
Regular Expressions 3/40
Introduction
Finite automataMachine-like descriptions of regular languages.
Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe language
Example: (01)(01)∗ + (010)(010)∗Uses
Search commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)
Regular Expressions 4/40
Introduction
Finite automataMachine-like descriptions of regular languages.
Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗
UsesSearch commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)
Regular Expressions 5/40
Introduction
Finite automataMachine-like descriptions of regular languages.
Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗Uses
Search commands (e.g. Grep)
Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)
Regular Expressions 6/40
Introduction
Finite automataMachine-like descriptions of regular languages.
Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗Uses
Search commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)
Domain specific languages (DSLs)
Regular Expressions 7/40
Introduction
Finite automataMachine-like descriptions of regular languages.
Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗Uses
Search commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)
Regular Expressions 8/40
Operations on Languages
DefinitionLet 𝐿 and 𝑀 be languages. Then
Union 𝐿 ∪𝑀 = {𝑤 ∣ 𝑤 ∈ 𝐿 or 𝑤 ∈ 𝑀}
Concatenation 𝐿.𝑀 = {𝑤 ∣ 𝑤 = 𝑥𝑦, 𝑥 ∈ 𝐿 and 𝑦 ∈ 𝑀}
Powers 𝐿0 = {𝜀}𝐿1 = 𝐿
𝐿𝑘+1 = 𝐿.𝐿𝑘
Kleene closure 𝐿∗ =∞⋃𝑖=0
𝐿𝑖
Regular Expressions 9/40
Operations on Languages
ExamplesIf 𝐿 = {0, 1}, then 𝐿∗ consists of all strings of 0’s and 1’s and theempty word.
If 𝐿 = {0𝑛 ∣ 𝑛 ≥ 1}, then 𝐿∗ = 𝐿 ∪ {𝜀}.If 𝐿 = {0, 11}, then 𝐿∗ consists of the empty word and those stringsof 0’s and 1’s such that the 1’s come in pairs.∅0 = {𝜀}∅𝑖 = ∅, for 𝑖 ≥ 1∅∗ = {𝜀}
Regular Expressions 10/40
Operations on Languages
ExamplesIf 𝐿 = {0, 1}, then 𝐿∗ consists of all strings of 0’s and 1’s and theempty word.If 𝐿 = {0𝑛 ∣ 𝑛 ≥ 1}, then 𝐿∗ = 𝐿 ∪ {𝜀}.
If 𝐿 = {0, 11}, then 𝐿∗ consists of the empty word and those stringsof 0’s and 1’s such that the 1’s come in pairs.∅0 = {𝜀}∅𝑖 = ∅, for 𝑖 ≥ 1∅∗ = {𝜀}
Regular Expressions 11/40
Operations on Languages
ExamplesIf 𝐿 = {0, 1}, then 𝐿∗ consists of all strings of 0’s and 1’s and theempty word.If 𝐿 = {0𝑛 ∣ 𝑛 ≥ 1}, then 𝐿∗ = 𝐿 ∪ {𝜀}.If 𝐿 = {0, 11}, then 𝐿∗ consists of the empty word and those stringsof 0’s and 1’s such that the 1’s come in pairs.
∅0 = {𝜀}∅𝑖 = ∅, for 𝑖 ≥ 1∅∗ = {𝜀}
Regular Expressions 12/40
Operations on Languages
ExamplesIf 𝐿 = {0, 1}, then 𝐿∗ consists of all strings of 0’s and 1’s and theempty word.If 𝐿 = {0𝑛 ∣ 𝑛 ≥ 1}, then 𝐿∗ = 𝐿 ∪ {𝜀}.If 𝐿 = {0, 11}, then 𝐿∗ consists of the empty word and those stringsof 0’s and 1’s such that the 1’s come in pairs.∅0 = {𝜀}∅𝑖 = ∅, for 𝑖 ≥ 1∅∗ = {𝜀}
Regular Expressions 13/40
Syntax: What the Regular Expressions Are
DefinitionLet Σ be an alphabet. The regular expressions on Σ are inductivelydefined by:
Basis step: 𝜀 and ∅ are regular expressions. If 𝑎 ∈ Σ then a is aregular expression.Inductive step: Let 𝐸 and 𝐹 be regular expressions. Then 𝐸 + 𝐹 ,𝐸𝐹 , 𝐸∗ and (𝐸) are regular expressions.
Regular Expressions 14/40
Syntax: What the Regular Expressions Are
DefinitionLet Σ be an alphabet. The regular expressions on Σ are inductivelydefined by:
Basis step: 𝜀 and ∅ are regular expressions. If 𝑎 ∈ Σ then a is aregular expression.
Inductive step: Let 𝐸 and 𝐹 be regular expressions. Then 𝐸 + 𝐹 ,𝐸𝐹 , 𝐸∗ and (𝐸) are regular expressions.
Regular Expressions 15/40
Syntax: What the Regular Expressions Are
DefinitionLet Σ be an alphabet. The regular expressions on Σ are inductivelydefined by:
Basis step: 𝜀 and ∅ are regular expressions. If 𝑎 ∈ Σ then a is aregular expression.Inductive step: Let 𝐸 and 𝐹 be regular expressions. Then 𝐸 + 𝐹 ,𝐸𝐹 , 𝐸∗ and (𝐸) are regular expressions.
Regular Expressions 16/40
Semantics: What Language Denotes a Regular Expression
DefinitionLet 𝐸 be a regular expression. The language denoted by 𝐸, denotedby 𝐿(𝐸), is inductively defined by:
Basis step:𝐿(𝜀) = {𝜀},𝐿(∅) = ∅,𝐿(a) = {𝑎}.
Inductive step: Let 𝐿(𝐸) and 𝐿(𝐹) be the languages denoted by theregular expressions 𝐸 and 𝐹 , then
𝐿(𝐸 + 𝐹) = 𝐿(𝐸) ∪ 𝐿(𝐹),𝐿(𝐸𝐹) = 𝐿(𝐸)𝐿(𝐹),𝐿(𝐸∗) = (𝐿(𝐸))∗,𝐿((𝐸)) = 𝐿(𝐸).
Regular Expressions 17/40
Semantics: What Language Denotes a Regular Expression
DefinitionLet 𝐸 be a regular expression. The language denoted by 𝐸, denotedby 𝐿(𝐸), is inductively defined by:
Basis step:𝐿(𝜀) = {𝜀},𝐿(∅) = ∅,𝐿(a) = {𝑎}.
Inductive step: Let 𝐿(𝐸) and 𝐿(𝐹) be the languages denoted by theregular expressions 𝐸 and 𝐹 , then
𝐿(𝐸 + 𝐹) = 𝐿(𝐸) ∪ 𝐿(𝐹),𝐿(𝐸𝐹) = 𝐿(𝐸)𝐿(𝐹),𝐿(𝐸∗) = (𝐿(𝐸))∗,𝐿((𝐸)) = 𝐿(𝐸).
Regular Expressions 18/40
Semantics: What Language Denotes a Regular Expression
DefinitionLet 𝐸 be a regular expression. The language denoted by 𝐸, denotedby 𝐿(𝐸), is inductively defined by:
Basis step:𝐿(𝜀) = {𝜀},𝐿(∅) = ∅,𝐿(a) = {𝑎}.
Inductive step: Let 𝐿(𝐸) and 𝐿(𝐹) be the languages denoted by theregular expressions 𝐸 and 𝐹 , then
𝐿(𝐸 + 𝐹) = 𝐿(𝐸) ∪ 𝐿(𝐹),𝐿(𝐸𝐹) = 𝐿(𝐸)𝐿(𝐹),𝐿(𝐸∗) = (𝐿(𝐸))∗,𝐿((𝐸)) = 𝐿(𝐸).
Regular Expressions 19/40
Precedence of Operators
Order of precedence, from highest to lowest: (), ∗, . and +.The operators . and + are left-associative.
Example01∗ + 1 = (0(1∗)) + 1
≠ (01)∗ + 1≠ 0(1∗ + 1)
Regular Expressions 20/40
Precedence of Operators
Order of precedence, from highest to lowest: (), ∗, . and +.The operators . and + are left-associative.
Example01∗ + 1 = (0(1∗)) + 1
≠ (01)∗ + 1≠ 0(1∗ + 1)
Regular Expressions 21/40
Languages Denoted by Regular Expressions
Example𝐸 𝐿(𝐸)a + b 𝐿(a) ∪ 𝐿(b) = {𝑎} ∪ {𝑏} = {𝑎, 𝑏}
a∗ {𝜀, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎,…}
(a + b)(a + b) 𝐿(a + b).𝐿(a + b) ={𝑎, 𝑏}.{𝑎, 𝑏} = {𝑎𝑎, 𝑎𝑏, 𝑏𝑎, 𝑏𝑏}
a + (ab)∗ {𝑎, 𝜀, 𝑎𝑏, 𝑎𝑏𝑎𝑏, 𝑎𝑏𝑎𝑏𝑎𝑏,…}
(0 + 1)∗01(0 + 1)∗ {𝑥01𝑦 ∣ 𝑥, 𝑦 ∈ {0, 1}∗}
ai(a1 + a2 +⋯+ an)∗ {𝑤 ∈ Σ∗ ∣ 𝑤 starts by 𝑎𝑖}
Regular Expressions 22/40
Languages Denoted by Regular Expressions
Example𝐸 𝐿(𝐸)a + b 𝐿(a) ∪ 𝐿(b) = {𝑎} ∪ {𝑏} = {𝑎, 𝑏}
a∗ {𝜀, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎,…}
(a + b)(a + b) 𝐿(a + b).𝐿(a + b) ={𝑎, 𝑏}.{𝑎, 𝑏} = {𝑎𝑎, 𝑎𝑏, 𝑏𝑎, 𝑏𝑏}
a + (ab)∗ {𝑎, 𝜀, 𝑎𝑏, 𝑎𝑏𝑎𝑏, 𝑎𝑏𝑎𝑏𝑎𝑏,…}
(0 + 1)∗01(0 + 1)∗ {𝑥01𝑦 ∣ 𝑥, 𝑦 ∈ {0, 1}∗}
ai(a1 + a2 +⋯+ an)∗ {𝑤 ∈ Σ∗ ∣ 𝑤 starts by 𝑎𝑖}
Regular Expressions 23/40
Languages Denoted by Regular Expressions
Example𝐸 𝐿(𝐸)a + b 𝐿(a) ∪ 𝐿(b) = {𝑎} ∪ {𝑏} = {𝑎, 𝑏}
a∗ {𝜀, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎,…}
(a + b)(a + b) 𝐿(a + b).𝐿(a + b) ={𝑎, 𝑏}.{𝑎, 𝑏} = {𝑎𝑎, 𝑎𝑏, 𝑏𝑎, 𝑏𝑏}
a + (ab)∗ {𝑎, 𝜀, 𝑎𝑏, 𝑎𝑏𝑎𝑏, 𝑎𝑏𝑎𝑏𝑎𝑏,…}
(0 + 1)∗01(0 + 1)∗ {𝑥01𝑦 ∣ 𝑥, 𝑦 ∈ {0, 1}∗}
ai(a1 + a2 +⋯+ an)∗ {𝑤 ∈ Σ∗ ∣ 𝑤 starts by 𝑎𝑖}
Regular Expressions 24/40
Languages Denoted by Regular Expressions
Example𝐸 𝐿(𝐸)a + b 𝐿(a) ∪ 𝐿(b) = {𝑎} ∪ {𝑏} = {𝑎, 𝑏}
a∗ {𝜀, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎,…}
(a + b)(a + b) 𝐿(a + b).𝐿(a + b) ={𝑎, 𝑏}.{𝑎, 𝑏} = {𝑎𝑎, 𝑎𝑏, 𝑏𝑎, 𝑏𝑏}
a + (ab)∗ {𝑎, 𝜀, 𝑎𝑏, 𝑎𝑏𝑎𝑏, 𝑎𝑏𝑎𝑏𝑎𝑏,…}
(0 + 1)∗01(0 + 1)∗ {𝑥01𝑦 ∣ 𝑥, 𝑦 ∈ {0, 1}∗}
ai(a1 + a2 +⋯+ an)∗ {𝑤 ∈ Σ∗ ∣ 𝑤 starts by 𝑎𝑖}
Regular Expressions 25/40
Languages Denoted by Regular Expressions
Example𝐸 𝐿(𝐸)a + b 𝐿(a) ∪ 𝐿(b) = {𝑎} ∪ {𝑏} = {𝑎, 𝑏}
a∗ {𝜀, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎,…}
(a + b)(a + b) 𝐿(a + b).𝐿(a + b) ={𝑎, 𝑏}.{𝑎, 𝑏} = {𝑎𝑎, 𝑎𝑏, 𝑏𝑎, 𝑏𝑏}
a + (ab)∗ {𝑎, 𝜀, 𝑎𝑏, 𝑎𝑏𝑎𝑏, 𝑎𝑏𝑎𝑏𝑎𝑏,…}
(0 + 1)∗01(0 + 1)∗ {𝑥01𝑦 ∣ 𝑥, 𝑦 ∈ {0, 1}∗}
ai(a1 + a2 +⋯+ an)∗ {𝑤 ∈ Σ∗ ∣ 𝑤 starts by 𝑎𝑖}
Regular Expressions 26/40
Languages Denoted by Regular Expressions
Example𝐸 𝐿(𝐸)a + b 𝐿(a) ∪ 𝐿(b) = {𝑎} ∪ {𝑏} = {𝑎, 𝑏}
a∗ {𝜀, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎,…}
(a + b)(a + b) 𝐿(a + b).𝐿(a + b) ={𝑎, 𝑏}.{𝑎, 𝑏} = {𝑎𝑎, 𝑎𝑏, 𝑏𝑎, 𝑏𝑏}
a + (ab)∗ {𝑎, 𝜀, 𝑎𝑏, 𝑎𝑏𝑎𝑏, 𝑎𝑏𝑎𝑏𝑎𝑏,…}
(0 + 1)∗01(0 + 1)∗ {𝑥01𝑦 ∣ 𝑥, 𝑦 ∈ {0, 1}∗}
ai(a1 + a2 +⋯+ an)∗ {𝑤 ∈ Σ∗ ∣ 𝑤 starts by 𝑎𝑖}
Regular Expressions 27/40
Languages Denoted by Regular Expressions
ExampleWrite a regular expression for the language 𝐿 defined by
𝐿 = {𝑤 ∈ {0, 1}∗ ∣ 0 and 1 alternate in 𝑤}.
Solution: (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗
Other solution: (𝜀 + 1)(01)∗(𝜀 + 0)
ExampleThe regular expression
(10 + 0)∗(𝜀 + 1)denotes the set of strings of 0’s and 1’s that have no two adjacent 1’s.
Regular Expressions 28/40
Languages Denoted by Regular Expressions
ExampleWrite a regular expression for the language 𝐿 defined by
𝐿 = {𝑤 ∈ {0, 1}∗ ∣ 0 and 1 alternate in 𝑤}.
Solution: (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗
Other solution: (𝜀 + 1)(01)∗(𝜀 + 0)
ExampleThe regular expression
(10 + 0)∗(𝜀 + 1)denotes the set of strings of 0’s and 1’s that have no two adjacent 1’s.
Regular Expressions 29/40
Languages Denoted by Regular Expressions
ExampleWrite a regular expression for the language 𝐿 defined by
𝐿 = {𝑤 ∈ {0, 1}∗ ∣ 0 and 1 alternate in 𝑤}.
Solution: (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗
Other solution: (𝜀 + 1)(01)∗(𝜀 + 0)
ExampleThe regular expression
(10 + 0)∗(𝜀 + 1)denotes the set of strings of 0’s and 1’s that have no two adjacent 1’s.
Regular Expressions 30/40
Languages Denoted by Regular Expressions
ExampleWrite a regular expression for the language 𝐿 defined by
𝐿 = {𝑤 ∈ {0, 1}∗ ∣ 0 and 1 alternate in 𝑤}.
Solution: (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗
Other solution: (𝜀 + 1)(01)∗(𝜀 + 0)
ExampleThe regular expression
(10 + 0)∗(𝜀 + 1)denotes the set of strings of 0’s and 1’s that have no two adjacent 1’s.
Regular Expressions 31/40
Languages Denoted by Regular Expressions
ExampleWrite a regular expression for denoting the set of strings over Σ = {0, 1}not ending in 01.
Solution: 𝜀 + 0 + 1 + (0 + 1)∗(00 + 10 + 11)
Regular Expressions 32/40
Languages Denoted by Regular Expressions
ExampleWrite a regular expression for denoting the set of strings over Σ = {0, 1}not ending in 01.Solution: 𝜀 + 0 + 1 + (0 + 1)∗(00 + 10 + 11)
Regular Expressions 33/40
Libraries
RemarkTheoretical regular expressions ≠ practical regular expressions.
Some programming languages with support to regular expressions.NET, C, Haskell, Java, Mathematica, MATLAB and Perl.
Regular Expressions 34/40
Libraries
RemarkTheoretical regular expressions ≠ practical regular expressions.
Some programming languages with support to regular expressions.NET, C, Haskell, Java, Mathematica, MATLAB and Perl.
Regular Expressions 35/40
Algorithms
AlgorithmsSee the Haskell implementation of some algorithms on regular expressionsin the course homepage.
Regular Expressions 36/40
Applications
Some programs that use regular expressionsGrep: print lines matching a patternAwk: pattern scanning and processing languageSed: stream editor for filtering and transforming textAlex, Flex and Lex: lexical-analyzer generatorsEmacs and Vim: test editorsMySQL, Oracle and PostgreSQL: databases
Regular Expressions 37/40
Applications
ReadingApplications of Regular Expressions (Hopcroft, Motwani and Ullman 2007,§ 3.3).
𝐿+ def= 𝐿𝐿∗ (one or many times operator)
𝐿? def= 𝜀 + 𝐿 (zero or one time operator)
Regular Expressions 38/40
An Implementation: A Regular Expression Matcher
“Rob’s implementation itself is asuperb example of beautiful code:compact, elegant, efficient, anduseful. It’s one of the best examplesof recursion that I have ever seen.”
Brian Kernighan, p. 3.
Regular Expressions 39/40
Recommended