Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
CITS2211 Discrete StructuresLectures for Semester 2 2017
Non-Regular Languages
October 30, 2017
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Highlights
We have seen that FSMs are surprisingly powerful
But we also saw some languages FSMs can not recognise
Now we will learn a useful theorem for testing whether a languageis regular or not: the pumping lemma for regular languages
We will study a new type of automata: pushdown automata
And a new class of languages: context-free languages
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Reading
Introduction to the Theory of Computation by Michael Sipser
Chapter 1: Regular LanguagesSection 1.4 The pumping lemma for regular languages
Chapter 2: Context-free languagesSection 2.2 Pushdown automata
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Lecture Outline
1 Non-regular languages
2 The pumping lemma for regular languages
3 Push-down automata
4 Context-free languages
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Non-regular languages
Q: Are all languages regular?A: No
We can use a diagonalization argument to show that there arenon-regular languages. Remember we used a diagonalizationargument to show that there are uncountable sets. Now we willshow that there must be some non-regular languages over thealphabet {0, 1}.
Firstly, notice that the set of regular expressions is countable— wecan arrange them in lexicographical order. Therefore we can forma complete list R1, R2, R3, . . ., of regular expressions.
Let L1, L2, . . ., be the corresponding list of regular languagesdescribed by the above regular expressions.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
A diagonalization argument
Now we will form a new language L and show that it is not in thelist above.
Recall that the earlier diagonalization arguments formed a newobject that was definitely different from every object on thesupposedly complete list.
We will copy this strategy as follows:Consider the set of strings 1, 11, 111, 1111, and form the newlanguage L as follows:If 1 /∈ L1, then add it to L (else do nothing)If 11 /∈ L2, then add it to LIf 111 /∈ L3, then add it to LIf 1111 /∈ L4, then add it to L and so on...
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
A diagonalization argument (cont.)
Then clearly L is different from L1, L2, . . . and thus L is notregular.
This is an example of a non-constructive proof. We have shownthe existence of a non-regular language, but have not actuallyspecified one.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
FSMs and regular languages
Consider the following recognizer.
s0 s1
1
10
0
What language does this machine recognize?
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
We can start by looking at some input strings to see if there is apattern...Accepted: 1, 01, 001Rejected: 0, 00, 10
We hypothesise that this FSM accepts strings ending in 1, and amoment’s thought shows that this is indeed correct.
Therefore the language accepted by this FSM is
L(M) = (0 + 1)∗1
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Another example: what language does this FSM recognise?
s0 s1
s2s3
0
1 0
1
01
0, 1
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
For this example we can see that if the FSM ever reaches state s3then it can never reach the accepting states (s1 or s2).
Therefore any accepted string must start with 0. Another 0 willchange the state to s3, and therefore the second character must bea 1. We continue this argument to see that the FSM will only be inan accepting state provided it has received a sequence of 01 pairs.As soon as this pattern is broken, the machine changes to state s3.
Therefore the language accepted by this FSM is
L(M) = (01)∗
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
The pumping lemma
We now turn to the problem of determining when a language isnon-regular, and hence cannot be recognized by a finite statemachine.
The main tool for this is a result called the pumping lemma whichcan be proved using the Pigeonhole Principle we studied a fewweeks ago.
The pumping lemma for regular languages was first stated byY. Bar-Hillel, Micha A. Perles, and Eli Shamir in 1961.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Theorem: Pumping Lemma (for regular languages)
Theorem: If L is a regular language then
∃ an integer p called the pumping length of L such that∀ words w ∈ L where |w | > p∃ an expression w = xyz where
1 ∀ i ≥ 0. xy iz ∈ L2 |y | ≥ 13 |xy | ≤ p
That is, if L is regular then any sufficiently long word in thelanguage contains a non-empty substring that can be repeated anarbitrary number of times (0,1 or more).
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
How to use the pumping lemma
The pumping lemma is of the form R → P. That is IF a languageis regular (R) THEN certain properties must hold (P).
Usage 1: Assume R. From the lemma we also have P. Nowderive a contradiction. The contradiction tells us that theoriginal assumption R must be false. This way we can provethat a language is NOT regular.
Usage 2: R → P ↔ ¬P → ¬R. So if we can show ¬P thenby the pumping lemma we can deduce that the language NOTregular.
Note, that we can not say that if the pumping properties holdthen the language is regular, because we do not have P → R
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Use of pumping lemma
The trickiness of applying pumping lemma is using the ∀ and ∃conditions correctly. You need to practice!
To prove that a language is not regular, find some sufficiently longstring that is in the language, but which cannot be pumped.
The existence of such a string shows that the given language is notregular.
We use an “adversary” game argument [Hopcroft and Ullman]
Your choices in the game correspond to the ∀ quantifiers in thestatement of the Pumping Lemma (see above) and adversarychoices correspond to ∃ lines.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Proof of pumping lemma
RTP.∀L. ∃p. ∀w . w ∈ L →
∃xyz . w = xyz ∧ |xy | ≤ p ∧ |y | ≥ 1 ∧ ∀i > 0. xy iz ∈ L
Proof. Suppose L is a regular language. Then it can be recognizedby a finite state machine M with p states.Now suppose that w = c1c2 . . . cn is a word of length n ≥ p. Thenconsider the states that occur when M is run with input string w .
s0 −→c1 s1 −→c2 s2 · · · −→cn sn
By the pigeonhole principle, at least two of these states must bethe same so let si and sj be the first two occurrences of the firstrepeated state.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Pumping Lemma (cont)
Now setx = c1c2 . . . ci
y = ci+1ci+2 . . . cj
z = cj+1 . . . cn
Now we can see that y is a string that takes the finite statemachine “in a circle” from a state back to itself.This means that we can now “pump” the input string by repeatingthis portion as often as possible, and still get a string that isrecognized by the machine.Hence xz , xyyz , xyyyz and in general xy iz are all recognized by Mand hence in the language L. QED
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Adversary Argument using the Pumping Lemma
1 Select the language L to be proven non-regular
2 The adversary (she) picks the pumping constant p
3 You select some string w ∈ L (based on your knowledge of p)
4 She breaks w into any x , y , z she wants subject to theconstraints |xy | ≤ p ∧ |y | ≥ 1
5 You achieve a contradiction to the Pumping Lemma byshowing that for any x , y , z chosen by the adversary, ∃i sothat xy iz /∈ L. Your choice of i may depend on p, x , y , and w .
6 From this it can be concluded that L is not regular.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Adversary Argument Example
Example Show that the language L is not regular where,
L = {w | w has an equal number of 0s and 1s}
ProofSuppose L is regular. Let the pumping length be p. Choosew = 0p1p. Clearly w ∈ L. We will see this is a useful example of wfor the proof (not all words in L are useful choices). For anyadversary choice of xyz = w , both x and y can only contain 0ssince |xy | ≤ p (constraint 3). Say x = 0m, y = 0n for somem + n ≤ p. Now, the pumping lemma states that xyyz ∈ L but weknow xyyz /∈ L because xyyz has more 0s than 1s. We havederived a contradiction. Therefore L is not regular. QED
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Take Care: Pumping Lemma uses → not ↔
Note that while the pumping lemma states that all regularlanguages do satisfy the conditions described above, the converseof this statement is not true. A language that satisfies thepumping conditions may still be non-regular.
Example: L = {aibjck | i , j , k ≥ 0 ∧ (i = 1→ j = k)}
a) show that L is not regularb) show that w = aibjck satisfies the pumping lemma conditions(for some i,j,k)c) explain why parts a) and b) do not contradict the pumpinglemma
This question is an exercise in this week’s tutorial.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Context-Free Languages
Context-free languages can describes certain features with arecursive structure.
They are more powerful than FSMs because they have somelimited memory in the form of a stack.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
What are non-regular languages like?
Context-free languages were first studied for understanding humanlanguages.For English we have the following loose rule
sentence → noun-phrase verb-phrase
which we interpret as saying“A valid sentence consists of a noun-phrase followed by averb-phrase”To complete the description, we then need to define noun-phrase,verb-phrase and so on, which are defined in the same way
noun-phrase → article nounverb-phrase → verb adverb
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Context Free languages for Computer Science
An important use of CF languages in Computer Science is thespecification and compilation of languages such as Java or SQL.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Context Free languages and Automata
Context-free languages include all the regular languages and manymore.
For example, we saw that 0n1n was not a regular language, buttoday we will learn that it is a context-free language.
Context-free languages are precisely the class of languages that canbe recognised by pushdown automata, which are finite statemachines that use a stack as a memory device.
The most important formal languages for Computer Science, areprobably context-free languages, because many computerlanguages are context free languages (at least in part if not all).
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Grammars
Context-free languages can be specified using grammars.
For example, the language {0n1n|n ≥ 0} is generated by thegrammar
A→ 0A1 | ε
Example construction sequence: A, 0A1, 00A11, 000A111, 000111
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Grammars: the mechanics
A grammar grows all the strings of a language.
A grammar is a collection of substitution rules calledproductions.
Each rule has a left hand side symbol, an arrow and a righthand side.
Variable symbols are called non-terminals and usuallyrepresented by a capital letter.
Other symbols are from the alphabet of the language calledterminals and usually represented by a lower case letter.
One symbol is designated the start variable usually written S .
Strings in the language are grown by starting with the startsymbol and then replacing non-terminals according to theproduction rules.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Grammar Example 1
The language of all expressions with balanced brackets is generatedby the grammar
S → SS | (S) | ε
Example construction sequence:
S ,SS ,SSS , (S)SS , ((S))SS , (())(S)S , (())()S , (())()(S), (())()()
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Grammar Example 2
A→ 0A1A→ BB → x
Example derivation: A, 0A1, 00A11, 00B11, 00x11
Rules can be written on separate lines (Example 2), or using | todenote a list of rules for the same non-terminal (Example 1).
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Context-free grammar definition
Definition: A context-free grammar is a 4-tuple (V ,Σ,R,S) where
1 V is a finite set called the variables (usually denoted bycapital letters)
2 Σ is a finite set, Σ ∩ V = ∅, called the terminals (usuallydenoted by lower case letters or symbols as the alphabet ofthe language)
3 R is a finite set of rules, with each rule of the form V → Xwhere X is a string of variables and terminals.
4 S ∈ V is the start variable
Sipser Definition 2.2, page 104 in the 3rd edition
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Idea of Push-Down Automata (PDA)
Context-free languages can be recognised by automata called PDAs
PDAs are similar to non-deterministic FSMs (NFSMs)
but they have an extra component called a stack
The stack provides extra memory, in addition to states
This memory allows PDA to do “counting” that an NFSM can not
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Formal defn of PDAs
Definition: A pushdown automata (PDA) is defined to be a6-tuple (Q,Σ, Γ,F , δ, q0,F ) where
Q is a finite set of states
Σ is a finite alphabet of input symbols
Γ is the finite stack alphabet
δ : Q × Σε × Γε → P(Q × Γε) is the transition function
q0 ∈ Q is the start state
F ⊆ Q is a set of accepting states (F may be the empty set)
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
PDA transitions
Note this definition is nearly the same as for NFSMs with theexception of the transition function and the addition of a stackalphabet Γ.
As well as changing state for a given input, a PDA may
read and pop a symbol from the stack
push a symbol onto the top of the stack.
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Writing PDA transitions
Transitions are written a, b → c where a is an input symbolIf the machine sees input a then it may replace b on top of thestack with cIn other words, b is the symbol popped off the stack and c is thesymbol pushed onto the stackIf b is ε (the empty symbol) then make the transition without anypop (read) operationIf c is ε then make the transition without any push (write)operation$ is a special symbol used to denote the bottom of the stack: itmeans the stack is empty
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
PDA Example for 0n1n
q1 q2
q3q4
ε, ε → $
1, 0 → ε
ε, $ → ε
0, ε → 0
1, 0 → ε
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Reading PDA transitions
ε, ε → $ Given no input and nothing to pop, put the empty stacksymbol onto the stack. All PDAs start with this transition0, ε → 0 On seeing an input 0, push a 0 onto the stack. Do notpop anything from the stack1, 0 → ε On seeing input 1 and a 0 on the stack, pop the 0 fromthe stack and do not push anything on. This step pairs off all the1s with the previously stored 0s.ε, $ → ε On seeing no input and an empty stack, accept thestring since we must now have seen the same number of 1s as 0s
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Palindrome language
Design a PDA to recognize the language { wwR | w ∈ {0, 1}∗ }.w is any binary string, wR means w written backwards. So forexample, 001110011100 is in the language.Here we will use non-determinism to “guess” when the middle ofthe string has been reached.Approach:
1 Push all the symbols read onto the stack
2 Guess you have reached the middle of the word
3 Then pop elements off the stack if they match the next inputsymbol
4 Accept the string if every popped symbol matches the input,and the stack empties at the same time the end of the inputis reached.
5 Reject otherwise
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Palindrome language PDA
q1 q2
q3q4
ε, ε → $
ε, ε → ε
ε, $ → ε
0, ε → 0 1, ε → 1
0, 0 → ε 1, 1 → ε
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Designing PDAs
See Sipser Lemma 2.21 for details (p117 in 3rd ed)
Idea: PDA accepts input w if grammar G generates it, byfollowing the derivations
Use non-determinism to allow for choice of productions.
Push the start symbol S onto the stack ε, ε→ S
If top of stack is a non-terminal (S) then non-deterministicallychoose any production and substitute S by the rule.
Since the rules generate more symbols we need a string ofpushes. No inputs are consumed at this stage.
If top of stack is a terminal symbol (0 or 1) then checkwhether it matches the next symbol in the input string
If they don’t match then go to a non-accept state, if they domatch then continue
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Designing PDAs (cont)
Example - see tutorial questions - to be done in class
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
How to show a language is context-free
Theorem: PDA are equivalent in power to context-free grammars.This is a useful result because it gives 2 options for proving that alanguage is CF
1 specify a PDA for the language
2 specify a CF grammar for language
Non-regular languages Pumping lemma Context-Free Languages Context Free Grammars Pushdown Automata
Backus-Naur form (for information)
Context-free grammars related to computer languages are oftengiven in a special shorthand notation known as Backus-Naur form.
<identifier> :: = <letter> | <identifer> <letter> |
<identifier> <digit>
<letter> :: = a | b | c | ... | z
<digit> :: = 0 | 1 | ... | 9
In BNF, the non-terminals are identifed by the angle brackets, andproductions with the same left-hand side are combined into asingle statement with the OR symbol.