433
Automaten und Formale Sprachen“ alias Theoretische Informatik“ Sommersemester 2014 Dr. Sander Bruggink ¨ Ubungsleitung: Jan St¨ uckrath Sander Bruggink Automaten und Formale Sprachen 1

Sommersemester 2014 - ti.inf.uni-due.de

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sommersemester 2014 - ti.inf.uni-due.de

”Automaten und Formale Sprachen“

alias

”Theoretische Informatik“

Sommersemester 2014

Dr. Sander BrugginkUbungsleitung: Jan Stuckrath

Sander Bruggink Automaten und Formale Sprachen 1

Page 2: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Who are we?

Teacher: Dr. Sander Bruggink

Roomm LF 265

E-Mail: [email protected]

Teaching assistent: Jan Stuckrath

Room LF 265

E-Mail: [email protected]

Tutors: Lars Stoltenow / Martin Weber

Lars Stoltenow: [email protected]

Martin Weber: [email protected]

Sander Bruggink Automaten und Formale Sprachen 2

Page 3: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Introduction

Who are you?

BAI

ISE

Others

Website

www.ti.inf.uni-due.de/teaching/ss2013/afs/

Moodle-Site

Sander Bruggink Automaten und Formale Sprachen 3

Page 4: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Appointments

Lecture:

Tuesday, 12pm–2pm, room LB 131

Exercise groups:

Group ISE: Tuesday, 8am–10am, Room LE 120 (English)Martin Weber

Group BAI-1: Wednesday, 12pm–2pm Uhr, Room LE 120Lars Stoltenow

Group BAI-2: Thursday, 12pm–2pm, Room LF 125Lars Stoltenow

Group BAI-3: Thursday, 4pm–6pm, Room LC 137Martin Weber

Group BAI-4: Friday, 10am–12pm, Room LC 137Jan Stuckrath

Sander Bruggink Automaten und Formale Sprachen 4

Page 5: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Advice about the exercises

Please try to split evenly among the exercise groups.

Visit the exercise groups and do the homework. The material of thislecture can only be mastered by frequent practice. Memorizingdoesn’t help much.

The exercise groups start in the third week of the semester.Thus, the first exercise group take place from 22 to 25 April.

Sander Bruggink Automaten und Formale Sprachen 5

Page 6: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Advice about the exercises

The exercise sheet is put online every week on Tuesday at the latest.

The written exercises must be handed in on Tuesday, 8am of thefollowing week. In this week the exercise sheet is discussed in theexercise groups.

Handing in:

in the letter box adjacent to room LF 259.online through Moodle.

Plase write clearly your name, student number and group number onyour exercise. Also write down the name of the lecture.

Sander Bruggink Automaten und Formale Sprachen 6

Page 7: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Exam

Oral exam in the module “Theoretische Informatik” (“Automaten undformale Sprachen” together with “Berechenbarkeit und Komplexitat”)

For: BAI

Students, who started this summer semester, may choose to do thetwo oral exams of the module separately.

Klausur

Fur: BAI (PO 2012), ISE, Nebenfach

Sander Bruggink Automaten und Formale Sprachen 7

Page 8: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Bonus points

Bonus points:

During the semester the will be 12 (or 11) exercise sheets of 20 pointseach.

If you receive 50% of the points, you will recieve one grade levelhigher (for example 2,0 instead of 2,3) for your exam.

You can obtain 10 extra bonus points by publically presenting theanswer to an exercise in the exercise group (this is possible only once).

For the oral exam of the module “Theoretische Informatik” you mustobtain the bonus in both “Automaten und Formale Sprachen” and in“Berechenbarkeit und Komplexitat”.

Sander Bruggink Automaten und Formale Sprachen 8

Page 9: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Literature

We use the following book:

Uwe Schoning: Theoretische Informatik – kurzgefaßt. Spektrum,2008. (5. Auflage)

Other relevant literature:

Neuauflage eines alten Klassikers:Hopcroft, Motwani, Ullman: Introduction to Automata Theory,Languages, and Computation. Addison-Wesley, 2001.

Sander Bruggink Automaten und Formale Sprachen 9

Page 10: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction

Literatur

Sander Bruggink Automaten und Formale Sprachen 10

Page 11: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Informal Overview

Automata

Finite representations of languagesfinite automata, pushdown automata, (Turing machines), . . .

Other method to finitely represent languages:grammars, regular expressions

and formal languages

Language = set of finite sequences of symbols (= words)

For example:Set of arithmetical expressionsSet of syntactically correct Java programsSet of all German sentencesSet of satisfiable logical formulas

Sander Bruggink Automaten und Formale Sprachen 11

Page 12: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Motivation: Vending Machine

Bild

:W

ikip

edia

50 Cent 20 Cent 10 Cent

Paying 70 cent with 50, 20 und 10 cent coins.

Sander Bruggink Automaten und Formale Sprachen 12

Page 13: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Motivation: Vending Machine

Automaton:

0

10

20

30

40

50

60

7010

20

50

10

20

50

10

20

50

10

20

10

20

10

20

10

Language:Sequences of coints (from 10, 20, 50 cent), that are wearth 70 cents.

Sander Bruggink Automaten und Formale Sprachen 13

Page 14: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure-Problem

Warming up: we consider the adventure problem, in which an adventurersearches a path through an adventure.

(Later we will find out, what this has to do with formal languages.)

Sander Bruggink Automaten und Formale Sprachen 14

Page 15: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

42

3

1

5 6

9

8

7

11

10

12

13

14 15 16

Sander Bruggink Automaten und Formale Sprachen 15

Page 16: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure Problem (Level 1)

Rules of the Adventure Problem:

The Treasure Rule

You must find at least two treasures.

The Door Rule

You can only go through a door, when you found a key before. (The keycan be used arbitrarily many times.)

Sander Bruggink Automaten und Formale Sprachen 16

Page 17: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure Problem (Level 1)

The Dragon Rule

Immediately after the encounter with a dragon, you must jump into a river,because the dragon will otherwise ignite you. This is not the case anymore,if you have previously found a sword, because then you can kill the dragon.

Remark: Dragons, treasures and keys are “refilled” after you left theaccording field.

We are look for a path from a start to an end state, which satisfies all ofthe above conditions.

Sander Bruggink Automaten und Formale Sprachen 17

Page 18: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure Problem (Level 1)

Question (Level 1)

Is there a solution in the example? Adventure

Yes! The shortest solution is:1, 2, 3, 1, 2, 4, 10, 4, 5, 6, 4, 5, 6, 4, 11, 12 (length 16).

Is there a general solving procedure which – given an adventure in theform of a graph – can always determine whether there is a solution?

Yes! We will see this procedure in the lecture.

In order to be able to implement this procedure, we need also formaldescription of the rules (door rule, dragon rule, treasure rule).

Sander Bruggink Automaten und Formale Sprachen 18

Page 19: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure Problem (Level 2)

New Door Rule

The keys are magical and disappear immediately after being used to opena door. As soon as you go through a door, the door is locked again.

However, you can carry more than one key.

Sander Bruggink Automaten und Formale Sprachen 19

Page 20: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure Problem (Level 2)

Questions (Level 2)

Is there a solution in the example? Adventure

Yes! The shortest solution is: 1, 2, 3, 1, 2, 4, 10, 4, 7, 8, 9, 4, 7, 8,9, 4, 11, 12. (length 18)

Is there a general solving procedure?

Yes! We will see this procedure in the lecture.

Why is the new problem harder?

We have to “count” the keys.

Sander Bruggink Automaten und Formale Sprachen 20

Page 21: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure Problem (Expert Level)

New Dragon Rule

Swords become unusable by the dragon’s blood, as soon as one has killeda dragon. However, dragons are replaced after being killed.

Key Regel

A magic gate can only be passed, when you don’t own a key.

Sword Rule

A river can only be passed, when you don’t have a sword (otherwise, you’lldrown).

Sander Bruggink Automaten und Formale Sprachen 21

Page 22: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure Problem (Expert Level)

Questions (Expert Level)

Is there a solution in the example? Adventure

Yes! The shortest solution is: Ja! Die kurzeste Losung ist 1, 2,3, 1, 2, 4, 10, 4, 7, 8, 9, 4, 10, 4, 5, 6, 4, 11, 12. (Lange 19)

Is there a general solving procedure?

No! It is a so-called undecidable problem.

This is not discussed in this lecture, but in “Berechenbarkeit undKomplexitat”.

Sander Bruggink Automaten und Formale Sprachen 22

Page 23: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure-Problem und Formale Sprachen

Automata:Adventure instancesAutomaton that accepts possible solutions

Languages:Possible object sequences of an adventure instanceObject sequences that satisy the treasure ruleObject sequences that satisy the dragon ruleObject sequences that satisy the door rule

Sander Bruggink Automaten und Formale Sprachen 23

Page 24: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Formal Languages

Questions

Typical questions here are:

Is a language L empty or does it contain (at least) one word? L = ∅?

Is a word w in the language? w ∈ L?

Are two languages included in one another? L1 ⊆ L2?

Depending on the language (or languages) these question are either

decidable (there is a general procedure to solve the problem) or

undecidable

Sander Bruggink Automaten und Formale Sprachen 24

Page 25: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Motivation / Introduction

Adventure Problem and Formal Languages

The single levels of the adventure belong to the following language classes:

Level 1 → regular languages

Level 2 → context free languages

Expert level → Chomsky-0 languages (semi-decidable languages)These are discussed in “Berechenbarkeit & Komplexitat”.

Sander Bruggink Automaten und Formale Sprachen 25

Page 26: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Contents of the Lecture

For theoretical computer science

How can infinite structures be represented by finite descriptions(automata, grammars)?

There are numerous applications – for example in the following areas:

searching in texts (regular expressions)syntax of (programming) languages and compiler constructionmodelling system behaviourverification of systems

Sander Bruggink Automaten und Formale Sprachen 26

Page 27: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Contents of the Lecture

Contents of the lecture

Automata and formal languages

Mathematical foundations and formal proofs

Languages, grammars and automata

Chomsky Hierarchy (different language classes)

Regular languages and context free languages

How can we show that a language is not of a certain class?

Decision procedures

Closure properties (is the intersection of two regular languages alsoregular?)

Sander Bruggink Automaten und Formale Sprachen 27

Page 28: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Sets

Set

A set M of elements is denoted as enumerations

M = {0, 2, 4, 6, 8, . . . }

or a a set of elements with a certain property

M = {n | n ∈ N and n even}

General format:M = {x | P(x)}

(M is the set of all elements x , which satisfy property P.)

Sander Bruggink Automaten und Formale Sprachen 28

Page 29: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Sets

Remarks:

The elements of a set a unordered, that is, their order is notimportant. For example:

{1, 2, 3} = {1, 3, 2} = {2, 1, 3} = {2, 3, 1} = {3, 1, 2} = {3, 2, 1}

An element cannot occur in a set more than once. It is either in theset, or not. For example:

{1, 2, 3} 6= {1, 2, 3, 4} = {1, 2, 3, 4, 4}

Sander Bruggink Automaten und Formale Sprachen 29

Page 30: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Sets

Element of a set

We write a ∈ M, when an element a is contained in the set M.

Number of elements of a set

For a set M the number of elements of M is denoted by |M|.

Empty set

The empty set (set without elements) is denoted by ∅.

Subset

We write A ⊆ B when every element of A is also an element of B. A isthen called a subset of B. The relation ⊆ is also called inclusion.

Sander Bruggink Automaten und Formale Sprachen 30

Page 31: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Sets

Example:

2 ∈ {1, 2, 3}? 3

2 ⊆ {1, 2, 3}? 7 ⇒ {2} ⊆ {1, 2, 3} 3

{1, 2} ∈ {1, 2, 3}? 7

{1, 2} ⊆ {1, 2, 3}? 3

∅ ∈ {A,B,C}? 7

∅ ⊆ {A,B,C}? 3

Sets can also contain sets:1 ∈ {{1}, {3, 4}}? 7

{1} ∈ {{1}, {3, 4}? 3

{1} ⊆ {{1}, {3, 4}? 7

{{1}} ⊆ {{1}, {3, 4}? 3 Wichtig: a 6= {a}

Sander Bruggink Automaten und Formale Sprachen 31

Page 32: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Venn-Diagrams

Venn-Diagrams are graphical representation of sets and the relationshipsbetween them.

A B•

B A

Sander Bruggink Automaten und Formale Sprachen 32

Page 33: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Set operations

Union: A ∪ B = {e | e ∈ A oder e ∈ B}Intersection: A ∩ B = {e | e ∈ A und e ∈ B}Difference: A \B = {e | e ∈ A und e /∈ B}

A ∪ B A ∩ B A \ B

A B A B A B

Sander Bruggink Automaten und Formale Sprachen 33

Page 34: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Power set

Power set

Let M be a set. The set P(M) is the set of all subsets of M.

P(M) = {A | A ⊆ M}

We have: |P(M)| = 2|M| (for a finite set M).

Sander Bruggink Automaten und Formale Sprachen 34

Page 35: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Tuples

Tuple

Besides sets we also use tuples, which are written with (round)parenthesis: (a1, . . . , an)

In a tuple the elements are ordered. For example:

(1, 2, 3) 6= (1, 3, 2)

An element can occur multiple times in a tuple. Tuples of differentsize are always unequal. For example

(1, 2, 3, 4) 6= (1, 2, 3, 4, 4)

A tuple (a1, . . . , an) consisting of n elements is called n-tuple. A2-tupel is also called a pair.

Sander Bruggink Automaten und Formale Sprachen 35

Page 36: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Cross Product

Cross product (or cartesian product)

Let A,B be two sets. The set A×B is the set of all pairs (a, b), where a isan element of A and b an element of B.

A× B = {(a, b) | a ∈ A, b ∈ B}

We have: |A× B| = |A| · |B| (for finite sets A,B).

Sander Bruggink Automaten und Formale Sprachen 36

Page 37: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Relations

Binary relation

Let A,B be two sets. A binary relation between A and B is a set of pairsR ⊆ A× B.

A B

f

g

h

1

2

3

4R

A = {f , g , h}B = {1, 2, 3, 4}R = {(f , 1), (f , 2),

(g , 4), (h, 2)}

Sander Bruggink Automaten und Formale Sprachen 37

Page 38: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

The sets A and B can also be equal.

Relation over A

Let A be a set. A (binary) relation over A is a set of pairs R ⊆ A× A.

A

1

2

3

45

6 A = {1, 2, 3, 4, 5, 6}R = {(1, 3), (1, 2), (2, 6),

(3, 6), (4, 4), (5, 1),

(5, 5), (5, 6)}

Sander Bruggink Automaten und Formale Sprachen 38

Page 39: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Properties of Relations

Let R ⊆ A× A be a elation from A to A.

R is reflexive, when for all x ∈ A: x R x .

a

R is irreflexive, when for all x ∈ A: not x R x .

a 7

Es gibt Relationen die nicht reflexiv aber auch nicht irreflexiv sind.

Sander Bruggink Automaten und Formale Sprachen 39

Page 40: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Properties of Relations (Continued)

Let R ⊆ A× A be a elation from A to A.

R is symmetric, when for all x , y ∈ A it holds, that when x R y , theny R x .

a b

R is antisymmetric, when for all x , y ∈ A it holds, that when x R yand y R x , then x = y .

a b7

(wobei a 6= b)

R is transitive, when for x , y , z ∈ A it holds that when x R y andy R z , then x R z .

a b c

Sander Bruggink Automaten und Formale Sprachen 40

Page 41: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Special Relations

A quasi-order (or pre-order) is a reflexive, transitive relation.

A order is a reflexive, transitive und antisymmetric relation.

An equivalence relation is a reflexive, transitive and symmetricrelation.

Quasi-order: order: equivalence relation:

a

b

c

d

a

b

c

d

a

b

c

d

Sander Bruggink Automaten und Formale Sprachen 41

Page 42: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Functions

Let R ⊆ A× B be a relation from A to B.

R is total, when for all x ∈ A there exists at least one y ∈ B suchthat x R y .

R is non-ambiguous, when for all x ∈ A there exists at most oney ∈ B such that x R y .

Funktion

A funktion f : A→ B is a total und non-ambiguous relation from A to B.

A function maps an element a ∈ A to an element f (a) ∈ B. Here, A is thedomain and B the codomain.

Notation: f (x) = y when x f y .

Sander Bruggink Automaten und Formale Sprachen 42

Page 43: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Mathematical Statements

Statements (propositions) make assertions.

Basically, the language of mathematical statements is the language of(classical) predicate logic.

A statement is either true or false.(Law of excluded middle / Tertium non Datur)

As a basic principle, we only hold an assertion true if we can prove it.(Exception: axioms)

Sander Bruggink Automaten und Formale Sprachen 43

Page 44: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

The following kinds of assertions exist:

Atomic assertions2 is a prime number Prime(2)

x is larger than 5 x > 5

Operatorsnot P ¬P

P and Q P ∧ QP or Q P ∨ Q

when P, then Q P → QP if and only if Q P ↔ Q

Quantifiersthere exists an x such that P ∃x P

for all x P ∀x P

Sander Bruggink Automaten und Formale Sprachen 44

Page 45: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Translation of natural language to predicate logic

Translations:All P are Q. ∀x

(P(x)→ Q(x)

)There exists a P such that Q. ∃x

(P(x) ∧ Q(x)

)Not all operators and quantifiers are explicitly given!

Examples:

”There is a prime number that is even.“

”All prime numbers are odd.“

”Let x be an even prime number. Then x < 10.“

”When x is an even number larger than 3, then x is no prime

number.“

Sander Bruggink Automaten und Formale Sprachen 45

Page 46: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Proving Assertions

We will now concentrate on how mathematical assertions of different typescan be proved.

Why do we prove an assertion?

Answer: to convince ourselves and others that the assertion is true.

Sander Bruggink Automaten und Formale Sprachen 46

Page 47: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Proving Assertions

During the process of writing a proof there are various kinds of assertions:

assertions that can be used:

assertions that have already been proved (propositions, theorems,lemmas, . . . );assertions that we have assumed true (hypotheses, premisses,assumptions);axioms(assertions that we immediately recognize as being provable);

assertions that we have to prove or refute.

Sander Bruggink Automaten und Formale Sprachen 47

Page 48: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Implication

Implication (“if, then”)

”If P, then Q“ (P → Q) is true, when Q follows from P.

Using: If P is known, and P → Q is known, then Q is known. (ModusPonens).

Proving: To prove P → Q, assume P, and show under this assumption,that Q is true.

Refuting: To refute P → Q, show that P is true, but Q isn’t.

Sander Bruggink Automaten und Formale Sprachen 48

Page 49: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Conjunction

Conjunction (“and”)

”P and Q“ (P ∧ Q) is true, when P and Q are both true.

Using: When P ∧ Q ist known, then P and Q are also known separately(and can be used as premises).

Proving: To prove P ∧ Q, we have to prove P and prove Q.

Refuting: To refute P ∧ Q, we have to refute P or refute Q.

Sander Bruggink Automaten und Formale Sprachen 49

Page 50: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Disjunction

Disjunction (“or”)

”P or Q“ (P ∨ Q) is true, when P is true or Q is true.

Refuting: To refute P ∨ Q, one must refute P or refute Q.

Using: To prove R from P ∨ Q:Assume P, and show under that assumption, that R is true.Assume Q, and show under that assumption, that R is true.Since R follows from both P and Q, and P or Q holds, R must also hold.

Prove: To prove P ∨ Q, you have to prove P or you have to prove Q.

Sander Bruggink Automaten und Formale Sprachen 50

Page 51: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Negation

Negation (“not”)

“Not P” (¬P) is true, when P is not true, and false, when P is true.

Using: Negations can be used to prove contradictions.

Proving: You prove ¬P by refuting P.(In many cases you need a proof by contradiction.

Refuting: You refute ¬P by proving P.

Sander Bruggink Automaten und Formale Sprachen 51

Page 52: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Universal quantifier

Universal quantifier (“for all”)

“For all x it holds that P“ ( ∀x P ) is true, when P holds for all objects x .

Using: When ∀x P is known, and you have an object a, you know that Pholds for a (i.e. P[x/a] is true, where P[x/a] is P where all occurrences ofx have been replaced by a).

Proving: Assume that a is an arbitrary object. Show that P holds for a.You cannot assume anything about a!

Refuting: Search a counter example; that is, search an object a such thatP doesn’t hold for a.

Sander Bruggink Automaten und Formale Sprachen 52

Page 53: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Existential quantifier

Existential quantifier (“there is a”)

“There is a P ( ∃x P ) is true, when an object x exists for which P holds.

Using: When ∃x P is known, you may introduce an object for which Pholds (with an arbitrary name).You cannot assume any other things about this object.

Proving: Search an example: an object a for which P is true.

Refuting: Assume there is an arbitrary object a and show that P does nothold for a.You cannot assume anything else about a.

Sander Bruggink Automaten und Formale Sprachen 53

Page 54: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Example

Prove the following theorem:

Theorem

Let A be a set, and � ⊆ A× A a quasi-order on A. Define the relation ≈as follows:

x ≈ y whenever x � y and y � x .

Then ≈ is an equivalence relation.

Sander Bruggink Automaten und Formale Sprachen 54

Page 55: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Counter example example

Jan claims that the following assertion is true:

Jan’s “Theorem”

Let R ∈ A× A be a binary relation. When R is symmetric and transitive,then R is reflexive.

Jan motivates as follows: from a R b it follows by symmetry that b R a,and thus by transitivity that a R a.

All persons are fictitious. Any resemblance to real persons, living or dead, is

purely coincidental.

Sander Bruggink Automaten und Formale Sprachen 55

Page 56: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Subsets

By definition, “A ⊆ B” means the same thing as “for all x ∈ A it holds,that x ∈ B”.

Proving A ⊆ B: Assume that x is an object of A. Show, under thisassumption, that x ∈ B. You cannot assume any other things about x!

Refuting A ⊆ B: Search for a counter example, that is, an object x ∈ A,such that x /∈ B.

Sander Bruggink Automaten und Formale Sprachen 56

Page 57: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Equality of sets

Two sets A and B are equal, when A ⊆ B and B ⊆ A.

Proving A = B: Prove A ⊆ B and prove B ⊆ A.

Refuting A = B: Search for a counter example, that is an object x ∈ A,such that x /∈ B, or an object x ∈ B, such that x /∈ A.

Sander Bruggink Automaten und Formale Sprachen 57

Page 58: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Example

Prove the following theorem:

Theorem (Law of distributivity)

For sets A,B,C it holds that:

(A ∩ B) ∪ C = (A ∪ C ) ∩ (B ∪ C )

Sander Bruggink Automaten und Formale Sprachen 58

Page 59: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Proof by contradiction

Proof by contradiction (Reductio ad absurdum)

Prove an assertion P, by assuming its negation and then deducing acontradiction.

Example: Prove the following theorem:

Theorem (√

2 is irrational)√

2 is irrational, that is, there are no p, q ∈ Z, such that pq =√

2.

Sander Bruggink Automaten und Formale Sprachen 59

Page 60: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Induction

Induction is a proof method, which can be used for sets which havesmallest elements (formally: well-founded sets).For example: natural numbers (N)

When we want to show, that all elements of such a set have a certainproperty, we can do this as follows:

Base case: Prove, that all smallest elements of the set have theproperty (in the case of N: 0 or 1).

Induction case: Prove, that an arbitrary element e (which is not oneof the smallest elements) has the property, under the assumption thatall smaller elements have the property.

When we have proven these two parts, we can deduce, that the propertyholds for all elements of the set.

Sander Bruggink Automaten und Formale Sprachen 60

Page 61: Sommersemester 2014 - ti.inf.uni-due.de

Organisational Stuff and Introduction Mathematical Foundations and Formal Proofs

Example for Induction

Prove the following theorem:

Theorem:

For all n > 0 it holds, that 1 + 2 + · · ·+ n =n · (n + 1)

2.

Sander Bruggink Automaten und Formale Sprachen 61

Page 62: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Words

Alphabet

An Alphabet is a finite set.

Word

A word is a finite string of symbols from Σ.

Set set of all words over Σ is denoted by Σ∗.

The empty word (the word of length 0) is denoted by ε.

The set of all non-empty words over Σ is denoted by Σ+.

Sander Bruggink Automaten und Formale Sprachen 62

Page 63: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Languages

Language

Let Σ be an alphabet. A (formal) Language K over Σ is a set of wordsover Σ.

That is: L ⊆ Σ∗.

Sander Bruggink Automaten und Formale Sprachen 63

Page 64: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Example languages

Alphabets and languages:

Σ1 = {(, ),+,−, ∗, /, a}L1 = {w ∈ Σ∗1 | w is an arithmetical expression}

Σ2 = {a, . . . , z, a, u, o, ß, ., ,, :, . . .}L2 = Grammatically correct sentences of german

Σ3 = arbitraryL3 = ∅, L′3 = {ε}

typical languages over the alphabet Σ4 = {a, b}:L4 = {w ∈ Σ∗4 | w contains aba as subword}L5 = {anbn | n ∈ N}L6 = {anbncn | n ∈ N}

(where xn = x . . . x︸ ︷︷ ︸n×

)

Sander Bruggink Automaten und Formale Sprachen 64

Page 65: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (introduction)

Languages are, in general, infinite: they may contain infinitely many words.We need finite representations ⇒ Grammars

Grammars for naturallanguages

Grammars in computerscience

A means of representingall syntactically correctsentences

Finitely many ruleswhich generate all wordsin the language

For example: Σ = {der, die, das, kleine, bissige, große,Hund,Katze, jagt}.

Sander Bruggink Automaten und Formale Sprachen 65

Page 66: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (introduction)

〈Satz〉 → 〈Subjekt〉〈Pradikat〉〈Objekt〉〈Subjekt〉 → 〈Artikel〉〈Attribut〉〈Substantiv〉〈Artikel〉 → ε〈Artikel〉 → der〈Artikel〉 → die〈Artikel〉 → das〈Attribut〉 → ε〈Attribut〉 → 〈Adjektiv〉〈Attribut〉 → 〈Adjektiv〉〈Attribut〉〈Adjektiv〉 → kleine〈Adjektiv〉 → bissige〈Adjektiv〉 → große〈Substantiv〉 → Hund〈Substantiv〉 → Katze〈Pradikat〉 → jagt〈Objekt〉 → 〈Artikel〉〈Attribut〉〈Substantiv〉

Sander Bruggink Automaten und Formale Sprachen 66

Page 67: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (introduction)

〈Artikel〉 〈Attr.〉 〈Subst.〉

〈Satz〉

〈Pradikat〉 〈Objekt〉

〈Artikel〉 〈Attr.〉 〈Subst.〉

〈Adj.〉 〈Attr.〉

〈Adj.〉

jagt

〈Adj.〉

die große Katzeder kleine bissige Hund

〈Subjekt〉

Sander Bruggink Automaten und Formale Sprachen 67

Page 68: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (definition)

Grammars consist of rules of the form

linke Seite → rechte Seite

Two types of symbol can occur (both in the left side as in the right side).

Non-terminal (the variables, from which wort components are derived)

Terminals (symbols which occur in the actual words)

Sander Bruggink Automaten und Formale Sprachen 68

Page 69: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (definition)

Definition (Grammar)

A grammar G is a 4-tuple G = (V ,Σ,P,S), such that the followingconditions hold:

V is a finite set of non-terminals (or variables)

Σ is the finite alphabet (the set of terminals). (The following musthold: V ∩ Σ = ∅, that is, no symbol is both terminal andnon-terminal)

P is a finite set of rules (also called productions) whereP ⊆ (V ∪ Σ)+ × (V ∪ Σ)∗.

S ∈ V is the start variable.

Sander Bruggink Automaten und Formale Sprachen 69

Page 70: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (definition)

What do productions look like?

P ⊆ (V ∪ Σ)+ × (V ∪ Σ)∗

A production from P is a pair (l , r) of words over V ∪ Σ. A production isusually written l → r .

Both l and r consist of variables and terminal symbols.

l cannot be empty (a rule must always replace a symbol).

Words from (Σ ∪ V )∗ are also called sentence forms.

Sander Bruggink Automaten und Formale Sprachen 70

Page 71: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (definition)

Conventions:

Variables: A, B, C , . . . , S , T , . . .

Terminal symbols: a, b, c , . . . und 0, 1, . . .

Words from (V ∪ Σ)∗ (or Σ∗): u, v , w , x , y , z , . . .

Notation:

The concatenation of two words u, v is denoted uv .

Sander Bruggink Automaten und Formale Sprachen 71

Page 72: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (example)

Example grammar

G = (V ,Σ,P, S) mit

V = {S ,B,C}Σ = {a, b, c}P = {S → aSBC ,S → aBC ,CB → BC , aB → ab,bB → bb, bC → bc, cC → cc}

Sander Bruggink Automaten und Formale Sprachen 72

Page 73: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (derivations)

How are the productions use to generate words from the start variable S?

Idea: When the grammar contains a production l → r , we may replace l byr .

Example:Production: CB → BCDerivation step: aab︸︷︷︸

x

CB︸︷︷︸l

Bcca︸︷︷︸y

⇒ aab︸︷︷︸x

BC︸︷︷︸r

Bcca︸︷︷︸y

.

Sander Bruggink Automaten und Formale Sprachen 73

Page 74: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (derivations)

How do we use productions to derive words from the start variable S?

Definition (Ableitung)

Let G = (V ,Σ,P,S) be a grammar and u, v ∈ (V ∪ Σ)∗ be words. Itholds that

u ⇒G v (u geht unter G unmittelbar uber in v),

when u, v have the following form:

u = xly und v = xry ,

where x , y ∈ (V ∪ Σ)∗ and l → r is a rule in P.

Sander Bruggink Automaten und Formale Sprachen 74

Page 75: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars (derivations)

Derivation

A sequence of words w0,w1,w2, . . . ,wn with w0 = S and

w0 ⇒G w1 ⇒G w2 ⇒G · · · ⇒G wn

is a derivation of wn (from S).The wi can contain both terminals and non-terminals. Such words are alsocalled sentence form.

In this case, we also write w0 ⇒∗G wn.

Sander Bruggink Automaten und Formale Sprachen 75

Page 76: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars and languages

Language generated by a grammar

The language generated by a grammar G = (V ,Σ,S ,P) is:

L(G ) = {w ∈ Σ∗ | S ⇒∗G w}.

In other words:The language generated by G consists of those words, that can be derivedfrom the start variable S in one ore more derivation steps, and consist onlyof terminal symbols.

Sander Bruggink Automaten und Formale Sprachen 76

Page 77: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Which language does the example grammar generate?

Example Grammar

G = (V ,Σ,P, S) with

V = {S ,B,C}Σ = {a, b, c}P consists of:

S → aSBC aB → ab bB → bb CB → BC

S → aBC bC → bc cC → cc

The above example grammar G generates the language

L(G ) = {anbncn | n ≥ 1}.

Here, an = a . . . a︸ ︷︷ ︸n times

.

Sander Bruggink Automaten und Formale Sprachen 77

Page 78: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars and languages

Comment:Deriving is no deterministic process, but a non-deterministic one. For aword u ∈ (V ∪ Σ)∗ it is possible, that the are no, or more v with u ⇒G v .

In other words: ⇒G is not a function.

This non-determinism can be caused by two things.

Sander Bruggink Automaten und Formale Sprachen 78

Page 79: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars and Languages

Two different rules can be applied:

In the example grammar:

S

aSBC

aBC

aaSBCBC

aaaBCBCBC

aaSBBC C

A rule can be applied in two different places.

In the example grammar:

aaaSBCBCBC

aaaSBBCCBC

aaaSBCBBCC

Sander Bruggink Automaten und Formale Sprachen 79

Page 80: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Grammars and languages

Further comments:

Derivations can be arbitrarily long and never reach a word whichconsist only of terminal symbols:

S ⇒ aSBC ⇒ aaSBCBC ⇒ aaaSBCBCBC ⇒ . . .

Sometimes derivation can end in a deadloch, in which no rule can beapplied, but the word still contain non-terminal symbols:

S ⇒ aSBC ⇒ aaBCBC ⇒ aabCBC ⇒ aabcBC 6⇒

A word is generated by a grammar, when there is at least onederivation of the word from the start variable.

Sander Bruggink Automaten und Formale Sprachen 80

Page 81: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Backus-Naur-Form

We will use the following short-hand notation for grammars:

When there are rules

A→ w1

...

A→ wn

we also writeA→ w1 | · · · | wn

Sander Bruggink Automaten und Formale Sprachen 81

Page 82: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Adventurous grammars

Σ =

{, , , , , ,

}

The Door Rule

You can only go through a door, when you found a key before. (This keycan be used arbitrarily often.)

G1 =({K ,N,X},Σ,P1,N}

), where P1 consists of the following

productions:

N → XN | K | ε

K → XK | K | K | ε

X →∣∣ ∣∣ ∣∣ ∣∣

Sander Bruggink Automaten und Formale Sprachen 82

Page 83: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Adventurous grammars

Σ =

{, , , , , ,

}

New Door Rule (Level 2)

The keys are magical and disappear immediately after being used to opena door. As soon as you go through a door, the door is locked again.

G2 =({S ,X},Σ,P2, S}

), where P2 consists of the following productions:

S → XS | S | S | SS | ε

X →∣∣ ∣∣ ∣∣ ∣∣

Sander Bruggink Automaten und Formale Sprachen 83

Page 84: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Languages and Grammars

Adventurous grammars

G1: Grammar for the Door Rule:

N → XN | K | ε

K → XK | K | K | ε

X →∣∣ ∣∣ ∣∣ ∣∣

G2: Grammar for the New Door Rule:

S → XS | S | S | SS | ε

X →∣∣ ∣∣ ∣∣ ∣∣

Why is G2 more complex than G1? ⇒ Chomsky-Hierarchy

Sander Bruggink Automaten und Formale Sprachen 84

Page 85: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy The Chomsky Hierarchy

Chomsky Hierarchy

We classify grammars after the form of their rules:

Chomsky Hierarchy for grammars

Chomsky Type 0: Every grammar is of type 0. (There are noconstraints.)

Chomsky Type 1: For all rules l → r it holds, that |l | ≤ |r |.Chomsky Type 2: Additionally, it holds for all rules l → r , thatl ∈ V .(That is l is a single variable.)

Chomsky Type 3: Additionally, it holds for all rules l → r , thatr = a or r = aB, for a ∈ Σ and B ∈ V .

Sander Bruggink Automaten und Formale Sprachen 85

Page 86: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy The Chomsky Hierarchy

Special rule for ε

Special rule for ε (For Type 1, Type 2 und Type 3 grammars)

When S is the start symbol, S → ε may occur, when S does not occuranywhere on the right side of a rule.

Sander Bruggink Automaten und Formale Sprachen 86

Page 87: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy The Chomsky Hierarchy

Chomsky Hierarchy

Grammars

Type 0

Type 1

Type 2

Type 3

no constraints

|l | ≤ |r |

l ∈ V

r = a oder r = aB

Sander Bruggink Automaten und Formale Sprachen 87

Page 88: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy The Chomsky Hierarchy

Chomsky Hierarchy

Names of grammar classes

Typ 0: . . .

Typ 1: context sensitive grammars, monotonous grammars

Typ 2: context free grammars

Typ 3: regular grammars

Sander Bruggink Automaten und Formale Sprachen 88

Page 89: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy The Chomsky Hierarchy

Chomsky Hierarchy

Chomsky Hierarchy for languages

A language L ⊆ Σ∗ is of Type i (i ∈ {0, 1, 2, 3}), when there is a Type igrammar G with L(G ) = L (that is, L is generated by G )

Names of language classes

Type 0: semi-decidable languages, recursively enumerable languages

Type 1: context sensitive languages

Type 2: context free languages, algebraic languages

Type 3: regular languages

Sander Bruggink Automaten und Formale Sprachen 89

Page 90: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy The Chomsky Hierarchy

Chomsky Hierarchy

Grammars Languages

All languages

Type 0 Type 0

Type 1 Type 1

Type 2 Type 2

Type 3 Type 3

Sander Bruggink Automaten und Formale Sprachen 90

Page 91: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy The Chomsky Hierarchy

Chomsky Hierarchy

Grammars Languages

All languages

Type 0 Type 0

Type 1 Type 1

Type 2 Type 2

Type 3 Type 3

Sander Bruggink Automaten und Formale Sprachen 90

Page 92: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy The Chomsky Hierarchy

Chomsky Type of a Grammar and Language

Context free grammarG2

S → X | εX → aXa | aa

Regular grammarG1

S → aX | εY → aXX → aY | a

??? language

L(G1) = {an | n ist gerade} = L(G2)

Context free language

L(G1) = {an | n ist gerade} = L(G2)

Regular language

L(G1) = {an | n ist gerade} = L(G2)

Sander Bruggink Automaten und Formale Sprachen 91

Page 93: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Word Problem for Context Sensitive Languages

Word Problem

Word Problem

Let a grammar G (of arbitrary type) an word w ∈ Σ∗ be given. Decide,whether w ∈ L(G ).

Decidability of the word problem (Theorem)

The word problem is decidable for type 1 grammars (and as such also forregular and context free grammars). That is: there is a procedure thatdecides, whether w ∈ L(G ).

Sander Bruggink Automaten und Formale Sprachen 92

Page 94: Sommersemester 2014 - ti.inf.uni-due.de

Words, Grammars and the Chomsky Hierarchy Word Problem for Context Sensitive Languages

Word Problem for Type 1 Languages

Algorithm to solve the word problem for Type 1 Grammars:returns “true” if and only if w ∈ L(G ).

input (G ,w)T := {S}repeat

T ′ := TT := T ′ ∪ {u | |u| ≤ |w | and u′ ⇒ u, for some u′ ∈ T ′}

until w ∈ T or T = T ′

return w ∈ T

Sander Bruggink Automaten und Formale Sprachen 93

Page 95: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages

A============================================================================

Sander Bruggink Automaten und Formale Sprachen 94

Page 96: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages

Regular Languages

We concern ourselves with regular languages for a few weeks.

deterministic and non-deterministic finite automata

regular expressions

proving, that a language is not regular: Pumping Lemma

minimal automata and acceptance equivalence

closure properties and decision procedure

Sander Bruggink Automaten und Formale Sprachen 94

Page 97: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Finite Automata

In this part we concern ourselves with regular languages, but first from adifferent viewpoint. Instead of Type 3 grammars we consider state basedautomaton models, which can be view as “language acceptors”.

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 95

Page 98: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Deterministic Finite Automata

Graphical notation:

State: z Initial state: z0 Final state: zE

Transition: z1 z2a

Example: (Σ = {a, b})

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 96

Page 99: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Deterministic Finite Automaton

b a a b a a bInput:

Accepted

z1 z2

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 97

Page 100: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Deterministic Finite Automaton

b a a b a a bInput:

Accepted

z1 z2

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 97

Page 101: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Deterministic Finite Automaton

b a a b a a bInput:

Accepted

z1 z2

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 97

Page 102: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Deterministic Finite Automaton

b a a b a a bInput:

Accepted

z1 z2

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 97

Page 103: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Deterministic Finite Automaton

b a a b a a bInput:

Accepted

z1 z2

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 97

Page 104: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Deterministic Finite Automaton

b a a b a a bInput:

Accepted

z1 z2

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 97

Page 105: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Deterministic Finite Automaton

b a a b a a bInput:

Accepted

z1 z2

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 97

Page 106: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Deterministic Finite Automaton

b a a b a a bInput:

Accepted

z1 z2

z1 z2

a a

b

b

Sander Bruggink Automaten und Formale Sprachen 97

Page 107: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Deterministic Finite Automata

Informal definition:

A deterministic finite automaton (DFA) consists of

states (of which one initial state, some are final states)

a transition function

The following conditions hold:

The alphabet and the state set are finite.

The transition function maps each pair of a state and an alphabetsymbol to exactly one successor state.

A word is accepted by the DFA, when one start at the initial state andreaches a finial state after reading in the word.

Sander Bruggink Automaten und Formale Sprachen 98

Page 108: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Deterministic Finite Automata

Deterministic Finite Automaton (definition)

A (deterministic) finite automaton (DFA) M is a 5-tupleM = (Z ,Σ, δ, z0,E ), where

Z is the set of states,

Σ is the input alphabet (with Z ∩ Σ = ∅),

z0 ∈ Z is the initial state,

E ⊆ Z is the set of final states and

δ : Z × Σ→ Z is the transition function

Z , Σ must be finite sets.

Sander Bruggink Automaten und Formale Sprachen 99

Page 109: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Deterministic Finite Automata

The previous transition function δ reads only a single symbol at once.Therefore, we generalize it to a transition function which read in entirewords.

Mehr-Schritt-Ubergange

For a given DFA M = (Z ,Σ, δ, z0,E ) we inductively define the functionδ : Z × Σ∗ → Z as follows:

δ(z , ε) = z

δ(z , ax) = δ(δ(z , a), x)

where z ∈ Z , x ∈ Σ∗ und a ∈ Σ.

Sander Bruggink Automaten und Formale Sprachen 100

Page 110: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Deterministic Finite Automata

Accepted Language

The language accepted by a DFA M = (Z ,Σ, δ, z0,E ) is

T (M) = {x ∈ Σ∗ | δ(z0, x) ∈ E}.

In other words:The language can be obtain, by enumerating all paths from the initialstate to a final state and concatenating all the symbols on the transitions.

Sander Bruggink Automaten und Formale Sprachen 101

Page 111: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Deterministic Finite Automata (example)

Let Σ = {a, b}.

Which language does the following DFA accept?

z0 z1 z2

a a a, b

b b

Construct a DFA which accepts the following language:

L = {x ∈ Σ∗ | x fangt mit a and und endet mit b}

Sander Bruggink Automaten und Formale Sprachen 102

Page 112: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Deterministic Finite Automata

DFAs → regular languages (Theorem)

Each language accepted by a DFA is regular.

Idea:States Variables

Transitions Productions

Formally: We construct the grammar G = (V ,Σ,P, S), where V = Z ,S = z0 and P contains the following productions:

If ε ∈ T (M), then P contains a production S → ε.

For all z1 ∈ Z and a ∈ Σ:

If δ(z1, a) = z2, then (z1 → az2) ∈ P.If additionally z2 ∈ E , then (z1 → a) ∈ P.

Sander Bruggink Automaten und Formale Sprachen 103

Page 113: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Non-deterministic Finite Automata

As opposed to grammars, there are no non-deterministic effects in DFAs.This means that, as soon as the next symbol is read, it is clear which statewill be the next.

But: in many cases, it is more natural to consider non-deterministictransitions. This often leads to smaller and clear automata.

z1

z2

z3

a

a

Sander Bruggink Automaten und Formale Sprachen 104

Page 114: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Non-deterministic Finite Automata (Idea)

In a non-deterministic finite automaton there are, for each pair (z , a) of astate z and a symbol a, either no, one or more successor states.

z0 z1 z2 zEa b c

a, b, c a, b, c

A non-deterministic automaton (non-deterministically) chooses one fromthe possible successor states. A word is accepted by the automaton, when,starting from a start state, it can reach an end state after reading in theword (if the automaton always chooses “correcly”).

Sander Bruggink Automaten und Formale Sprachen 105

Page 115: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 116: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 117: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 118: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 119: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 120: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 121: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 122: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 123: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Automaton

a b a b b a bInput:

Accepted

z0

z0

z1

z1

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 106

Page 124: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Non-deterministic Finite Automata

Definition: Non-deterministic Finite Automaton

A non-deterministic finite automaton (NFA) M is a 5-tupleM = (Z ,Σ, δ, S ,E ), where

Z is the set of states,

Σ is input alphabet (with Z ∩ Σ = ∅),

S ⊆ Z is the set of initial states,

E ⊆ Z is the set of final states and

δ : Z × Σ→ P(Z ) is the transition function.

Z , Σ must be finite.

Sander Bruggink Automaten und Formale Sprachen 107

Page 125: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Non-deterministic Finite Automata

The transition function δ is again extended to a multistep transitionfunction.:

Transition of more steps

Let M = (Z ,Σ, δ,S ,E ) be an NFA. We inductively define the functionδ : P(Z )× Σ∗ → P(Z ) as follows:

δ(Z ′, ε) = Z ′

δ(Z ′, ax) =⋃z∈Z ′

δ(δ(z , a), x)

where Z ′ ⊆ Z , x ∈ Σ∗ and a ∈ Σ.

Sander Bruggink Automaten und Formale Sprachen 108

Page 126: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Non-deterministic Finite Automata

Accepted Language

The language accepted by a NFA M = (Z ,Σ, δ,S ,E ) is:

T (M) = {x ∈ Σ∗ | δ(S , x) ∩ E 6= ∅}.

In other words: a word w is accepted, when there exists a path from aninitial to a final state, of which the transitions are labelled with the symbolfrom w . (It is possible that there exist more such paths.)

Sander Bruggink Automaten und Formale Sprachen 109

Page 127: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Non-deterministic Finite Automata

Differences between DFAs and NFAs

DFA: δ(z , a) ∈ ZNFA: δ(z , a) ∈ P(Z )

In a DFA there exists, for each state z and alphabet symbol a, exactlyone succesor state.

In an NFA it is allowed that for some state z and alphabet symbol athere are more than one successor states: δ(z , a) = {z1, z2, . . .}.In an NFA it is allowed that for some state z and alphabet symbol athere are no successor states: δ(z , a) = ∅.

A DFA is a restricted kind of NFA.We only need a small translation: δ(z , a) = z ′ δ(z , a) = {z ′}

Sander Bruggink Automaten und Formale Sprachen 110

Page 128: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Non-deterministic Finite Automata

Let Σ = {a, b}.

Which language does the following NFA accept?

1 2 3 4

a, b

a a, b a, b

We try to find an NFA which accepts the following language:

L = {x | x fangt mit a an und endet auf b}

Sander Bruggink Automaten und Formale Sprachen 111

Page 129: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

From NFAs to DFAs

NFAs → DFAs (Satz)

Every language which is accepted by an NFA is also accepted by a DFA.

Idea: We let the various “parallel universes” be simulated by anautomaton. This automaton memorizes, in which states the NFA can be.

That is, the states of the DFA are sets of states of the original NFA. Thus,this construction is called subset construction.

Sander Bruggink Automaten und Formale Sprachen 112

Page 130: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Finite Automaton

a b a b b a bInput:

Accept

z0

z0

z1

z1

z2

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 113

Page 131: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Finite Automaton

a b a b b a bInput:

Accept

z0

z0

z1

z1

z2

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 113

Page 132: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Finite Automaton

a b a b b a bInput:

Accept

z0

z0

z1

z1

z2

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 113

Page 133: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Finite Automaton

a b a b b a bInput:

Accept

z0

z0

z1

z1

z2

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 113

Page 134: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Finite Automaton

a b a b b a bInput:

Accept

z0

z0

z1

z1

z2

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 113

Page 135: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Finite Automaton

a b a b b a bInput:

Accept

z0

z0

z1

z1

z2

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 113

Page 136: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Finite Automaton

a b a b b a bInput:

Accept

z0

z0

z1

z1

z2

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 113

Page 137: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Run of a Non-deterministic Finite Automaton

a b a b b a bInput:

Accept

z0

z0

z1

z1

z2

z2

z3

z3

a

b

b

a

a

b

Sander Bruggink Automaten und Formale Sprachen 113

Page 138: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

From NFAs to DFAs

Subset construction:

Let an NFA M = (Z ,Σ, δ, S ,E ). be given. We construct a DFAM ′ = (Z,Σ, δ′, z ′0,E ′) with:

Z = P(Z )

δ′(Z ′, a) = δ(Z ′, a), Z ′ ⊆ Z

z ′0 = S

E ′ = {Z ′ ⊆ Z | Z ′ ∩ E 6= ∅}

It holds: T (M) = T (M ′)

Sander Bruggink Automaten und Formale Sprachen 114

Page 139: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

From NFAs to DFAs

Remarks about the subset construction:

Because |P(Z )| = 2|Z | the DFA has exponentially many states (in theworst case). However, sometimes the automaton can be minimized.

However, in many cases the smallest DFA which accepts a language isexponentially larger than the smallest NFA.

Sander Bruggink Automaten und Formale Sprachen 115

Page 140: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

NFAs, DFAs and Regular Grammars

We now can

convert NFAs to DFAs

convert DFAs to regular grammars

The direction regular grammar → NFA fails.

Regular grammar

DFA NFA

Sander Bruggink Automaten und Formale Sprachen 116

Page 141: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

NFAs, DFAs and Regular Grammars

Regular grammars → NFAs (Theorem)

For each regular grammar G there is an NFA M such that L(G ) = T (M).

Construction. Let G = (V ,Σ,P,S) be a regular grammar. We constructthe NFA M = (Z ,Σ, δ,S ′,E ), where

Z = V ∪ {X}, X 6∈ V

S ′ = {S}

E =

{{S ,X} when (S → ε) ∈ P{X} when (S → ε) 6∈ P

B ∈ δ(A, a) when (A→ aB) ∈ P

X ∈ δ(A, a) when (A→ a) ∈ P

It holds that T (M) = L(G ).

Sander Bruggink Automaten und Formale Sprachen 117

Page 142: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Brief Summary

Now we can transform deterministic finite automata (DFA) to regulargrammars, regular grammars to non-deterministic finite automata (NFA)and NFA to DFA.

Regular Grammar

DFA NFA

Result: Regular grammars and deterministic and non-deterministic finiteautomata describe the same class of languages, namely the class of regularlanguages.

Sander Bruggink Automaten und Formale Sprachen 118

Page 143: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Small Summary

Advantages and disadvantages of the formalisms:

Regular grammars

provide the connection to the Chomsky hierarchyare used to generate languagesnon-deterministic → less efficient to decide, whether a certain word isin the language

NFAs

allow short, compact representations of languagesintuitive, graphical representationnon-deterministic → less efficient to decide, whether a certain word isin the language

Sander Bruggink Automaten und Formale Sprachen 119

Page 144: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Finite Automata

Small Summary

DFAs

can be exponentially larger than equivalent NFAsallow for an efficient solution for the word problem (we only need tofollow the transitions of the automaton and check whether a final statewas reached)

However, all models require much effort and place to write down. Thus,we need a more compact representation: so-called regular expressions.

Sander Bruggink Automaten und Formale Sprachen 120

Page 145: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expressions

Regular expression

A regular expression is inductively defined as follows:

∅, ε und a (where a ∈ Σ) are regular expressions;

when α and β are regular expressions, then also

(α | β),(αβ),(α∗)

are regular expressions;

everyting which cannot be generated by the above rules is not aregular expression.

Remark: Instead of (α | β) we also often see (α + β).

Sander Bruggink Automaten und Formale Sprachen 121

Page 146: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expressions

Now we have fixed the syntax of regular expressions, we must determinetheir meaning, that it, which regular expression describes which language.

Language of a regular expression

L(∅) = ∅L(ε) = {ε}L(a) = {a}

L(α | β) = L(α) ∪ L(β)

L(αβ) = L(α)L(β), whereL1L2 = {w1w2 | w1 ∈ L1,w2 ∈ L2} fortwo languages L1, L2.

L((α)∗) = (L(α))∗, wobeiL∗ = {w1 . . .wn | n ∈ N0,wi ∈ L} fora language L

Sander Bruggink Automaten und Formale Sprachen 122

Page 147: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expressions

Let Σ = {a, b, c}

α1 = (ab | ba)α2 = (ab | ba)∗

α3 = (ab | ba)c∗

α4 = (a | b | c)∗abc(a | b | c)∗

L5 = Language of all words that start with a and end with bb endenL6 = Language of all words that contain an even number of a’s

Sander Bruggink Automaten und Formale Sprachen 123

Page 148: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expressions

Regular grammar

DFA NFA

Regular expression

Sander Bruggink Automaten und Formale Sprachen 124

Page 149: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expression → NFA

Regular expressions → NFAs

For each regular expression γ there exists an NFA M with L(γ) = T (M).

Proof by induction on the structure of γ.

Sander Bruggink Automaten und Formale Sprachen 125

Page 150: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expression → NFAs

Regular expressions → NFAs

For each regular expression γ there exists an NFA M with L(γ) = T (M).

Proof by induction on the structure of γ.

Base step: For γ = ∅, γ = ε and γ = a there are obvious correspondingautomata:

z z z1 z2a

Sander Bruggink Automaten und Formale Sprachen 126

Page 151: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expression → NFA

Induction step

Let γ an arbitrary, composite, regular expression.

This means that γ is of one of the following forms:

γ = α |βγ = αβ

γ = α∗

where α and β are shorter (and therefore smaller) regular expressions sind.By the induction hypothesis we can assume, that there are automata for αand β such that L(α) = T (Mα) und L(β) = T (Mβ).

Sander Bruggink Automaten und Formale Sprachen 127

Page 152: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expression → NFA

Case 1: Let γ = (α | β).We have Mα and Mβ with T (Mα) = L(α)and T (Mβ) = L(β). We construct M withT (M) = L(α | β).

The state set of M is the disjoint unionof both state sets. Also, the set ofinitial states of M is the union of thetwo sets of initial states, and the set offinal states of M is the union of bothsets of final states.

All transitions of Mα and Mβ aremaintained.

Then it holds thatT (M) = T (Mα) ∪ T (Mβ) = L(α) ∪ L(β)

Sα Eα

Sβ Eβ

Sander Bruggink Automaten und Formale Sprachen 128

Page 153: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expression → NFA

Case 2: Let γ = αβ.

Sα Eα Sβ Eβ

a a

a

neu!Mα Mβ

It holds that T (M) = T (Mα)T (Mβ) = L(α)L(β)

Sander Bruggink Automaten und Formale Sprachen 129

Page 154: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regularer Ausdruck → NFA

Fall 2: Sei γ = αβ.There are automata Mα and Mβ with T (Mα) = L(α) and T (Mβ) = L(β).We compose these automata as follows:

The state set of M is the disjoint union of the state sets of Mα andMβ. M has the same initial states as Mα and the same final states asMβ. (When ε ∈ L(α), then the initial states of Mβ are also initialstates of M.)

All transitions of Mα and Mβ are preserved.

All states of Mα that have a transition to a final state of Mα, obtainan additional transition to all initial states of Mβ.

Sander Bruggink Automaten und Formale Sprachen 130

Page 155: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regularer Ausdruck → NFA

Case 3: Let γ = (α)∗.

evtl. zusatzl. Zustand

Sα Eα

aa

a

It holds T (M) = (T (Mα))∗ = (L(α))∗.

Sander Bruggink Automaten und Formale Sprachen 131

Page 156: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regularer Ausdruck → NFA

Case 3: Let γ = (α)∗.There is an automaton Mα with T (Mα) = L(α). From this automaton weconstruct the automaton M as follows:

All states and all initial and final states are preserved.

Additionally, all states which have an arrow to a final state of Mα

obtain a new transition with the same label to all initial states of Mα.

When ε 6∈ T (Mα), there is an additional state which is both initialand final.

Sander Bruggink Automaten und Formale Sprachen 132

Page 157: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expressions

Regulare Grammatik

DFA NFA

Regularer Ausdruck

Sander Bruggink Automaten und Formale Sprachen 133

Page 158: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular Expression

NFAs → Regular expressions

For each NFA M there is a regular expression γ with T (M) = L(γ).

Sander Bruggink Automaten und Formale Sprachen 134

Page 159: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular Expression

We use the following state elimination algorithm, that transforms an NFAM into a regular expression. As intermediate steps we obtain automata ofwhich the transitions are labelled with regular expressions instead ofalphabet symbols.

z1 z2α

Sander Bruggink Automaten und Formale Sprachen 135

Page 160: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular Expression

Step 1First we add a new initial state and a new final state and connect themwith the old initial and final states with transitions labeled with ε.

ε

ε

ε

ε

......

S E

Sander Bruggink Automaten und Formale Sprachen 136

Page 161: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular Expression

Now we non-deterministically use transformation rules, that decrease thesize of the automaton, but make sure that the new automaton still acceptsthe same language.

In the end, only the initial and the final state remain, connected by singlearrow which is labeled with the sought after regular expression.

z1 z2γ

Sander Bruggink Automaten und Formale Sprachen 137

Page 162: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular Expression

Rule V: Two parallel arrows with the labels α and β can be fused togetherto a single transition labeled with α | β.

z1 z2

α

β

⇒ z1 z2(α | β)

The same is the case, if a state contain two loops.

zα β ⇒ z (α | β)

Sander Bruggink Automaten und Formale Sprachen 138

Page 163: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular Expression

Rule S: Loop are removed by adding their label (augmented with ∗) to thelabels of the following transitions, as follows:

z

x1

xn

...α

β1

βn

⇒ z

x1

xn

...

(α)∗β1

(α)∗βn

This is only allows, if the there is only a single loop on a state.

Sander Bruggink Automaten und Formale Sprachen 139

Page 164: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular Expression

Rule E: A state z is eliminated by connecting states with transitions to zand states with transition from z with one another, as follows:

z

x1

xn

y1

ym

......

α1

αn

β1

βm

x1

xn

y1

ym

......

α1β1

α1βm

αnβ1

αnβm

Sander Bruggink Automaten und Formale Sprachen 140

Page 165: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular expression

Rule E may only be applied, if:

there is no loop attached to the removed state z and

the state z has at least one incoming and at least one outgoing edge.

Sander Bruggink Automaten und Formale Sprachen 141

Page 166: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

NFA → Regular Expression

As soon as no rule can be applied any more, we have in general thefollowing situation (and sometimes some additional dead ends andunreachable states):

z1 z2γ

Then γ is the sought after regular expression.

When there is no transition between the initial and the final state, thenγ = ∅.

Sander Bruggink Automaten und Formale Sprachen 142

Page 167: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expressions

Practical applications of regular expressions:

Search and replace in text editors(Tools: vi, emacs, . . . )

Pattern matching and processing of large databases, for example fordata mining.(Tools: grep, sed, awk, perl, . . . )

Translation of programming languageslexical analysis – Transformation of string of symbols (the program)to a sequence of tokens, where keywords, identifier, data, etc, arealready identified.(Tools: lex, flex, . . . )

Sander Bruggink Automaten und Formale Sprachen 143

Page 168: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expressions in Practice

POSIX ERE-Syntax:

All symbols which do not mean something else, are atomic symbols.For symbols with other meanings: \( = (, \[ = [, usw.

“αβ” “α|β” “α*”

“α?” ≡ (α | ε) “α+” ≡ αα∗ “α{n}” ≡ α . . . α (n Mal α)

“.” ≡ a | · · · | z | 1 | · · · | 9 | % | # | · · ·“[a1 . . . an]” ≡ (a1 | · · · | an)

“[^a1 . . . an]” ≡ jedes Zeichen außer a1 . . . an

(,) group and store (useful for replacing, stored group are denotedwith \1, \2, . . . )

. . .

Sander Bruggink Automaten und Formale Sprachen 144

Page 169: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Regular Expressions in Practice

Goal: In a HTML file there are Wiki-links of the form [[text]]. We wantto replace such links by HTML hyperlinks.

In HTML, a hyperlink looks as follows:<a href="A"> T </a>

So we want to replace [[x]] by <a href="x.html">x</a>.

Sander Bruggink Automaten und Formale Sprachen 145

Page 170: Sommersemester 2014 - ti.inf.uni-due.de

©xkcd.com

Page 171: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

Summary: Describe Regular Languages

Regulare Grammatik

DFA NFA

Regularer Ausdruck

Arrow →Production

Production →Arrow

Subset con-struction

Inductivealgorithm

State elimi-nation algo-rithm

Sander Bruggink Automaten und Formale Sprachen 147

Page 172: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

The Pumping Lemma

We now looked at four forma-lisms, with which regular lan-guages can be described: regu-lar grammars, DFAs, NFAs andregular expressions.

Now, we will learn about aproof method to show that acertain language is not regular.

All languages

Regular languages

Regular grammars

DFAs

NFAsRegular expressions

Pumping Lemma

Sander Bruggink Automaten und Formale Sprachen 148

Page 173: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Regular Expressions

The Pumping Lemma

We now looked at four forma-lisms, with which regular lan-guages can be described: regu-lar grammars, DFAs, NFAs andregular expressions.

Now, we will learn about aproof method to show that acertain language is not regular.

All languages

Regular languages

Regular grammars

DFAs

NFAsRegular expressions

Pumping Lemma

Sander Bruggink Automaten und Formale Sprachen 148

Page 174: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pigeon Hole Principle

Sander Bruggink Automaten und Formale Sprachen 149

Page 175: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pigeon Hole Principle

Sander Bruggink Automaten und Formale Sprachen 149

Page 176: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pigeon Hole Principle

Pigeon hole principle)

When you want to distribute m objects over n sets and m > n, then theremust be at least one set which contains two elements.

The pigeon hole principle for finite automata

When an automaton with n states has a path of length m and m ≥ n,then there is at least one state which occurs on the path twice.

Sander Bruggink Automaten und Formale Sprachen 150

Page 177: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pumping Lemma

Each path with more transitions as the automaton has states, contains aloop.

zu

v

w

This loop can be traversed multiple times (or not at all). Thus, the worduvw is “pumped”, and one sees that the words uw , uv 2w , uv 3w , . . . arealso in the language of the automaton.

Remark: We write v i = v . . . v︸ ︷︷ ︸i-mal

.

Sander Bruggink Automaten und Formale Sprachen 151

Page 178: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pumping Lemma

zu

v

w

We can also assume the following conditions for u, v , w , where n is thenumber of states in the automaton:

1 |v | ≥ 1: the loop is not trivial and contains at least one transition.

2 |uv | ≤ n: the state is reached for the second time after at most ntransitions.

Sander Bruggink Automaten und Formale Sprachen 152

Page 179: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pumping Lemma

Example: M = ({z0, z1, z2, z3}, {a, b, c}, δ, z0, {zE})

z0

z1

z2

zE

a

b

b

a

c c

x = a cu

c cv

aw

∈ T (M)

uv 0w = a cu

aw

∈ T (M) uv 2w = a cu

c cv

c cv

aw

∈ T (M) . . .

Sander Bruggink Automaten und Formale Sprachen 153

Page 180: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pumping Lemma

From this property of finite automata we derive a property of regularlanguages.

Pumping property for regular languages

A language L has the pumping property when there exists a naturalnumber n such that all words x ∈ L with |x | ≥ n can be decomposed inx = uvw , such that the following conditions hold:

1 |v | ≥ 1,

2 |uv | ≤ n und

3 for all i = 0, 1, 2, . . . it holds that: uv iw ∈ L.

Pumping-Lemma for Regular Languages (Theorem)

When L is a regular language, then L has the pumping property.

Sander Bruggink Automaten und Formale Sprachen 154

Page 181: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pumping Lemma

Pumping-Lemma

Let L be a language.

L is regular ⇒∃n ∈ N∀x ∈ L with |x | ≥ n∃uvw such that x = uvw ,|v | ≥ 1, |uv | ≤ n∀i ∈ N

uv iw ∈ L

∀n ∈ N∃x ∈ L with |x | ≥ n∀uvw such that x = uvw ,|v | ≥ 1, |uv | ≤ n∃i ∈ N

uv iw /∈ L⇒ L is not regular

Sander Bruggink Automaten und Formale Sprachen 155

Page 182: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pumping Lemma

Pumping Lemma (alternative formulation)

Let L be a langage. Suppose it is the case thta for each natural number nwe can find a word x with |x | ≥ n, such that for all decompositionx = uvw with

1 |v | ≥ 1,

2 |uv | ≤ n

it holds that there is an i with uv iw 6∈ L. Then L is not regular.

D.h., wir mussen zeigen, dass es fur jedes n (fur jede mogliche Anzahl vonZustanden) ein Wort gibt, das mindestens so lang wie n ist und das keine

”pumpbare“ Zerlegung hat.

Sander Bruggink Automaten und Formale Sprachen 156

Page 183: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

The Pumping Lemma

“Cooking recipe” for the pumping lemma

Let L be a language (Example: {akbk | k ≥ 0}). We want to show, that itis not regular.

1 Take an arbitrary number n an. This number cannot be freely chosen.

2 Choose a word x ∈ L with |x | ≥ n. To make sure that the word has aleast length n, n should occur (for example as an exponent) in thedescription of the word.

Example: x = anbn

Sander Bruggink Automaten und Formale Sprachen 157

Page 184: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

Pumping Lemma

“Cooking recipe” for the pumping lemma

3 Consider all possible decompositions x = uvw with the followingconstraints: |v | ≥ 1 und |uv | ≤ n.

Example: here there is only one possibility: u = aj , v = al , w = ambn

mit j + l + m = n and l ≥ 1.

4 Choose for each of these decompositions an i (in each case it can bea different i) such that uv iw /∈ L. (In many cases i = 0 and i = 2 aregood choices.)

Example: choose i = 2, then uv 2w = aj+2l+mbn 6∈ L, sincej + 2l + m 6= n.

Sander Bruggink Automaten und Formale Sprachen 158

Page 185: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

Pumping Lemma

Example 1

Let Σ = {a, b}.Show that the language {a2k | k ∈ N} is not regular.

Example 2

Let Σ = {a, b, c}.Sho that the language {akb`cm | k ≥ 1, ` ≤ m} is not regular.

Sander Bruggink Automaten und Formale Sprachen 159

Page 186: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Das Pumping-Lemma

Remark to the Pumping Lemma

Faulty application of the Pumping Lemma

When L has the pumping property, then L is regular. 7

There are non-regular languages that satisfy the pumping property.

For example:L = {akbmcm | k ,m ≥ 1} ∪ {bkcm | k,m ≥ 0}.L satisfies the pumping property.

But L is not regular (Proof follows later).

Sander Bruggink Automaten und Formale Sprachen 160

Page 187: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Brief Recapitulation: Equivalence Relations

What is an equivlance relation?

We repeat first the definition of a relation:

Relation

A (unary, homogenous) relation R on a set M is a subset R ⊆ M ×M.Instead of (m1,m2) ∈ R we usually write m1 R m2.

Graphical representation:

(a, b) ∈ R a b

Sander Bruggink Automaten und Formale Sprachen 161

Page 188: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Brief Recapitulation: Equivalence Relations

Equivalence relation

An equivalence relation R on a set M is a relation R ⊆ M ×M that hasthe following properties:

R is reflexive, that is, (a, a) ∈ R for all a ∈ M.

R is symmetric, that is, if (a, b) ∈ R, then also (b, a) ∈ R.

R is transitive, that is, from (a, b) ∈ R and (b, c) ∈ R it follows that(a, c) ∈ R.

Here, a, b und c are arbitrary elements of M.

Sander Bruggink Automaten und Formale Sprachen 162

Page 189: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Brief Recapitulation: Equivalence Relations

Reflexive: a

Symmetric: a b

Transitive: a b c

Sander Bruggink Automaten und Formale Sprachen 163

Page 190: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Equivalence Classes

Equivalence class

Let R be an equivalence relation on M and m ∈ M. The equivalence class[m]R of m is the set

[m]R = {n ∈ M | (n,m) ∈ R}

Sometimes one writes only [m], when it is clear, which relation is meant.

Sander Bruggink Automaten und Formale Sprachen 164

Page 191: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Equivalence Classes

Properties of equivalence classes

Let R be an equivalence relation on M and m1,m2 ∈ M.Then it holds either that

[m1]R = [m2]R

or that[m1]R ∩ [m2]R = ∅.

Additionally it holds that

M =⋃

m∈M[m]R .

That is, two equivalence classes are either equal or disjoint. Additionallythey completely cover M.It is also said: the equivalence classes build a partition of M.

Sander Bruggink Automaten und Formale Sprachen 165

Page 192: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Acceptance Equivalence

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

Is this the smallest automaton which accepts the same language?

Sander Bruggink Automaten und Formale Sprachen 166

Page 193: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Acceptance Equivalence

The states 4 und 5 are acceptance equivalent. The same holds for thestates 2 and 3. These states can be merged:

1 2/3 4/5 6

a, b

a

b

b

aa, b

Sander Bruggink Automaten und Formale Sprachen 167

Page 194: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Acceptance Equivalence

Acceptance equivalence (Definition)

Let a DFA M = (Z ,Σ, δ, z0,E ) be given. Two states z1, z2 ∈ Z areacceptance equivalent, when it holds for all words w ∈ Σ∗, that:

δ(z1,w) ∈ E ⇐⇒ δ(z2,w) ∈ E .

Acceptance equivalent states can be merged ⇒ minimal automaton

Sander Bruggink Automaten und Formale Sprachen 168

Page 195: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Algorithm minimal automaton

Eingabe: DFA MAusgabe: Sets of acceptance equivalent states

1 Remove the states which are not reachable from the start state.

2 Create a table of all (unordered) state pairs {z , z ′} with z 6= z ′.

3 Mark all pairs {z , z ′} with z ∈ E and z ′ 6∈ E (or vice versa).(z , z ′ are for sure not acceptance equivalent.)

Sander Bruggink Automaten und Formale Sprachen 169

Page 196: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Algorithm minimal automaton

4 For each unmarked pair {z , z ′} and each a ∈ Σ, test whether{δ(z , a), δ(z ′, a)} is already marked. When yes: mark {z , z ′}.(From z , z ′ there are transitions to non acceptance equivalent states,so they cannot be acceptance equivalent.)

5 Repeat the previous step until no changes in the table are possible.

6 For all pairs {z , z ′} which are still unmarked, it holds that z and z ′

are acceptance equivalent.

Sander Bruggink Automaten und Formale Sprachen 170

Page 197: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

23456

1 2 3 4 5

Create a table of all pairs of states

Sander Bruggink Automaten und Formale Sprachen 171

Page 198: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

23456 1 1 1 1 1

1 2 3 4 5

(1) Mark pairs of final and non-final states

Sander Bruggink Automaten und Formale Sprachen 171

Page 199: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

234 256 1 1 1 1 1

1 2 3 4 5

(2) Mark {2, 4} because δ(2, a) = 1, δ(4, a) = 6 and {1, 6} is marked

Sander Bruggink Automaten und Formale Sprachen 171

Page 200: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

234 25 36 1 1 1 1 1

1 2 3 4 5

(3) Mark {3, 5} because δ(3, a) = 1, δ(5, a) = 6 and {1, 6} is marked

Sander Bruggink Automaten und Formale Sprachen 171

Page 201: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

234 25 4 36 1 1 1 1 1

1 2 3 4 5

(4) Mark {2, 5} because δ(2, a) = 1, δ(5, a) = 6 and {1, 6} is marked

Sander Bruggink Automaten und Formale Sprachen 171

Page 202: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

234 2 55 4 36 1 1 1 1 1

1 2 3 4 5

(5) Mark {3, 4} because δ(3, a) = 1, δ(4, a) = 6 and {1, 6} is marked

Sander Bruggink Automaten und Formale Sprachen 171

Page 203: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

234 2 55 6 4 36 1 1 1 1 1

1 2 3 4 5

(6) Mark {1, 5} because δ(1, a) = 3, δ(5, a) = 6 and {3, 6} is marked

Sander Bruggink Automaten und Formale Sprachen 171

Page 204: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

234 7 2 55 6 4 36 1 1 1 1 1

1 2 3 4 5

(7) Mark {1, 4} because δ(1, a) = 3, δ(4, a) = 6 and {3, 6} is marked

Sander Bruggink Automaten und Formale Sprachen 171

Page 205: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

23 84 7 2 55 6 4 36 1 1 1 1 1

1 2 3 4 5

(8) Mark {1, 3} because δ(1, b) = 2, δ(3, b) = 5 and {2, 5} is marked

Sander Bruggink Automaten und Formale Sprachen 171

Page 206: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

2 93 84 7 2 55 6 4 36 1 1 1 1 1

1 2 3 4 5

(9) Mar {1, 2} because δ(1, b) = 2, δ(2, b) = 4 and {2, 4} is marked

Sander Bruggink Automaten und Formale Sprachen 171

Page 207: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Executing the minimal automaton algorithm on the following automaton:

1

2

3

4

5

6

b

aa

b

a b

b

a

b

a

a, b

2 93 84 7 2 55 6 4 36 1 1 1 1 1

1 2 3 4 5

The remaining state pairs {2, 3} und {4, 5} cannot be marked they areacceptance equivalent

Sander Bruggink Automaten und Formale Sprachen 171

Page 208: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata

Hints for the minimization algorithm:

Create the table in such a way, that each pair of states occurs exactlyonce.

2, . . . , n vertically and 1, . . . , n − 1 horizontally.

Please specify which pairs of states are marked in which order!

(In Schoning’s book only asterisks (∗) are used, but from that theorder and reason for the markings cannot be derived during thecorrection.)

Sander Bruggink Automaten und Formale Sprachen 172

Page 209: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode Equivalence

Acceptance equivalence (Recapitulation)

Let a DFA M = (Z ,Σ, δ, z0,E ) be given. Two states z1, z2 ∈ Z areacceptance equivalent, when it holds for all words w ∈ Σ∗, that:

δ(z1,w) ∈ E ⇐⇒ δ(z2,w) ∈ E .

When we expand the acceptance equivalence to words (instead of states),we obtain the Myhill–Nerode equivalence.

Myhill–Nerode equivalenz (Definition)

Let a language L and words x , y ∈ Σ∗ be given.We define an equivalence relation ≡L by: x ≡L y if and only if

for all z ∈ Σ∗ it holds that (xz ∈ L ⇐⇒ yz ∈ L).

Sander Bruggink Automaten und Formale Sprachen 173

Page 210: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode Equivalence

Let L = {akbk | k ∈ N}. Is it the case that:

a4b3 ≡L a3b2 ?

a2b2 ≡L a3b2 ?

a4b2 ≡L a3b2 ?

abb ≡L baba ?

What are the Myhill-Nerode equivalence classes of the following languages?

L1 = {w ∈ {a, b}∗ | #a(w) even}L2 = {w ∈ {a, b, c}∗ | the subword abc does not occur in w}

Sander Bruggink Automaten und Formale Sprachen 174

Page 211: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode Equivalence

Myhill–Nerode equivalence and regularity (Theorem)

A language L ⊆ Σ∗ is regular, if and only if ≡L has finitely manyequivalence classes.

Sander Bruggink Automaten und Formale Sprachen 175

Page 212: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode Equivalence

L is regular ⇒ ≡L has finitely many equivalence classes:

Let L be a language and M = (Z ,Σ, δ, z0,E ) a DFA with T (M) = L. Wedefine the equivalence relation ≡M with

x ≡M y ⇐⇒ δ(z0, x) = δ(z0, y) for x , y ∈ Σ∗.

The number of equivalence classes of ≡M is equal to the number of(reachable) states of M, that is, finite.

Sander Bruggink Automaten und Formale Sprachen 176

Page 213: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode Equivalence

We can show, that x ≡M y implies x ≡L y folgt. Assume x ≡M y and takean arbitrary z ∈ Σ∗. It holds that:

xz ∈ L ⇐⇒ δ(z0, xz) ∈ E (Def. acc. language)

⇐⇒ δ(δ(z0, x), z) ∈ E (Def. δ)

⇐⇒ δ(δ(z0, y), z) ∈ E x RM y

⇐⇒ δ(z0, yz) ∈ E (Def. δ)

⇐⇒ yz ∈ L. (Def. acc. language)

From this, it follows that x ≡L y .

Therefore ≡M connects at most as many words as ≡L and thus has moreequivalence classes as ≡L. From this it follows, that ≡L has finitely manyequivalence classes.

Sander Bruggink Automaten und Formale Sprachen 177

Page 214: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode Equivalence

≡L has finitely many equivalence classes ⇒ L is regular:

Assume that ≡L has finitely many equivalence classes. We construct thefinite automaton M0 = (Z ,Σ, δ, z0,E ) for L, which is defined as follows:

Z = {[w ]≡L| w ∈ Σ∗} (set of equivalence classes)

z0 = [ε]≡L

E = {[w ]≡L| w ∈ L}

δ([w ]≡L, a) = [wa]≡L

Sander Bruggink Automaten und Formale Sprachen 178

Page 215: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode-Equivalence

From δ([w ]≡L, a) = [wa]≡L

it follows that δ([w ]≡L, u) = [wu]≡L

.

It holds that

x ∈ L(M0) ⇐⇒ δ([ε], x) ∈ E (Def. acc. Language)

⇐⇒ [x ] ∈ E (See above)

⇐⇒ x ∈ L (Def. Final states)

Therefore T (M0) = L. �

Sander Bruggink Automaten und Formale Sprachen 179

Page 216: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode Equivalence

With the Myhill–Nerode theorem we can show that a language is regularand that a language is not regular.

Examples:

The language L1 = {akbk | k ≥ 0} has infinitely many equivalenceclasses and is not regular.

The language L2 = {anbmcm | n,m ≥ 1} ∪ {bmcn | n,m ≥ 1} hasinfinitely many equivalence classes and is not regular.(However, it does satisfy the conditions of the pumping lemma!)

Sander Bruggink Automaten und Formale Sprachen 180

Page 217: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Myhill–Nerode Equivalence

Let M0 be the DFA, that is constructed from the equivalence classes.For arbitrary automaton M, with T (M) = T (M0), it holds that

≡M ⊆ ≡L = ≡M0 .

That means that M0 can be constructed from M by mergin acceptanceequivalent states.

In other words: M0 is the minimal DFA for L: all other minimal DFAs thataccept the same language, are equal (after renaming states).

Sander Bruggink Automaten und Formale Sprachen 181

Page 218: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata Again

For non-deterministic automata the following holds:

The minimalen NFA does not exist, but there can be more than one.

The following NFAs accept the language L((0|1)∗1) and both havetwo states. (The language cannot be accepted with only a single .)

1 2

0, 1

1

0

1 2

1

0

1

Sander Bruggink Automaten und Formale Sprachen 182

Page 219: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Equivalence Relations and Minimal Automata

Minimal Automata Again

Let a DFA M be given. Then a minimal NFA that accepts T (M)erkennt always has at most as many states as M. (Because M is aalread an NFA itself.)

In fact, a minimal NFA can be exponentially smaller as the minimalDFA.

See for example the languages

Lk = {x ∈ {0, 1}∗ | |x | ≥ k , the k-last symbol x is 0}.

Sander Bruggink Automaten und Formale Sprachen 183

Page 220: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Closure properties (Definition)

Let a set M an a n-ary operator f : M × · · · ×M → M be given.We say that a set M ′ ⊆ M is closed under f , when it holds for n arbitraryelements m1, . . . ,mn ∈ M ′ that f (m1, . . . ,mn) ∈ M ′.

Example: Let M = N and M ′ = {i | i ∈ N and i is even}. Let f be themultiplication operator, that is f (x , y) = x · y .M ′ is closed under f , because the product of two even number is even.

Now, let g be defined as: g(x) = x/2, where / is integer division.M ′ is not closed under g , because for example g(6) = 6/2 = 3.

Sander Bruggink Automaten und Formale Sprachen 184

Page 221: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Here we consider closure properties of the set of regular languages. Theinteresting questions are:

When L1, L2 are regular, are L1 ∪ L2, L1 ∩ L2, L1L2, L1 = Σ∗\L1

(complement) and L∗1 also regular?

Short answer: The regular languages are closed under all the mentionedoperations.

Sander Bruggink Automaten und Formale Sprachen 185

Page 222: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Closure under union

When L1 and L2 are regular, then L1 ∪ L2 is also regular.

Proof: Since L1 and L2 are regular by assumption, there are regularexpressions α1 and α2 such that L(α1) = L1 and L(α2) = L2. The regularexpression α1 | α2 generates the language L1 ∪ L2 and therefore thatlanguage is regular.

Sander Bruggink Automaten und Formale Sprachen 186

Page 223: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Closure under concatenation

When L1 and L2 are regular, then L1L2 = {w1w2 | w1 ∈ L1 and w2 ∈ L2}is also regular.

Proof: Since L1 and L2 are regular by assumption, there are regularexpressions α1 and α2 such that L(α1) = L1 and L(α2) = L2. The regularexpression α1α2 generates the language L1 ∩ L2 and therefore thatlanguage is regular.

Sander Bruggink Automaten und Formale Sprachen 187

Page 224: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Closure under the star operation

When L is a regular language, then L∗ is also regular.

Proof: Since L is regular by assumption, there is a regular expression αwith L(α) = L. The regular expression α∗ generates the language L∗.Because there is a regular language for this language, it is regular.

Sander Bruggink Automaten und Formale Sprachen 188

Page 225: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Closure under complement

When L is a regular language, then also L = Σ∗\L is a regular language.

Remark: when we construct the complement, we must always specify inrelation to which set we construct the complement. In this case this is theset Σ∗, the set of all words over Σ.

Sander Bruggink Automaten und Formale Sprachen 189

Page 226: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Proof: Since L is regular by assumption, there is a DFAM = (Z ,Σ, δ, z0,E ) with T (M) = L. This automaton is transformed intoan Automaton M ′ for L, by exchanging final and non-final states. That is:M ′ = (Z ,Σ, δ, z0,Z\E ).

Now we have:w ∈ L ⇐⇒ δ(z0,w) ∈ E ⇐⇒ δ(z0,w) 6∈ Z\E ⇐⇒ w 6∈ L.

Since there is an automaton M ′ for L, L is regular.

Sander Bruggink Automaten und Formale Sprachen 190

Page 227: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Closure under complement

z0 z1 z2 zEa

b, c a

b

c a

b

c

a, b, c

z0 z1 z2 zEa

b, c a

b

c a

b

c

a, b, c

Sander Bruggink Automaten und Formale Sprachen 191

Page 228: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Closure under intersection

When L1 and L2 are regulare languages, then L1 ∩ L2 is also regular.

Proof: It holds that L1 ∩ L2 = L1 ∪ L2 and we already know, that regularlanguages are closed under union and complement.

Sander Bruggink Automaten und Formale Sprachen 192

Page 229: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Cross product contruction for DFAsThere is also a direct construction. In this construction, two automata aresynchronized with each other. This is done by building the cross product ofthe two state sets.

Let M1 = (Z1,Σ, δ1, s1,E1) and M2 = (Z2,Σ, δ2, s2,E2) be DFAs withT (M1) = L1 and T (M2) = L2. The following DFA M accepts thelanguage L1 ∩ L2:

M = (Z1 × Z2,Σ, δ, (s1, s2),E1 × E2),

where δ((z1, z2), a) = (δ1(z1, a), δ2(z2, a)).

M accepts a word w if and only if both M1 and M2 accept it.

Sander Bruggink Automaten und Formale Sprachen 193

Page 230: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Cross product contruction for NFAsLet M1 = (Z1,Σ, δ1,S1,E1) and M2 = (Z2,Σ, δ2, S2,E2) be NFAs withT (M1) = L1 and T (M2) = L2. The following NFA M accepts thelanguage L1 ∩ L2:

M = (Z1 × Z2,Σ, δ,S1 × S2,E1 × E2),

where δ((z1, z2), a) = δ1(z1, a)× δ2(z2, a).

M accepts a word w if and only if both M1 and M2 accept it.

Sander Bruggink Automaten und Formale Sprachen 193

Page 231: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties

Why are closure properties interesting?

To show that a language is regular.Complex regular language can be built from simple ones.

To show that a language is not regular. Sometimes it is simpler toprove that the complement of a language or the intersection of alanguage with a regular language is not regular, than to show that alanguage is not regular itself.

Sander Bruggink Automaten und Formale Sprachen 194

Page 232: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties: The Adventure Again

42

3

1

5 6

9

8

7

11

10

12

13

14 15 16

Sander Bruggink Automaten und Formale Sprachen 195

Page 233: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties: The Adventure Again

The Treasure Rule

You must find at least two treasures.

The Door Rule

You can only go through a door, when you have found a key before. (Thiskey can be used arbitrarily often and fits to every door.)

Sander Bruggink Automaten und Formale Sprachen 196

Page 234: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties: The Adventure Again

The Dragon Rule

Immediately after the encounter with a dragon you have to jump into ariver, because otherwise the dragon will set you afire. This is no longernecessary when you’ve found a sword, because you can then kill thedragon.

Alphabet symbols:

Dragon (D):

Sword (W):

River (F):

Arch (B):

Door (T):

Key (L):

Treasure (A):

Sander Bruggink Automaten und Formale Sprachen 197

Page 235: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties: The Adventure Again

The rules can be described by the following (non-deterministic) finiteautomata.

1 2

1 2 3 1 2

Σ

Σ\{ , }

3

D

Σ Σ Σ

T

Σ\{ , } Σ

A

Sander Bruggink Automaten und Formale Sprachen 198

Page 236: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Closure Properties

Closure Properties: The Adventure Again

Let M be the automaton, which describes the adventure map. Let

LM = T (M) be the set of all paths in the map from a start to an endstate,

LA = T (A) be the set of all paths that satisfy the treasure rule,

LT = T (T ) be the set of all paths that satisfy the door rule, and

LD = T (D) be the set of all paths that satisfy the dragon rule.

Let AM be the set of all paths through the adventure map which satisfy allconditions. Then:

AM = LM ∩ LA ∩ LT ∩ LD

Is there a solution to the adventure?

Sander Bruggink Automaten und Formale Sprachen 199

Page 237: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithms

We now discuss, whether there are procedure or algorithms to solvequestions about regular languages.

The general form of the questions is:

Let regular languages L1, L2 be given. Does it hold for these languages,that . . . ?

We assume, that the regular languages are given as DFAs, NFAs,grammars or regular expressions.

Sander Bruggink Automaten und Formale Sprachen 200

Page 238: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithms

Problems

Word problem: Let a regular language L and w ∈ Σ∗ be given. Doesw ∈ L hold?

Emptiness problem: Let a regular language L be given. Does L = ∅hold?

Finiteness problem: Let a regular language L be given. Is L finite?

Intersection problem: Let two regular languages L1 and L2 be given.Does L1 ∩ L2 = ∅ hold?

Inclusion problem: Let two regular languages L1 and L2 be given.Does L1 ⊆ L2 hold?

Equivalence problem: Let two regular languages L1 and L2 be given.Does L1 = L2 hold?

Sander Bruggink Automaten und Formale Sprachen 201

Page 239: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithms

Word problem (w ∈ L?)

Let a regular language L and w ∈ Σ∗ be given.

Solution: Determine a DFA M for L and track the state transitions of Mduring the reading of w .Final state reached w ∈ LNon-final state reached w 6∈ L

Sander Bruggink Automaten und Formale Sprachen 202

Page 240: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithms

Emptiness problem (L = ∅?)

Let a regular language L be given.

Solution: Determine a NFA M for L.

L = ∅ ⇐⇒ there is no path from an initial to a final state.

Sander Bruggink Automaten und Formale Sprachen 203

Page 241: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithmen

Finiteness problem (is L finite?)

Let a regular language L be given.

Solution: Determine a NFA M for L.L is finite⇐⇒ there are infinitely paths from an initial to a final state in M⇐⇒ there is a reachable cycle in M from which a final state is reachable

Sander Bruggink Automaten und Formale Sprachen 204

Page 242: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithms

Intersection problem (L1 ∩ L2 = ∅?)

Let regular languages L1 and L2 be given

Solution: Determine DFAs M1 and M2 for L1 and L2 and construct theircross product. Now apply the emptiness test to the cross product.

Sander Bruggink Automaten und Formale Sprachen 205

Page 243: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithms

Inclusion problem (L1 ⊆ L2?)

Let regular languages L1, L2 be given.

Solution: L1 ⊆ L2 holds if and only if L1 ∩ L2 = ∅. Since intersection andcomplement can be determined constructively and an emptiness testexists, the inclusion problem can be solved.

Sander Bruggink Automaten und Formale Sprachen 206

Page 244: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithms

Equivalence problem (L1 = L2?)

Let regular languages L1, L2 be given.

Solution 1: Determine the minimal DFAs for L1 and L2. When the DFAsare equal (possible after renaming of states), then L1 and L2 are equal.

Solution 2: L1 = L2 holds if and only if L1 ⊆ L2 and L1 ⊇ L2. Dasinclusion problem is solvable.

Sander Bruggink Automaten und Formale Sprachen 207

Page 245: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Algorithms for Problems

Algorithms

Efficiency:The complexity of the described procedures depends on the representationof the regular languages.

For the equivalence problem:

L1, L2 given as DFAs complexitity O(n2)(quadratically many steps relative to the size of the input)

L1, L2 given as grammars, regular expressios or NFAs complexityNP-hardThis means among others, that no efficient algorihms to solve theproblem are known.

More to complexity in the lecture “Berechenbarkeit und Komplexitat”(Computability and complexity).

Sander Bruggink Automaten und Formale Sprachen 208

Page 246: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Model Verification

System Specification

Modelling Modelling

Systemmodel

Specificationmodel

Model checker

finiteautomata

Yes No

Sander Bruggink Automaten und Formale Sprachen 209

Page 247: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Model Verification

In our case:

The system model is an NFA Sys that accepts all possible systemruns.

The specification model is an NFA Spec that accepts all allowssystem runs.

We verify a safety property. That means, that Spec models the allowedsystem run. We try to find out, whether

L(Sys) ⊆ L(Spec)

Sander Bruggink Automaten und Formale Sprachen 210

Page 248: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Example: mutual exclusion

We consider two processes P1, P2, that try to access a sharedresource.

Each process has a so-called critical area, in which it accesses theresource. At each time, only one process may be in its critical area.

The processes may use shared variables, that the process may use tosynchronize. However, these variables are no semaphores, that is,there does not exist an atomic operation which read and write thevariable at the same time.

We want to show that mutual exclusion is ensured.

Sander Bruggink Automaten und Formale Sprachen 211

Page 249: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Attempt 1: Both processes P1, P2 use a single shared boolean variable fthat is initialized with false .

Program code for P1, P2

while true do1: if (f = false) then2: f := true3: Enter critical area

. . .4: Leave critical area5: f := false

endend

Sander Bruggink Automaten und Formale Sprachen 212

Page 250: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Anwendung: Verifikation

Alphabet:

(f := true)1 : P1 sets f to true (f := true)2 : P2 sets f to true

(f := false)1 : P1 sets f to false (f := false)2 : P2 sets f to false

(f = true?)1 : P1 reads f and f=true (f = true?)2 : P2 reads f and f=true

(f = false?)1 : P1 reads f and f=false (f = false?)2 : P2 reads f and f=false

BkB1 : P1 enters CA BkB2 : P2 enters CA

VkB1 : P1 leaves CA VkB2 : P2 leaves CA

Sander Bruggink Automaten und Formale Sprachen 213

Page 251: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Vorgang:

Specify automata P1 and P2 for both processes.

Specify an automaton F for the value of the variable.

Calculate the cross product MSys of the above three automata. Thisautomaton models the combined behaviour of the system.

Specify an automaton MSpec for the specification.

Find out, whether L(MSys) ⊆ L(MSpec).

Sander Bruggink Automaten und Formale Sprachen 214

Page 252: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Anwendung: Verifikation

Modelling the processes:

i : f := truek : . . .

⇒ i k(f := true)1

i : if f = true thenj : . . .

endk : . . .

⇒ j i k(f = true?)1 (f = false?)1

Sander Bruggink Automaten und Formale Sprachen 215

Page 253: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Descriptions of the runs of the process i as finite automaton:

1

3

4

2

5

(f = false?)i

VkB i

(f := false)i

(f = true?)i

(f := true)i

BkB i

∆i

∆i ∆i

∆i

∆i

Pi

with ∆i = {(f :=true)j , (f :=false)j , (f =true?)j , (f =false?)jBkB j ,VkB j}where j = 2 when i = 1, und j = 1 when i = 2.

Sander Bruggink Automaten und Formale Sprachen 216

Page 254: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Description of the boolean variable f by an automaton:

1

2

(f = false?)1

(f = false?)2

(f := false)1

(f := false)2

(f := true)1

(f := true)2

(f := false)1

(f := false)2

(f = true?)1

(f = true?)2

(f := true)1

(f := true)2

∆f

∆f

F

where ∆f = {BkB1,VkB1,BkB2,VkB2}.

Sander Bruggink Automaten und Formale Sprachen 217

Page 255: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

The language of all runs of the system is T (P1) ∩ T (P2) ∩ T (F ).

The automaton WA which describes all runs that satisfy mutual exclusion(both process are not in their critical areas at the same time).

2 1 3

BkB1

VkB1

Σ\{BkB1,BkB2}

VkB2

BkB2

WA

Σ\{VkB1,BkB2} Σ\{BkB1,VkB2}

Now we have to show that T (P1) ∩ T (P2) ∩ T (F ) ⊆ T (WA).

Sander Bruggink Automaten und Formale Sprachen 218

Page 256: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Encoding for Grail:

(f := true)1 a (f := true)2 A

(f := false)1 b (f := false)2 B

(f = true?)1 c (f = true?)2 C

(f = false?)1 d (f = false?)2 D

BkB1 x BkB2 X

VkB1 y VkB2 Y

Sander Bruggink Automaten und Formale Sprachen 219

Page 257: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Automaton files: p1.aut, p2.aut, f.aut, wa.aut.

Used Grail-tools:

fmcross aut1 < aut2 > res – generates the cross product of aut1and aut2 and stores the result in res.

fmcment aut > res – generates the complement of aut and stores theresult in res.

fmenum aut – enumerates the words that are accepted by aut.

Sander Bruggink Automaten und Formale Sprachen 220

Page 258: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Attempt 2: We now consider Lamport’s algorithm for mutual exclusion.

Here we consider two processes P1 and P2 with different program codeand two shared boolean variables f1 and f2 (both initialized to be false).

Sander Bruggink Automaten und Formale Sprachen 221

Page 259: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Prozess P1:

while true do1: f1 := true2: while (f2 = true?) do

skipend

3: Enter critical area. . .

4: Leave critical area5: f1 := false

end

skip : Null-operation (does nothing)

Sander Bruggink Automaten und Formale Sprachen 222

Page 260: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Prozess P2

while true do1: f2 := true2: if (f1 = true?) then do3: f2 := false4: while (f1 = true?) do skip end

else5: Enter critical area

. . .6: Leave critical area7: f2 := false

endend

Sander Bruggink Automaten und Formale Sprachen 223

Page 261: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

In this case we use the following alphabet Σ:

Σ = {(f1 := false)i , (f2 := false)i , (f1 = false?)i , (f2 = false?)i ,

(f1 := true)i , (f2 := true)i , (f1 = true?)i , (f2 = true?)i ,

BkB i ,VkB i | i ∈ {1, 2}}

Sander Bruggink Automaten und Formale Sprachen 224

Page 262: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Automaton for the process P1:

5

2 3

4

1(f1 := true)1

VkB1(f1 := false)1

(f2 = true?)1

(f2 = false?)1

∆∆

BkB1

P1

Where ∆ contains all actions of process P2.

Sander Bruggink Automaten und Formale Sprachen 225

Page 263: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Automaton for the process P2:

1 2 3

4

5

(f2 := true)2

(f1 = true?)2

6

(f1 = false?)2

(f1 = true?)2

(f1 = false?)2

(f2 := false)2

∆∆

7

VkB2 BkB2

(f2 := false)2

P2

Where ∆ contains all actions of process P1.

Sander Bruggink Automaten und Formale Sprachen 226

Page 264: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Automata for the two variables:

1

2

1

2

(f1 = false?)2 (f1 := false)1

(f1 := true)1 (f1 := false)1

(f1 := true)1(f1 = true?)2

∆1

∆1

∆2

(f2 := false)2

(f2 := false)2(f2 := true)2

(f2 := true)2(f2 = true?)1

∆2

(f2 = false?)1

F1 F2

∆1 = {(f2 := true)2, (f2 := false)2, (f2 = true?)1, (f2 = false?)1

BkB1,BkB2,VkB1,VkB2

Analogously for ∆2.Sander Bruggink Automaten und Formale Sprachen 227

Page 265: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Anwendung: Verifikation

Encoding for Grail:

(f1 := true)1 a (f2 := true)2 A BkB1 x

(f1 := false)1 b (f2 := false)2 B BkB2 X

(f2 = true?)1 c (f1 = true?)2 C VkB1 y

(f2 = false?)1 d (f1 = false?)2 D VkB2 Y

Now the system runs are contained in the allowed runs. So, the algorithmis correct.

Sander Bruggink Automaten und Formale Sprachen 228

Page 266: Sommersemester 2014 - ti.inf.uni-due.de

Regular Languages Program Verification with Regular Languages

Application: Verification

Summary Verification:

We have modelled, with the help of finite automata, two protocols,that should realise mutual exclusion.

With the help of the algorithm to solve the language inclusion andintersection problem, we have automatically tested whether or notthese protocols are correct.

That means: automata and regular languages can be used for programverification.

Sander Bruggink Automaten und Formale Sprachen 229

Page 267: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Context Free Languages

Context free languages

Context free grammars and syntax trees

Word problem: the CYK-algorithm

Pumping lemma for context free languages

Automaton model: push-down automata

Closure properties and algorithms

Determinism/non-determinism in context free languages

Sander Bruggink Automaten und Formale Sprachen 230

Page 268: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Context Free Languages

Applications of context free languages

Describing the syntax of programming languages. Many techniquesmentioned in the lecture are interesting for the construction ofcompilers.

Partly also for describing natural languages.

Sander Bruggink Automaten und Formale Sprachen 231

Page 269: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Context Free Grammars

A grammar is a 4-tuple G = (V ,Σ,P, S), where V is a (finite) set ofnon-terminal symbols, Σ the (finite) alphabet, P a finite set of productionsthat consist of a left and right side, and S ∈ V the start symbol.

A grammar is context free when all left side of the rules consist of exactlyone non-terminal symbol, and all right sides have at least one symbol.

Special rule for ε: When S is the start symbol, the rule S → ε can occur inthe grammar, when S does not occur on the right side of a rule.

Sander Bruggink Automaten und Formale Sprachen 232

Page 270: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Context Free Grammars

Let Σ = {a, b}.

Example 1: Give a context free grammar G1, such thatL(G1) = {anbn | n ≥ 0}.

Beispiel 2: Give a context free grammar G2, such thatL(G2) = {akbnambn | n,m, k ≥ 1}.

Sander Bruggink Automaten und Formale Sprachen 233

Page 271: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Special rule for ε again

Question: Is the special rule for ε really required? Can we always allow ε asright side?

Answer: yes, we can always allow ε as right side.

ε-free grammars (Theorem)

Let a context free grammar G = (V ,Σ,P,S) be given, containingproductions of the form A→ w , where A ∈ V , w ∈ (V ∪ Σ)∗.Then there is a grammar G ′ = (V ,Σ,P ′,S) containing productions of theform A→ w , where w ∈ (V ∪ Σ)+, such that L(G ) = L(G ′).

Sander Bruggink Automaten und Formale Sprachen 234

Page 272: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Special rule for ε again

Method to remove ε-productions:

1 Determine the set of variables V1 ⊆ V such thatV1 = {A ∈ V | A⇒∗ ε}, that is, the set of all variables from whichthe empty word can be derived.

2 Add for each production B → xAy with A ∈ V1, x , y ∈ (V ∪ Σ)∗ aproduction B → xy to the set of productions. (This production“simulates” the removing of A.)

3 Remove all productions of the form A→ ε.

4 If ε ∈ L(G ) (that is, S ∈ V1), add a new start variable S ′ and theproductions S ′ → ε and S ′ → S .

Sander Bruggink Automaten und Formale Sprachen 235

Page 273: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Example: Removing ε-Productions

Let G = (V ,Σ,P,S), where V = {S ,X ,Y ,Z}, Σ = {a, b} and P =

S → XZ

X → aYb | εY → bXa | bb

Z → ε | aSa

Sander Bruggink Automaten und Formale Sprachen 236

Page 274: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Special rule for ε again

Because we can transform each grammar which is “almost” context freebut contains the empty word as right side into a context free grammar, wewill allow the empty word as right side of a production.

Sometimes it is convenient to assume that ε is not a right side inconstructions and proofs.

Sander Bruggink Automaten und Formale Sprachen 237

Page 275: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Syntax Trees and Ambiguity

We consider the following example grammar that generates arithmeticalexpressions.

G = ({E ,T ,F}, {(, ), a,+, ∗},P,E )

with the following set of productions P:

E → T | E + T

T → F | T ∗ F

F → a | (E )

Sander Bruggink Automaten und Formale Sprachen 238

Page 276: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Syntax Trees and Ambiguity

For most words of the language there are two different derivations:

E ⇒ T ⇒ T ∗ F ⇒ F ∗ F → a ∗ F ⇒ a ∗ (E )

⇒ a ∗ (E + T )⇒ a ∗ (T + T )⇒ a ∗ (F + T )

⇒ a ∗ (a + T )⇒ a ∗ (a + F )⇒ a ∗ (a + a)

E ⇒ T ⇒ T ∗ F ⇒ T ∗ (E )→ T ∗ (E + T )

⇒ T ∗ (E + F )⇒ T ∗ (E + a)⇒ T ∗ (T + a)

⇒ T ∗ (F + a)⇒ T ∗ (a + a)⇒ F ∗ (a + a)⇒ a ∗ (a + a)

The first derivation is a so-called left derivation (we always derive as far tothe left as possible), while the second one is a right derivation (we alwaysderives as far to the right as possible).

Sander Bruggink Automaten und Formale Sprachen 239

Page 277: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Syntax Trees and Ambiguity

Syntaxbaum aufbauen

From the derivations we construct a syntax tree by

Labelling the root of the tree by the start variable of the grammar.

For each application of a rule A→ z we add to the node A |z |children labelled with the symbols from z .

Syntax tree can be constructed for all derivations of context freegrammars.

Sander Bruggink Automaten und Formale Sprachen 240

Page 278: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Syntax Trees and Ambiguity

We obtain the same syntax tree inboth cases.

A grammar is called unambiguouswhen there is a single syntax tree foreach word in the language.

A grammar is called ambiguous whenthere is a word which has two ormore syntax trees. Eine Grammatikist mehrdeutig, wenn es fur ein Wortin der erzeugten Sprache, zwei odermehr Syntaxbaume gibt.

F

a

F

a

T

F

a

T

T

E

T F

E( )

E +

Sander Bruggink Automaten und Formale Sprachen 241

Page 279: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Grammars and Syntax Trees

Syntax Trees and Ambiguity

Example of an ambiguous grammar

Sei G = (V ,Σ,P,S), wobei V = {S}, Σ = {(, )} und P =

S → (S) | SS | ε

Sander Bruggink Automaten und Formale Sprachen 242

Page 280: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The Chomsky Normal Form

The Chomsky Normal Form

We now consider a convenient normal form.

Chomsky normal form (Definition)

A context free grammar G = (V ,Σ,P,S) with ε 6∈ L(G ) is in Chomskynormal form when all productions have one the following two forms:

A→ BC A→ a

Here, A,B,C ∈ V are variables and a ∈ Σ is an alphabet symbol.

Sander Bruggink Automaten und Formale Sprachen 243

Page 281: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The Chomsky Normal Form

The Chomsky Normal Form

Transformation into Chomsky normal form (Theorem)

For each context free grammar G with ε 6∈ L(G ) there is a grammar G ′ inChomsky normal form with L(G ) = L(G ′).

Sander Bruggink Automaten und Formale Sprachen 244

Page 282: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The Chomsky Normal Form

The Chomsky Normal Form

Procedure to transform a grammar in Chomsky normal form:

1 remove ε-productions (see slide 235)

2 remove chain productions (V1 → V2)

3 remove alphabetsymbolen from the right sides

4 split long right sides

Sander Bruggink Automaten und Formale Sprachen 245

Page 283: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The Chomsky Normal Form

The Chomsky Normal Form

Step 2: Removing chain productions

There are two cases:

Case 1: A chain production is on a cycle A1 → A2 → · · · → Ak → A1 ofproductions. In this case we replace all variables A1, . . . ,Ak by a singlevariable A and remove the chain productions.

Sander Bruggink Automaten und Formale Sprachen 246

Page 284: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The Chomsky Normal Form

The Chomsky Normal Form

Step 2: Removing chain productions

Case 2: There is no cycle. In this case, the variables can be seriallynumbered: A1, . . . ,Ak , such that Ai → Aj holds only when i < j . Now weiterator from high to low numbers (i = k−1, . . . , 1) and replace Ai → Aj

byAi → x1 | · · · | xn,

when the rules with Aj on the left have the following form:

Aj → x1 | · · · | xn

(Introducing “shortcuts”)

Sander Bruggink Automaten und Formale Sprachen 247

Page 285: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The Chomsky Normal Form

The Chomsky Normal Form

Step 3: Removing alphabet symbols

When a production A→ w has more than one symbol on the right side(d.h., |w | > 1), each terminal symbol a in w is replaced by a new variableUa. Additionally, productions Ua → a are added. Then, there are onlyterminal symbols on the right sides.

Only apply this step, when |w | > 1.

Sander Bruggink Automaten und Formale Sprachen 248

Page 286: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The Chomsky Normal Form

The Chomsky Normal Form

Step 4: Split long right sides

In the last step productions of the form A→ B1 . . .Bk are eliminated: addnew variables C1, . . . ,Ck−2, remove the original productions and replace itby:

A→ B1C1

C1 → B2C2

...

Ck−2 → Bk−1Bk

Sander Bruggink Automaten und Formale Sprachen 249

Page 287: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The Chomsky Normal Form

The Chomsky Normal Form

Example:

Let G = ({S ,X ,Y }, {a, b, c},P,S) be a grammar, where P contains thefollowing productions:

S → aXb

X → S | Y | aaSc | εY → X | bbSc

Transform G into Chomsky normal form.

Sander Bruggink Automaten und Formale Sprachen 250

Page 288: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

The CYK-Algorithmus: an efficient algorithm which decides whether aword w is generated by a context free grammar.

The algorithm was developed by John Cocke, Daniel Younger and TadaoKasami.

It requires a context free grammar G in Chomsky normal form an a wordw as input, and outputs, whether w is in the language of G .

Sander Bruggink Automaten und Formale Sprachen 251

Page 289: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

Idea: Let the word x ∈ Σ∗ be given. We want to know from whichvariables the word can be derived.

Possibility 1: x = a ∈ Σ, that is, x consists of a single alphabetsymbol. Then w can only be derived from variables A for which aproduction A→ a exists.

Possibility 2: x = a1 . . . an with n ≥ 2. In this case a productionA→ BC must be applied first, and then one part a1 . . . ak of wordmust be derived from B and one part ak+1 . . . an from C (1 ≤ k < n).

Sander Bruggink Automaten und Formale Sprachen 252

Page 290: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

Possibility 2 is schematically represented as follows:

A

B C

a1 . . . ak ak+1 . . . an

Sander Bruggink Automaten und Formale Sprachen 253

Page 291: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

However, it is not clear where the word x must be split, that is, how largethe index k is.

Therefore: We need to try all possible k’s. This means:

Let a word x = a1 . . . an be given. For all q with 1 < k < n do thefollowing:

Determine the set of variables V1 from which we can derive a1 . . . ak .

Determine the set of variables V2 from which we can deriveak+1 . . . an.

Check, wheter there are variables A,B,C with (A→ BC ) ∈ P,B ∈ V1 and C ∈ V2. In this case x can be derived from A.

Sander Bruggink Automaten und Formale Sprachen 254

Page 292: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

To avoid duplicate work, we apply methods of dynamic programming,which means:

we first determine all variables from which subwords of length 1 canbe derived;

we then determine all variables from which subwords of length 2 canbe derived;

. . .

finally we determine all variables from which x can be derived. Whenthe start symbol S is among these variables, then x is in the languageof the grammar.

Sander Bruggink Automaten und Formale Sprachen 255

Page 293: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

Notation: With xi ,j we denote the subword of x that starts on location iand has length j .

x = a1 . . . an xi ,j = ai . . . ai+j−1.

With this notation, the tree of before looks like this:

A

B C

x1,k xk+1,n−k

Sander Bruggink Automaten und Formale Sprachen 256

Page 294: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

With Ti ,j we denote the the set of variables from which xi ,j can be derived.

Ti ,j is determined as follows:

If j = 1, thenTi ,j = {A | (A→ xi ,j) ∈ P}

If j > 1, then

Ti ,j= {A | (A→ BC ) ∈ P

und es gibt k < j mit B ∈ Ti ,k und C ∈ Ti+k,j−k}

Sander Bruggink Automaten und Formale Sprachen 257

Page 295: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

Practical execution of the algorithm:We insert the sets of variables Ti ,j in the following table:

a1 a2 an−1 an

j = 1

j = n − 1

j = n

. . .

. . .

T1,n

T1,n−1T2,n−1

. . .. . .. . .

. . . . . . . . . . . .

Tn−1,2. . .. . .T2,2

T1,1 T2,1 Tn−1,1 Tn,1. . . . . .

T1,2j = 2

Sander Bruggink Automaten und Formale Sprachen 258

Page 296: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

The variable derive the following subwords:

j = 6

j = 5

j = 4

j = 3

j = 2

j = 1 T1,1

a1

T1,2 T2,2

T2,1

a2 a3

T3,1

T3,2 T4,2

T4,1

a4 a5

T5,2

T6,1

a6

T1,6

T5,1

T1,3 T2,3 T3,3 T4,3

T1,4 T2,4 T3,4

T1,5 T2,5

Sander Bruggink Automaten und Formale Sprachen 259

Page 297: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

a1 a2

j = 1

j = 2

j = 5

j = 6

j = 3

j = 4

a6a5a3 a4

T1,6

T1,5

T6,1

x = a1a2a3a4a5 | a6

(A→ BC ) ∈ P,B ∈ T1,5, C ∈ T6,1 ⇒ A ∈ T1,6

Sander Bruggink Automaten und Formale Sprachen 260

Page 298: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

a1 a2

j = 1

j = 2

j = 5

j = 6

j = 3

j = 4

a6a5a3 a4

T1,6

T1,4

T5,2 x = a1a2a3a4 | a5a6

(A→ BC ) ∈ P,B ∈ T1,4, C ∈ T5,2 ⇒ A ∈ T1,6

Sander Bruggink Automaten und Formale Sprachen 260

Page 299: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

a1 a2

j = 1

j = 2

j = 5

j = 6

j = 3

j = 4

a6a5a3 a4

T1,6

T1,3 T4,3

x = a1a2a3 | a4a5a6

(A→ BC ) ∈ P,B ∈ T1,3, C ∈ T4,3 ⇒ A ∈ T1,6

Sander Bruggink Automaten und Formale Sprachen 260

Page 300: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

a1 a2

j = 1

j = 2

j = 5

j = 6

j = 3

j = 4

a6a5a3 a4

T1,6

T1,2

T3,4

x = a1a2 | a3a4a5a6

(A→ BC ) ∈ P,B ∈ T1,2, C ∈ T3,4 ⇒ A ∈ T1,6

Sander Bruggink Automaten und Formale Sprachen 260

Page 301: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

a1 a2

j = 1

j = 2

j = 5

j = 6

j = 3

j = 4

a6a5a3 a4

T1,6

T1,1

T2,5

x = a1 | a2a3a4a5a6

(A→ BC ) ∈ P,B ∈ T1,1, C ∈ T2,5 ⇒ A ∈ T1,6

Sander Bruggink Automaten und Formale Sprachen 260

Page 302: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

input G = (V ,Σ,P, S), w ∈ Σ∗

n := |w |for i ∈ {1, . . . , n} do // upper row, subword length = 1

Ti ,1 := {A | A→ xi ,1 ∈ P}endfor j ∈ {2, . . . , n} do // rest of table, subword length ≥ 2

for i ∈ {1, . . . , n − j + 1} do // start of the subwordTi ,j := ∅for k ∈ {1, . . . , j − 1} do // try out possible splits

Ti ,j := Ti ,j ∪{A | A→ BC ∈ P for some B ∈ Ti ,k ,C ∈ Ti+k,j−k}

endend

endif S ∈ T1,n then return true else return false

Sander Bruggink Automaten und Formale Sprachen 261

Page 303: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

Example 1: Consider a context free grammar with the followingproductions:

S → AD | FG

D → SE | BC

E → BC

F → AF | a

G → BG | CG | b

H → SC

A→ a

B → b

C → c

Question: Let x = aabcbc . Is x ∈ L?

Sander Bruggink Automaten und Formale Sprachen 262

Page 304: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

Example 2: Consider a context free grammar with the followingproductions:

S → AB

A→ ab | aAb

B → c | cB

Questions: Let x = aaabbbcc. Is x ∈ L?

Sander Bruggink Automaten und Formale Sprachen 263

Page 305: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

Complexity of the CYK-Algorithm

Let n = |x | be the length of the input word. Then:

We need fewer as n2 boxes in the table.

In order to fill in a box of the table, 2 · n other fields must beconsidered. Fur das Ausfullen jedes Tabellenfeldes mussen bis zu 2 · nandere Felder betrachtet werden.

The algorithm makes less than c · n3 steps, where c is a constant. This iscalled cubic run time.

Sander Bruggink Automaten und Formale Sprachen 264

Page 306: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages The CYK-Algorithm

The CYK-Algorithm

The CYK-Algorithm is one of the most efficient algorithms that work forarbitrary context free grammars.

In praxis, however, it is to slow to parse e.g. large Java programs.

There are also procedure which are much more efficient. However, theycan only be used on a subclass of the context free languages. In praxis,recursive descent parsers and LR(k)-parsers are often used.

Sander Bruggink Automaten und Formale Sprachen 265

Page 307: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

How do we show that a language is not context free?

Rightarrow Pumping Lemma for context free languages.

The statement

Every sufficiently long word uses a state twice.

for regular languages is replaced by

In a path of the syntax tree that represents the derivation of a sufficientlylong word in a context free grammar, one of the variables occurs at leasttwice.

Sander Bruggink Automaten und Formale Sprachen 266

Page 308: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

How Many Leaves Does a Syntax Tree Have?

How many leaves does a syntax tree of depth n have?

SDepth = 020 = 1 leaves

? ?Depth = 121 = 2 leaves

? ? ? ?Depth = 222 = 4 leaves

a1 a2 a3 a4Depth = 323 = 8 leaves

In general:Depth = n2n = 2n leaves

For a grammar in Chomsky normal form.

Conversely: a tree with 2n leaves has at least depth n.

Sander Bruggink Automaten und Formale Sprachen 267

Page 309: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

This means:

Let G be a grammar in Chomsky normal form and k the number ofvariables in G .

For a word z ∈ L(G ) with |z | > 2k , the corresponding syntaxt treehas more than 2k leaves.

This means, that the depth of the syntax tree is larger than k (notcounting the level of the leaves), and therefore there exists a pathfrom root to leave of length at least k + 1.

On this path there are at least k + 1 variables. At least one variableoccurs on this path at least twice.

Sander Bruggink Automaten und Formale Sprachen 268

Page 310: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

Syntax tree for a word z with |z | ≥ n = 2k

Here, n is the “pumping lemma constant”

S

Baum

Wort z

Ebene der Blatter(letzter Ableitungsschritt)

Sander Bruggink Automaten und Formale Sprachen 269

Page 311: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

There is a path with at least k + 1 inner nodes.

S

Wort z

Sander Bruggink Automaten und Formale Sprachen 269

Page 312: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

On this path some variable occurs twice. For example, A.

A

A

S

Wort z

Sander Bruggink Automaten und Formale Sprachen 269

Page 313: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

The word z is decomposed in five subwords u, v , w , x , y :

w is derived by the lower A: A⇒∗ w

uwx is derived by the upper A: A⇒∗ vwx

u v w x y

S

A

A

Sander Bruggink Automaten und Formale Sprachen 269

Page 314: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

We obtain three smaller syntax trees, that can be put together again indifferent ways.

u v w x y

S

A

A

Sander Bruggink Automaten und Formale Sprachen 269

Page 315: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

By removing the middle sub tree one obtains a syntax tree for uwy . There-fore, uwy ∈ L holds.

u y

w

S

A

Sander Bruggink Automaten und Formale Sprachen 269

Page 316: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

By duplicating the middle sub tree one obtains a syntax tree for uv 2wx2y .Therefore, uv 2wx2y ∈ L holds.

u y

v w x

v x

S

A

A

A

Sander Bruggink Automaten und Formale Sprachen 269

Page 317: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

Additionally, one can require the following properties for v ,w , x :

|vwx | ≤ n = 2k

|vx | ≥ 1

u v w x y

S

A

A

Sander Bruggink Automaten und Formale Sprachen 270

Page 318: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

Pumping Lemma, uvwxy -Theorem (Theorem)

Let L be context free language. There is number n such that all wordsz ∈ L with |z | > n can be decomposed in z = uvwxy , such that thefollowing properties hold:

1 |vx | ≥ 1,

2 |vwx | ≤ n and

3 for all i = 0, 1, 2, . . . it holds that uv iwx iy ∈ L.

Here, n = 2|V | comes from the number of variables of the context freegrammar.

Sander Bruggink Automaten und Formale Sprachen 271

Page 319: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

Pumping Lemma (alternative phrasing)

Let L be a language. Assume that, for each number n a word z ∈ L with|x | > n can be chosen, such that for all decompositions z = uvwxy suchthat

1 |vx | ≥ 1 and

2 |vwx | ≤ n

there is a number i such that uv iwx iy /∈ L. Then L is not context free.

Sander Bruggink Automaten und Formale Sprachen 272

Page 320: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping Lemma

Let L be a language. We want to use the pumping lemma to show that alanguage is not context free.

Cooking recipe for the pumping lemma

1 Take an arbitrary number n.

2 Choose a word z ∈ L with |x | > n.

3 Consider all possible decompositions z = uvwxy with the restrictionsthat |vx | ≥ 1 and |vwx | ≤ n.

4 When there is a number i such that uv iwx iy /∈ L for all suchdecompositions, then the language L is not context free.

Sander Bruggink Automaten und Formale Sprachen 273

Page 321: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Pumping Lemma for Context Free Languages

Pumping-Lemma

With the help of the pumping lemma, we can now show that the followinglanguages are not context free:

Example 1:

L1 = {ambm2 | m > 0}

Example 2:

L2 = {ambmcm | m > 0}

Sander Bruggink Automaten und Formale Sprachen 274

Page 322: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push Down Automata

What is a fitting automata model for context free languages?

Analogously to regular languages we want to have an automaton model forcontext free languages.

Answer: Push down automata, (german: Kellerautomaten

Automata that are equipped with an additional stack.

Sander Bruggink Automaten und Formale Sprachen 275

Page 323: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

We consider the language

L = {w$w R | w ∈ {a, b, c , d}∗}

with Σ = {a, b, c , d , $}.Here w R the reverse of the word w (for example: (abc)R = cba).

Sander Bruggink Automaten und Formale Sprachen 276

Page 324: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

To obtain an automaton model for context free languages,

we introduce a stack or push down memory, on which a sequence ofsymbols of arbitrary length can be stored

during the reading of a new symbol the top symbol on the stack canbe read an changed in the following ways:

either the stack is not changed oderthe top symbol on the stack is removed (pop) and replaced by asequence of symbols (push).

Somewhere else the stack cannot be read or changed.

Sander Bruggink Automaten und Formale Sprachen 277

Page 325: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Schematic representation of a push-down automaton:

e i n g a b e

A

B

C

#

Keller

Kellerbodenzeichen

Kellerautomat

Sander Bruggink Automaten und Formale Sprachen 278

Page 326: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

We consider the language

L = {w$w R | w ∈ {a, b, c , d}∗}

with Σ = {a, b, c , d , $}.A push-down automaton recognizes the language in the following way:

A word w is read from left to right.

The automaton has two states:

State 1: Store the first part of the word.

State 2: Check the second part of the word.

Sander Bruggink Automaten und Formale Sprachen 279

Page 327: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

State 1: As long a $ was not reached: push each symbol read ascapital on the stack(a A, b B, . . . ).

When $ is read: don’t change the stack and change to state 2.

State 2: check for each symbol read, whether the correspondingcapital is the top symbol on the stack. This symbol is removed.

When the read symbol and the top stack symbol don’t correspond,there are no possible transitions. The push-down automaton blocksand the word is not accepted.

When the two always correspond: the stack bottom symbol # isremoved and the automaton accepts the word with an empty stack.

Sander Bruggink Automaten und Formale Sprachen 280

Page 328: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

KellerautomatZustand 1

c a d $ d a c aa

#

Sander Bruggink Automaten und Formale Sprachen 281

Page 329: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

KellerautomatZustand 1

a d $ d a c a

#

a c

A

Sander Bruggink Automaten und Formale Sprachen 281

Page 330: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

KellerautomatZustand 1

d $ d a c a

#

a

A

c a

C

Sander Bruggink Automaten und Formale Sprachen 281

Page 331: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

KellerautomatZustand 1

$ d a c a

#

a

A

c da

C

A

Sander Bruggink Automaten und Formale Sprachen 281

Page 332: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

d a c a

#

a

A

c a

C

A

d $

KellerautomatZustand 1

D

Sander Bruggink Automaten und Formale Sprachen 281

Page 333: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

a c a

#

a

A

c a

C

A

d d$

KellerautomatZustand 2

D

Sander Bruggink Automaten und Formale Sprachen 281

Page 334: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

c a

#

a

A

c a

C

d $

KellerautomatZustand 2

ad

A

Sander Bruggink Automaten und Formale Sprachen 281

Page 335: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

a

#

a

A

c a d $

KellerautomatZustand 2

d a c

C

Sander Bruggink Automaten und Formale Sprachen 281

Page 336: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

#

a c a d $

KellerautomatZustand 2

d a

A

c a

Sander Bruggink Automaten und Formale Sprachen 281

Page 337: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

a c a d $

KellerautomatZustand 2

d a c

#

a

Sander Bruggink Automaten und Formale Sprachen 281

Page 338: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation

a c a d $

Kellerautomat

d a c a

Zustand 2

Sander Bruggink Automaten und Formale Sprachen 281

Page 339: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Push-down automaton (Definition)

A (non-deterministic) push-down automaton M is a 6-tupleM = (Z ,Σ, Γ, δ, z0,#), where

Z is the set of states,

Σ is the input alphabet (with Z ∩ Σ = ∅),

Γ is the stack alphabet,

z0 ∈ Z is the initial state,

δ : Z × (Σ ∪ {ε})× Γ→ Pe(Z × Γ∗) is the transition function and

# ∈ Γ is the stack bottom symbol.

Sander Bruggink Automaten und Formale Sprachen 282

Page 340: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Remarks about push-down automata:

Z , Σ must be finite sets again.

Pe(Z × Γ∗) denotes the set of all finite subsets of Z × Γ∗.

Abbreviation: PDA (pushdown automaton) or KA (Kellerautomat)

Sander Bruggink Automaten und Formale Sprachen 283

Page 341: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push Down Automata

Consider the transition function

δ : Z × (Σ ∪ {ε})× Γ→ Pe(Z × Γ∗)

When (z ′,B1 . . .Bk) ∈ δ(z , a,A) this means that

when the input symbol a is read in state z and the symbol A is the topsymbol on the stack, thenA is removed from the stack and replaced by B1 . . .Bk (B1 is the topsymbol) and the automaton transfers to state z ′.

It is also possible that a = ε. In this case no input symbol is read.

Sander Bruggink Automaten und Formale Sprachen 284

Page 342: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

We consider several cases of values of the transition function δ:

(z ′, ε) ∈ δ(z , a,A)

Symbol a is read

State changes from z to z ′

Symbol A is removed from the stack:

A

Sander Bruggink Automaten und Formale Sprachen 285

Page 343: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

(z ′,B) ∈ δ(z , a,A)

Symbol a is read

State changes from z to z ′

Symbol A on the stack is repla-ced by B:

A B

Sander Bruggink Automaten und Formale Sprachen 286

Page 344: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

(z ′,A) ∈ δ(z , a,A)

Symbol a is read

State changes from z to z ′

Symbol A stays on the stack:

AA

Sander Bruggink Automaten und Formale Sprachen 287

Page 345: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

(z ′,BA) ∈ δ(z , a,A)

Symbol a is read

State changes from z to z ′

Symbol B is put on the stack:

B

AA

Sander Bruggink Automaten und Formale Sprachen 288

Page 346: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

(z ′,B1 . . .Bk) ∈ δ(z , a,A)

Symbol a is read

State changes from z to z ′

Symbol A is replace by severalsymbols:

A

. . .

B1

Bk

Sander Bruggink Automaten und Formale Sprachen 289

Page 347: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Configuration (definition)

A configuration of a push-down automaton is a triple

k ∈ Z × Σ∗ × Γ∗.

Meaning of the components of k = (z ,w , γ) ∈ Z × Σ∗ × Γ∗:

z ∈ Z is the current state of the push-down automaton.

w ∈ Σ∗ is the input which still needs to be read.

γ ∈ Γ∗ is the current stack. The top stack symbols is put on the left.

Sander Bruggink Automaten und Formale Sprachen 290

Page 348: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Transitions between configurations arise from the transition function δ:

Configuration transitions (definition)

It holds that(z , aw ,Aγ) ` (z ′,w ,B1 . . .Bkγ),

when (z ′,B1 . . .Bk) ∈ δ(z , a,A) and it holds that

(z ,w ,Aγ) ` (z ′,w ,B1 . . .Bkγ),

when (z ′,B1 . . .Bk) ∈ δ(z , ε,A).

In the first case a symbol is read from the input, but not in the secondcase.

Sander Bruggink Automaten und Formale Sprachen 291

Page 349: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Accepted language (definition)

Let M = (Z ,Σ, Γ, δ, z0,#) be a push-down automaton. Then the languageaccepted by M is:

N(M) = {x ∈ Σ∗ | (z0, x ,#) `∗ (z , ε, ε) fur ein z ∈ Z}.

The accepted language consists of all words, with which it is possible tocompletely empty the stack. Since push-down automata arenon-deterministic, it is possible that there are calculations for the wordthat do not end in an empty stack.

Sander Bruggink Automaten und Formale Sprachen 292

Page 350: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

At the start of each calculation the stack contains exactly the stackbottom symbol #.

The stack is unbounded and can grow arbitrarily large. There areinfinitely many possible stacks; this distinguishes push-down automatafrom finite automata.

The push-down automata we consider always accept with an emptystack. Variants exist which have final states, like finite automata.

Sander Bruggink Automaten und Formale Sprachen 293

Page 351: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata: Examples

Example 1: Let Σ = {a, b, $}.Give a push-down automaton M for the languageN(M) = {w$w R | w ∈ {a, b}∗}.

Example 2: Let Σ = {a, b}.Give a push-down automaton M ′ for the languageN(M ′) = {ww R | w ∈ {a, b}∗}.

Biespiel 3: Let Σ = {a, b}.Give a push-down automaton K for the languageN(K ) = {anbm | 1 ≤ n ≤ m}.

Sander Bruggink Automaten und Formale Sprachen 294

Page 352: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Now we have to show that push-down automata really accept exactly thecontext-free languages.

Context free grammar → push-down automaton (theorem)

For each context free grammar G there is a push-down automaton M withL(G ) = N(M).

Sander Bruggink Automaten und Formale Sprachen 295

Page 353: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Proof idea:

1 Use the stack to simulate the grammar. Derive a word of thelanguage on the stack (guess non-deterministically) and check,whether the word corresponds to the input.

2 Problem: the stack cannot be used arbitrarily; we can only replace thetop stack symbol. Solution: remove already derived parts of the word

from the stack, by comparing them with the input and remove themwhen they correspond.

3 Thus one can make sure that a variable always one the top of thestack.

Sander Bruggink Automaten und Formale Sprachen 296

Page 354: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

More formally:let G = (V ,Σ,P,S) be a context free grammar. We define the push-downautomaton

M = ({z},Σ,V ∪ Σ, δ, z ,S)

with a single state z and the stack alphabet V ∪Σ. The start variable S isthe stack bottom symbol.

Transition function δ:

For each production (A→ α) ∈ P with α ∈ (V ∪ Σ)∗, add (z , α) tothe set δ(z , ε,A).(Derivation step on the stack without reading from the input.)

Additionally, add (z , ε) to δ(z , a, a), for each a ∈ Σ.(Comparing stack and input.)

Sander Bruggink Automaten und Formale Sprachen 297

Page 355: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

We consider the following context free grammar with the two-elementalphabet Σ = {[, ]}, which generates correctly paranthesized structures:

S → [S ]S | ε

Exercise: transform the grammar in a push-down automaton and acceptthe word [ [ ] ] [ ].

Sander Bruggink Automaten und Formale Sprachen 298

Page 356: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ ] ] [ ]

Kellerautomat

Konfiguration:

[

Zustand z

(z , [ [ ] ] [ ], S)

S

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 357: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ ] ] [ ]

Kellerautomat

Konfiguration:

Zustand z

(z , [ [ ] ] [ ], [S ]S)

S

]

S

[

[

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 358: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ ] ] [ ]

Kellerautomat

Konfiguration:

Zustand z

S

]

[

(z , [ ] ] [ ], S ]S)

S

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 359: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

] ] [ ]

Kellerautomat

Konfiguration:

Zustand z

S

]

[

S

]

S

(z , [ ] ] [ ], [S ]S ]S)

[

[

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 360: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ ] [ ]

Kellerautomat

Konfiguration:

Zustand z

[

S

]

S

(z , ] ] [ ], S ]S ]S)

]

]

SGrammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 361: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ ] [ ]

Kellerautomat

Konfiguration:

Zustand z

[

S

]

S

]

]

(z , ] ] [ ], ]S ]S)

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 362: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ [ ]

Kellerautomat

Konfiguration:

Zustand z

[

S

]

] ]

S

(z , ] [ ], S ]S)

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 363: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ [ ]

Kellerautomat

Konfiguration:

Zustand z

[

S

]

(z , ] [ ], ]S)

]

]

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 364: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ [ ]

Kellerautomat

Konfiguration:

Zustand z

[ ] ]

S

(z , [ ], S)

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 365: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ ]

Kellerautomat

Konfiguration:

Zustand z

[ ] ]

]

S

S

[

[

(z , [ ], [S ]S)

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 366: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[ ]

Kellerautomat

Konfiguration:

Zustand z

[ ] ]

]

S

[

(z , ], S ]S)

S

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 367: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[

Kellerautomat

Konfiguration:

Zustand z

[ ] ]

S

[

(z , ], ]S)

]

]

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 368: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[

Kellerautomat

Konfiguration:

Zustand z

[ ] ] [

(z , ε, S)

S

]

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 369: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Simulation of the PDA on the word [ [ ] ] [ ]

[

Kellerautomat

Konfiguration:

Zustand z

[ ] ] [ ]

(z , ε, ε)

Grammar:

S → [S ]S

S → ε

Sander Bruggink Automaten und Formale Sprachen 299

Page 370: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automaton → Context Free Grammar

Now, we want to show that for each push-down automaton there exists anequivalent context-free grammar.

Push-down automaton → Context free grammar (theorem)

For each push-down automaton M there exists a context free grammar Gwith N(M) = L(G ).

Sander Bruggink Automaten und Formale Sprachen 300

Page 371: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automaton → Context Free Grammar

Proof idea:

We want to describe, which words are accepted as a single symbol isremoved from the stack. The language accepted by the automatonconsists of all words, which are accepted as # is removed from the stack.

Removing from the stack means: in between other symbols may be put onthe stack, but in the end the stack must be one symbol smaller.

Sander Bruggink Automaten und Formale Sprachen 301

Page 372: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automaton → Context Free Grammar

Height ofthe stack

Numer of inputsymbols read

A B

In between A can be repla-ced; however, the stack isnot lower.

A is the topstack symbol

First time the stackis lower.

a1 a2 . . . akWord, which is

read by removingA from the stack

Sander Bruggink Automaten und Formale Sprachen 302

Page 373: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automaton → Context Free Grammar

Which production are needed?

There are 2 possibilities:

A is directly removed after reading a:

⇒ production of the form:”A“ → a

After reading a, A is replaced by B1, . . . ,Bn. Now we have to removeB1, . . . ,Bn and concatenate the generated words.

⇒ production of the form”A“ → a

”B1“ · · ·

”Bn“

But how do we also consider the states of the push-down automaton?

Sander Bruggink Automaten und Formale Sprachen 303

Page 374: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automaton → Context Free Grammar

Question: How do we consider the states of the push-down automaton?

Answer: We will generate a context free grammar with variables of theform (z ,A, z ′), meaning

From (z ,A, z ′) we can derive exactly those words, which thepush-down automaton can read, when it starts in state z ,removes A from the stack, and ends in state z ′.

(z ,A, z ′)⇒∗ x ⇐⇒ (z , x ,A) `∗ (z ′, ε, ε)

Sander Bruggink Automaten und Formale Sprachen 304

Page 375: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automaton → Context Free Grammar

Height ofthe stack

Numer of inputsymbols read

A BA is the top

stack sym-bol (PDA

in state z)

First time the stackis lower (PDA instate z ′)

a1 a2 . . . akWord, which is

generated from thevariable (z,A, z ′)

Sander Bruggink Automaten und Formale Sprachen 305

Page 376: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automaton → Context Free Grammar

Formal definition:

Let a push-down automaton M = (Z ,Σ, Γ, δ, z0,#) be given. We define agrammar G = (V ,Σ,P, S) as follows:

Variables: V = {S} ∪ Z × Γ× Z(New start variables and variables of the form (z1,A, z2))

Sander Bruggink Automaten und Formale Sprachen 306

Page 377: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

The grammar consists of productions of the following form:

S → (z0,#, z) for all z ∈ Z

(Begin)

(z ,A, z ′) → a when (z ′, ε) ∈ δ(z , a,A)

(Symbol A can be immediately

removed when reading a

(z ,A, z ′) → a(z1,B1, z2)(z2,B2, z3) . . . (zk ,Bk , z′)

when (z1,B1 . . .Bk) ∈ δ(z , a,A), z2, . . . , zk ∈ Z

(Symbol A is replaced by B1 . . .Bk , these

must be removed passing through states z1, . . . , zk

Sander Bruggink Automaten und Formale Sprachen 307

Page 378: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Example: Consider the push-down automaton

M = ({z1, z2}, {a, b}, {A,#}, δ, z1,#)

with the following transition function δ:

(z1, a,#)→ (z1,A#)

(z1, a,A)→ (z1,AA)

(z1, b,A)→ (z2, ε)

(z2, b,A)→ (z2, ε)

(z2, ε,#)→ (z2, ε)

It holds that: N(M) = {anbn | n ≥ 1}.

Exercise: Transform M in a context free grammar.

Sander Bruggink Automaten und Formale Sprachen 308

Page 379: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Push-Down Automata

Remark to the translations “context free grammar ↔ push-downautomaton”:

For each push-down automaton there exists an equivalent push-downautomaton with a single state.

To construct it, transform the push-down automaton into a contextfree grammar and then back again into a push-down automaton. Thisworks, because the translation from context free grammars topush-down automata always produces push-down automata with asingle state.

Sander Bruggink Automaten und Formale Sprachen 309

Page 380: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Interlude: Push-Down Automata with Final States

In the literature push-down automata are often defined in such a way, thatthey have final states and do not accept with an empty stack.

Push-down automaton with final states (definition)

A push-down automaton with final states M is a 7-tupleM = (Z ,Σ, Γ, δ, z0,#,E ), where

(Z ,Σ, Γ, δ, z0,#) is a push-down automaton,

E ⊆ Z is the set of final states

Sander Bruggink Automaten und Formale Sprachen 310

Page 381: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Interlude: Push-Down Automata with Final States

Accepted language of a push-down automaton with final states (definition)

Let M = (Z ,Σ, Γ, δ, z0,#,E ) be a push-down automaton with final states.The language accepted by M is:

N(M) = {x ∈ Σ∗ | (z0, x ,#) `∗ (z , ε, γ) fur ein z ∈ E , γ ∈ Γ∗}.

The following differences exist:

The states which is reached in the end is not requires to be a finalstate.

In the end the stack is not necessarily empty.

Sander Bruggink Automaten und Formale Sprachen 311

Page 382: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Interlude: Push-Down Automata with Final States

Theorem

A language is accepted by a push-down automaton with final states if andonly if it is accepted by a push-down automaton.

(Proof omitted.)

Sander Bruggink Automaten und Formale Sprachen 312

Page 383: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Push Down Automata

Interlude: Push-Down Automata with Final States

L = {ww R | w ∈ {a, b}∗}.

M = ({z1, z2, z3}, {a, b}, {A,B,#}, δ, z1,#, {z3}), where δ is given by:

(z1, a,#) → (z1,A#) (z1, a,A) → (z1,AA) (z1, a,B) → (z1,AB)(z1, b,#) → (z1,B#) (z1, b,A) → (z1,BA) (z1, b,B) → (z1,BB)(z1, ε,#) → (z2,#) (z1, a,A) → (z2, ε) (z1, b,B) → (z2, ε)(z2, a,A) → (z2, ε) (z2, b,B) → (z2, ε) (z2, ε,#) → (z3,#)

Sander Bruggink Automaten und Formale Sprachen 313

Page 384: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

Closure Properties

Are the context free languages closed under the following operations?

Union (L1, L2 context free ⇒ L1 ∪ L2 context free) ?

Product/Concatenation (L1, L2 context free ⇒ L1L2 context free) ?

Star-Operation (L context free ⇒ L∗ context free) ?

Intersection (L1, L2 context free ⇒ L1 ∩ L2 context free) ?

Complement (L context free ⇒ L = Σ∗\L context free) ?

Sander Bruggink Automaten und Formale Sprachen 314

Page 385: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

Closure under union

When L1 and L2 are context free languages, then L1 ∪ L2 is also contextfree.

Proof: Let context free grammars

G1 = (V1,Σ,P1,S1) and G2 = (V2,Σ,P2,S2)

(with V1 ∩ V2 = ∅) be given for L1 = L(G1), L2 = L(G2). The grammar

G = (V1 ∪ V2 ∪ {S},Σ,P1 ∪ P2 ∪ {S → S1, S → S2},S)

where S /∈ V1 ∪ V2, is a context free grammar for L1 ∪ L2.

Sander Bruggink Automaten und Formale Sprachen 315

Page 386: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

Closure under product/concatenation

When L1 and L2 are context free languages, then L1L2 is also context free.

Proof:G1 = (V1,Σ,P1,S1) and G2 = (V2,Σ,P2,S2)

(with V1 ∩ V2 = ∅) for L1 = L(G1), L2 = L(G2). Then

G = (V1 ∪ V2 ∪ {S},Σ,P1 ∪ P2 ∪ {S → S1S2}, S)

where S /∈ V1 ∪ V2, is a context free grammar for L1L2.

Sander Bruggink Automaten und Formale Sprachen 316

Page 387: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

Closure under the Star Operation

When L is a context free language, then L∗ is also context free.

Proof: Let a context free grammar

G1 = (V1,Σ,P1, S1)

for L = L(G1) be given. Then

G = (V1 ∪ {S},Σ,P1 ∪ {S → ε, S → S1S},S)

is a context free grammar for L∗.

Sander Bruggink Automaten und Formale Sprachen 317

Page 388: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

No closure under intersection

When L1 and L2 are context free languages, then L1 ∩ L2 is not necessarilycontext free.

Counter example: The languages

L1 = {ajbkck | j ≥ 0, k ≥ 0}L2 = {akbkc j | j ≥ 0, k ≥ 0}

are context free. Their intersection

L1 ∩ L2 = {akbkck | k ≥ 0}

is, however, not context free.

Sander Bruggink Automaten und Formale Sprachen 318

Page 389: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

Closure under intersection with regular languages

Let L be context free language and R a regular language. Then it holdsthat L ∩ R is a context free language.

Idea: Similar to the construction of a cross product automaton of two finiteautomata, we can build the cross product of a push-down automaton anda finite automaton, which accepts the intersection of the two languages.

Sander Bruggink Automaten und Formale Sprachen 319

Page 390: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

Proof idea:Construction of a push-down automaton M ′ for L ∩ R from a push-downautomaton (with final states) M = (Z1,Σ, Γ, δ1, z

10 ,#,E1) for L and a

deterministic finite automaton A = (Z2,Σ, δ2, z20 ,E2) for R:

M ′ = (Z1 × Z2,Σ, Γ, δ, (z10 , z

20 ),#,E1 × E2)

with

((z ′1, z′2),B1 . . .Bk) ∈ δ((z1, z2), a,A), when

(z ′1,B1 . . .Bk) ∈ δ1(z1, a,A) und δ2(z2, a) = z ′2((z ′1, z2),B1 . . .Bk) ∈ δ((z1, z2), ε,A), when(z ′1,B1 . . .Bk) ∈ δ1(z1, ε,A)

Sander Bruggink Automaten und Formale Sprachen 320

Page 391: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

Kein Abschluss unter Komplement

When L is a context free language, then L = Σ∗\L is not necessarilycontext free.

Proof: Assume that the context free languages are closed under

complement. Because L1 ∩ L2 = L1 ∪ L2, then they would be closed underintersection as well, which is not the case. That is, we obtain acontradiction.

Sander Bruggink Automaten und Formale Sprachen 321

Page 392: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Closure Properties

Closure Properties

Closere properties (summary)

The context free languages are closed under:

Union (L1, L2 context free ⇒ L1 ∪ L2 context free) ?

Product/Concatenation (L1, L2 context free ⇒ L1L2 context free) ?

Star-Operation (L context free ⇒ L∗ context free) ?

Intersection with a regular language (L context free, R regular ⇒L ∩ R context free)

The context free languages are not closed under:

Intersection

Complement

Sander Bruggink Automaten und Formale Sprachen 322

Page 393: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Decidable Properties

Decidability

Are the following problems decidable?

Word problem: Let a context free language and a word w ∈ Σ∗ begiven. Does w ∈ L hold?

Emptiness problem: Let a context free language L be given. DoesL = ∅ hold?

Finiteness problem: Let a context free language L be given. Is L finite?

Equivalence problem: Let two context free languages L1, L2 be given.Does L1 = L2 hold?

Intersection problem: Let two context free languages L1, L2 be given.Does L1 ∩ L2 = ∅ hold?

Decidable means that there exists an algorithm which solves the problemin every case.

Sander Bruggink Automaten und Formale Sprachen 323

Page 394: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Decidable Properties

Decidability

Decidability of the word problem

The word problem for context free languages is decidable.

We can solve it with the CYK algorithm.

Sander Bruggink Automaten und Formale Sprachen 324

Page 395: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Decidable Properties

Decidability

Decidability of the emptiness problem

The emptiness problem for context free languages is decidable.

Proof idea: Determine all productive variables, that is, all variables A, forwhich a x ∈ Σ∗ exists with A⇒∗ x . The language is empty if and only ifthe start variable S is not productive.

Sander Bruggink Automaten und Formale Sprachen 325

Page 396: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Decidable Properties

Decidability

Determine productive variables:

input Grammar G = (V ,Σ,P, S)T := {A ∈ V | there exists an (A→ w) ∈ P with w ∈ Σ∗}repeat

P ′ := {(A→ w) ∈ P | for all variables B in w : B ∈ T}T := T ∪ {A | there exists a w such that (A→ w) ∈ P ′}

until T is not modified in the last loopreturn T

Determine, whether the language of a grammar is empty:

input Grammatik G = (V ,Σ,P,S)T := produktive Variablen von Greturn S /∈ T

Sander Bruggink Automaten und Formale Sprachen 326

Page 397: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Decidable Properties

Decidability

Decidability of the finiteness problem

The finiteness problem for context free languages is decidable.

Proof idea: Let a context free grammar be given. Let n = 2|V |, that is, thepumping lemma number of the grammar. All words x ∈ L with |x | ≥ n canbe pumped up, while all words x ∈ L with |x | ≤ 2n can be pumped down.

This means that L is finite if and only if there is a word x ∈ L such thatn ≤ |x | ≤ 2n. There are only finitely many such words, and so we can testall of them for membership of the language.

Sander Bruggink Automaten und Formale Sprachen 327

Page 398: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Decidable Properties

Decidability

Undecidability for context free languages

The following problems are undecidable for context free languages, whichmeans that there are no procedures to solve them:

Equivalence problem: Let two context free languages L1, L2 be given.Does L1 = L2 hold?

Intersection problem: Let two context free languages L1, L2 be given.Does L1 ∩ L2 = ∅ hold?

Remark: In the lecture “Berechenbarkeit und Komplexitat” it will beshown, how we can prove that such problems are undecidable.

Sander Bruggink Automaten und Formale Sprachen 328

Page 399: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Decidable Properties

Decidability

The intersection problem is decidable, when it known of one the twolanguages L1, L2 that it is regular.

Decision procedure:

1 Construct the push-down automaton that accepts L1 ∩ L2.

2 Convert it to a context free grammar.

3 Check whether the language generated by the context free grammar isempty.

Sander Bruggink Automaten und Formale Sprachen 329

Page 400: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

We now consider a subclass of push-down automata that can be used torecognize languages deterministically and thus efficiently.

Deterministic Push-Down Automaton (definition)

A deterministic push-down automaton M is a 7-tupleM = (Z ,Σ, Γ, δ, z0,#,E ), where

(Z ,Σ, Γ, δ, z0,#) is a push-down automaton,

E ⊆ Z is the set of final states, and

the transition function δ is deterministic, which means: for all z ∈ Z ,a ∈ Σ and A ∈ Γ:

|δ(z , a,A)|+ |δ(z , ε,A)| ≤ 1.

Sander Bruggink Automaten und Formale Sprachen 330

Page 401: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Differences between push-down automata and deterministic push-downautomata:

Deterministic push-down automata have a set of final states andaccept in a final state – not with empty stack.

For each state z and each stack symbol A it holds that:

either there is at most one ε transitionor for each alphabet symbol there is at most one transition.

The definitions of configurations and transitions betweenconfiguration stay the same.

Sander Bruggink Automaten und Formale Sprachen 331

Page 402: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Accepted language of a deterministic push-down automaton (definition)

Let M = (Z ,Σ, Γ, δ, z0,#,E ) be a deterministic push-down automaton.The language accepted by M is:

N(M) = {x ∈ Σ∗ | (z0, x ,#) `∗ (z , ε, γ) fur ein z ∈ E , γ ∈ Γ∗}.

Sander Bruggink Automaten und Formale Sprachen 332

Page 403: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Deterministisch kontextfreie Sprachen

A language is deterministic context free if and only if it is accepted by adeterministic push-down automaton.

Sander Bruggink Automaten und Formale Sprachen 333

Page 404: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Example: The language L = {w$w R | w ∈ {a, b}∗} is deterministiccontext free.

M = ({z1, z2, z3}, {a, b, $}, {#,A,B}, δ, z1,#, {z3}),

where δ is defined as follows (we write (z , a,A)→ (z ′, x), when(z ′, x) ∈ δ(z , a,A)).

(z1, a,#) → (z1,A#) (z1, a,A) → (z1,AA) (z1, a,B) → (z1,AB)(z1, b,#) → (z1,B#) (z1, b,A) → (z1,BA) (z1, b,B) → (z1,BB)(z1, $,#) → (z2,#) (z1, $,A) → (z2,A) (z1, $,B) → (z2,B)(z2, a,A) → (z2, ε) (z2, b,B) → (z2, ε) (z2, ε,#) → (z3,#)

Sander Bruggink Automaten und Formale Sprachen 334

Page 405: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Example 2: The language L = {ww R | w ∈ {a, b}∗} is however notdeterministic context free. (Without proof.)

We have already seen that this language is context free. That means, thatthe class of context free languages and the class of deterministic contextfree languages are not equal.

The class of deterministic context free languages is a strict subset of theclass of context free languages.

Sander Bruggink Automaten und Formale Sprachen 335

Page 406: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Further remarks:

Efficiency: With deterministic push-down automata we have obtaineda procedure for solving the word problem which runs in linear time (asfunction of the number of input symbols).

For this one has the push-down automata run one the word andchecks whether a final state was reached in the end.

Deterministic context free grammar: There are also classes ofgrammars which correpsond to the deterministic push-down automata.

These are, however, non-trivial; there are several forms. The mostwell-known is the class of so-called LR(k) grammars.

Sander Bruggink Automaten und Formale Sprachen 336

Page 407: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Deterministic context free languages have different closure properties thancontext free languages.

Abgeschlossenheit

Are the deterministic context free languages closed under the followingoperations?

Union (L1, L2 det. context free ⇒ L1 ∪ L2 det. context free) ?

Intersection (L1, L2 det. context free ⇒ L1 ∩ L2 det. context free) ?

Intersection with a regular language (L1 det. context free, L2 regular⇒ L1 ∩ L2 det. context free) ?

Complement (L det. context free ⇒ L = Σ∗\L det. context free) ?

Sander Bruggink Automaten und Formale Sprachen 337

Page 408: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Closure under complement

When L is a deterministic context free language, then also L = Σ∗ \ L isdeterministic context free.

Informal proof idea:Just like with DFA, we can construct a push-down automaton for thecomplement, by exchanging final and non-final states.

Sander Bruggink Automaten und Formale Sprachen 338

Page 409: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

No closure under intersection

When L1 and L2 are deterministic context free languages, then L1 ∩ L2 isnot necessarily a deterministic context free language.

Proof: The example languages from the argument that context freelanguages are not closed under intersection are deterministic context free,however their intersection is not even context free. So we can use the samecounter example.

L1 = {ajbkck | j ≥ 0, k ≥ 0}L2 = {akbkc j | j ≥ 0, k ≥ 0}

L1 ∩ L2 = {anbncn | n ≥ 0}

Sander Bruggink Automaten und Formale Sprachen 339

Page 410: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

No closure under union

When L1 and L2 are deterministic context free languages, then L1 ∪ L2 isnot necessarily a deterministic context free language.

Proof: From complement under union it would follow, that thedeterministic context free languages are closed under intersection, because

of the following equality: L1 ∩ L2 = L1 ∪ L2).

Sander Bruggink Automaten und Formale Sprachen 340

Page 411: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Deterministic context free languages are closed under intersection withregular languages.

Closure under intersection with regular languages

Let L be a deterministic context free language and R a regular language.Then L ∩ R is a deterministic context free language.

Proof idea: Analogous to context free languages.

Sander Bruggink Automaten und Formale Sprachen 341

Page 412: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Deterministic Context Free Languages

Summary closure properties

Closedunder R

egu

lar

Det

.C

F

CF

Union 3 7 3

Concatenation 3 7 3

Kleene-Star 3 7 3

Intersection 3 7 7

Inters. w. RL 3 3 3

Complement 3 3 7

Sander Bruggink Automaten und Formale Sprachen 342

Page 413: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Decidability

Decidability with deterministic context free languages

The following problem are decidable for deterministic context freelanguages (represented by a deterministic push-down automaton):

Word problem: Let a deterministic context free language and a wordw ∈ Σ∗ be given. Does w ∈ L hold?

With a deterministic push-down automaton in linear time (as functionof the length of the input).

Emptiness problem: Let a context free language L be given. DoesL = ∅ hold?

See the corresponding procedure for context free languages.

Sander Bruggink Automaten und Formale Sprachen 343

Page 414: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Decidability

Decidability with deterministic context free languages

Finiteness problem: Let a context free language L be given. Is L finite?

See the corresponding procedure for context free languages.

Equivalence problem: Let two context free languages L1, L2 be given.Does L1 = L2 hold?

Was an open problem for a long time. Only in 1997 it was shown byGeraud Senizergues.

Sander Bruggink Automaten und Formale Sprachen 344

Page 415: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Deterministic Push-Down Automata

Decidability

Undecidability with deterministic context free languages

The following problem are not decidable for deterministic context freelanguages.

Intersection problem: Let two context free languages L1, L2 be given.Does L1 ∩ L2 = ∅ hold?

As with context free languages, this problem is decidable when one thetwo languages is regular.

Sander Bruggink Automaten und Formale Sprachen 345

Page 416: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing with ANTLR

Context free languages are often used to specify the syntax ofcomputer languages (programming languages, markup languages,etc.).

A parser is a program module that reads the source code of a programand produces some representation of it (for example a syntax tree).

In this lecture we look at ANTLR v4 (ANother Tool for LanguageRecognition) a parser generator which can be used to generate aparser (in Java)

Sander Bruggink Automaten und Formale Sprachen 346

Page 417: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing with ANTLR

In general, the parser of a compiler or an interpreter consists of twocomponenents:

The lexical analyser) arranges symbols into strings, the so-calledtokens.

The parser itself analyses the sequence of tokens and constructs asyntax tree (or another representation).

ANTLR generates both components.

Sander Bruggink Automaten und Formale Sprachen 347

Page 418: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing in Practice

(8 + sqrt(16)) * 3

( 8 + sqrt ( 16 ) ) * 3

〈expr〉〈expr〉

( 〈expr〉〈expr〉

8

+ 〈expr〉sqrt ( 〈expr〉

16

)

)

* 〈expr〉3

Input:

Tokens:

Syntax tree:

Lexical analysis

Parsing

Further processing

Sander Bruggink Automaten und Formale Sprachen 348

Page 419: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Interlude: Extended Backus-Naur-Form

In practice the Extended Backus-Naur-Form (EBNF) is often used tospecify context free grammars. In EBNF the right sides of the productionsare not words, but regular expressions.

Sander Bruggink Automaten und Formale Sprachen 349

Page 420: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Einschub: Extended Backus-Naur-Form

EBNF grammars can be easily transformed into normal context freegrammars.

A→ α(β1 | · · · | βn)γ ⇒

{A→ αBγ

B → β1 | · · · | βn

A→ αβ∗γ ⇒

{A→ αγ | αBγ

B → β | βB

Abbreviations: α+ ≡ αα∗, α? ≡ (ε | α), usw.

Apply the above rules until the grammar does not contain “∗” and(nested) “|” operations anymore.

Sander Bruggink Automaten und Formale Sprachen 350

Page 421: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing with ANTLR

We will generate a parser which recognizes the following grammar:

program→ statement∗

statement → ID ‘=’ expression ‘ ←↩ ’ | expression ‘ ←↩ ’ | ‘ ←↩ ’

expression→ expression ‘*’ expression | expression ‘/’ expression

| expression ‘+’ expression | expression ‘-’ expression

| ‘sqrt’ ‘(’ expression ‘)’

| ‘(’ expression ‘)’

| ID | NUMBER

Terminal symbols written in quotes. The tokens ID and NUMBER arerecognized by the lexical analyser.ANTLR solves ambiguities “automatically”.

Sander Bruggink Automaten und Formale Sprachen 351

Page 422: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing with ANTLR

Example input:

a = 4 + 2 * 5

b = (a / 2) * sqrt(16)

a + b

Sander Bruggink Automaten und Formale Sprachen 352

Page 423: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing with ANTLR

Body of an ANTLR source file:

grammar 〈parser name〉 ;Rules of the lexical analysers

Rules of the grammar

Sander Bruggink Automaten und Formale Sprachen 353

Page 424: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing with ANTLR

The lexical analyser:

// lexical rules

NUMBER : [0-9]+ | [0-9]* ’.’ [0-9]+ ;

NEWLINE : ’\r’? ’\n’ ;

SQRT : [sS][qQ][rR][tT] ;

ID : [a-zA-Z] [a-zA-Z0-9]* ;

WHITESPACE : [ \t]+ -> skip ;

Variables of the lexical analyser start with a capital letter.

The command “-> skip” makes sure that white space characters are nottransferred to the parser.

”sqrt“ is a SQRT-token, although it is also recognized by ID.

Sander Bruggink Automaten und Formale Sprachen 354

Page 425: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing mit ANTLR

Translation of the rulestatement → ID ‘=’ expression ‘ ←↩ ’ | expression ‘ ←↩ ’ | ‘ ←↩ ’

statement : var=ID ’=’ expression NEWLINE # Assignment

| expression NEWLINE # PrintExpression

| NEWLINE # Empty

;

Name of subtree Name of an alternative

Variables of the parser start with a non-capital letter.

Sander Bruggink Automaten und Formale Sprachen 355

Page 426: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing with ANTLR

ANTLR tools:

antlr4: Reads a grammar and generates a lexical analyser and aparsers (and Java-classes which are used to further process the syntaxtree generated by the parser).

grun: Program which call a parser and show the syntax tree (mainlyused for debugging the grammar).

Sander Bruggink Automaten und Formale Sprachen 356

Page 427: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing with ANTLR

Now, we want to further process the syntax tree, and execute theprogram.

ANTLR generated some classes that help us do this.

We write an implementation of ExpressionVisitor<T> to walkthrough the syntax tree and calculate the results.

For the grammar Expression.g, alternative X and return type T :

public T visitX (ExpressionParser.X Context ctx)

{T value = ...;

// Visit subtrees

return value;}

Sander Bruggink Automaten und Formale Sprachen 357

Page 428: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Example for:

expression : ...

| left=expression ’+’ right=expression # Plus

| ...

public Double visitPlus(ExpressionParser.PlusContext ctx)

{return visit(ctx.left) + visit(ctx.right);

}

Sander Bruggink Automaten und Formale Sprachen 358

Page 429: Sommersemester 2014 - ti.inf.uni-due.de

Context Free Languages Context Free Languages in Praxis

Parsing mit ANTLR

For self study: http://www.antlr.org/.

Sander Bruggink Automaten und Formale Sprachen 359

Page 430: Sommersemester 2014 - ti.inf.uni-due.de

Conclusion

All languages

Semi-decidable lang. (0)

Context sensitive lang. (1)

Context freelanguages (2)

Det. context freelanguages

Regular languages (3)

Context freelanguages (2)

Det. context freelanguage

Regular languages (3)

Solving the word problem

Pumping Lemma for RL

Regular grammars

DFAs and NFAsRegular expression

Myhill–Nerode equivalence

Pumping Lemma for CFL

Context free grammars

Push-down automata

Det. push-down automata

Sander Bruggink Automaten und Formale Sprachen 360

Page 431: Sommersemester 2014 - ti.inf.uni-due.de

Conclusion

Summary

Closedunder R

egu

lar.

Det

.C

F

Con

text

free

Union 3 7 3

Concatenation 3 7 3

Kleene-Star 3 7 3

Intersection 3 7 7

Inters. with regular 3 3 3

Komplement 3 3 7

Problemdecidable R

egu

lare

Det

.C

F

Con

text

free

Word problem 3 3 3

Emptiness 3 3 3

Finiteness 3 3 3

Intersection pr. 3 7 7

— with reg. 3 3 3

Equivalence 3 3 7

Sander Bruggink Automaten und Formale Sprachen 361

Page 432: Sommersemester 2014 - ti.inf.uni-due.de

Conclusion

Ubersicht

Applications:

Regulare Languages

VerificationSearching and replacing in text editorsLexical analysis

Context free languages

Specification of computer languages (programming languages, HTML,XML, arithmetic expressions, . . . )Specification of natural languagesVerification – the stack is used to model function calls

Sander Bruggink Automaten und Formale Sprachen 362

Page 433: Sommersemester 2014 - ti.inf.uni-due.de

Conclusion

Outlook “Berechenbarkeit und Komplexitat”

Decidability

the focus is on context sensitive languages, semi-decidable languagesand undecidable languages

automaton model: Turing machine

Solving the word problem ⇐⇒ Solving a decision problem ⇐⇒Computing characteristic function

Undecidable problem / Uncomputable functions

Complexity:

Complexity of algorithms ⇒ Complexity classes

Sander Bruggink Automaten und Formale Sprachen 363