121
TRX: A Formally Verified Parser Interpreter Adam Koprowski Henri Binsztok MLstate, Paris, France http://mlstate.com European Symposium on Programming (ESOP ’10) 22 March A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 1 / 34

TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

TRX: A Formally Verified Parser Interpreter

Adam Koprowski Henri Binsztok

MLstate, Paris, Francehttp://mlstate.com

European Symposium on Programming (ESOP ’10)22 March

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 1 / 34

Page 2: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Outline

1 Context

2 Objective

3 PEGs: Parsing Expression Grammars

4 XPEGs: PEGs eXtended with with Semantic Actions

5 Termination Analysis for (X)PEGs

6 PEG interpreter in Coq

7 Performance evaluation

8 Related work

9 Conclusions

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 2 / 34

Page 3: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.

OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .

Safety is a distinguished feature of OPA:

OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 4: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.

OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .Safety is a distinguished feature of OPA:

OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 5: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.

OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .

Safety is a distinguished feature of OPA:

OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 6: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .

Safety is a distinguished feature of OPA:OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 7: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.

OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .

Safety is a distinguished feature of OPA:

OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 8: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.

OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .

Safety is a distinguished feature of OPA:

OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 9: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.

OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .

Safety is a distinguished feature of OPA:

OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 10: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.

OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .

Safety is a distinguished feature of OPA:

OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 11: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Context

SaaS (Software as a service) is a new model of software deploymentthat is gaining popularity.

OPA (One-Pot Application) is a new unified technology fordeveloping rich web applications.The language has built-in: persistency, parsing capabilities,client-code generation capabilities, concurrency, . . .

Safety is a distinguished feature of OPA:

OPA features a number of static checks for its applications,OPA has a verification platform build on top of it [WIP],OPA components will be subjected to formal verification [WIP].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 3 / 34

Page 12: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Objective

Parsing is a well-known and well-studied topic.

Typical approach to parsing is by means of parser generators:

... can we use formal mehods to verify some properties of generatedparsers?

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 4 / 34

Page 13: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Objective

Parsing is a well-known and well-studied topic.

Typical approach to parsing is by means of parser generators:

... can we use formal mehods to verify some properties of generatedparsers?

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 4 / 34

Page 14: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Objective

Parsing is a well-known and well-studied topic.

Typical approach to parsing is by means of parser generators:

... can we use formal mehods to verify some properties of generatedparsers?

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 4 / 34

Page 15: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Parsing expressions

Main reference

Bryan FordParsing expression grammars: a recognition-based syntacticfoundationPOPL ’04

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 5 / 34

Page 16: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Parsing expressions

Definition

Parsing expressions over non-terminals VN and terminals VT :

∆ := ε empty expression| [·] any character| [a] a terminal symbol (a ∈ VT )| [“s”] a literal (s ∈ S)| [a−z ] a range (a, z ∈ VT )| A a non-terminal (A ∈ VN)| e1; e2 a sequence (e1, e2 ∈ ∆)| e1/e2 a prioritized choice (e1, e2 ∈ ∆)| e∗ a zero-or-more greedy repetition (e ∈ ∆)| e+ a one-or-more greedy repetition (e ∈ ∆)| e? an optional expression (e ∈ ∆)| !e a not-predicate (e ∈ ∆)| &e an and-predicate (e ∈ ∆)

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 6 / 34

Page 17: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Parsing expressions

Definition

Parsing expressions over non-terminals VN and terminals VT :

∆ := ε empty expression| [·] any character| [a] a terminal symbol (a ∈ VT )| [“s”] := [s1] ; . . . ; [sn] a literal (s ∈ S)| [a−z ] := [a] / . . . / [z ] a range (a, z ∈ VT )| A a non-terminal (A ∈ VN)| e1; e2 a sequence (e1, e2 ∈ ∆)| e1/e2 a prioritized choice (e1, e2 ∈ ∆)| e∗ a zero-or-more greedy repetition (e ∈ ∆)| e+ := e; e∗ a one-or-more greedy repetition (e ∈ ∆)| e? := e/ε an optional expression (e ∈ ∆)| !e a not-predicate (e ∈ ∆)| &e := !!e an and-predicate (e ∈ ∆)

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 6 / 34

Page 18: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Parsing expressions

Definition

Parsing expressions over non-terminals VN and terminals VT :

∆ := ε empty expression| [·] any character| [a] a terminal symbol (a ∈ VT )| [“s”] := [s1] ; . . . ; [sn] a literal (s ∈ S)| [a−z ] := [a] / . . . / [z ] a range (a, z ∈ VT )| A a non-terminal (A ∈ VN)| e1; e2 a sequence (e1, e2 ∈ ∆)| e1/e2 a prioritized choice (e1, e2 ∈ ∆)| e∗ a zero-or-more greedy repetition (e ∈ ∆)| e+ := e; e∗ a one-or-more greedy repetition (e ∈ ∆)| e? := e/ε an optional expression (e ∈ ∆)| !e a not-predicate (e ∈ ∆)| &e := !!e an and-predicate (e ∈ ∆)

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 6 / 34

Page 19: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG example

Example

A very simple PEG for mathematical expressions:

ws := ([ ] / [\t])∗number := [0−9]+term := ws number ws / ws [(] expr [)] ws

factor := term [∗] factor / termexpr := factor [+] expr / factor

Example

How about making addition left-associative?

expr := expr [+] factor / factor

... but that makes the grammar left-recursive, leading to anon-terminating parser.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 7 / 34

Page 20: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG example

Example

A very simple PEG for mathematical expressions:

ws := ([ ] / [\t])∗number := [0−9]+term := ws number ws / ws [(] expr [)] ws

factor := term [∗] factor / termexpr := factor [+] expr / factor

Example

How about making addition left-associative?

expr := expr [+] factor / factor

... but that makes the grammar left-recursive, leading to anon-terminating parser.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 7 / 34

Page 21: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG example

Example

A very simple PEG for mathematical expressions:

ws := ([ ] / [\t])∗number := [0−9]+term := ws number ws / ws [(] expr [)] ws

factor := term [∗] factor / termexpr := factor [+] expr / factor

Example

How about making addition left-associative?

expr := expr [+] factor / factor

... but that makes the grammar left-recursive, leading to anon-terminating parser.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 7 / 34

Page 22: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Examples ctd.

Example (Reserved words)

A PEG grammar for recognizing identifiers that are not reserved words:

identifier := !reserved letter+ wsreserved := IF / . . .

IF := [“if ”] !letter ws

Example (“Dangling” else)

Consider the following part of a CFG for the C language:

stmt := IF ( expr ) stmt| IF ( expr ) stmt ELSE stmt| . . .

According to this grammar there are two possible readings of a statement

if (e1) if (e2) s1 else s2

This ambiguity is easy to resolve using PEGs.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 8 / 34

Page 23: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Examples ctd.

Example (Reserved words)

A PEG grammar for recognizing identifiers that are not reserved words:

identifier := !reserved letter+ wsreserved := IF / . . .

IF := [“if ”] !letter ws

Example (“Dangling” else)

Consider the following part of a CFG for the C language:

stmt := IF ( expr ) stmt| IF ( expr ) stmt ELSE stmt| . . .

According to this grammar there are two possible readings of a statement

if (e1) if (e2) s1 else s2

This ambiguity is easy to resolve using PEGs.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 8 / 34

Page 24: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Formal definition of PEGs

Definition (PEG)

A parsing expressions grammar (PEG), G, is a quadruple(VN ,VT ,Pexp, es), where:

VT is a finite set of terminals,

VN is a finite set of non-terminals of the grammar,

Pexp is the interpretation of the productions of the grammar, i.e.,Pexp : VN → ∆ and

es is the start production of the grammar, es ∈ VN .

Definition

Semantics of PEGs:

(e, s)m ⊥: expression e fails on input string s.

(e, s)m √

s′ : expression e succeeds on s, leaving s ′ to be parsed.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 9 / 34

Page 25: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Formal definition of PEGs

Definition (PEG)

A parsing expressions grammar (PEG), G, is a quadruple(VN ,VT ,Pexp, es), where:

VT is a finite set of terminals,

VN is a finite set of non-terminals of the grammar,

Pexp is the interpretation of the productions of the grammar, i.e.,Pexp : VN → ∆ and

es is the start production of the grammar, es ∈ VN .

Definition

Semantics of PEGs:

(e, s)m ⊥: expression e fails on input string s.

(e, s)m √

s′ : expression e succeeds on s, leaving s ′ to be parsed.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 9 / 34

Page 26: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs semantics

Definition (PEGs semantics, (1/2))

(ε, s)1 √

s

(Pexp(p), s)m r

(p, s)m+1 r

([·], x :: xs)1 √

xs ([·], []) 1 ⊥

(x , x :: xs)1 √

xs (x , [])1 ⊥

x 6= y

(y , x :: xs)1 ⊥

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 10 / 34

Page 27: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs semantics

Definition (PEGs semantics, (1/2))

(ε, s)1 √

s

(Pexp(p), s)m r

(p, s)m+1 r

([·], x :: xs)1 √

xs ([·], []) 1 ⊥

(x , x :: xs)1 √

xs (x , [])1 ⊥

x 6= y

(y , x :: xs)1 ⊥

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 10 / 34

Page 28: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs semantics

Definition (PEGs semantics, (1/2))

(ε, s)1 √

s

(Pexp(p), s)m r

(p, s)m+1 r

([·], x :: xs)1 √

xs ([·], []) 1 ⊥

(x , x :: xs)1 √

xs (x , [])1 ⊥

x 6= y

(y , x :: xs)1 ⊥

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 10 / 34

Page 29: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs semantics

Definition (PEGs semantics (2/2))

(e1, s)m ⊥

(e1; e2, s)m+1 ⊥

(e1, s)m √

s′ (e2, s′)

n r

(e1; e2, s)m+n+1 r

(e1, s)m √

s′

(e1/e2, s)m+1 √

s′

(e1, s)m ⊥ (e2, s)

n r

(e1/e2, s)m+n+1 r

(e, s)m √

s′ (e∗, s ′) n √

s′′

(e∗, s)m+n+1 √

s′′

(e, s)m ⊥

(e∗, s)m+1 √

s

(e, s)m ⊥

(!e, s)m+1 √

s

(e, s)m √

s′

(!e, s)m+1 ⊥

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 11 / 34

Page 30: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs semantics

Definition (PEGs semantics (2/2))

(e1, s)m ⊥

(e1; e2, s)m+1 ⊥

(e1, s)m √

s′ (e2, s′)

n r

(e1; e2, s)m+n+1 r

(e1, s)m √

s′

(e1/e2, s)m+1 √

s′

(e1, s)m ⊥ (e2, s)

n r

(e1/e2, s)m+n+1 r

(e, s)m √

s′ (e∗, s ′) n √

s′′

(e∗, s)m+n+1 √

s′′

(e, s)m ⊥

(e∗, s)m+1 √

s

(e, s)m ⊥

(!e, s)m+1 √

s

(e, s)m √

s′

(!e, s)m+1 ⊥

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 11 / 34

Page 31: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs semantics

Definition (PEGs semantics (2/2))

(e1, s)m ⊥

(e1; e2, s)m+1 ⊥

(e1, s)m √

s′ (e2, s′)

n r

(e1; e2, s)m+n+1 r

(e1, s)m √

s′

(e1/e2, s)m+1 √

s′

(e1, s)m ⊥ (e2, s)

n r

(e1/e2, s)m+n+1 r

(e, s)m √

s′ (e∗, s ′) n √

s′′

(e∗, s)m+n+1 √

s′′

(e, s)m ⊥

(e∗, s)m+1 √

s

(e, s)m ⊥

(!e, s)m+1 √

s

(e, s)m √

s′

(!e, s)m+1 ⊥

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 11 / 34

Page 32: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs in a nutshell

Expressiveness:

X PEGs are unambiguous,X PEGs can handle lexical analysis.X PEGs support backtracking and unlimited lookahead7 ... but cannot easily handle left-recursion.X PEGs can parse all LL(k) and LR(k) languages... and more (including

non-context-free grammars).

Implementation:

X Allow very easy implementation by means of a recursive descent parser.7 Simple implementation can result in exponential time for parsing.X Using memoization ensures linear time complexity (packrat parsing),7 ... but is expensive in terms of memory consumption.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 12 / 34

Page 33: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs in a nutshell

Expressiveness:

X PEGs are unambiguous,X PEGs can handle lexical analysis.X PEGs support backtracking and unlimited lookahead7 ... but cannot easily handle left-recursion.X PEGs can parse all LL(k) and LR(k) languages... and more (including

non-context-free grammars).

Implementation:

X Allow very easy implementation by means of a recursive descent parser.7 Simple implementation can result in exponential time for parsing.X Using memoization ensures linear time complexity (packrat parsing),7 ... but is expensive in terms of memory consumption.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 12 / 34

Page 34: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Semantic Actions

So far we can recognize whether a string belongs to the languagedefined by G.

We need to be able to get a parse trace (AST).

For that we extend the type of parsing expressions ∆ to a family oftypes ∆α, where the index α is a type of the semantic valueassociated with the expression.

We also compositionally define default semantic values for all types ofexpressions.

And we introduce a new construct: coercion, e[7→]f , which converts asemantic value v associated with e to f (v).

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 13 / 34

Page 35: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Semantic Actions

So far we can recognize whether a string belongs to the languagedefined by G.

We need to be able to get a parse trace (AST).

For that we extend the type of parsing expressions ∆ to a family oftypes ∆α, where the index α is a type of the semantic valueassociated with the expression.

We also compositionally define default semantic values for all types ofexpressions.

And we introduce a new construct: coercion, e[7→]f , which converts asemantic value v associated with e to f (v).

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 13 / 34

Page 36: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Semantic Actions

So far we can recognize whether a string belongs to the languagedefined by G.

We need to be able to get a parse trace (AST).

For that we extend the type of parsing expressions ∆ to a family oftypes ∆α, where the index α is a type of the semantic valueassociated with the expression.

We also compositionally define default semantic values for all types ofexpressions.

And we introduce a new construct: coercion, e[7→]f , which converts asemantic value v associated with e to f (v).

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 13 / 34

Page 37: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Semantic Actions

So far we can recognize whether a string belongs to the languagedefined by G.

We need to be able to get a parse trace (AST).

For that we extend the type of parsing expressions ∆ to a family oftypes ∆α, where the index α is a type of the semantic valueassociated with the expression.

We also compositionally define default semantic values for all types ofexpressions.

And we introduce a new construct: coercion, e[7→]f , which converts asemantic value v associated with e to f (v).

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 13 / 34

Page 38: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Semantic Actions

So far we can recognize whether a string belongs to the languagedefined by G.

We need to be able to get a parse trace (AST).

For that we extend the type of parsing expressions ∆ to a family oftypes ∆α, where the index α is a type of the semantic valueassociated with the expression.

We also compositionally define default semantic values for all types ofexpressions.

And we introduce a new construct: coercion, e[7→]f , which converts asemantic value v associated with e to f (v).

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 13 / 34

Page 39: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Typing discipline

Definition (XPEGs typing)

ε ∈ ∆True [·] ∈ ∆char

a ∈ VT

a ∈ ∆char

A ∈ VN

A ∈ ∆Ptype(A)

e1 ∈ ∆α e2 ∈ ∆β

e1; e2 ∈ ∆α∗β

e1 ∈ ∆α e2 ∈ ∆α

e1/e2 ∈ ∆α

e ∈ ∆α

e∗ ∈ ∆listα

e ∈ ∆α

!e ∈ ∆True

e ∈ ∆α f : α→ β

e[7→]f ∈ ∆β

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 14 / 34

Page 40: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Typing discipline

Definition (XPEGs typing)

ε ∈ ∆True [·] ∈ ∆char

a ∈ VT

a ∈ ∆char

A ∈ VN

A ∈ ∆Ptype(A)

e1 ∈ ∆α e2 ∈ ∆β

e1; e2 ∈ ∆α∗β

e1 ∈ ∆α e2 ∈ ∆α

e1/e2 ∈ ∆α

e ∈ ∆α

e∗ ∈ ∆listα

e ∈ ∆α

!e ∈ ∆True

e ∈ ∆α f : α→ β

e[7→]f ∈ ∆β

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 14 / 34

Page 41: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Typing discipline

Definition (XPEGs typing)

ε ∈ ∆True [·] ∈ ∆char

a ∈ VT

a ∈ ∆char

A ∈ VN

A ∈ ∆Ptype(A)

e1 ∈ ∆α e2 ∈ ∆β

e1; e2 ∈ ∆α∗β

e1 ∈ ∆α e2 ∈ ∆α

e1/e2 ∈ ∆α

e ∈ ∆α

e∗ ∈ ∆listα

e ∈ ∆α

!e ∈ ∆True

e ∈ ∆α f : α→ β

e[7→]f ∈ ∆β

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 14 / 34

Page 42: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Typing discipline

Definition (XPEGs typing)

ε ∈ ∆True [·] ∈ ∆char

a ∈ VT

a ∈ ∆char

A ∈ VN

A ∈ ∆Ptype(A)

e1 ∈ ∆α e2 ∈ ∆β

e1; e2 ∈ ∆α∗β

e1 ∈ ∆α e2 ∈ ∆α

e1/e2 ∈ ∆α

e ∈ ∆α

e∗ ∈ ∆listα

e ∈ ∆α

!e ∈ ∆True

e ∈ ∆α f : α→ β

e[7→]f ∈ ∆β

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 14 / 34

Page 43: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Typing discipline

Definition (XPEGs typing)

ε ∈ ∆True [·] ∈ ∆char

a ∈ VT

a ∈ ∆char

A ∈ VN

A ∈ ∆Ptype(A)

e1 ∈ ∆α e2 ∈ ∆β

e1; e2 ∈ ∆α∗β

e1 ∈ ∆α e2 ∈ ∆α

e1/e2 ∈ ∆α

e ∈ ∆α

e∗ ∈ ∆listα

e ∈ ∆α

!e ∈ ∆True

e ∈ ∆α f : α→ β

e[7→]f ∈ ∆β

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 14 / 34

Page 44: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Derived operators

Example

The typed versions of our derived operators become:

[a−z ] : ∆char := [a] / . . . / [z ][“s”] : ∆string := [s0] ; . . . ; [sn] [7→] tuple2str

e+ : ∆listα := e; e ∗ [7→] λx . x1 :: x2

e? : ∆optionα := e [ 7→] λx .Some x/ ε [7→] λx .None

&e : ∆True := !!e

where tuple2str(c1, . . . , cn) = [c1; . . . ; cn].

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 15 / 34

Page 45: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Example ctd.

Example (Simple mathematical expressions ctd.)

We often want to ignore the semantical values attached to an expression:e[]] := e [7→] λx . I .

ws := ([ ] / [\t])∗ []]number := [0−9]+ [7→] digListToNat

term := ws number ws [ 7→] λx . x2

/ ws [(] expr [)] ws [ 7→] λx . x3

factor := term [∗] factor [7→] λx . x1 ∗ x3

/ termexpr := factor [+] expr [7→] λx . x1 + x3

/ factor

⇒ [“(1+2) * (3 * 4)”] = 36

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 16 / 34

Page 46: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Example ctd.

Example (Simple mathematical expressions ctd.)

We often want to ignore the semantical values attached to an expression:e[]] := e [7→] λx . I .

ws := ([ ] / [\t])∗ []]number := [0−9]+ [7→] digListToNat

term := ws number ws [ 7→] λx . x2

/ ws [(] expr [)] ws [ 7→] λx . x3

factor := term [∗] factor [7→] λx . x1 ∗ x3

/ termexpr := factor [+] expr [7→] λx . x1 + x3

/ factor

⇒ [“(1+2) * (3 * 4)”] = 36

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 16 / 34

Page 47: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

XPEGs

Definition (Extended Parsing Expressions Grammar (XPEG))

An extended parsing expressions grammar (XPEG), G, is a tuple(VN ,VT ,Ptype,Pexp, vstart), where:

VT is a finite set of terminals,

VN is a finite set of non-terminals of the grammar,

Ptype : VN → Type is a function that gives types of semantic valuesof all productions.

Pexp is the interpretation of the productions of the grammar, i.e.,Pexp : ∀p:VN

∆Ptype(p) and

vstart is the start production of the grammar, vstart ∈ VN .

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 17 / 34

Page 48: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Semantics of XPEGs (1/2))

(ε, s)1 √ I

s

(Pexp(p), s)m r

(p, s)m+1 r

([·], x :: xs)1 √ x

xs ([·], []) 1 ⊥

(x , x :: xs)1 √ x

xs (x , [])1 ⊥

x 6= y

(y , x :: xs)1 ⊥

(e1, s)m ⊥

(e1; e2, s)m+1 ⊥

(e1, s)m √ v1

s′ (e2, s′)

n ⊥

(e1; e2, s)m+n+1 ⊥

(e1, s)m √ v1

s′ (e2, s′)

n √ v2

s′′

(e1; e2, s)m+n+1 √ (v1,v2)

s′′

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 18 / 34

Page 49: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Semantics of XPEGs (1/2))

(ε, s)1 √ I

s

(Pexp(p), s)m r

(p, s)m+1 r

([·], x :: xs)1 √ x

xs ([·], []) 1 ⊥

(x , x :: xs)1 √ x

xs (x , [])1 ⊥

x 6= y

(y , x :: xs)1 ⊥

(e1, s)m ⊥

(e1; e2, s)m+1 ⊥

(e1, s)m √ v1

s′ (e2, s′)

n ⊥

(e1; e2, s)m+n+1 ⊥

(e1, s)m √ v1

s′ (e2, s′)

n √ v2

s′′

(e1; e2, s)m+n+1 √ (v1,v2)

s′′

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 18 / 34

Page 50: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Semantics of XPEGs (1/2))

(ε, s)1 √ I

s

(Pexp(p), s)m r

(p, s)m+1 r

([·], x :: xs)1 √ x

xs ([·], []) 1 ⊥

(x , x :: xs)1 √ x

xs (x , [])1 ⊥

x 6= y

(y , x :: xs)1 ⊥

(e1, s)m ⊥

(e1; e2, s)m+1 ⊥

(e1, s)m √ v1

s′ (e2, s′)

n ⊥

(e1; e2, s)m+n+1 ⊥

(e1, s)m √ v1

s′ (e2, s′)

n √ v2

s′′

(e1; e2, s)m+n+1 √ (v1,v2)

s′′

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 18 / 34

Page 51: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Semantics of XPEGs (2/2))

(e1, s)m √ v

s′

(e1/e2, s)m+1 √ v

s′

(e1, s)m ⊥ (e2, s)

n r

(e1/e2, s)m+n+1 r

(e, s)m √ v

s′ (e∗, s ′) n √ vs

s′′

(e∗, s)m+n+1 √ v ::vs

s′′

(e, s)m ⊥

(e∗, s)m+1 √ []

s

(e, s)m ⊥

(!e, s)m+1 √ I

s

(e, s)m √ v

s′

(!e, s)m+1 ⊥

(e, s)m ⊥

(e[7→]f , s)m+1 ⊥

(e, s)m √ v

s′

(e[7→]f , s)m+1 √ f (v)

s′

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 19 / 34

Page 52: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Semantics of XPEGs (2/2))

(e1, s)m √ v

s′

(e1/e2, s)m+1 √ v

s′

(e1, s)m ⊥ (e2, s)

n r

(e1/e2, s)m+n+1 r

(e, s)m √ v

s′ (e∗, s ′) n √ vs

s′′

(e∗, s)m+n+1 √ v ::vs

s′′

(e, s)m ⊥

(e∗, s)m+1 √ []

s

(e, s)m ⊥

(!e, s)m+1 √ I

s

(e, s)m √ v

s′

(!e, s)m+1 ⊥

(e, s)m ⊥

(e[7→]f , s)m+1 ⊥

(e, s)m √ v

s′

(e[7→]f , s)m+1 √ f (v)

s′

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 19 / 34

Page 53: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Semantics of XPEGs (2/2))

(e1, s)m √ v

s′

(e1/e2, s)m+1 √ v

s′

(e1, s)m ⊥ (e2, s)

n r

(e1/e2, s)m+n+1 r

(e, s)m √ v

s′ (e∗, s ′) n √ vs

s′′

(e∗, s)m+n+1 √ v ::vs

s′′

(e, s)m ⊥

(e∗, s)m+1 √ []

s

(e, s)m ⊥

(!e, s)m+1 √ I

s

(e, s)m √ v

s′

(!e, s)m+1 ⊥

(e, s)m ⊥

(e[7→]f , s)m+1 ⊥

(e, s)m √ v

s′

(e[7→]f , s)m+1 √ f (v)

s′

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 19 / 34

Page 54: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Semantics of XPEGs (2/2))

(e1, s)m √ v

s′

(e1/e2, s)m+1 √ v

s′

(e1, s)m ⊥ (e2, s)

n r

(e1/e2, s)m+n+1 r

(e, s)m √ v

s′ (e∗, s ′) n √ vs

s′′

(e∗, s)m+n+1 √ v ::vs

s′′

(e, s)m ⊥

(e∗, s)m+1 √ []

s

(e, s)m ⊥

(!e, s)m+1 √ I

s

(e, s)m √ v

s′

(!e, s)m+1 ⊥

(e, s)m ⊥

(e[7→]f , s)m+1 ⊥

(e, s)m √ v

s′

(e[7→]f , s)m+1 √ f (v)

s′

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 19 / 34

Page 55: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Some meta-properties of XPEGs

Lemma

If (e, s) √ v

s then (e∗, s) 6 r for all r .

Theorem

If (e, s)m √ s′

v then s ′ is a suffix of s.

Theorem (unambiguity)

If (e, s)m1 r1 and (e, s)

m2 r2 then m1 = m2 and r1 = r2.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 20 / 34

Page 56: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Some meta-properties of XPEGs

Lemma

If (e, s) √ v

s then (e∗, s) 6 r for all r .

Theorem

If (e, s)m √ s′

v then s ′ is a suffix of s.

Theorem (unambiguity)

If (e, s)m1 r1 and (e, s)

m2 r2 then m1 = m2 and r1 = r2.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 20 / 34

Page 57: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Some meta-properties of XPEGs

Lemma

If (e, s) √ v

s then (e∗, s) 6 r for all r .

Theorem

If (e, s)m √ s′

v then s ′ is a suffix of s.

Theorem (unambiguity)

If (e, s)m1 r1 and (e, s)

m2 r2 then m1 = m2 and r1 = r2.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 20 / 34

Page 58: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Termination problem for PEGs

Ensuring termination of a PEG parser essentially essentially comes down totwo problems:

1 termination of all semantic actions in G and

2 completness of G with respect to PEGs semantics.

1 with Coq we get that for free.

2 in general: undecidablebut conservative analysis is sufficient in practice (essentially:complete)

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 21 / 34

Page 59: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Termination problem for PEGs

Ensuring termination of a PEG parser essentially essentially comes down totwo problems:

1 termination of all semantic actions in G and

2 completness of G with respect to PEGs semantics.

1 with Coq we get that for free.

2 in general: undecidablebut conservative analysis is sufficient in practice (essentially:complete)

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 21 / 34

Page 60: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Termination problem for PEGs

Ensuring termination of a PEG parser essentially essentially comes down totwo problems:

1 termination of all semantic actions in G and2 completness of G with respect to PEGs semantics.

Example

Remember:expr := expr [+] factor / factor.

But also:A := B / C !D A

if B may fail and C and D may succeed, the former without consuming anyinput ... and it can get even worse with mutual recursion.

1 with Coq we get that for free.2 in general: undecidable

but conservative analysis is sufficient in practice (essentially:complete)A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 21 / 34

Page 61: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Termination problem for PEGs

Ensuring termination of a PEG parser essentially essentially comes down totwo problems:

1 termination of all semantic actions in G and2 completness of G with respect to PEGs semantics.

Example

Remember:expr := expr [+] factor / factor.

But also:A := B / C !D A

if B may fail and C and D may succeed, the former without consuming anyinput ... and it can get even worse with mutual recursion.

1 with Coq we get that for free.2 in general: undecidable

but conservative analysis is sufficient in practice (essentially:complete)A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 21 / 34

Page 62: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Termination problem for PEGs

Ensuring termination of a PEG parser essentially essentially comes down totwo problems:

1 termination of all semantic actions in G and2 completness of G with respect to PEGs semantics.

Example

Remember:expr := expr [+] factor / factor.

But also:A := B / C !D A

if B may fail and C and D may succeed, the former without consuming anyinput ... and it can get even worse with mutual recursion.

1 with Coq we get that for free.2 in general: undecidable

but conservative analysis is sufficient in practice (essentially:complete)A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 21 / 34

Page 63: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Termination problem for PEGs

Ensuring termination of a PEG parser essentially essentially comes down totwo problems:

1 termination of all semantic actions in G and

2 completness of G with respect to PEGs semantics.

1 with Coq we get that for free.

2 in general: undecidablebut conservative analysis is sufficient in practice (essentially:complete)

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 21 / 34

Page 64: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Termination problem for PEGs

Ensuring termination of a PEG parser essentially essentially comes down totwo problems:

1 termination of all semantic actions in G and

2 completness of G with respect to PEGs semantics.

1 with Coq we get that for free.

2 in general: undecidablebut conservative analysis is sufficient in practice (essentially:complete)

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 21 / 34

Page 65: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Termination problem for PEGs

Ensuring termination of a PEG parser essentially essentially comes down totwo problems:

1 termination of all semantic actions in G and

2 completness of G with respect to PEGs semantics.

1 with Coq we get that for free.

2 in general: undecidablebut conservative analysis is sufficient in practice (essentially:complete)

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 21 / 34

Page 66: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG analysis

We define three groups of properties over parsing expressions:

“0”: parsing expression can succeed without consuming any input,

“> 0”: parsing expression can succeed after consuming some input,

“⊥”: parsing expression can fail.

e ∈ P0 means that expression e has property “0”.e ∈ P≥0 := e ∈ P0 ∨ e ∈ P>0.

Theorem

For arbitrary e ∈ ∆ and s ∈ S:

if (e, s) √ v

s then e ∈ P0,

if (e, s) √ v

s′ and |s ′| < |s| then e ∈ P>0 and

if (e, s) ⊥ then e ∈ P⊥.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 22 / 34

Page 67: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG analysis

We define three groups of properties over parsing expressions:

“0”: parsing expression can succeed without consuming any input,

“> 0”: parsing expression can succeed after consuming some input,

“⊥”: parsing expression can fail.

e ∈ P0 means that expression e has property “0”.e ∈ P≥0 := e ∈ P0 ∨ e ∈ P>0.

Theorem

For arbitrary e ∈ ∆ and s ∈ S:

if (e, s) √ v

s then e ∈ P0,

if (e, s) √ v

s′ and |s ′| < |s| then e ∈ P>0 and

if (e, s) ⊥ then e ∈ P⊥.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 22 / 34

Page 68: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG analysis

We define three groups of properties over parsing expressions:

“0”: parsing expression can succeed without consuming any input,

“> 0”: parsing expression can succeed after consuming some input,

“⊥”: parsing expression can fail.

e ∈ P0 means that expression e has property “0”.e ∈ P≥0 := e ∈ P0 ∨ e ∈ P>0.

Theorem

For arbitrary e ∈ ∆ and s ∈ S:

if (e, s) √ v

s then e ∈ P0,

if (e, s) √ v

s′ and |s ′| < |s| then e ∈ P>0 and

if (e, s) ⊥ then e ∈ P⊥.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 22 / 34

Page 69: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (PEG analysis (1/2))

ε ∈ P0 [·] ∈ P>0 [·] ∈ P⊥

a ∈ VT

[a] ∈ P>0

a ∈ VT

[a] ∈ P⊥

e ∈ P⊥e∗ ∈ P0

e ∈ P>0

e∗ ∈ P>0

e ∈ P⊥!e ∈ P0

e ∈ P≥0

!e ∈ P⊥

? ∈ {0, > 0,⊥} A ∈ VN Pexp(A) ∈ P?A ∈ P?

e1 ∈ P⊥ ∨ (e1 ∈ P≥0 ∧ e2 ∈ P⊥)

e1; e2 ∈ P⊥

e1 ∈ P0 e2 ∈ P0

e1; e2 ∈ P0

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 23 / 34

Page 70: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (PEG analysis (1/2))

ε ∈ P0 [·] ∈ P>0 [·] ∈ P⊥

a ∈ VT

[a] ∈ P>0

a ∈ VT

[a] ∈ P⊥

e ∈ P⊥e∗ ∈ P0

e ∈ P>0

e∗ ∈ P>0

e ∈ P⊥!e ∈ P0

e ∈ P≥0

!e ∈ P⊥

? ∈ {0, > 0,⊥} A ∈ VN Pexp(A) ∈ P?A ∈ P?

e1 ∈ P⊥ ∨ (e1 ∈ P≥0 ∧ e2 ∈ P⊥)

e1; e2 ∈ P⊥

e1 ∈ P0 e2 ∈ P0

e1; e2 ∈ P0

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 23 / 34

Page 71: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (PEG analysis (1/2))

ε ∈ P0 [·] ∈ P>0 [·] ∈ P⊥

a ∈ VT

[a] ∈ P>0

a ∈ VT

[a] ∈ P⊥

e ∈ P⊥e∗ ∈ P0

e ∈ P>0

e∗ ∈ P>0

e ∈ P⊥!e ∈ P0

e ∈ P≥0

!e ∈ P⊥

? ∈ {0, > 0,⊥} A ∈ VN Pexp(A) ∈ P?A ∈ P?

e1 ∈ P⊥ ∨ (e1 ∈ P≥0 ∧ e2 ∈ P⊥)

e1; e2 ∈ P⊥

e1 ∈ P0 e2 ∈ P0

e1; e2 ∈ P0

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 23 / 34

Page 72: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (PEG analysis (1/2))

ε ∈ P0 [·] ∈ P>0 [·] ∈ P⊥

a ∈ VT

[a] ∈ P>0

a ∈ VT

[a] ∈ P⊥

e ∈ P⊥e∗ ∈ P0

e ∈ P>0

e∗ ∈ P>0

e ∈ P⊥!e ∈ P0

e ∈ P≥0

!e ∈ P⊥

? ∈ {0, > 0,⊥} A ∈ VN Pexp(A) ∈ P?A ∈ P?

e1 ∈ P⊥ ∨ (e1 ∈ P≥0 ∧ e2 ∈ P⊥)

e1; e2 ∈ P⊥

e1 ∈ P0 e2 ∈ P0

e1; e2 ∈ P0

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 23 / 34

Page 73: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (PEG analysis (2/2))

(e1 ∈ P>0 ∧ e2 ∈ P≥0) ∨ (e1 ∈ P≥0 ∧ e2 ∈ P>0)

e1; e2 ∈ P>0

e1 ∈ P⊥ e2 ∈ P⊥e1/e2 ∈ P⊥

? ∈ {0, > 0} e1 ∈ P? ∨ (e1 ∈ P⊥ ∧ e2 ∈ P?)

e1/e2 ∈ P?

Definition (Expression set of G)

E(G) = {e ′ | e ′ v e, e ∈ Pexp(A),A ∈ VN}

⇒ We compute those properties by iterating over E(G) from ∅ untilreaching a fixpoint.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 24 / 34

Page 74: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (PEG analysis (2/2))

(e1 ∈ P>0 ∧ e2 ∈ P≥0) ∨ (e1 ∈ P≥0 ∧ e2 ∈ P>0)

e1; e2 ∈ P>0

e1 ∈ P⊥ e2 ∈ P⊥e1/e2 ∈ P⊥

? ∈ {0, > 0} e1 ∈ P? ∨ (e1 ∈ P⊥ ∧ e2 ∈ P?)

e1/e2 ∈ P?

Definition (Expression set of G)

E(G) = {e ′ | e ′ v e, e ∈ Pexp(A),A ∈ VN}

⇒ We compute those properties by iterating over E(G) from ∅ untilreaching a fixpoint.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 24 / 34

Page 75: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (PEG analysis (2/2))

(e1 ∈ P>0 ∧ e2 ∈ P≥0) ∨ (e1 ∈ P≥0 ∧ e2 ∈ P>0)

e1; e2 ∈ P>0

e1 ∈ P⊥ e2 ∈ P⊥e1/e2 ∈ P⊥

? ∈ {0, > 0} e1 ∈ P? ∨ (e1 ∈ P⊥ ∧ e2 ∈ P?)

e1/e2 ∈ P?

Definition (Expression set of G)

E(G) = {e ′ | e ′ v e, e ∈ Pexp(A),A ∈ VN}

⇒ We compute those properties by iterating over E(G) from ∅ untilreaching a fixpoint.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 24 / 34

Page 76: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (PEG analysis (2/2))

(e1 ∈ P>0 ∧ e2 ∈ P≥0) ∨ (e1 ∈ P≥0 ∧ e2 ∈ P>0)

e1; e2 ∈ P>0

e1 ∈ P⊥ e2 ∈ P⊥e1/e2 ∈ P⊥

? ∈ {0, > 0} e1 ∈ P? ∨ (e1 ∈ P⊥ ∧ e2 ∈ P?)

e1/e2 ∈ P?

Definition (Expression set of G)

E(G) = {e ′ | e ′ v e, e ∈ Pexp(A),A ∈ VN}

⇒ We compute those properties by iterating over E(G) from ∅ untilreaching a fixpoint.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 24 / 34

Page 77: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Checking WF)

e1 ∈WF e2 ∈WF

e1/e2 ∈WF ε ∈WF

A ∈ VN Pexp(A) ∈WF

A ∈WF [·] ∈WF

e1 ∈WF e1 ∈ P0 ⇒ e2 ∈WF

e1; e2 ∈WF

a ∈ VT

[a] ∈WF

e ∈WF, e /∈ P0

e∗ ∈WF

e ∈WF

!e ∈WF

⇒ Again iteration from ∅ until reaching a fixpoint

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 25 / 34

Page 78: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Checking WF)

e1 ∈WF e2 ∈WF

e1/e2 ∈WF ε ∈WF

A ∈ VN Pexp(A) ∈WF

A ∈WF [·] ∈WF

e1 ∈WF e1 ∈ P0 ⇒ e2 ∈WF

e1; e2 ∈WF

a ∈ VT

[a] ∈WF

e ∈WF, e /∈ P0

e∗ ∈WF

e ∈WF

!e ∈WF

⇒ Again iteration from ∅ until reaching a fixpoint

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 25 / 34

Page 79: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Checking WF)

e1 ∈WF e2 ∈WF

e1/e2 ∈WF ε ∈WF

A ∈ VN Pexp(A) ∈WF

A ∈WF [·] ∈WF

e1 ∈WF e1 ∈ P0 ⇒ e2 ∈WF

e1; e2 ∈WF

a ∈ VT

[a] ∈WF

e ∈WF, e /∈ P0

e∗ ∈WF

e ∈WF

!e ∈WF

⇒ Again iteration from ∅ until reaching a fixpoint

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 25 / 34

Page 80: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Checking WF)

e1 ∈WF e2 ∈WF

e1/e2 ∈WF ε ∈WF

A ∈ VN Pexp(A) ∈WF

A ∈WF [·] ∈WF

e1 ∈WF e1 ∈ P0 ⇒ e2 ∈WF

e1; e2 ∈WF

a ∈ VT

[a] ∈WF

e ∈WF, e /∈ P0

e∗ ∈WF

e ∈WF

!e ∈WF

⇒ Again iteration from ∅ until reaching a fixpoint

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 25 / 34

Page 81: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Checking WF)

e1 ∈WF e2 ∈WF

e1/e2 ∈WF ε ∈WF

A ∈ VN Pexp(A) ∈WF

A ∈WF [·] ∈WF

e1 ∈WF e1 ∈ P0 ⇒ e2 ∈WF

e1; e2 ∈WF

a ∈ VT

[a] ∈WF

e ∈WF, e /∈ P0

e∗ ∈WF

e ∈WF

!e ∈WF

⇒ Again iteration from ∅ until reaching a fixpoint

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 25 / 34

Page 82: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Checking WF)

e1 ∈WF e2 ∈WF

e1/e2 ∈WF ε ∈WF

A ∈ VN Pexp(A) ∈WF

A ∈WF [·] ∈WF

e1 ∈WF e1 ∈ P0 ⇒ e2 ∈WF

e1; e2 ∈WF

a ∈ VT

[a] ∈WF

e ∈WF, e /∈ P0

e∗ ∈WF

e ∈WF

!e ∈WF

⇒ Again iteration from ∅ until reaching a fixpoint

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 25 / 34

Page 83: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Definition (Checking WF)

e1 ∈WF e2 ∈WF

e1/e2 ∈WF ε ∈WF

A ∈ VN Pexp(A) ∈WF

A ∈WF [·] ∈WF

e1 ∈WF e1 ∈ P0 ⇒ e2 ∈WF

e1; e2 ∈WF

a ∈ VT

[a] ∈WF

e ∈WF, e /∈ P0

e∗ ∈WF

e ∈WF

!e ∈WF

⇒ Again iteration from ∅ until reaching a fixpoint

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 25 / 34

Page 84: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Well-formedness of (X)PEGs

Definition (WF)

We say that G is well-formed if E(G) = WF.

Theorem

If G is well-formed then it is complete.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 26 / 34

Page 85: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Well-formedness of (X)PEGs

Definition (WF)

We say that G is well-formed if E(G) = WF.

Theorem

If G is well-formed then it is complete.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 26 / 34

Page 86: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Expressing PEGs in Coq

Example

Parameter prods : Enumeration.Parameter prods type : prods → Type.

Inductive PExp : Type → Type :=| Terminal : char → PExp char| NonTerminal : ∀ p,PExp (prods type p)| Star : ∀ A,PExp A→ PExp (list A)| Choice : ∀ A,PExp A→ PExp A→ PExp A| ...

⇒ Every such PEG is well-defined and well-typed.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 27 / 34

Page 87: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Expressing PEGs in Coq

Example

Parameter prods : Enumeration.Parameter prods type : prods → Type.

Inductive PExp : Type → Type :=| Terminal : char → PExp char| NonTerminal : ∀ p,PExp (prods type p)| Star : ∀ A,PExp A→ PExp (list A)| Choice : ∀ A,PExp A→ PExp A→ PExp A| ...

⇒ Every such PEG is well-defined and well-typed.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 27 / 34

Page 88: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs in Coq

Example (Coq’s grammar of our leading example)

Program Definition production p :=match p return PExp (prod type p) with| ws ⇒ (" " / "\t") [∗] [#]| number ⇒ ["0"−−"9"] [+] [→] digListToRat| term⇒ ws; number ; ws [→] (λv ⇒ A2 3 v)

/ ws; "("; expr ; ")"; ws [→] (λv ⇒ A3 5 v)| factor ⇒ term; "*"; factor [→] (λv ⇒ A1 3 v ∗ A3 3 v)

/ term| expr ⇒ factor ; "+"; expr [→] (λv ⇒ A1 3 v + A3 3 v)

/ factorend.

⇒ Thanks to notations and coercions this is not too different from whatwe had before...

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 28 / 34

Page 89: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEGs in Coq

Example (Coq’s grammar of our leading example)

Program Definition production p :=match p return PExp (prod type p) with| ws ⇒ (" " / "\t") [∗] [#]| number ⇒ ["0"−−"9"] [+] [→] digListToRat| term⇒ ws; number ; ws [→] (λv ⇒ A2 3 v)

/ ws; "("; expr ; ")"; ws [→] (λv ⇒ A3 5 v)| factor ⇒ term; "*"; factor [→] (λv ⇒ A1 3 v ∗ A3 3 v)

/ term| expr ⇒ factor ; "+"; expr [→] (λv ⇒ A1 3 v + A3 3 v)

/ factorend.

⇒ Thanks to notations and coercions this is not too different from whatwe had before...

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 28 / 34

Page 90: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Checking PEGs well-formedness in Coq

Example

Program Fixpoint wf compute (wf : PES .t | wf prop wf ){measure (wf measure wf )} :{wf : PES .t | wf prop wf } :=let wf ′ := wf derive wf inmatch PES .equal wf wf ′ with| true ⇒ wf| false ⇒ wf compute wf ′

end.

Theorem

wf ⊆ E(G) =⇒ wf derive wf ⊆ E(G)

wf ⊆ wf derive wf .

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 29 / 34

Page 91: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Checking PEGs well-formedness in Coq

Example

Program Fixpoint wf compute (wf : PES .t | wf prop wf ){measure (wf measure wf )} :{wf : PES .t | wf prop wf } :=let wf ′ := wf derive wf inmatch PES .equal wf wf ′ with| true ⇒ wf| false ⇒ wf compute wf ′

end.

Theorem

wf ⊆ E(G) =⇒ wf derive wf ⊆ E(G)

wf ⊆ wf derive wf .

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 29 / 34

Page 92: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG parser

Example

Program Fixpoint parse(T : Type) (e : PExp T | is grammar exp e) (s : string)

: {r : ParsingResult T | ∃ n, [e, s ]⇒ [n, r ]} :=

match e with

| ...end.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 30 / 34

Page 93: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG parser

Example

Program Fixpoint parse(T : Type) (e : PExp T | is grammar exp e) (s : string)

: {r : ParsingResult T | ∃ n, [e, s ]⇒ [n, r ]} :=

match e with| Terminal c ⇒

match s with| nil ⇒ Fail| x :: xs ⇒

match CharAscii .eq dec c x with| left ⇒ Ok xs c| right ⇒ Fail

endend

| ...end.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 30 / 34

Page 94: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG parser

Example

Program Fixpoint parse(T : Type) (e : PExp T | is grammar exp e) (s : string)

: {r : ParsingResult T | ∃ n, [e, s ]⇒ [n, r ]} :=

match e with| NonTerminal p ⇒

parse (production p) s

| ...end.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 30 / 34

Page 95: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG parser

Example

Program Fixpoint parse(T : Type) (e : PExp T | is grammar exp e) (s : string)

: {r : ParsingResult T | ∃ n, [e, s ]⇒ [n, r ]} :=

match e with| Choice e1 e2 ⇒

match parse e1 s with| PR ok s ′ v ⇒ Ok s ′ v| PR fail ⇒ parse e2 s

end

| ...end.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 30 / 34

Page 96: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG parser

Example

Program Fixpoint parse(T : Type) (e : PExp T | is grammar exp e) (s : string)

: {r : ParsingResult T | ∃ n, [e, s ]⇒ [n, r ]} :=

match e with| Star e ⇒

match parse e s with| PR fail ⇒ Ok s [ ]| PR ok s ′ v ⇒

match parse (e [∗]) s ′ with| PR fail ⇒ Fail| PR ok s ′′ v ′ ⇒ Ok s ′′ (v :: v ′)

endend

| ...end.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 30 / 34

Page 97: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG parser

Example

Program Fixpoint parse(T : Type) (e : PExp T | is grammar exp e) (s : string)

: {r : ParsingResult T | ∃ n, [e, s ]⇒ [n, r ]} :=

match e with| ...

end.

How about termination?

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 30 / 34

Page 98: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG parser

Example

(e1, s1) � (e2, s2) ⇐⇒ ∃n1,r1,n2,r2(e1, s1)n1 r1∧(e2, s2)

n2 r2∧n1 > n2

Program Fixpoint parse(T : Type) (e : PExp T | is grammar exp e) (s : string)

{measure (e, s) (�)}: {r : ParsingResult T | ∃ n, [e, s ]⇒ [n, r ]} :=

match e with| ...

end.

How about termination?

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 30 / 34

Page 99: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

PEG parser

Example

Program Fixpoint parse(T : Type) (e : PExp T | is grammar exp e) (s : string)

{measure (e, s) (�)}: {r : ParsingResult T | ∃ n, [e, s ]⇒ [n, r ]} :=

match e with| ...

end.

+ 43 TCCs.

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 30 / 34

Page 100: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0.05 0.05

0

10

20

30

40

50

60

70

80

90

100

XML performance

xml­cert­1

XML file size [MB]

Tim

e [s

]

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 101: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0.05 0.05

0

10

20

30

40

50

60

70

80

90

100

XML performance

xml­cert­1

XML file size [MB]

Tim

e [s

]

⇒ rev uses append at every step and hence is quadratic

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 102: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

10

20

30

40

50

60

70

80

90

100

XML performance

xml­cert­1

xml­cert­2

XML file size [MB]

Tim

e [s

]

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 103: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

10

20

30

40

50

60

70

80

90

100

XML performance

xml­cert­1

xml­cert­2

XML file size [MB]

Tim

e [s

]

⇒ We better implement [a−z ] as a primitive

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 104: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

10

20

30

40

50

60

70

80

90

100

XML performancexml­cert­1

xml­cert­2

xml­cert­3

XML file size [MB]

Tim

e [s

]

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 105: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

10

20

30

40

50

60

70

80

90

100

XML performancexml­cert­1

xml­cert­2

xml­cert­3

XML file size [MB]

Tim

e [s

]

⇒ ... and we better use binary N to compare characters

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 106: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

10

20

30

40

50

60

70

80

90

100

XML performancexml­cert­1

xml­cert­2

xml­cert­3

xml­cert­4

XML file size [MB]

Tim

e [s

]

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 107: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

10

20

30

40

50

60

70

80

90

100

XML performancexml­cert­1

xml­cert­2

xml­cert­3

xml­cert­4

XML file size [MB]

Tim

e [s

]

⇒ 85% of the time the parser is busy with GC! Let us tweak it

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 108: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

10

20

30

40

50

60

70

80

90

100

XML performancexml­cert­1

xml­cert­2

xml­cert­3

xml­cert­4

xml­cert­5

XML file size [MB]

Tim

e [s

]

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 109: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Evolution of TRX’s performance

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

10

20

30

40

50

60

70

80

90

100

XML performancexml­cert­1

xml­cert­2

xml­cert­3

xml­cert­4

xml­cert­5

XML file size [MB]

Tim

e [s

]

⇒ Can we do better than that?

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 31 / 34

Page 110: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

TRX: performance comparison with other tools

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

1

2

3

4

5

6

7

8

9

10

XML performance

aurochs

xml­light

xml­trx­int

xml­cert­5

XML file size [MB]

Tim

e [s

]

7x slower than Aurochs (but more robust?).

32x slower than xml-light.

Anonymous review quote

“the experimental results are not reallyimpressive, showing that the standardLALR(1) parser outperforms all thePEG-based ones.”

Anonymous review quote

“I would like to applaud [the authors] for attempting toimprove the performance of their toolkit and showingus a graph which suggests that a better tuned libraryand the elimination of interpretative overhead have agood chance of delivering competitive results. Themere fact that it is even feasible to consider’Performance comparison’ is progress indeed.”

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 32 / 34

Page 111: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

TRX: performance comparison with other tools

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

1

2

3

4

5

6

7

8

9

10

XML performance

aurochs

xml­light

xml­trx­int

xml­cert­5

XML file size [MB]

Tim

e [s

]

7x slower than Aurochs (but more robust?).

32x slower than xml-light.

Anonymous review quote

“the experimental results are not reallyimpressive, showing that the standardLALR(1) parser outperforms all thePEG-based ones.”

Anonymous review quote

“I would like to applaud [the authors] for attempting toimprove the performance of their toolkit and showingus a graph which suggests that a better tuned libraryand the elimination of interpretative overhead have agood chance of delivering competitive results. Themere fact that it is even feasible to consider’Performance comparison’ is progress indeed.”

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 32 / 34

Page 112: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

TRX: performance comparison with other tools

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

1

2

3

4

5

6

7

8

9

10

XML performance

aurochs

xml­light

xml­trx­int

xml­cert­5

XML file size [MB]

Tim

e [s

]

7x slower than Aurochs (but more robust?).

32x slower than xml-light.

Anonymous review quote

“the experimental results are not reallyimpressive, showing that the standardLALR(1) parser outperforms all thePEG-based ones.”

Anonymous review quote

“I would like to applaud [the authors] for attempting toimprove the performance of their toolkit and showingus a graph which suggests that a better tuned libraryand the elimination of interpretative overhead have agood chance of delivering competitive results. Themere fact that it is even feasible to consider’Performance comparison’ is progress indeed.”

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 32 / 34

Page 113: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

TRX: performance comparison with other tools

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50

0

1

2

3

4

5

6

7

8

9

10

XML performance

aurochs

xml­light

xml­trx­int

xml­cert­5

XML file size [MB]

Tim

e [s

]

7x slower than Aurochs (but more robust?).

32x slower than xml-light.

Anonymous review quote

“the experimental results are not reallyimpressive, showing that the standardLALR(1) parser outperforms all thePEG-based ones.”

Anonymous review quote

“I would like to applaud [the authors] for attempting toimprove the performance of their toolkit and showingus a graph which suggests that a better tuned libraryand the elimination of interpretative overhead have agood chance of delivering competitive results. Themere fact that it is even feasible to consider’Performance comparison’ is progress indeed.”

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 32 / 34

Page 114: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Related work

Work on formally verified parsing:

SLR parser in HOL

Aditi Barthwal and Michael Norrish:Verified, Executable Parsing.ESOP ’09

library of parser combinators in Agda

Nils Anders Danielsson and Ulf NorellStructurally Recursive Descent Parsing.Draft, ’08

PEG parser in Ynot/Coq

Ryan Wisnesky and Gregory Malecha and Greg MorrisettCertified Web Services in Ynot.WWV ’09

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 33 / 34

Page 115: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Related work

Work on formally verified parsing:

SLR parser in HOL

Aditi Barthwal and Michael Norrish:Verified, Executable Parsing.ESOP ’09

library of parser combinators in Agda

Nils Anders Danielsson and Ulf NorellStructurally Recursive Descent Parsing.Draft, ’08

PEG parser in Ynot/Coq

Ryan Wisnesky and Gregory Malecha and Greg MorrisettCertified Web Services in Ynot.WWV ’09

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 33 / 34

Page 116: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Related work

Work on formally verified parsing:

SLR parser in HOL

Aditi Barthwal and Michael Norrish:Verified, Executable Parsing.ESOP ’09

library of parser combinators in Agda

Nils Anders Danielsson and Ulf NorellStructurally Recursive Descent Parsing.Draft, ’08

PEG parser in Ynot/Coq

Ryan Wisnesky and Gregory Malecha and Greg MorrisettCertified Web Services in Ynot.WWV ’09

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 33 / 34

Page 117: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Related work

Work on formally verified parsing:

SLR parser in HOL (termination not addressed)

Aditi Barthwal and Michael Norrish:Verified, Executable Parsing.ESOP ’09

library of parser combinators in Agda (type-indices ruling out left-recursion)

Nils Anders Danielsson and Ulf NorellStructurally Recursive Descent Parsing.Draft, ’08

PEG parser in Ynot/Coq (termination not addressed)

Ryan Wisnesky and Gregory Malecha and Greg MorrisettCertified Web Services in Ynot.WWV ’09

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 33 / 34

Page 118: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Conclusions

We presented an interpreter for PEGs developed in Coq,

along with the proofs of semantic preservation and terminationensuring total correctness of parsing.

Using Coq’s extraction capabilities this enables us to obtain formallycorrect parsers.

Thank you for your attention.

If you want to learn

more about OPA I will

be happy to discuss it

over lunch!

http://mlstate.com

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 34 / 34

Page 119: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Conclusions

We presented an interpreter for PEGs developed in Coq,

along with the proofs of semantic preservation and terminationensuring total correctness of parsing.

Using Coq’s extraction capabilities this enables us to obtain formallycorrect parsers.

Thank you for your attention.

If you want to learn

more about OPA I will

be happy to discuss it

over lunch!

http://mlstate.com

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 34 / 34

Page 120: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Conclusions

We presented an interpreter for PEGs developed in Coq,

along with the proofs of semantic preservation and terminationensuring total correctness of parsing.

Using Coq’s extraction capabilities this enables us to obtain formallycorrect parsers.

Thank you for your attention.

If you want to learn

more about OPA I will

be happy to discuss it

over lunch!

http://mlstate.com

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 34 / 34

Page 121: TRX: A Formally Verified Parser InterpreterAdam.Koprowski/pres/trx-esop-10.pdf · Parsing expression grammars: a recognition-based syntactic foundation POPL ’04 A.Koprowski, H.Binsztok

Conclusions

We presented an interpreter for PEGs developed in Coq,

along with the proofs of semantic preservation and terminationensuring total correctness of parsing.

Using Coq’s extraction capabilities this enables us to obtain formallycorrect parsers.

Thank you for your attention.

If you want to learn

more about OPA I will

be happy to discuss it

over lunch!

http://mlstate.com

A.Koprowski, H.Binsztok (MLstate) TRX: A Formally Verified Parser Interpreter ESOP 2010 34 / 34