39
XMλ

XM λ

  • Upload
    hina

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

XM λ. Contents. What is the problem? Hosoya’s approach Shields’ approach XM Lambda and the UHConclusion. What is the problem?. XML, a standard language of first-order, tree-like datatypes - PowerPoint PPT Presentation

Citation preview

Page 1: XM λ

XMλ

Page 2: XM λ

Contents

What is the problem? Hosoya’s approach Shields’ approach XMLambda and the UHConclusion

Page 3: XM λ

What is the problem?XML, a standard language of first-order, tree-like datatypes

XML works well for describing static documents, but documents are typically dynamic, generated by a server

Implementing a server for dynamic documents in conventional languages is hard:

no direct support for XML or scripting language syntax no compile-time checks to ensure valid documents

Can custom languages developed for XML be embedded as combinatory libraries within a Haskell-like language?

Page 4: XM λ

element Msg = ( ( (To|Bcc)* & From), Body)element To = Stringelement Bcc = Stringelement From = Stringelement Body = P*element P = String

<Msg><To>[email protected]</To><Bcc>[email protected]</Bcc><From>[email protected]</From><Body>

<P>Our presentation is finished!</P></Body>

</Msg>

XML

Page 5: XM λ

element Msg = ( ( (To|Bcc)* & From), Body)element To = Stringelement Bcc = Stringelement From = Stringelement Body = P*element P = String

| : union* : sequence& : unordered tuple, : ordered tuple

XML

Page 6: XM λ

What we are looking for:

XML → Functional Program.

document-type definition → type definitions

Regular expression → type

element → term

Document validation → type checking

Page 7: XM λ

Possible solutions

1. Using a universal datatype

Data Element = Atom String

| Node String (List Element)

Page 8: XM λ

Data Element = Atom String| Node String (List

Element)

Node “Msg” [Node “To” [Atom “[email protected]”],Node “Bcc [Atom “[email protected]”],Node “From” [Atom “[email protected]”],Node “Body” [

Node “P” [Atom “Our...”]]

]

No validation possible

Page 9: XM λ

Possible solutions

1. Using a universal datatype

2. Using a newtype declarations

Newtype Msg = Msg (List (Either To Bcc),

From, Body )Newtype From = From StringNewtype To = To StringNewtype Bcc = Bcc StringNewtype Body = List PNewtype P = P String

Page 10: XM λ

Newtype Msg = Msg (List (Either To Bcc), From, Body

Newtype From = From StringNewtype To = To StringNewtype Bcc = Bcc StringNewtype Body = List PNewtype P = P String

Msg ([ Left ( To “[email protected]”), Right ( Bcc “[email protected]”),From “[email protected]”,Body [

P “Our...”]

)

Sound, but not complete.

Page 11: XM λ

Possible solutions

1. Using a universal datatype

2. Using a newtype declarations

3. Using regular expression types as primitive

Hosoya

Page 12: XM λ

Possible solutions

1. Using a universal datatype

2. Using a newtype declarations

3. Using regular expression types as primitive

4. Using Type-Indexed rowsShields

Page 13: XM λ

Hosoya’s approach

Page 14: XM λ

Why Regular Expression Types? Static typechecking: generated XML

documents conform to DTD Or: invalid documents can never arise For example: A <table> must have at

least one <tr>

Page 15: XM λ

Why Regular Expression Patterns? Convenient programming constructs for

manipulating documents For instance, jump over arbitrary length data and

extract specific data: type Person = person[Name,Email*,Tel?]

match p with

person[Name

,Email+

,Tel ] -> …

Page 16: XM λ

XDuce: Values

Primitives represent XML documents (trees)

For example:person[name[“Joep”]

,email[“[email protected]”]]

I.e. a value is a sequence of nodes

Page 17: XM λ

XDuce: Regular Expression Types Types correspond to document schemas Familiar XML regular expressions: type Tel = tel[String] type Tels = Tel* type Recip = Bcc|Cc (Name, Tel*), Addr T? = T|() T+ = T,T*

Page 18: XM λ

Subtyping

Many algebraic laws:Associativity of concatenation and union:

A|(B|C) (A|B)|CCommutativity of union: A|B B|A

These laws are crucial for XML processing, but lead to complicated specification

Page 19: XM λ

Subtyping

Subtyping as set inclusion First define which values belong to type One type is a subtype of another if the

former denotes a subset of the latter For example: (Name*, Tel*) <: (Name|Tel)*

Page 20: XM λ

Pattern Matching: Exhaustivenesstype Person = person[Name,Email*,Tel?]match p with person[Name,Email+,Tel?] -> … person[Name,Email*,Tel] -> …

Not exhaustive Use subtyping to check: the input type

must be a subtype of the union of the pattern types

Page 21: XM λ

Pattern Matching: Irredundancy

match p with person[Name,Email*,Tel?] -> … person[Name,Email+,Tel] -> …

Second clause redundant A clause is redundant iff all the input

values that can be matched by the pattern can also be matched by preceding patterns

Page 22: XM λ

Pattern Matching: Type Inferencetype Name = name[String]match (ps as Person*) with person[name[val n as String] ,Email*,Tel?] ,rest -> …

Avoid excessive type annotations Use input type and pattern to infer types of

bare variables (rest)bound variables (n)

Page 23: XM λ

Functions

First-order functions (explicitly typed):fun f(P):T = e

For example:fun tels(val ps as Person*):Tel* = match ps with

person[Name,Email*,tel[val t]],rest -> tel[t],tels(rest) person[Name,Email*],rest -> tels(rest)

Page 24: XM λ

Higher-order Functions

Functions as first-class citizen Why desireable?

Abstraction

Not supported by XDuce What is needed?

Subtyping for arrow types

So why not support higher-order functions?

Page 25: XM λ

Higher-order Functions

Function definitions given by fixed set G G is used in T-APP (instead of standard

rule) Consequence: T-ABS fails Fix: redefine T-APP Type annotations needed for check of

pattern match

Page 26: XM λ

Parametric Polymorphism

Generic typing using vars instead of actual types Why desireable?

Abstraction from structure of problem

What is needed? Type abstraction Type application

So why no parametric polymorphism?

Page 27: XM λ

Parametric Polymorphism

Problems:forall X . (U|X) -> (T|X) Pattern matching problems:

Exhaustiveness / irredundancy checksType inference

Typing constraints cannot be representedforall X {U,T}.(U|X) -> (T|X)

Page 28: XM λ

Conclusions

Typed language with XML docs as primitive values

Regular expression types are fundamental Regular expression pattern matching No higher-order functions No parametric polymorphism

Page 29: XM λ

Shields’ approach“It is required that content models in element type

declarations be deterministic”

Consequence 1:

regular expressions must be 1-unambiguous

Unions and unordered tuples are formed from distinct members.

( ( To , Bcc ) & (Bcc, To) ) is 1-unambiguous

( (Bcc, To) & Bcc ) is not

( (To | Bcc) & Bcc ) is not

Page 30: XM λ

Shields’ approach“It is required that content models in element type

declarations be deterministic”

Consequence 2:possible to transform any XML element into a term:

* sequence list, tuple tuple| union → type-indexed sum& unordered tuple → type-indexed product

| and & are both formed from Type-Index Rows

Page 31: XM λ

Type-Indexed Rows

A type-indexed row is a list of types

Type constructors Empty: Row (_#_): Type → Row →

Row

For example: (Int # Bool # Empty)

Page 32: XM λ

Type-indexed product TIP: (All _): Row → Type

Type-indexed coproduct TIC: (One _): Row → Type

Page 33: XM λ

Insertion Constraints

Insertion constraints used to guarantee distinctness of elements:

a ins (Int # Bool # Empty)

constrains a to be any other than Int or Bool

(List b) ins (Int # Bool # Empty)

Is True

Page 34: XM λ

Type-indexed product TIP:Triv: All Empty(_ && _): extension

forall (a: Type) (b: Row) .

a ins b => a → All b → All (a#b)

Type-indexed coproduct TIC:(Inj _): injection

forall (a: Type) (b: Row) .

a ins b => a → One (a#b)

Page 35: XM λ

Let tuple = \(x && y && Triv) . (x, y)In tuple (True && 1 && Triv)

Type checking:

Unify All(x#y#Empty) and All(Int#Bool#Empty)

Under constraint: x ins (y#Empty)

Overall term has type (Int, Bool) or (Bool, Int) !

Page 36: XM λ

Equality constraints( c # d # Empty ) eq ( Int # Bool # Empty )

Propagates until sufficient information is found to be simplified

Page 37: XM λ

Simplifying constraints

Simple unification: (a → Int) eq (Bool → b)

a eq Bool, Int eq b

Row unification: (Int # a # Empty) eq (Bool # b # Empty)

(Int eq b), (a # Empty) eq (Bool # Empty)

insertion: (a,b) ins (Bool # c # Empty)

(a,b) ins (c # Empty)

Page 38: XM λ

Introducing fresh typenames Monomorphic:

newtype xCoord = IntAll (xCoord # Int # Empty)

Polymorphic:newtype xCoord = \ (a:Type).aAllows same newtypes within a record !!

Introduction opaque newtypesType arguments are ignored in insertion constraints

: newtype opaque xCoord = \(a:Type).a

Page 39: XM λ

XMLambda and UHConclusion

Why regular expression types (Hosoya)? Fundamental regular expression types Powerful pattern matching No higher order functions and polymorphism Subtyping and parametric polymorphism?

Why type indexed rows (Shields)? Flexibility: more general than regular expression types All nice characteristics of FP Constraint system?