TI1220 Lecture 14: Domain-Specific Languages

Preview:

DESCRIPTION

 

Citation preview

TI1220 2012-2013Concepts of Programming Languages

Eelco Visser / TU Delft

Lecture 14: Domain-Specific Languages

Linguistic Abstraction

Formalizing Design Patterns

ProblemDomain

SolutionDomain

implement

validate

Software Engineering

Software Reuse: Don’t Repeat Yourself

software reuse patterns

• copy and paste

• libraries

• frameworks

• service APIs

• design patterns

Linguistic Abstraction

identify pattern

use new abstraction

language A language Bdesign abstraction

From Instructions to Expressions

mov &a, &cadd &b, &cmov &a, &t1sub &b, &t1and &t1,&c

Source: http://sites.google.com/site/arch1utep/home/course_outline/translating-complex-expressions-into-assembly-language-using-expression-trees

c = ac += bt1 = at1 -= bc &= t1

c = (a + b) & (a - b)

From Calling Conventions to Procedures

calc: push eBP ; save old frame pointer mov eBP,eSP ; get new frame pointer sub eSP,localsize ; reserve place for locals . . ; perform calculations, leave result in AX . mov eSP,eBP ; free space for locals pop eBP ; restore old frame pointer ret paramsize ; free parameter space and return

f(e1,e2,...,en)

push eAX ; pass some register resultpush byte[eBP+20] ; pass some memory variable (FASM/TASM syntax)push 3 ; pass some constantcall calc ; the returned result is now in eAX

f(x) { ... }

http://en.wikipedia.org/wiki/Calling_convention

A structure is a collection of one or more variables, possibly of different types, grouped together under a single name for convenient handling. (Structures are called ``records'' in some languages, notably Pascal.)

struct point { int x; int y;};

member

structure tag

Structures in C: Abstract from Memory Layout

Malloc/Free to Automatic Memory Management

/* Allocate space for an array with ten elements of type int. */

int *ptr = (int*)malloc(10 * sizeof (int));if (ptr == NULL) { /* Memory could not be allocated, the program should handle the error here as appropriate. */} else { /* Allocation succeeded. Do something. */ free(ptr); /* We are done with the int objects, and free the associated pointer. */ ptr = NULL; /* The pointer must not be used again, unless re-assigned to using malloc again. */}

http://en.wikipedia.org/wiki/Malloc

int [] = new int[10];/* use it; gc will clean up (hopefully) */

typedef struct Base { void* (**vtable)(); int x;} Base;

void (*Base_Vtable[])() = { &Base_print };

Base* newBase(int v) { Base* obj = (Base*)malloc(sizeof(Base)); obj->vtable = Base_Vtable; obj->x = v; return obj;}

void print(Base* obj) { obj->vtable[0](obj);}

class Base { Integer x; public Base(Integer v) { x = v; } public void print() { System.out.println("Base: " + x); }}class Child extends Base { Integer y; public Child(Integer v1, Integer v2) { super(v1); y = v2; } public void print() { System.out.println("Child: (" + x + "," + y + ")"); }}

Dynamic Dispatch

Polymorphic Higher-Order Functions

def map[A,B](f: A => B, xs: List[A]): List[B] = { xs match{ case Nil() => Nil() case Cons(y, ys) => Cons(f(y), map(f, ys)) } }

def incList(xs: IntList): IntList = xs match { case Nil() => Nil() case Cons(y, ys) => Cons(y + 1, incList(ys)) }

Abstractions in Programming Languages

❖ Structured control-flow

★ if-then-else, while

❖ Procedural abstraction

★ procedures, first-class functions (closures)

❖ Memory management

★ garbage collection

❖ Data abstraction

★ abstract data types, objects

❖ Modules

★ inheritance, traits, mixins

“A programming language is low level when its programs require attention to the irrelevant”

Alan J. Perlis. Epigrams on Programming. SIGPLAN Notices, 17(9):7-13, 1982.

Do HLLs eliminate all irrelevant details?

What about

❖ data persistence

❖ data services

❖ concurrency

❖ distribution

❖ access control

❖ data invariants

❖ workflow

❖ ...

Do HLLs eliminate all irrelevant details?

What about

❖ data persistence

❖ data services

❖ concurrency

❖ distribution

❖ access control

❖ data invariants

❖ workflow

❖ ...

many of these concerns require

programmatic encodings

What is the Next Level of Abstraction?

ProblemDomain HLL Machine

Domain-Specific Languages

ProblemDomain HLL MachineDSL

Example: Encoding Units

compiler

computerinput

input distance : Float;input duration : Float;output speed : Float := duration / distance;

output

Example: Encoding Units

compiler

computerinput

input distance : Float;input duration : Float;output speed : Float := duration / distance;

error

wrong output

Impact of Software Errors

compiler

computer

error

Mars Climate OrbiterUnit mismatch: Orbiter variables in Newtons, Ground control software in Pound-force.

Damage: ~350 M$

input distance : Float;input duration : Float;output speed : Float := duration / distance;

wrong output

Example: Explicit Representation of Units

computer

input distance : Meter;input duration : Second;output speed : Meter/Second := duration / distance;

compiler

formalize knowledge of application area (domain) in language

error

DSLs Provide Domain-Specific ...

Abstractions

★ directly represent domain concepts

Concrete syntax

★ natural notation

Optimization

★ based on domain assumptions

Error checking

★ report errors in terms of domain concepts

Tool support

★ interpreter, compiler, code generator, IDE

Internal DSL

Library in HLL

★ Haskell, Scala, Ruby, ...

★ API is language

★ language features for ‘linguistic abstraction’

Advantages

★ host language = implementation language

Disadvantages

★ host language = implementation language (encoding)

★ no portability

★ no domain-specific errors, analysis, optimization

External DSL

Dedicated language

★ independent of host/target language (portable)

★ implementation with interpreter or compiler

Advantages

★ language tuned to domain

★ domain-specific errors, analysis, optimizations

Disadvantages

★ cost of learning new language

★ cost of maintaining language

Example DSLs (1)

Spreadsheet

★ formulas, macros

Querying

★ SQL, XQuery, XPath

Graph layout

★ GraphViz

Web

★ HTML, CSS, RSS, XML, XSLT

★ Ruby/Rails, JSP, ASP, JSF, WebDSL

Example DSLs (2)

Games

★ Lua, UnrealScript

Modeling

★ UML, OCL, QVT

Language engineering

★ YACC, LEX, RegExp, ANTLR, SDF

★ TXL, ASF+SDF, Stratego

Example: Linguistic Integration in

WebDSL

browser server database

web app

Web Programming

browser server database

Java SQLHTML, JS, CSS

Web Programming = Distributed Programming

Concerns in Web Programming

Data Persistence

Access Control

Injection Attacks

Search

XSS

Data Validation

Data Binding

Routing

... ...

Complexity in Web Programming:

Multiple Languages x Multiple Concerns

Consistency not statically checked

Formalizing Navigation Logic

page blog(b: Blog, index: Int) { main(b){ for(p: Post in b.recentPosts(index,5)) { section{ header{ navigate post(p) { output(p.title) } } par{ output(p.content) } par{ output(p.created.format("MMMM d, yyyy")) } } }}page post(p: Post) { ... }

Statically checked navigation

entity Blog { key :: String (id) title :: String (name) posts -> Set<Post> (inverse=Post.blog) function recentPosts(index: Int, n: Int): List<Post> { var i := max(1,index) - 1; return [p | p: Post in posts order by p.created desc limit n offset i*n].list(); }}entity Post { key :: String (id) title :: String (name, searchable) content :: WikiText (searchable) blog -> Blog }

Persistent Data Models

Generation of queries: no injection attacks

entity Assignment { key :: String (id) title :: String (name, searchable) shortTitle :: String description :: WikiText (searchable) course -> CourseEdition (searchable) weight :: Float (default=1.0) deadline :: DateTime (default=null) // ...} page assignment(assign: Assignment, tab: String) { main{ progress(assign, tab) pageHeader{ output(assign.title) breadcrumbs(assign) } // ... } }

Persistent variables in WebDSL

http://department.st.ewi.tudelft.nl/weblab/assignment/752

objects are automatically persisted in database

1

2

3

page post(p: Post) { ... }page editpost(p: Post) { action save() { return blog(p); } main(p.blog){ form{ formEntry("Title"){ input(p.title) } formEntry("Content") { input(p.content) } formEntry("Posted") { input(p.created) } submit save() { "Save" } } }}

Forms & Data Binding

No separate controller!

access control rules

principal is User with credentials username, password rule page blog(b: Blog, index: Int) { true } rule page post(p: Post) { p.public || p.author == principal } rule page editpost(p: Post) { principal == p.author }

extend entity User { password :: Secret}

extend entity Blog { owner -> User}

extend entity Post { public :: Bool}

Declarative Access Control Rules

Linguistically Integrated

Persistent data model

Logic

Templates (UI, Email, Service)

Data binding

Access control

Data validation

Faceted search

Collaborative filtering

DSL Summary

software reuse through linguistic abstraction

• capture understanding of design patterns in language concepts

• abstract from accidental complexity

• program in terms of domain concepts

• automatically generate implementation

When to Use/Create DSLs?

Hierarchy of abstractions

• first understand how to program it

• make variations by copy, paste, adapt

• (avoid over-engineering)

• make library of frequently used patterns

• find existing (internal) DSLs for the domain

Time for a DSL?

• large class of applications using same design patterns

• design patterns cannot be captured in PL

• lack of checking / optimization for DSL abstractions

Language Engineering

object ExpParser extends JavaTokenParsers with PackratParsers { lazy val exp: PackratParser[Exp] = (exp <~ "+") ~ exp1 ^^ { case lhs~rhs => Add(lhs, rhs) } | exp1

lazy val exp1: PackratParser[Exp] = (exp1 ~ exp0) ^^ { case lhs~rhs => App(lhs, rhs) } | exp0 lazy val exp0: PackratParser[Exp] = number | identifier | function | letBinding | "(" ~> exp <~ ")" // ... def parse(text: String) = parseAll(exp, text)}

syntax through parsers

sealed abstract class Valuecase class numV(n: Int) extends Valuecase class closureV(param: Symbol, body: Exp, env: Env) extends Value

def eval(exp: Exp, env: Env): Value = exp match { case Num(v) => numV(v) case Add(l, r) => plus(eval(l, env), eval(r, env)) case Id(name) => lookup(name, env) case Let(name, e1, e2) => eval(e2, bind(name, eval(e1, env), env))

case Fun(name, body) => closureV(name, body, env)

case App(fun, arg) => eval(fun, env) match { case closureV(name, body, env2) => eval(body, bind(name, eval(arg, env), env2)) case _ => sys.error("Closure expected") }

} semantics through interpreter

Traditional Compilers

Traditional Compilers

ls

Course.java

Traditional Compilers

ls

Course.java

javac -verbose Course.java

[parsing started Course.java][parsing completed 8ms][loading java/lang/Object.class(java/lang:Object.class)][checking university.Course][wrote Course.class][total 411ms]

Traditional Compilers

ls

Course.java

javac -verbose Course.java

[parsing started Course.java][parsing completed 8ms][loading java/lang/Object.class(java/lang:Object.class)][checking university.Course][wrote Course.class][total 411ms]

ls

Course.class Course.java

Language Processors

syntax analysis

• parsing

• AST construction

static analysis

• name analysis

• type analysis

semantics

• generation

• interpretation

Integrated Development Environments (IDE)

Modern Compilers in IDEs

syntactic editor services

• syntax checking

• syntax highlighting

• outline view

• code folding

• bracket matching

semantic editor services

• error checking

• reference resolving

• hover help

• content completion

• refactoring

Eclipse Platform

runtime platform

• composition

• integration

development platform

• complex APIs

• abstractions for Eclipse IDEs

• concepts: editors, views, label provider, label provider factory, …

• tedious, boring, frustrating

Spoofax Language Workbench

declarative meta-languages

• syntax definition

• editor services

• term rewriting

implementation

• generic integration into Eclipse and IMP

• compilation & interpretation of language definitions

agile

• Spoofax & IDE under development in same Eclipse instance

• support for test-driven development

A Taste of Language Engineeringwith Spoofax

• abstract syntax trees

• declarative syntax definition

• name binding and scope

• transformation by term rewriting

EnFun: Entities with Functions

module blog entity String { function plus(that:String): String } entity Bool { } entity Set<T> { function add(x: T) function remove(x: T) function member(x: T): Bool } entity Blog { posts : Set<Post> function newPost(): Post { var p : Post := Post.new(); posts.add(p); } } entity Post { title : String }

Structure: Abstract Syntax

Signature & Terms

constructors Module : ID * List(Definition) -> Module Imports : ID -> Definition

Module( "application", [Imports("library"), Imports("users"), Imports("frontend")])

Entities & Properties

constructors Entity : ID * List(Property) -> Definition Type : ID -> Type New : Type -> Exp

constructors Property : ID * Type -> Property This : Exp PropAccess : Exp * ID -> Exp

Module("users", [ Imports("library") , Entity("User" , [ Property("email", Type("String")) , Property("password", Type("String")) , Property("isAdmin", Type("Bool"))])])

Parsing: From Text to Structure

Declarative Syntax Definition

Entity("User", [ Property("first", Type("String")), Property("last", Type("String"))])

signature constructors Entity : ID * List(Property) -> Definition Type : ID -> Type Property : ID * Type -> Property

Declarative Syntax Definition

entity User { first : String last : String}

Entity("User", [ Property("first", Type("String")), Property("last", Type("String"))])

signature constructors Entity : ID * List(Property) -> Definition Type : ID -> Type Property : ID * Type -> Property

Declarative Syntax Definition

entity User { first : String last : String}

Entity("User", [ Property("first", Type("String")), Property("last", Type("String"))])

signature constructors Entity : ID * List(Property) -> Definition Type : ID -> Type Property : ID * Type -> Property

context-free syntax "entity" ID "{" Property* "}" -> Definition {"Entity"} ID -> Type {"Type"} ID ":" Type -> Property {"Property"}

Prototyping Syntax Definition

Context-free Syntax

constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp

context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}

Lexical Syntax

constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp

context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}

lexical syntax [a-zA-Z][a-zA-Z0-9]* -> ID "-"? [0-9]+ -> INT [\ \t\n\r] -> LAYOUT

constructors : String -> ID : String -> INT

scannerless generalized (LR) parsing

form of tokens (words, lexemes)

Ambiguity

constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp

context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}

isPublic || isDraft && (author == principal())

Ambiguity

constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp

context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}

isPublic || isDraft && (author == principal())

amb([ And(Or(Var("isPublic"), Var("isDraft")), Eq(Var("author"), ThisCall("principal", []))), Or(Var("isPublic"), And(Var("isDraft"), Eq(Var("author"), ThisCall("principal", []))))])

Disambiguation by Encoding Precedence

constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp

context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}

context-free syntax "(" Exp ")" -> Exp0 {bracket} "true" -> Exp0 {"True"} "false" -> Exp0 {"False"} Exp0 -> Exp1 "!" Exp0 -> Exp1 {"Not"} Exp1 -> Exp2 Exp1 "&&" Exp2 -> Exp2 {"And"} Exp2 -> Exp3 Exp2 "||" Exp3 -> Exp3 {"Or"}

Declarative Disambiguation

context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And", left} Exp "||" Exp -> Exp {"Or", left} "(" Exp ")" -> Exp {bracket}context-free priorities {left: Exp.Not} > {left: Exp.Mul} > {left: Exp.Plus Exp.Minus} > {left: Exp.And} > {non-assoc: Exp.Eq Exp.Lt Exp.Gt Exp.Leq Exp.Geq}

isPublic || isDraft && (author == principal())

Or(Var("isPublic"), And(Var("isDraft"), Eq(Var("author"), ThisCall("principal", []))))

Analysis: Name Resolution

+

Definitions and References

module test

entity String { }

entity User { first : String last : String }

definition

reference

Name Binding in IDE

From Tree to Graph

Module( "test", [ Entity("String", []) , Entity( "User" , [ Property("first", ) , Property("last", ) ] ) ])

NaBL: Name Binding Language

module names

imports include/Cam namespaces Type Property Function Variable rules

Entity(name, None(), None(), _): defines Type name of type Type(name, []) scopes Type, Function, Property, Variable

Type(name, _): refers to Type name

Transformation

Transformation by Strategic Rewriting

rules desugar: Plus(e1, e2) -> MethCall(e1, "plus", [e2]) desugar: Or(e1, e2) -> MethCall(e1, "or", [e2])

desugar : VarDeclInit(x, t, e) -> Seq([VarDecl(x, t), Assign(Var(x), e)])

strategies desugar-all = topdown(repeat(desugar))

Return-Lifting Applied

function fact(n: Int): Int { var res: Int; if(n == 0) { res := 1; } else { res := this * fact(n - 1); } return res;}

function fact(n: Int): Int {

if(n == 0) { return 1; } else { return this * fact(n - 1); }

}

Return-Lifting Rules

rules lift-return-all = alltd(lift-return; normalize-all) lift-return : FunDef(x, arg*, Some(t), stat1*) -> FunDef(x, arg*, Some(t), Seq([ VarDecl(y, t), Seq(stat2*), Return(Var(y)) ])) where y := <new>; stat2* := <alltd(replace-return(|y))> stat1* replace-return(|y) : Return(e) -> Assign(y, e)

Language Engineering Summary

apply linguistic abstraction to language engineering

• declarative languages for language definition

• automatic derivation of efficient compilers

• automatic derivation of IDEs

Research Agenda

Example: Explicit Representation of Units

computer

input distance : Meter;input duration : Second;output speed : Meter/Second := duration / distance;

compiler

formalize knowledge of application area (domain) in language

error

error

Problem: Correctness of Language Definitions

computer

compilerCan we trust the compiler?

wrong outputinput

program

type soundness: well-typed programs don’t go wrong

compiler

error

Challenge: Automatic Verification of Correctness

computer

compiler

wrong output

program

type soundness: well-typed programs don’t go wrong

typechecker

codegenerator

input

CorrectnessProof

Language Workbench

State-of-the-Art: Language Engineering

SyntaxChecker

NameResolver

TypeChecker

CodeGenerator

focus on implementation; not suitable for verification

CompilerEditor(IDE) Tests

Formal Language Specification

State-of-the-Art: Semantics Engineering

AbstractSyntax

TypeSystem

DynamicSemantics Transforms

focus on (only semi-automatic) verification; not suitable for implementation

CorrectnessProof TestsCompiler

Editor(IDE)

Declarative Language Definition

My Approach: Multi-Purpose Language Definitions

SyntaxDefinition

NameBinding

TypeSystem

DynamicSemantics Transforms

CompilerEditor(IDE)

CorrectnessProof Tests

bridging the gap between language engineering and semantics engineering

Software Development on the Web

revisiting the architecture of the IDE

Exam

Material for exam

Slides from lectures

Tutorial exercises

Graded assignments

Sebesta: Chapters 1-13, 15

Programming in Scala: Chapters 1, 4-16, 19, 32-33

K&R C: Chapters 1-6

JavaScript Good Parts: Chapters 1-4

Content of exam

10% multiple choice questions about concepts

50% Scala programming (functional programming)

20% C programming (structures and pointers)

20% JavaScript programming (objects and prototypes)

Registration for Exam is Required

http://department.st.ewi.tudelft.nl/weblab/assignment/761 -> your submission

Good Luck!