University of Dublin Trinity College Department of Computer Science Simon Dobson [email protected] Fragmenting languages

University of DublinTrinity College

Department of Computer Science

Simon [email protected] http://www.cs.tcd.ie/

Fragmenting languages

Fragmenting programming languages - 2

Introduction

We’re starting to see more emphasis on component-based software engineering

• compose and re-use relatively large-scale modules

Affects applications and languages• languages for building component-based systems

• changes in emphasis in constructs

Languages themselves as component-based systems• rapid experimentation

• domain-specific languages made easy

My aim: to tell you about some work we’ve been doing in this area


Outsourcing information systems

when(shop purchase made) accounts bill(shop buyer, shop total); stock withdraw(shop purchases); fulfilment ship(shop purchases);

Honest Ron’s Fulfilment Inc

Outsource fulfilment and warehousing

Widget Packers of London Ltd

Shop window

Accounts

Intern

et

Retain only core business services on-site

Program becomes a high-level script executed in response to events

Programmer deals with components ranging from objects to full-sized sub-systems


CORBA

From remote interaction...

Information exposed to browsers via the web

Selectively expose parts of the information system

These two worlds don’t interact as well as they might


…to enterprise embedding

Admit a remote component to an enterprise’s intranet, and run in a controlled environment

Embed part of one organisation into another

Keep a close eye on the embedded component’s activities and connections


Language challengesTechnologies

• component repositories, scripting languages

Do we understand composition?• it becomes the key factor - “programming in the huge”

• the internals of the individual components are less interesting

• type systems, objects, exceptions, ...

Do we understand security, at this level?• Java applets and ActiveX signed components are too coarse

What are the “right” languages for component systems?• changes in programmer demographics

• low-level - control, generality, complexity

• high-level - domain-specific, simple, limited


How do we address these issues?

Need to experiment with languages quickly• language - composition operators, re-use models

• trust management - security, authentication, webs of trust

• tools - selecting code, repositories, integration with OS services

• the balance between control and expressiveness

Not facilitated by current language technologies• a compiler is a big piece of kit - hard to get source, hard to

understand once you’ve got it

• barrier to entry into language design too high for many


A language as a component system

Many (most?) language features are orthogonal• don’t need to know exactly what expressions are available to

describe a sequential composition of them

Features tend to be compositional: they often work by combining sub-expressions without inspecting them

• this applies to syntax, types and behaviour

(fun( Int n ) n + 1)(12)

exp1 ; exp2

12“hello”

auto Int with [| 1, 2, 3 |]

The details of the features aren’t really important


What does this mean?

A “language” is simply a composition of fragments• syntax, typing and behaviour

Specify the features independently, and compose• there will be some dependencies…

• …but a lot more independence

Effects re-use - leverage feature independence easy experimentation - just a composition exercise extensibility - present features as syntax, or libraries, or hidden targeting - exactly the features you want, how you want them performance - generality brings overheads complexity - “so what language do we program in????”


Vanilla

We’ve been developing a language design system called Vanilla

• build interpreters from language fragments (“pods”)

• re-use existing fragments

• incrementally construct language tools

Standard pods cover a large sub-set of the language design space

• imperative, functional, “typeful”, object-based

• direct interaction with CORBA objects

• 100% pure Java

The idea is to be able to experiment with (and deploy) new language variants quickly and easily


Vanilla architecture

Parser

Type checker

Interpreter

Services

A core algorithm running over a set of components providing the actual functionality

Sub-typing

Attributes


Pods

The set of components needed to implement a language feature

• syntax - a parser component

• types - new types, the type-checking of constructs, ...

• behaviour - interpretation of the abstract syntax tree

• auxiliary services - initialisation, daemons, CORBA mappings, …

The critical observation is that language features are largely independent of each other

• how does it matter what type an object’s methods return?

• apply a construct uniformly across other constructs

Strive to make them independent• although of course there are dependencies


Parsing

public export void TypeSpecifier() : { } { GroundType() | ConstructedType() }

Sequence()

Int

StringTypeSpecifier

A parser is typically “all one piece”• complete syntax in a set of tokens and productions

A recursive use of the TypeSpecifier() production


Modular parsing

Sequence()

All(X <: T)

Int

StringTypeSpecifier

Parser components• imports tokens and productions from other components

• express bits of grammar

• recursive descent, parser combinators

The body of the universal is itself a TypeSpecifier()

Extend the production with another disjunct

Can result in ambiguities, but a good way to experiment with syntax


Component typing and behaviour

Both type-checking and interpretation walk to abstract syntax tree

A type (interpreter) component “expresses an interest” in particular AST node types

• assign a type

• determine absolutely that the node is type-incorrect

• “pass the buck” to another interested component

Overload syntax by having several interested pods• risky, risky…

• …but sometimes fits really well, e.g. functions taking types, or the “dot” operator for addressing into various structures


Components

Sub-system

ASTUniversalType

ASTFunction

ASTStringType

Components express interest in the different node

types

Try each interested component until one succeeds, one fails, or all decline to commit

themselves


What’s in a pod

new UniversalType(…)

new IClosure(…)

ALL(X <: T) X

fun(X <: T) …

Concrete syntax to abstract

syntax

Abstract syntax to type

Abstract syntax to behaviour

Behaviour may depend on attributes derived from type-checking

Parser

Type-checker

Interpreter

Attributes


Standard pods

Core• ground types, simple constructed types, control structures

Functions• higher-order closures with currying

Kinds

Universally polymorphic types• like C++ templates without the re-compilation

Automorphic values• dynamic typing

Objects

Client-side CORBA

Object types: haven’t done classes yet

The essentials of functional programming

Higher-order type systems

Currently an incomplete IDL mapping


Languages

A language, in Vanilla terms, is just a set of pods composed together within the framework

• language definition files name the classes

• omit CORBA support by omitting ORB services

// core podie.tcd.cs.vanilla.syntax.Coreie.tcd.cs.vanilla.types.CoreTypeComponentie.tcd.cs.vanilla.types.CoreSubtypeComponentie.tcd.cs.vanilla.interpreter.CoreInterpreterComponentie.tcd.cs.vanilla.interpreter.CoreInitialValueComponentie.tcd.cs.vanilla.interpreter.CORBA.CoreCorbaMapping

Parser Typing and sub-

typing Interpreter

Default initial

values

Core Vanilla => IDL


Pod re-use

Pod usage• change a pod in a language without re-writing the whole thing

• change a component of a pod, without changing the others

Same pods in several languages or variants• e.g. implement all Pascal and a lot of Java using the standard pods

FOR i := 1 TO 10 DO j := j + f(i)END;

for(i = 1; i < 11; i = i + 1) j = j + f(i);

Basically we just eliminate some of the possibilities syntactically

ASTForASTInteger

ASTAdd...

ASTLess


Ions

Why do we name the super-class of a class?• just need to know the (minimal) type of possible super-classes

Ions• “class with a free super-class”

• bind to a real super-class before instanciating

• just change the pod implementing classes - nowhere else

Must have

Apply the same functionality to a family of legal super-classes

Don’t over-commit to the super-class

Reify ion with a particular super-class before instanciating the resulting class


Binding

Take tight control of binding for mobile code• no covert channels, uniform access control

• dynamic bindings - re-bind to the “equivalent” object on entering a new environment

Retain static bindingsAdd logging or

encryption to channels

Disallow some bindings

Re-bind dynamic objects


Active buildings

Quite a complex domain• compound events and combinations, unstable predicates

• asymmetric communications, two distinct networks

• humans, agents and robots living together

High-level scripting to control building actions

when(person “simon” enters office “F35”) { execute(“quake2.exe”) on enya.dsg.cs.tcd.ie; tell(all in “dsg”) “Quake time!!”;}

Code is implicitly event-driven

Complex location and group communication tasks hidden inside simple statements

“Place” could actually be quite complex

Details are written once and then hidden, rather than being exposed throughout the applications


CORBA integration - 1

Client-side CORBA interactions as standard

Import IDL directly, rather than through a stub compiler• import an IDL file directly

• importing results in a set of Vanilla module and type declarations, in terms of the other pods

Build stub by binding an object type to an IOR• each method on the object type induces a method stub

• use DII to actually make the call

Component-based CORBA mapping• map from Vanilla value to IDL any• each pod may add its own mapping


import idl "http://www.random.org/Random.idl";

Random r = bind(Random, ior "http://www.random.org/Random.ior");

Int i;Sequence(Int) rs = for(i = 0; i < 10; i = i + 1) r.lrand48();println(rs);

CORBA integration - 2

Vanilla’s random numbers are really random….

Pull the IDL definitions in off the web

Bind a type from the IDL to an IOR also pulled off the web

Call the stub like any other method


Conclusion

Component systems provide a new computer science context

We’ll need new language variants• composition, security, trust management

• high-level, domain-specific

Vanilla provides an infrastructure for experimentation• define language features as fragments, and compose

• lots of standard pods to re-use

• experiment quickly across the spectrum of language design

Documents

University of Dublin Trinity College Department of Computer Science Simon Dobson [email protected] Fragmenting languages