November 1999
CHAIMS 1
Compiling High-level AccessInterfaces for Multi-site Software Stanford University
Objective: Investigate revolutionary approaches to large-scale software composition.
Approach: Develop & validate a composition-only language.
Contributions and plans: • Hardware and software platform independence.• Asynchrony by splitting up CALL-statement.• Performance optimization by invocation scheduling.• Potential for multi-site dataflow optimization.
www-db.stanford.edu/CHAIMS
CHAIMS: Mega-Programming Research
CHAIMS
November 1999
CHAIMS 2
Presentation• Motivation and Objectives
– changes in software production– basis for new visions and education
• Concepts of CHAIMS– CHAIMS language– CHAIMS architecture and composition process– Scheduling– Dataflow optimization
• Status, Plans, Conclusions
November 1999
CHAIMS 3
Coding
Integration
1970 1990 2010
Shift in Programming Tasks
November 1999
CHAIMS 4
Hypotheses• After the Y2K effort no large software app-
lications will be written from the ground up. They will always be composed using existing legacy code.
• Composition requires functionalities not available in current mainstream programming languages.
• Large-scale systems enable and require different optimizations.
• Composition programmers will use different tools from base programmers. (type A versus type B -- [Belady]
November 1999
CHAIMS 5
Languages & Interfaces• Large languages intended to support coding
and composition have not been successful– Algol 68– PL/1– Ada– CLOS
• Databases are being successfully composed, using Client-server, Mediator architectures
– distribution -- exploit network capabilities– heterogeneity -- autonomy creates heterogneity– simple schemas -- some human interpretation– service model -- public and commercial sources
in use: C, C++, Fortran, Java
November 1999
CHAIMS 6
Typical Scenario: LogisticsA general has to ship troops and/or equipment
from San Diego NOSC to Washington DC:– at different times ship different kind of materiel:
» criteria for suitable means of transport differ– not every airport equally suited– congestion, prices– actual weather– certain due or ready dates
Today: call different companies, look up information on the web, make reservations one-by-oneTomorrow: system proposes shipping methods that take many conditions into account
» hand-coded systems» composition of processes
November 1999
CHAIMS 7
C H A I M S
Megamodules
Megaprogram for composition, written by domain programmer
CHAIMS system automates generation of client for
distributed system
Megamodules, provided by various megamodule
providers
CHAIMS
November 1999
CHAIMS 8
Megamodules - DefinitionMegamodules are large, autonomous, distributed,
heterogeneous services or processes.• large: computation intensive, data intensive, ongoing
processes (monitoring of the real world, simulation services)• distributed: remote, available to more than one client• heterogeneous: a variety of languages and systems
accessible by various distribution protocols• autonomous: maintenance and control over recourses
remains with provider, differing ontologies ( ==> SKC)Examples:
– logistics: “find best transportation from A to B”, reservation systems– genomics: compose various analysis tools (now manual control)
November 1999
CHAIMS 9
Architecture for today: Fat Clients Domain expert
Client computer
Control &Computation
Services
I/O
a bcd
e
Wrappers to resolve
differences
I/O
DataResources
November 1999
CHAIMS 10
Service Architecture: Thin Clients Domain expert
Client workstation
ComputationServices
IO module
MEGA modules
IO module
ab
cd
e
DataResources
Sites RT
S U T
C
November 1999
CHAIMS 11
Issues in Heavy-weight ServicesServices are not free for a client:
• execution time of a service• transfer time for data• fees for services ?
What the client applications need:==> monitoring progress of a service==> allow choice among equivalent services
based on estimated waiting time and fees==> high performance due to parallelism among
distributed remote services==> preliminary overview results, information to
select level of accuracy / results size==> effective optimization techniques
November 1999
CHAIMS 12
Challenge in the new world:Empower Non-technical Domain Experts
Company providing services:• domain experts of domain of service (e.g. weather)• technical experts for programming for distribution
protocols, setting up servers in a middleware system• marketing experts
“Megaprogrammer”:• is domain expert of domain that uses these services• is not technical expert of middleware system or
experienced programmer,• wants to focus on problem at hand (=results of using
megaprogram)• e.g. scientist, logistics officer
November 1999
CHAIMS 13
A purely compositional language?Which languages did succeed?
– Algol, ADA: integrated composition and computation– C, C++ focus on computation
Why a new language?– complexity: not all facilities of a common language
(compare to approach of Java), – inhibiting traditional computational programming
(compare C++ and Smalltalk concerning object-oriented programming)
– focus on issue of composition, parallelism by natural asynchrony, and novel optimizations
November 1999
CHAIMS 14
CHAIMS “Logical” Architecture
Customer
Megaprogramclients(in CHAIMS)
Network/Transport(DCE, CORBA,...)
Megamodules(Wrapped or Native)
November 1999
CHAIMS 15
CHAIMS Physical Architecture
Network CORBA, JAVA RMI, DCE, DCOM...
MegaprogramClients in CHAIMS
Megamodules (wrapped, native) each supportingsetup, estimate, invoke, examine, extract, and terminate.
November 1999
CHAIMS 16
CALL statements - growth & split
Copying
Code sharing
Parameterized computation
Objects with overloaded method names
Remote procedure calls to distributed modules
Constrained (black box) access to encapsulated data
progressin
scale ofcomputing
ExtractInvokeEstimate ExamineSetup
CHAIMSdecomposes CALL functions
CALL gainedfunctionality
November 1999
CHAIMS 17
CHAIMS PrimitivesPre-invocation:
SETUP: set up the connection to a megamoduleSET-, GETATTRIBUTES: set global parameters in a megamoduleESTIMATE: get estimate of execution time for optimization
Invocation and result gathering:INVOKE: start a specific methodEXAMINE: test status of an invoked methodEXTRACT: extract results from an invoked method
Termination:TERMINATE: terminate a method invocation or a connection to
a megamodule
Control: Utility:WHILE, IF GETPARAM: get default parameters
November 1999
CHAIMS 18
Megaprogram Example: Overview
InputOutput- Input- Output
RouteInfo- AllRoutes- CityPairList- ...
AirGround- CostForGround- CostForAir- ...
Routing- BestRoute- ...
RouteOptimizer- Optimum- ...
General I/O-megamodule» Input function takes as parameter a default
data structure containing names, types and default values for expected input
Travel information:» Computing all possible routes between two
cities» Computing the air and ground cost for each
leg given a list of city-pairs and data about the goods to be transported
Two megamodules that offer equivalent functions for calculating optimal routes
» Optimum and BestRoute both calculate the optimum route given routes and costs
» Global variables: Optimization can be done for cost or for time
November 1999
CHAIMS 19
Megaprogram Example: Codeio_mmh = SETUP ("InputOutput")route_mmh = SETUP ("RouteInfo")...best2_mmh.SETATTRIBUTES (criterion = "cost")
cities_default = route_mmh.GETPARAM(Pair_of_Cities)input_cities_ih = io_mmh.INVOKE ("input”, cities_default)WHILE (input_cities_ih.EXAMINE() != DONE) {}cities = input_cities_ih.EXTRACT()...route_ih = route_mmh.INVOKE ("AllRoutes", Pair_of_Cities = cities)WHILE (route_ih.EXAMINE() != DONE) {}routes = route_ih.EXTRACT() …
IF (best1_mmh.ESTIMATE("Best_Route") < best2_mmh.ESTIMATE("Optimum") ) THEN {best_ih = best1_mmh.INVOKE ("Best_Route", Goods = info_goods, Pair_of_Cities = cities, List_of_Routes = routes, Cost_Ground = cost_list_ground, Cost_Air = cost_list_air)}ELSE {best_ih = best2_mmh.INVOKE ("Optimum", Goods = info_goods, …...best2_mmh.TERMINATE()
// Setup connections to megamodules.
// Set global variables valid for all invocations // of this client.
// Get information from the megaprogram user // about the goods to be transported and about// the two desired cities.
// Get all routes between the two cities.
//Get all city pairs in these routes.//Calculate the costs of all the routes.
// Figure out the optimal megamodule for// picking the best route.
//Pick the best route and display the result.
// Terminate all invocations
November 1999
CHAIMS 20
Operation of one Megamodule
• SETUP
• SETATTRIBUTES provides context
• ESTIMATE serves scheduling
• INVOKE initiates remote computation
• EXAMINE checks for completion
• EXTRACT obtains results
• TERMINATE I / ALL
M handle
M handle
M handle
M handle
I handle
I handle
I handle
M handle
I handle
November 1999
CHAIMS 21
CHAIMS Megaprogr. LanguagePurely compositional:
– only variety of CALLs and control flow– no primitives for input/output ==> instead use general and
problem-specific I/O megamodules– no primitives for arithmetic ==> use math megamodules
Splitting up CALL-statement:– parallelism by asynchrony in sequential program– novel possibilities for optimizations– reduction of complexity of integrated invoke statements
• higher-level language (assembler => HLLs, HLLs => composition/megamodule paradigm)
November 1999
CHAIMS 22
Architecture: Creation Process
d
a
b
c
MEGA modules
CHAIMS Repository
adds information to
MegamoduleProvider
Writes native programs or wraps non-CHAIMS
compliant megamodules
Wrapper Templates
e
November 1999
CHAIMS 23
writes
Architecture: Composition Process
Megaprogrammer
CSRT(compiled megaprogram)
Megaprogram(in CHAIMS language)
CHAIMS Compiler
generates
CHAIMS Repository
information
information
June 1998 CHAIMS 24
Runtime Architecture
Distribution System (CORBA, RMI…)
CSRT(compiled megaprogram)
ed
a
b
cMEGA modules IO module(s)
November 1999
CHAIMS 25
writes
Architecture: AllActive at different times
e
Megaprogrammer
d
a
b
c
Distribution System (CORBA, RMI…)
CSRT(compiled megaprogram)
Megaprogram(in CHAIMS language)
CHAIMS Compiler
generates
MEGA modules
CHAIMS Repository
adds information to
MegamoduleProvider
wraps non-CHAIMScompliant megamodules
information
information
Wrapper Templates
November 1999
CHAIMS 26
Multiple Transport Protocols
Megaprogrammer
CHAIMS - language
M e g a m o d u l e s
CHAIMS-protocols
CORBA-idl DCE-idl Java-class
CHAIMS API defines interface between megaprogrammer and megaprogram; the megaprogram is
written in the CHAIMS language.
The CHAIMS protocols define the calls the mega-modules have to understand. These protocols are slightly different for the different distribution protocols, and are defined by an idl for CORBA, another idl for DCE, and a Java class for RMI.
Megaprogram
November 1999
CHAIMS 27
Name of Person
Data objects: BlobsMinimal Typing within CHAIMS:Integer, boolean only for controlAll else is placed into Binary Large OBjects (Blobs),
transparent to compiler :Alternatives• ASN.1, with conversion routines• XML Example: Person_Information
complex
First Name string Joe Last Name string Smith
Personal Data complex Address
Date of Birth date 6/21/54 Soc.Sec.No string 345-34-345
November 1999
CHAIMS 28
Wrapper: CHAIMS ComplianceCHAIMS protocol - support all CHAIMS primitives
– if not native, achieved by wrapping legacy codes
• State management and asynchrony: » clientId (megamodule handle in CHAIMS language)» callId (invocation handle in CHAIMS language)» results must be stored for possible extraction(s) until
termination of the invocation
• Data transformation: » BLOBs must be converted into the megamodule
specific data types (coding/decoding routines)
November 1999
CHAIMS 29
Architecture: Three Views
Transport View moving around data blobs and CHAIMS messages
Composition View (megaprogram)
- composition of megamodules
- directing of opaque data blobs
Data View - exchange of data - interpretation of
data - in/between
megamodules
CHAIMS Layer
Distribution Layer
Objective: Clear separation between composition of services, computation of data, and transport
November 1999
CHAIMS 30
execution of a remote method
synchronous
invoke a methodie extract results
setup / set attributes s
s
e
i
time
decomposed(no benefit for one module)
asynchronous
s,i
time
etim
e
available for other methods
e
s,i
Scheduler: Decomposed Execution
November 1999
CHAIMS 31
Optimized Execution of Modules
M1 M4(<M1+M2)
M5
M2
M3 (>M1+M2)
i1e1
e4e3
e2
i3i4
i5
i2
e5
time
M1
M4
M5
M2
M3
i1
e1
e2
e3
e4
e5
i2
i3
i4
i5
time
data dependenciesexecution of a module
non-optimized
optimized by scheduleraccording to estimates
invoke a methodie extract results
November 1999
CHAIMS 32
Decomposed Parallel Execution
time
M1M4
(<M1+M2)
M5
M2
M3<M1+M2)
optimized by scheduleraccording to estimates
invoke a methodextract results
set up / set attributes
Long setup timesoccur, for instance,when a subset of a large database hasto be loaded for asimple search, sayTransatlantic fightsfor an optimal arrival.
November 1999
CHAIMS 33
M1M4
(<M1+M2)
M5
M2
M3 (>M1+M2)
Decomposed Optimized Execution
M1M4
(<M1+M2)
M5
M2
M3 (>M1+M2)
optimized by scheduleraccording to estimates
invoke a methodextract results
set up / set attributes
time
prio
r tim
e
November 1999
CHAIMS 34
Scheduling: Simple Example
1 cost_ground_ih = cost_mmh.INVOKE ("Cost_for_Ground", 1 List_of_City_Pairs = city_pairs,Goods = info_goods)
2 WHILE (cost_ground_ih.EXAMINE() != DONE) {} 3cost_list_ground = cost_ground_ih.EXTRACT()
3 cost_air_ih = cost_mmh.INVOKE ("Cost_for_Air", 2 List_of_City_Pairs = city_pairs,Goods = info_good)
4 WHILE (cost_air_ih.EXAMINE() != DONE) {} 4cost_list_air = cost_air_ih.EXTRACT()
order inunscheduledmegaprogram
order in automaticallyprescheduled megaprogram
June 1998 CHAIMS 35
Iterated Invocations
invoke a methodextract results
set up / set attributes
prio
r tim
e
M6.1
M6.2
M6.3
M6.4
M6.5
M6.1
M6.2
M6.3
M6.5
M6.4
Avoid repeatedsetups
time
June 1998 CHAIMS 36
& Repeated Extractions
invoke a methodextract resultspartial for iteratingfull for presentation
set up / set attributes
prio
r tim
e, d
isib
ct in
voct
ions
M6.1
M6.2
M6.3
M6.4
M6.5
M6.1
M6.2
M6.3
M6.5
M6.4
time,
sha
red
setu
p
M6.1
M6.2
M6.3
M6.5
M6.4
t i
m e
,sh
ared
set
up &
par
tial e
xtra
ct
Avoid largeexactsuntilsatisfied
November 1999
CHAIMS 37
Scheduling: HeuristicsINVOKES: call INVOKE’s as soon as possible
» may depend on other data» moving it outside of an if-block: depending on cost-
function (ESTIMATE of this and following functions concerning execution time, dataflow and fees (resources).
EXTRACT: move EXTRACT’s to where the result is actually needed
» no sense of checking/waiting for results before they are needed
» instead of waiting, polling all invocations and issue next possible invocation as soon as data could be extracted
TERMINATE: terminate invocations that are no longer needed (save resources)
» not every method invocation has an extract (e.g. print-like functions)
November 1999
CHAIMS 38
Compiling into a NetworkMega Program
Module A
Module B
Module CModule E
Module DModule F
current CHAIMS systemMega Program
Module DModule F
control flow data flow
with distribution dataflow optimization
Mega Program
Module A
Module B
Module C Module E
Module DModule F
November 1999
CHAIMS 39
CHAIMS Implementation• Specify minimal language
– minimal functions: CALLs, While, If *– minimal typing {boolean, integer, string, handles, object}
» objects encapsulated using ASN.1 standard– type conversion in wrappers, service modules*
• Compiler for multiple protocols (one-at-time, mixed*)• Wrapper generation for multiple protocols• Native modules for I/O, simple mathematics*, other• Implement API for CORBA, Java RMI, DCE usage• Wrap / construct several programs for simple demos• Schedule optimization *• Demonstrate use in heterogeneous setting• Define full-scale demonstration * in process
November 1999
CHAIMS 40
CHAIMS
Conclusion: Research Questions
• Is a Megaprogramming language focusing only on composition feasible?
• Can it exploit on-going progress in client-server models and be protocol independent?
• Can natural parallelism for distributed services be effectively scheduled?
• Can high-level dataflow among distributed modules be optimized?
• Can CHAIMS express clearly a high-level distributed SW architecture?
• Can the approach affect SW process concepts and practice?
November 1999
CHAIMS 41
Conclusion: Questions not addressed
• Will one Client/Server protocol subsume all others?– distributed optimization remains an issue
• Synchronization / Concurrency Control– autonomy of sources negates current concepts– if modules share databases, then database locks may
span setup/terminate all for a megaprogram handle.
• Will software vendors consider moving to a service paradigm?
– need CHAIMS demonstration for evaluation
November 1999
CHAIMS 42
Integration Science
IntegrationScience
ArtificialIntelligence
knowledge mgmtmodels
uncertainty
Systems Engineering
analysisdocumentation
costing
Databasesaccessstoragealgebras
November 1999
CHAIMS 43
CHAIMS