21
The Moving Code Project – Sharing Computing Components on the Web Faculty of Environmental Sciences, Geoinformation Systems Matthias Müller

The Moving Code Project - Matthias Müller

Embed Size (px)

Citation preview

The Moving Code Project –Sharing Computing Components on the Web

Faculty of Environmental Sciences, Geoinformation Systems

Matthias Müller

We like to move it, move it! – the "Moving Code" idea

2

Cloud machineWorkstation GIS Web Service

A piece of implementedcomputational logic:- Well defined- Tested- Useful for others- Robust- … worth publishing

The roots of service-oriented processing …

Think of a processing tool as a service („get me a haircut“), but not necessarily a Web Service Services are well-defined Services are hopefully robust

Having WPS process descriptions for processingfunctions is a first major step to compare different functions interface-wise

Top-down approach: First agree on a convenientprocess interface between service provider andconsumer, then start / adjust implementation

3

Use cases

Process sensitive data (which cannot be transportedover a network due to legal reasons)

Publish well-defined processes from scientific research(imagine impact factor for well-defined, well-programmed, well-tested models!)

Approach the scaling issue in a distributed environment(it is not economic to re-implement new function in everyproduct / library)

Sharing new processing functions in an interopeablefashion without having to provide a WPS server

4

Save bandwidth – move processes close to the data

5

Data: 5GBCoded process:

5 KB

6

Requirements for code exchange across systems

1. What functionality is provided?

2. What software environment is required to run the code?

3. What hardware is required for robust execution?

4. What are the legal conditions for use?

5. (What is a proper packaging mechanism for shipping?)

7

Multi-level description approach

Adopted from service models in cloud environents SaaS – Software as a Service Algorithmic functionality PaaS – Platform as a Service Req‘d. software environment IaaS – Infrastructure as a Service Req‘d. hardware resources

Plus exploitation rights

8

SaaS OGC WPS Standard: Process DescriptionsPaaS (URI registry)IaaS OCCI (Open Cloud Computing Infrastructure Spec.)Exploitation Rights Creative Commons Modular license

Related standards

1. Functional Contracts – SaaS

WPS process descriptions can be used to describeprocess intefaces (without actually using the WPS protocol) What are the parameters of a process What are they for Are they Inputs, wich or Outputs Are they mandatory or optional (cardinality in general) Are specific about the required data format (e.g. GeoJSON,

GeoTiff, …)

9

2. Contracted Platforms - PaaS

There is a limited number of well known, coarse-grainedAPIs in the Geospatial Domain GDAL (which usually includes Numpy) GRASS ArcGIS R Matlab …

Probably 90% of the people will use these APIs toproduce their compuational logic

Business models of PaaS providers can support pre-defined platform configurations

10

3. Contracted Infrastructure – IaaS

Service-oriented operation means a robust execution ofa service under a broad range of conditions

… is a prerequisite to operate on large data volumes

Making algorithms fit for large amounts of data is not an easy task but worth the effort

Contracted infrastructure for code packages are theexpected hardware requirements for a single execution This also helps to check the feasibility of concurrent executions

on the same machine

11

Different implementations, same functional contracts

12

SaaS

FunctionalityInterface

PaaS

InterpretersDependencies

IaaS

ArchitectureMemoryProcessor

System 1 System 2 System 3 System 4

Platform A Platform B

#!/bin/bash

Process

System ...

Platform ...

4. Exploitation Rights

Implementations will surely have associated usage rightsor lincensing conditions

Creative Commons is a widely used standardizedapproach – also for software licenses

… but there is also a need to support other types of open source or commercial licenses

Possibilities: Use structures CC approach to licensing –or – reference „traditional“ license

Moving Code currently uses URLs to reference licensedocuments but preferes CC

13

5. Packaging Mechanism – Workspaces and Containers

14

Structured ZIP-Archive

Package description (XML)

Workspace

Executable

Additional elements(e.g. constant data, fine-graineddependencies)

A transferable directory structure

A well-known containerformat (.py, .jar, …)

Towards a Web archive for self-contained, well-definedprocessing functions

15

Geoprocessing Repository / „Appstore“

MovingCode Framework – Components

16

mc-schema

Descriptive schemaand packaging format

mc-runtime

API for reading, writingand executing MC-packages

52N WPS Framework

MC-Plugin

<other products>MC-Plugin

Process Repository / „Appstore“

/ web-based toolboxUse a catalogue, downloadand use directly …

Delivery mechanism – Atom Feeds

17

Can be subscribed to

Human-readable Web-Browsers RSS-Readers /

Mailclients

Machine-readablecontent

18

mvc::ContractedFunctionality

mvc::ContractedInfrastructure(inherits from OCCI specification)

+ runtimeComponent [1..*]

mvc::ContractedPlatform

+ CreativeCommonsLicense [0..1]+ TraditionalLicenseDocument [0..1]

mvc::ExploitationRights

wps::ProcessDescription(inherits from WPS specification)

+ workspaceRoot [1]+ executableLocation [1]+ containerType [1]+ executionParameters [1..*]

mvc::Workspace

implements

complies

complies

liable to

...(inherits from other standard)

Package Description: Schema

Package Creation: Do it top down

19

Contract-first: Start with the WPS interface description

Specify container format and required software platform(we have tested: Any command line, JAVA, ArcToolbox)

Map the I/O parameters from the WPS description to the container‘sinterface

Specify hardware and license

Package and publish

Summary / Remarks / Further thoughts

Relies on well-defined or „standardized“ environments

Easy to require but hard to enforce

Needs a common agreement on container types(command line interfaces are come out-of-the box)

Packaging scheme for reusable processing logic reliesas far as possible on open standards

Sandbox testing of new packages (replicate runtimeenvironment)

Generate visibility for scientific models and analysisfunctions

20

A bazaar for processes?

21