38
e-Business e- Science and the Grid Geoffrey Fox Professor of Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 Chief Technologist for Anabas Corporation [email protected] http://www.infomall.org http://www.grid2002.org

e-Business e-Science and the Grid

Embed Size (px)

DESCRIPTION

e-Business e-Science and the Grid. Geoffrey Fox Professor of Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 Chief Technologist for Anabas Corporation [email protected] http://www.infomall.org http://www.grid2002.org. - PowerPoint PPT Presentation

Citation preview

Page 1: e-Business e-Science and the Grid

e-Business e-Science and the Grid

Geoffrey FoxProfessor of Computer Science, Informatics, Physics

Pervasive Technology LaboratoriesIndiana University Bloomington IN 47401

Chief Technologist for Anabas Corporation

[email protected]://www.infomall.orghttp://www.grid2002.org

Page 2: e-Business e-Science and the Grid

Grid Computing: Making The Global Infrastructure a Reality

Based on work done in preparing book edited withFran Berman andAnthony J.G. Hey,

ISBN: 0-470-85319-0 Hardcover 1080 Pages Published March 2003 http://www.grid2002.org

Page 3: e-Business e-Science and the Grid

e-Business e-Science and the Grid e-Business captures an emerging view of corporations as

dynamic virtual organizations linking employees, customers and stakeholders across the world. • The growing use of outsourcing is one example

e-Science is the similar vision for scientific research with international participation in large accelerators, satellites or distributed gene analyses.

The Grid integrates the best of the Web, traditional enterprise software, high performance computing and Peer-to-peer systems to provide the information technology infrastructure for e-moreorlessanything.

A deluge of data of unprecedented and inevitable size must be managed and understood.

People, computers, data and instruments must be linked. On demand assignment of experts, computers, networks and

storage resources must be supported

Page 4: e-Business e-Science and the Grid

So what is a Grid? Supporting human decision making with a network of at least

four large computers, perhaps six or eight small computers, and a great assortment of disc files and magnetic tape units - not to mention remote consoles and teletype stations - all churning away. (Licklider 1960)

Coordinated resource sharing and problem solving in dynamic multi-institutional virtual organizations

Infrastructure that will provide us with the ability to dynamically link together resources as an ensemble to support the execution of large-scale, resource-intensive, and distributed applications.

Realizing thirty year dream of science fiction writers that have spun yarns featuring worldwide networks of interconnected computers that behave as a single entity.

Page 5: e-Business e-Science and the Grid

e-Science e-Science is about global collaboration in key areas of

science, and the next generation of infrastructure that will enable it. This is a major UK Program

e-Science reflects growing importance of international laboratories, satellites and sensors and their integrated analysis by distributed teams

CyberInfrastructure is the analogous US initiative

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

IMAGING INSTRUMENTS

COMPUTATIONALRESOURCES

LARGE-SCALE DATABASES

DATA ACQUISITION ,ANALYSIS

ADVANCEDVISUALIZATION

Grid Technology supports e-Science and CyberInfrastructure

Page 6: e-Business e-Science and the Grid

Global Terabit Research Network

The Grid software and resources run on top of high performance global networks

Page 7: e-Business e-Science and the Grid

Resources-on-demand Computing-on-demand uses dynamically assigned

(shared) pool of resources to support excess demand in flexible cost-effective fashion

Program AComputer

1

Program ZComputer

26

Program AComputer 27

Program ZComputer

52

Spares

PoolComputer

1

PoolComputer N

<52

Program A

Program Z

Static Assignment with redundancy

Dynamic on-demand Assignment

Page 8: e-Business e-Science and the Grid

e-Business and (Virtual) Organizations Enterprise Grid supports information system for an

organization; includes “university computer center”, “(digital) library”, sales, marketing, manufacturing …

Outsourcing Grid links different parts of an enterprise together (Gridsourcing)• Manufacturing plants with designers• Animators with electronic game or film designers and

producers• Coaches with aspiring players (e-NCAA or e-NFL etc.)

Customer Grid links businesses and their customers as in many web sites such as amazon.com

e-Multimedia can use secure peer-to-peer Grids to link creators, distributors and consumers of digital music, games and films respecting rights

Distance education Grid links teacher at one place, students all over the place, mentors and graders; shared curriculum, homework, live classes …

Page 9: e-Business e-Science and the Grid

e-Defense and e-Crisis Grids support Command and Control and provide

Global Situational Awareness • Link commanders and frontline troops to themselves and to

archival and real-time data; link to what-if simulations • Dynamic heterogeneous wired and wireless networks• Security and fault tolerance essential

System of Systems; Grid of Grids• The command and information infrastructure of each ship is

a Grid; each fleet is linked together by a Grid; the President is informed by and informs the national defense Grid

• Grids must be heterogeneous and federated Crisis Management and Response enabled by a Grid

linking sensors, disaster managers, and first responders with decision support

Page 10: e-Business e-Science and the Grid

Some Important Classes of Grids Computational Grids were origin of concepts and link

computers across the globe – high latency stops this from being used as parallel machine

Knowledge and Information Grids link sensors and information repositories as in Virtual Observatories or BioInformatics• More detail on next slide

Education Grids link teachers, learners, parents as a VO with learning tools, distant lectures etc.

e-Science Grids link multidisciplinary researchers across laboratories and universities

Community Grids focus on Grids involving large numbers of peers rather than focusing on linking major resources – links Grid and Peer-to-peer network concepts

Semantic Grid links Grid, and AI community with Semantic web (ontology/meta-data enriched resources) and Agent concepts

Page 11: e-Business e-Science and the Grid

Information/Knowledge Grids Distributed (10’s to 1000’s) of data sources (instruments,

file systems, curated databases …) Data Deluge: 1 (now) to 100’s petabytes/year (2012)

• Moore’s law for Sensors Possible filters assigned dynamically (on-demand)

• Run image processing algorithm on telescope image• Run Gene sequencing algorithm on compiled data

Needs decision support front end with “what-if” simulations

Metadata (provenance) critical to annotate data

Integrate across experiments as in multi-wavelength astronomy

Data Deluge comes from pixels/year available

Page 12: e-Business e-Science and the Grid

2.4 Petabytes Today

Page 13: e-Business e-Science and the Grid

Database Database

Closely Coupled Compute Nodes

Analysis and Visualization

RepositoriesFederated Databases

Sensor Nets Streaming Data

Loosely Coupled Filters

SERVOGrid – Solid Earth Research Virtual Observatory will link Australia, Japan, USA ……

Page 14: e-Business e-Science and the Grid

In flight data

Airline

Maintenance Centre

Ground Station

Global NetworkSuch as SITA

Internet, e-mail, pager

Engine Health (Data) Center

DAME

Rolls Royce and UK e-Science ProgramDistributed Aircraft Maintenance Environment

~ Gigabyte per aircraft perEngine per transatlantic flight

~5000 engines

Page 15: e-Business e-Science and the Grid

NASA Aerospace Engineering Grid

•Lift Capabilities•Drag Capabilities•Responsiveness

•Deflection capabilities•Responsiveness

•Thrust performance•Reverse Thrust performance•Responsiveness•Fuel Consumption

•Braking performance•Steering capabilities•Traction•Dampening capabilities

Crew Capabilities- accuracy- perception- stamina- re-action times- SOP’s

Engine Models

Airframe Models

Wing Models

Landing Gear Models

Stabilizer Models

Human Models

Whole system simulations are produced by couplingall of the sub-system simulations

It takes a distributed virtual organization to design, simulate and build a complex system like an aircraft

Page 16: e-Business e-Science and the Grid

Virtual Observatory Astronomy GridIntegrate Experiments

Radio Far-Infrared Visible

Visible + X-ray

Dust Map

Galaxy Density Map

Page 17: e-Business e-Science and the Grid

e-Chemistry LaboratoryExperiments-on-demand

X-Raye-Lab

Analysis

Properties

Propertiese-Lab

SimulationVideo

Diffr

acto

mete

r

Globus

StructuresDatabase

Grid Resources

Grid-enabled Output Streams

Page 18: e-Business e-Science and the Grid

CERN LHC Data Analysis Grid

Page 19: e-Business e-Science and the Grid

Raw (HPC) Resources

Middleware

Database

PortalServices

SystemServices

SystemServices

SystemServices

Application Service

SystemServices

SystemServices

UserServices

“Core”Grid

Typical Grid Architecture

Page 20: e-Business e-Science and the Grid

SERVOGrid Requirements Seamless Access to Data repositories and large scale

computers Integration of multiple data sources including sensors,

databases, file systems with analysis system• Including filtered OGSA-DAI (Grid database access)

Rich meta-data generation and access with SERVOGrid specific Schema extending openGIS (Geography as a Web service) standards and using Semantic Grid

Portals with component model for user interfaces and web control of all capabilities

Collaboration to support world-wide work Basic Grid tools: workflow and notification

Page 21: e-Business e-Science and the Grid

Sources of Grid Technology Grids support distributed collaboratories or virtual

organizations integrating concepts from The Web Agents Distributed Objects (CORBA Java/Jini COM) Globus, Legion, Condor, NetSolve, Ninf and other High

Performance Computing activities Peer-to-peer Networks With perhaps the Web and P2P networks being the most

important for “Information Grids” and Globus for “Compute Grids”

Page 22: e-Business e-Science and the Grid

The Essence of Grid Technology? We will start from the Web view and assert that basic

paradigm is Meta-data rich Web Services communicating via

messages These have some basic support from some runtime

such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit 3)• These are the distributed equivalent of operating system

functions as in UNIX Shell

• Called Hosting Environment or platform W3C standard WSDL defines IDL (Interface

standard) for Web Services

Page 23: e-Business e-Science and the Grid

A typical Web Service In principle, services can be in any language (Fortran .. Java ..

Perl .. Python) and the interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining)

The simplest implementations involve XML messages (SOAP) and programs written in net friendly languages like Java and Python

PaymentCredit Card

WarehouseShippingcontrol

WSDL interfaces

WSDL interfaces

Security CatalogPortalService

Web Services

Web Services

Page 24: e-Business e-Science and the Grid

Services and Distributed Objects A web service is a computer program running on either the local

or remote machine with a set of well defined interfaces (ports) specified in XML (WSDL)

Web Services (WS) have many similarities with Distributed Object (DO) technology but there are some (important) technical and religious points (not easy to distinguish)• CORBA Java COM are typical DO technologies• Agents are typically SOA (Service Oriented Architecture)

Both involve distributed entities but Web Services are more loosely coupled• WS interact with messages; DO with RPC (Remote Procedure Call)• DO have “factories”; WS manage instances internally and interaction-

specific state not exposed and hence need not be managed• DO have explicit state (statefull services); WS use context in the messages

to link interactions (statefull interactions) Claim: DO’s do NOT scale; WS build on experience (with

CORBA) and do scale

Page 25: e-Business e-Science and the Grid

Details of Web Service Protocol Stack UDDI finds where programs are

• remote (distributed) programs are just Web Services

• (not a great success) WSFL links programs together

(under revision as BPEL4WS) WSDL defines interface (methods,

parameters, data formats) SOAP defines structure of message

including serialization of information HTTP is negotiation/transport protocol TCP/IP is layers 3-4 of OSI Physical Network is layer 1 of OSI

UDDI or WSILUDDI or WSIL

WSFLWSFL

WSDLWSDL

SOAP or RMISOAP or RMI

HTTP or SMTP or IIOP or

RMTP

HTTP or SMTP or IIOP or

RMTP

TCP/IPTCP/IP

Physical Network

Physical Network

Page 26: e-Business e-Science and the Grid

Education as a Web Service “Learning Object” XML standards already exist Web Services for virtual university include: Registration Performance (grading) Authoring of Curriculum Online laboratories for real and virtual instruments Homework submission Quizzes of various types (multiple choice, random parameters) Assessment data access and analysis Synchronous Delivery of Curricula including Audio/Video

Conferencing and other synchronous collaborative tools as Web Services

Scheduling of courses and mentoring sessions Asynchronous access, data-mining and knowledge discovery Learning Plan agents to guide students and teachers

Page 27: e-Business e-Science and the Grid

Classic Grid Architecture

Database Database

Netsolve

Computing

SecurityCollaboration

CompositionContent Access

Resources

Clients Users and Devices

Middle TierBrokers Service Providers

Middle Tier becomes Web Services

Page 28: e-Business e-Science and the Grid

Some Observations “Traditional “ Grids manage and share asynchronous resources in

a rather centralized fashion Peer-to-peer networks are “just like” Grids with different

implementations of message-based services like registration and look-up

Collaboration systems like WebEx/Placeware (Application sharing) or Polycom (audio/video conferencing) can be viewed as Grids

Computers are fast and getting faster. One can afford many strategies that used to be unrealistic including rich usually XML based messaging

Web Services interact with messages

• Everything (including applications like PowerPoint) will be a Web Service?

• Grids, P2P Networks, Collaborative Environments are (will be) managed message-linked Web Services

Page 29: e-Business e-Science and the Grid

Peer to Peer Grid

DatabaseDatabase

Peers

Peers

Peer to Peer GridA democratic organization

User FacingWeb Service Interfaces

Service FacingWeb Service Interfaces

Event/MessageBrokers

Event/MessageBrokers

Event/MessageBrokers

Page 30: e-Business e-Science and the Grid

System and Application Services? There are generic Grid system services: security, collaboration,

persistent storage, universal access• OGSA (Open Grid Service Architecture) is implementing these

as extended Web Services An Application Web Service is a capability used either by another

service or by a user• It has input and output ports – data is from sensors or other

services Consider Satellite-based Sensor Operations as a Web Service

• Satellite management (with a web front end)• Each tracking station is a service• Image Processing is a pipeline of filters – which can be

grouped into different services• Data storage is an important system service• Big services built hierarchically from “basic” services

Portals are the user (web browser) interfaces to Web services

Page 31: e-Business e-Science and the Grid

Sensor Data as a Web

service (WS)

Data Analysis WS

Sensor Management

WS

Visualization WS

Simulation WS

Filter1WS

Filter2WS

Filter3WS

Build as multiple Filter Web Services

Prog1WS

Prog2WS

Build as multiple interdisciplinaryPrograms

Satellite Science Grid Environment

Page 32: e-Business e-Science and the Grid

What is Happening? Grid ideas are being developed in (at least) two

communities• Web Service – W3C, OASIS• Grid Forum (High Performance Computing, e-Science)

Service Standards are being debated Grid Operational Infrastructure is being deployed Grid Architecture and core software being developed Particular System Services are being developed

“centrally” – OGSA framework for this in Lots of fields are setting domain specific standards and

building domain specific services There is a lot of hype Grids are viewed differently in different areas

• Largely “computing-on-demand” in industry (IBM, Oracle, HP, Sun)

• Largely distributed collaboratories in academia

Page 33: e-Business e-Science and the Grid

OGSA OGSI & Hosting Environments• Start with Web Services in a hosting environment

• Add OGSI to get a Grid service and a component model

• Add OGSA to get Interoperable Grid “correcting” differences in base platform and adding key functionalities

OGSI on Web Services

Broadly applicable services: registry,authorization, monitoring, data

access, etc., etc.

Hosting Environment for WS

More specialized services: datareplication, workflow, etc., etc.

Domain -specific services

Network

OGSAEnvironment

Possibly OGSA

Not OGSA

Given to us from on high

Page 34: e-Business e-Science and the Grid

Technical Activities of Note• Look at different styles of Grids such as Autonomic

(Robust Reliable Resilient)• New Grid architectures hard due to investment required• Critical Services Such as

– Security – build message based not connection based– Notification – event services– Metadata – Use Semantic Web, provenance– Databases and repositories – instruments, sensors– Computing – Submit job, scheduling, distributed file systems– Visualization, Computational Steering– Fabric and Service Management– Network performance

• Program the Grid – Workflow• Access the Grid – Portals, Grid Computing Environments

Page 35: e-Business e-Science and the Grid

Issues and Types of Grid Services• 1) Types of Grid

– R3– Lightweight– P2P– Federation and Interoperability

• 2) Core Infrastructure and Hosting Environment

– Service Management– Component Model– Service wrapper/Invocation – Messaging

• 3) Security Services– Certificate Authority– Authentication– Authorization– Policy

• 4) Workflow Services and Programming Model

– Enactment Engines (Runtime)– Languages and Programming– Compiler– Composition/Development

• 5) Notification Services• 6) Metadata and Information Services

– Basic including Registry– Semantically rich Services and meta-data– Information Aggregation (events)– Provenance

• 7) Information Grid Services– OGSA-DAI/DAIT– Integration with compute resources– P2P and database models

• 8) Compute/File Grid Services– Job Submission– Job Planning Scheduling Management– Access to Remote Files, Storage and

Computers– Replica (cache) Management– Virtual Data– Parallel Computing

• 9) Other services including– Grid Shell– Accounting– Fabric Management– Visualization Data-mining and

Computational Steering– Collaboration

• 10) Portals and Problem Solving Environments

• 11) Network Services– Performance– Reservation– Operations

Page 36: e-Business e-Science and the Grid

Data

Technology Components of (Services in)a Computing Grid

1: Job Management Service(Grid Service Interface to user or program client)

2: Schedule and control Execution

1: Plan Execution 4: Job Submittal

Remote Grid ServiceRemote Grid Service

6: File andStorage Access

3: Access to Remote Computers

Data

7: CacheData

Replicas5: Data Transfer

10: JobStatus

8: VirtualData

9: Grid MPI

Page 37: e-Business e-Science and the Grid

Conclusions Grids are inevitable and pervasive Can expect Web Services and Grids to merge with a

common set of general principles but different implementations with different scaling and functionality trade-offs

Enough is known that one can start today We will be flooded with data, information and

purported knowledge One should be preparing Grid strategies;

understanding relevant Web and Grid standards and developing new domain specific standards

Note many existing (standards) efforts assume client-server and not a brokered service model; these will need to change!

Page 38: e-Business e-Science and the Grid

Grid Computing: Making The Global Infrastructure a Reality

Fran Berman, Anthony J.G. Hey, Geoffrey Fox

ISBN: 0-470-85319-0 Hardcover 1080 Pages Published March 2003 http://www.grid2002.org