Towards Autonomic e-Science Ecosystems

Preview:

DESCRIPTION

 

Citation preview

Towardsautonomice‐scienceecosystems

CécileGermain‐Renaud

LaboratoiredeRechercheenInforma<queUniversitéParis‐Sud‐CNRS‐INRIA

Outline

☀ Computa<onalecosystems

☂ TheClouds☂ Challenges☀ Autonomics

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Therequirementsofe‐science

“Cyberinfrastructureintegrateshardwareforcompu6ng,dataandnetworks,digitally‐enabledsensors,observatoriesandexperimentalfacili6es,andaninteroperablesuiteofso=wareandmiddlewareservicesandtools…”

NSF’sCyberinfrastructureVisionfor21stCenturyDiscovery

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Anolddream

«Acomputa6onalgridisahardwareandso=wareinfrastructurethatprovidesdependable,consistent,pervasive,andinexpensiveaccesstohighcomputa6onalcapabili6es.»I.Foster,C.Kesselman,TheGrid,1998

UCLApressreleaseonthecrea<onofARPANET,1969

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Gridsareareality

•  Severallargedeploymentsinrou<neproduc<on•  UKNa<onalGridService(NGS)•  EuropeanGridInfrastructure(EGI‐EGEE)•  TeraGrid•  OpenScienceGrid(OSG)•  DEISA•  …

30/01/11 ASSYSTmee<ng:OpeningtheCloud

TheEGEE/EGIgrid

LHCisthe•  Largest(26km),•  Fastest(14TeV)•  Coldest(1.9K)•  Emp<est(10−13

atm)machine.

EGEE/EGIisthe•  Largest(40KCPUs),•  Mostdistributed(250

sites),•  Mostused(300K

jobs/day)Computersystem

AtlasCollabora<on(oneinfour)

•  3000scien<sts•  38countries

•  174universi<esandlabs

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Cyberinfrastructure => Cyber-Ecosystems

Source: M. Parashar eSI visitor Seminar /www.nesc.ac.uk/action/esi/

Cyberinfrastructure => Cyber-Ecosystems

21st century Science and Engineering: New Paradigms & Practices

•  Fundamentally data-driven/data intensive

•  Fundamentally collaborative

Source: M. Parashar eSI visitor Seminar /www.nesc.ac.uk/action/esi/

UnprecedentedOpportuni<es

Forscienceandengineering

•  Knowledge‐based,informa<on/data‐driven,context/content‐awarecomputa<onallyintensive,pervasive,..

•  Holis<capplica<ons:integrateon‐demandcomputa<ons,experiments,observa<ons,data,…

•  Tomanage,control,predict,adapt,op<mize,…

•  Newparadigmsandprac<cesforexis<nggoalsornewthinking

30/01/11 ASSYSTmee<ng:OpeningtheCloud

e‐scienceecosystems

•  AmajorrequirementisPervasive:On‐demand,integrated,transparent

•  Con<nuity,notrevolu<on–Wemustlearnfromtheexperience

30/01/11 ASSYSTmee<ng:OpeningtheCloud

ExperiencewiththeEGEE/EGIgrid

EGEECPUusage

0.10%

1.00%

10.00%

100.00%

AA CC ES F HEP INF LS MV OTH UNK

Y0(%)

Y1(%)

Y2(%)

Source:ReportonU<liza<onofEGEEsupportservicesandinfrastructure,May2010

30/01/11 ASSYSTmee<ng:OpeningtheCloud

e‐scienceecosystems

•  AmajorrequirementisPervasive:On‐demand,integrated,transparent

•  Con<nuity,notrevolu<on–Wemustlearnfromtheexperience

•  Organizedscien<ficcommuni<esarecommimedtoglobalizedhomogeneoussystems.Individualizedscienceisnot(yet?).Heterogeneoushigh‐levelsystemsares<llinthedesignstate.

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Outline

•  Computa<onalecosystems

•  TheClouds•  Challenges•  Autonomics

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Amorepervasivetechnology

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Source:WilliamVambenepe'sKeynoteatCloudConnect2010hmp://stage.vambenepe.com/archives/1355

SaaS:SopwareasaService

Howtodeliver/consume/managesuchservices

•  Cloudprovidesincreasedinfrastructureflexibility,excellentbutnotthebomleneck

•  Applica<onoruser‐orientedflexibility•  Controlandorchestra<onoftheholis<capplica<onsacrossspecialized

andheterogeneouscomponents,whetherlocal,inagridorinacloud•  Agilityasthecapacitytoreconfigure,reorganizetheinternalprocesses

«TheboComlineisthatanydis6nc6onbetweenSaaSandPOWA(PlainOldWebApplica6ons)isatworstarbitraryandatbestconcernedwiththebusinessrela6onshipbetweentheproviderandtheconsumerratherthantechnicalaspectsoftheapplica6on.»Samesource

30/01/11 ASSYSTmee<ng:OpeningtheCloud

TheGridexperience

«Gridaredefinedbycoordinatedresourcesharingandproblemsolvingindynamic,mul6‐ins6tu6onalvirtualorganiza6ons.Thesharingisnecessarily,highlycontrolled,withresourceprovidersandconsumersdefiningclearlyandcarefullyjustwhatisshared,whoisallowedtoshare,andthecondi6onsunderwhichsharingoccurs»IanFoster,2000

«Acomputa6onalgridisahardwareandso=wareinfrastructurethatprovidesdependable,consistent,pervasive,andinexpensiveaccesstohighcomputa6onalcapabili6es.»I.Foster,C.Kesselman,TheGrid,1998

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Consumers

Differentusersandrequirementsacrossandwithinthecollobara<ons30/01/11 ASSYSTmee<ng:OpeningtheCloud

Providers

30/01/11 ASSYSTmee<ng:OpeningtheCloud

WhataboutGPUs?

•  Anewdigitaldivide,HPCandpersonalcomputersembarkingintoGPUs,businessande‐scienceintoclouds?

•  GridsmightbeamenabletoGPUs,virtualizedGPUsisanascentresearcharea/technology

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Themessage

•  DEFINITELYNOT“Cloudisabuzzword”•  Atechnology,notasilverbullet•  Bothe‐scienceandbusinessrequire

•  Efficientintegra<onoflargedatasetswithcompu<ng

•  Pervasiveness•  e‐sciencehasspecificrequirements

•  Organizedsharing:dataandfunding–technicalandpoli<calissues

•  Performance:notalways,butastrongculturalbias/feature.

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Outline

•  Computa<onalecosystems

•  TheClouds•  Challenges•  Autonomics

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Thecomplexitycrisis

source:IDC2008,retrievedfromhmp://www.vmware.com/files/pdf/Virtualiza<on‐applica<on‐based‐cost‐model‐WP‐EN.pdf

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Thecomplexitycrisisinac<on

Source:hmp://www.teach‐ict.com/news/news_stories/news_computer_failures.htm

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Implemen<ngPervasivenessandSharing

Mul<‐scalefeedbacks30/01/11 ASSYSTmee<ng:OpeningtheCloud

Implemen<ngPervasivenessandSharing

Mul<‐scalefeedbacks30/01/11 ASSYSTmee<ng:OpeningtheCloud

Configuringthemiddleware

Source:JamesCasey’stalkatEGEE’0930/01/11 ASSYSTmee<ng:OpeningtheCloud

Runningthemiddleware

gLitepredic<onerrorforqueuing<me30/01/11 ASSYSTmee<ng:OpeningtheCloud

Usersbehavior

Users/filegroups/hostswithAVIZGraphDice

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Usersbehavior

30/01/11 ASSYSTmee<ng:OpeningtheCloud

ComplexityANDuncertainty

•  Asadistributedsystem•  Componentsandcommunica<onscomeandgo

•  Fordynamic(P2P),butformanagedsystemsaswell

•  CAP(Brewer’s)theorem:atmosttwooftheConsistency,Availability,Par<<ontolerancecanbeguaranteed

•  Asadynamic(al)system•  En<<eschangebehaviorasaneffectofunexpectedfeedbacks,

emergentbehavior•  Organizedself‐cri<cality,minoritygames,...

•  Lackofcompleteandcommonknowledge–Informa<onuncertainty•  Monitoringisdistributedtoo•  Resolu<onandcalibra<on•  Seman<csandontologies

30/01/11 ASSYSTmee<ng:OpeningtheCloud

ComplexityANDuncertainty

Forapplica<onstoo

•  Opportunis<cbehaviors•  Space‐<me,accuracy,andmoregenerallyobjec<veadap<vity

•  Context‐awarenessasrequiredbyaCAP‐proneenvironement

•  Dynamicandcomplexcouplingandinterac<ons•  mul<‐physics,mul<‐model,mul<‐resolu<on,…

•  Trustindataandsopware•  NotonlyforP2Psystems

30/01/11 ASSYSTmee<ng:OpeningtheCloud

ChallengesSummary

•  Currentlevelsofscale,complexityanddynamismmakeitinfeasibleforhumanstoeffec<velymanageandcontrolsystemsandapplica<ons

•  Compu<ngecosystems,withtheirverylargenumbersofhardwareandsopwarecomponentsinterac<ngwithverylargedata,arecomplexsystemsthatarecurrentlyverydifficulttoprogram

•  Compu<ngecosystemsaredifficulttomanagebecauseoftheheterogeneityofworkflows,datasetsandopera<ngenvironment.

•  Theabilityofanapplica<ontoself‐adaptbyincorpora<ngdynamicinputsalongitsexecu<onneedstobeformulatedthroughageneralandprincipledprogrammingmodel

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Outline

•  Computa<onalecosystems

•  TheClouds•  Challenges•  Autonomics

30/01/11 ASSYSTmee<ng:OpeningtheCloud

WhatisAutonomicCompu<ng?

“Compu6ngsystemsthatmanagethemselvesinaccordancewithhigh‐levelobjec6vesfromhumans”KephartandChess,AVisionofAutonomicCompu<ng,IEEEComputer,2003

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Milestones•  IBMVisionandManifesto2001

•  J.O.KephartandD.M.Chess.Thevisionofautonomiccompu<ng.IEEEComputer,36(1),2003

•  IEEEInterna<onalConferenceonAutonomicCompu<ngseriessince2004

•  IEEETaskForceonAutonomousandAutonomicSystems2006

•  ECMLPKDD2006Tutorial/Workshop:AutonomicCompu<ng:ANewChallengeforMachineLearning,I.RishandG.Tesauro

•  ACMTransac<onsonAutonomousandAdap<veSystems(TAAS),2006

•  AutonomicCompu6ng:Concepts,InfrastructureandApplica6onsM.ParasharandS.Hariri(Ed.),CRCPress,2006

•  TheNSFCenterforAutonomicCompu<ng,2008

•  Interna<onalJournalofAutonomicCompu<ng(IJAC),IntersciencePublishers,2009

•  Panelatthe1stGMACworkshop:TheconvergenceofGrids,CloudsandAutonomics,2009

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Self‐management

•  Self‐ConfiguraDonAutomatedconfigura<onofcomponents,systemsaccordingtohigh‐levelpolicies;restofsystemadjustsseamlessly.

•  Self‐HealingAutomateddetec<on,diagnosis,andrepairoflocalizedsopware/hardwareproblems.

•  Self‐OpDmizaDonAutoma<candcon<nualadap<vetuningofhundredsofparameters(databaseparams,serverparams,…)affec<ngperformance&efficiency

•  Self‐ProtecDonAutomateddefenseagainstmaliciousamacksorcascadingfailures;useearlywarningtoan<cipateandpreventsystem‐widefailures.

30/01/11 ASSYSTmee<ng:OpeningtheCloud

TheAutonomicNervousSystem

•  Themostsophis<catedexampleofautonomicbehavior.

•  Regulatesandmaintainshomeostasis:maintainsstructureandfunc<onsbymeansofamul<plicityofdynamicequilibriumsthatarerigorouslycontrolledbyinterdependentregula<onmechanisms.

30/01/11 ASSYSTmee<ng:OpeningtheCloud

•  Notallparametershavethesameurgency,essen<alparametersaremonitoredmoreclosely.

Ashby’sUltrastableSystem

Source: “Autonomic Computing: An Overview, ” M. Parashar, and S. Hariri, UPP 2004, Mont Saint-Michel, France, Editors: J.-P. Banâtre et al. LNCS, Springer Verlag, Vol. 3566, pp. 247 – 259, 2005.

Acontroltheoryvision

30/01/11 ASSYSTmee<ng:OpeningtheCloud

And/orSelf‐awareness

30/01/11 ASSYSTmee<ng:OpeningtheCloud

TheMAPE‐Kloop

ManagedElement

ES

Monitor

Analyze

Execute

Plan

Knowledge

AutonomicManagerES

Environmentsensors

Networkinstrumenta<on

Userscontext

Applica<onrequirements

High‐dimensional,high‐volume‘raw’data

30/01/11 ASSYSTmee<ng:OpeningtheCloud

TheMAPE‐Kloop

ManagedElement

ES

Monitor

Analyze

Execute

Plan

Knowledge

AutonomicManagerES

State‐SpaceandDataAbstracDon

Streaming:

On‐linedatamining,clustering,..

Dimensionalityreduc<on

Ac<velearning

Ontologicalinference

High‐dimensional,high‐volume‘raw’data

Compressed,‘informa<ve’data30/01/11 ASSYSTmee<ng:OpeningtheCloud

TheMAPE‐Kloop

ManagedElement

ES

Monitor

Analyze

Execute

Plan

Knowledge

AutonomicManagerES

LearnpredicDvemodels

Classifica<on,regression,<meseries,MCMC

Decision‐making

Explora<onvsExploita<onGametheory,Riskanalysis

ReinforcementLearning,bandits

Compressed,‘informa<ve’data

30/01/11 ASSYSTmee<ng:OpeningtheCloud

TheMAPE‐Kloop

ManagedElement

ES

Monitor

Analyze

Execute

Plan

Knowledge

AutonomicManagerES

Knowledge–basedegontologies,a‐priorimodels,intelligentini<alisa<on

Or

Tabula‐rasaKnowledge

‐Avoidsknowledge‐intensivemodelbuilding

Criteria

‐IndepedentKnowledgeandlearning

‐Theore<calguaranteesofimprovement30/01/11 ASSYSTmee<ng:OpeningtheCloud

Technicalissues:exampleforRL

NeedenhancementtoVanillaReinforcementLearning•  Observa<onuncertainty•  Historicaldependenciesmayexist:MDPmightnotbeanexactmodel

•  Convergencenotguaranteed•  Lackofsta<onarity,•  Con<nuousstate‐ac<onspacerequiresapproxima<ons•  Localvsgloballearning,becauseofcurseofdimensionality

•  Explora<onpenal<esmightbeexcessiveAnindepthexplora<onoftheseissues:GeraldTesauroetal.OntheUseofHybridReinforcementLearningforAutonomicResourceAlloca<on.ClusterCompu<ng,10(3):287‐99,2007.

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Transversalissues

•  Limits•  Biologicalself‐*(awareness,healing…)may/willul<matelyfail,plusunforeseentreats

•  Overheads•  Designing,programming,execu<ng,provisioning

•  Valida<on•  Extremeevents:revisittradi<onalcriteriaegRMSE

•  Benchmarkingunderuncertainty•  Availabilityofreferencedatasets

www.grid‐observatory.org

CGermain‐Renaudetal.TheGridObservatory,toappearIEEE/ACMCCGRID'11

30/01/11 ASSYSTmee<ng:OpeningtheCloud

30/01/11 ASSYSTmee<ng:OpeningtheCloud

Recommended