Upload
mikaela-mennen
View
223
Download
0
Embed Size (px)
Citation preview
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 1/35
AppScale:
Open-sourcePlatform-as-a-Service (PaaS)
System for Energy-AwareCloud Computing Research
Chandra KrintzAssociate Professor
Computer Science Dept andInstitute for Energy Efficiency
UCSB
Nov. 18, 2009
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 2/35
Cloud Computing
• Software systems for accessing easily and transparentlyscalable CPU/storage/network resources via a networkconnection or web interface – “as-a-service”
• On a rental basis
SLAs
Web Services
Virtualization
-• Pay-per-use, flat rate – E-comerced based• Resources are opaque
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 3/35
Cloud Computing
• Software systems for accessing easily and transparentlyscalable CPU/storage/network resources via a networkconnection or web interface – “as-a-service” – Culmination of grid/cluster/utility/elastic computing
– Advances in processor, virtualization, systems technology
• On a rental basis – via service-level agreements (SLAS) – Pay-per-use, flat rate – E-comerced based
– Resources are opaque
• Have experienced a rapid uptake in the commercial sector – Public clouds – you run your systems/apps on others’ systems
• Access small fraction of vast resource pools
– Offer availability guarantees and extreme scale
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 4/35
Layers
• Software systems for accessing easily and transparentlyscalable CPU/storage/network resources via a networkconnection or web interface – “as-a-service”
– Infrastructure, e.g. Amazon Web Services (AWS)• Provision isolated resources under contract
• ull-s ste i a es de lo ed o er irtual achine onitor
IaaS
– Platform, e.g. Google AppEngine (GAE), Microsoft Azure• Enable construction of network-accessible applications
• Complete software stack; Process-level runtime isolation
• Specialized/scalable runtime and library support
– Software, e.g. Salesforce• Remotely accessible and customizable applications
PaaS
SaaS
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 5/35
Cloudy Issues
• Many institutions and companies own IT infrastructure
• Public cloud features also useful for “on-premise” clouds
– Privacy of code and data – Avoids vendor “lock-in”, pay-per-use, and availability reliance
– Potential for hybrid and customized approaches
–
o en a or eas ng resource cons ra n s• Storage/data management, cpu/memory, communication
– Potential for investigating new energy-aware dist. systems
•
Public clouds are opaque – open APIs, closed implementation – How can we research what’s next if we can’t see what’s now?
• Can cloud fabrics support other application domains,
services, performance/availability requirements?
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 6/35
An Opening in the Clouds
• Open-source cloud computing systems from theUCSB Computer Science Department – Goal: Bring popular cloud fabrics to “on-premise” clusters that
are easy to use and are transparent
– To facilitate investigation of• Ener -e icient cloud com utin
– Services, underlying device technology, support technologies – Customization (availability, performance, application behavior)
• Hybrid cloud solutions (public and on-premise)
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 7/35
An Opening in the Clouds
• Open-source cloud computing systems from theUCSB Computer Science Department – Goal: Bring popular cloud fabrics to “on-premise” clusters that
are easy to use and are transparent – To facilitate investigation of
• Energy-efficient cloud computing – erv ce , un er y ng ev ce ec no ogy, u or ec no og e
– Customization (availability, performance, application behavior)• Hybrid cloud solutions (public and on-premise)
– By emulating key cloud layers from the commercial sector• Engender user community, access to real applications/users• Leverage extant software technologies
– Not replacement technology for any Public Cloud service
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 8/35
UCSB Cloud Computing
• Software systems for accessing easily and transparentlyscalable CPU/storage/network resources via a networkconnection or web interface – “as-a-service”
– Infrastructure, e.g. Amazon Web Services (AWS)• Eucalyptus: www.eucalyptus.com Prof. Rich Wolski
IaaS
–
a orm, e.g. oog e pp ng ne , croso zure• Enable construction of network-accessible applications• Complete software stack; Process-level runtime isolation• Specialized/scalable runtime and library support
– AppScale: appscale.cs.ucsb.edu• Web services based implementation of Google App Engine
– Complete application stack for MVC-based web applications – Written in Python or Java
PaaS
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 9/35
From Google App Engine
to AppScale• GAE is a full application stack that facilitates construction
of interactive webpages with a database backing store – Users develop Python and Java apps using well defined APIs
• Execute, test, debug the apps locally using the Google open-source software development kit (SDK)
–
SDK implements simple, slow, non-scalable versions of APIs – Generates indices that the application requires
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 10/35
From Google App Engine
to AppScale• GAE is a full application stack that facilitates construction
of interactive webpages with a database backing store – Users develop Python and Java apps using well defined APIs
• Execute, test, debug the apps locally using the Google open-source software development kit (SDK)
–
SDK implements simple, slow, non-scalable versions of APIs – Generates indices that the application requires
• Upload application to Google: YOUR_APPNAME.appspot.com
– Sandboxed execution, language/behavior restrictions
– Resource quotas via free or for-fee – Highly scalable API implementations (proprietary)
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 11/35
From Google App Engine
to AppScale• GAE is a full application stack that facilitates construction
of interactive webpages with a database backing store – Users develop Python and Java apps using well defined APIs
• Highly scalable proprietary implementation on Google resources
Datastore -> Bi table/Ma reduce
MemCache -> in-memory datastoreAuthentication -> Google Accounts
Mail -> GMail
URL-Fetch (for HTTP/S communication)
Images
Task Queues for short background jobs
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 12/35
AppScale
• An Open-source platform-as-a-service (PaaS) system thatemulates Google App Engine (GAE) – Extension of GAE SDK with API implementations replacedAppLoadBalancer (ALB) Database Master/Peer (DB M/P)
AppServer (AS) Database Slave/Peer (DB S/P)
GAE App
Developer
(AppScale Admin)
GAE App
Users
AppScale
tools
HTTPS
App
Controller
ALB
DB M/P
DB S/P
ASGAE App
UsersGAE App
Users
AppScale Cloud
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 13/35
An AppScale Image
• Implements every AppScale component – Can instantiate as a particular role (ALB, AS, DB)
– Can change functionality and instantiate itself as another• AppLoadBalancer (ALB)
– Controller of the system (Ruby-on-Rails application)
–
Starts other instances and instantiates each with its role – Monitors other components once instantiated
– Restarts lost components
– Current work: ALB availability (shadow ALB(s))
– GAE developers contact ALB to manage cloud• Via the AppScale Tools
• Stop/start cloud, upload/start/terminate GAE apps
• Command line and web page administrator status w/ load stats
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 14/35
An AppScale Image
• Implements every AppScale component – Can instantiate as a particular role (ALB, AS, DB)
– Can change functionality and instantiate itself as another• AppLoadBalancer (ALB)
– Controller of the system (Ruby-on-Rails application)
–
Starts other instances and instantiates each with its role – Monitors other components once instantiated
– Restarts lost components
• Monitors system statistics – CPU, Memory, page requests, network, application behavior
– Grow and shrink AppScale cloud on-demand
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 15/35
An AppScale Image
• Implements every AppScale component – Can instantiate as a particular role (ALB, AS, DB)
– Can change functionality and instantiate itself as another• AppServer (AS)
– Web front-end for the GAE applications (GAE API compliant)
–
Users of GAE apps interact with ASs directly• After being routed to an AS instance by the ALB upon first
access to a GAE application (for load balancing purposes)
– Python GAE interface
– Java GAE interface
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 16/35
An AppScale Image
• Implements every AppScale component – Can instantiate as a particular role (ALB, AS, DB)
– Can change functionality and instantiate itself as another• AppServer (AS)
– Web front-end for the GAE applications (GAE API compliant)
–
Users of GAE apps interact with ASs directly• After being routed to an AS instance by the ALB upon first
access to a GAE application (for load balancing purposes)
– Python GAE interface
– Java GAE interface• Extensions (non-GAE compliant)
– MapReduce integration via an API
– GAE developers write their own mappers/reducers• AppScale schedules and load balances
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 17/35
An AppScale Image
• Implements every AppScale component – Can instantiate as a particular role (ALB, AS, DB)
– Can change functionality and instantiate itself as another• AppDBMaster (DBM) and AppDBSlaves (DBS)
– Scalable distributed database (Datastore API) implementation
•
Type and replication specified at AppScale start by admin – Bigtable like options (schema-free, key-value store)
• HBase (Java) and Hypertable (C++)
• MongoDB (Python)
– Peer-to-peer options• Cassandra (Java), Voldemort (Java)
– In-Memory, Distibuted – MemchacheDB (Python)
–
Relational options• MySQL (sharded, Master/Slave)
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 18/35
AppScale
• Component interoperation – Google protocol buffers for all app data communication
– Inter-language IPC via Thrift
• Deploys and executes automatically over different cloud
a rics an virtua ization ayers – Eucalyptus
– Amazon’s EC2
– Xen
– KVM
– Soon to come…• VMWare/VCloud, IBM’s RC2, greater EC2 scale
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 19/35
AppScale Potential
• Current status – Stable system, many interesting database options
– International user community, real applications• AppScale is used for commercial & private endeavors
• Significant visibility and attention from blogosphere & industry
–
Web services application domain• W/ extension that spawns multi-language Map-Reduce jobs
• Research directions:
– Measurement and characterization of energy consumption• Multiple components, languages, resource requirements
• Parallelization and concurrency, multicore utilization
• Virtualization and IaaS layers
• Database technologies
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 20/35
AppScale Research Roadmap
• Support for Java and additional DBs
• ASs and DBs grow and shrink according to load and failure – Resource monitoring & allocation (SLA support)
• Automatic and dynamic renegotiation, improved scaling
• Per ormance ener and availabilit monitorin – Capture full-system behavior via sampling
• For debugging, performance/energy feedback, optimization
• Administrator/Developer control of
Replication of data for fault tolerance Scaling triggersType and amount of system monitoring Sandbox restrictions
Parallelism/Concurrency Energy/Power management
• Alternative computation models, e.g. streaming
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 21/35
• Support for novel/emerging application domains – Computationally and data intensive -- streaming, HPC, hybrids
•
IaaS + PaaS integration AppScale + Eucalyptus – Resource allocation, specialization/customization
• Automatic re-negotiation of SLAs
• -
AppScale Research Directions
, ,
– Application-level feedback to/from physical power/air systems – Isolation/energy tradeoffs, e.g. exploit shared memory
Multi-core server
VirtualizationTechnology (SW+HW)
Linux + EucalyptusAppScale
Applications/libraries
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 22/35
Lessons• Open-source and management of a user community• Positives:
– Real applications, real systems
• If it works here, it really works – Immediate feedback – helps debugging, evaluating efficacy of
more “theoretical” and “research” contributions, what is,
– Extensive visibility – helps with recruting, funding, impact
– Really fun and challenging
• Negatives – Immediate feedback – can be quite demanding and time
consuming to respond to users effectively• Not responding equal to not putting something out as open source
• Visibility can also be a bad thing!
– Engineering oriented
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 23/35
Cloud Computing at UCSB
Open-source implementations of popular cloud systems
• Platform-as-a-service (PaaS) framework
• Web services based implementation of
Google AppEngine APIs
• Runs over Eucalyptus, Amazon EC2, and
• RACELab students:
Chris Bunch & Navraj Chohan,Nupur Garg, Matt Hubert,
Puneet Lakhina, Yiming Li,
• Implements multiple database backends(Hbase, Hypertable, Cassandra, MySQL,…)
• Real use, real users, real impact
• International user community
http://appscale.cs.ucsb.eduLead: Chandra Krintz
Thanks!
, ,
Michal Weigel Visiting Scholar:Yoshihide Nomura (Fujitsu)
• Supported by: Google,
National Science Foundation,IBM Research
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 24/35
Cloud Computing at UCSB
Open-source implementations of popular cloud systems
• Infrastructure-as-a-service (IaaS) framework
•
Web services based implementation of elastic/utility/cloud computing infrastructure
• Linux image hosting via virtualization
• Emulates the Amazon AWS interface –
’
• Platform-as-a-service (PaaS) framework
•
Web services based implementation of Google AppEngine APIs
• Runs over Eucalyptus, Amazon EC2, and
virtualization layers (Xen/KVM)
•
•Real use, real users, real impact
• Distributed with Ubuntu
• Large international user community
http://www.eucalyptus.com
Lead: Rich Wolski
(Hbase, Hypertable, Cassandra, MySQL,…)• Real use, real users, real impact
• International user community
http://appscale.cs.ucsb.edu
Lead: Chandra Krintz
Thanks!
• RACELab students: Chris Bunch, Jovan Chohan, Navraj Chohan,Nupur Garg, Matt Hubert, Puneet Lakhina, Yiming Li, GauravMehta, Nagy Mostafa, Soo Hwan Park, Michal WeigelVisiting Scholar: Yoshihide Nomura (Fujitsu)
• Supported by: Google, National Science Foundation,IBM Research, You!?
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 25/35
Cloudy Issues
• Many institutions and companies own IT infrastructure
• Public cloud features also useful for “on-premise” clouds – Privacy of code and data
– Avoids vendor “lock-in”, pay-per-use, and availability reliance
–
– Potential for easing resource constraints• Storage/data management, cpu/memory, communication
•
Public clouds are opaque – open APIs, closed implementation – What is your application doing?
• Can cloud fabrics support other application domains,
services, performance/availability requirements?
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 26/35
AppScale
• Component interoperation – Component handshakes via socket messages
–
Google protocol buffers for all data communication – Inter-language IPC via Thrift
•
Deploys and executes automatically over different cloudfabrics and virtualization layers – Eucalyptus
– Amazon’s EC2
– Xen
– KVM
– Currently laying the groundwork with VMWare for an
AppScale + VCloud/VSphere effort
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 27/35
AppScale
• An Open-source platform-as-a-service (PaaS) system thatemulates Google App Engine (GAE)
– Extension of GAE SDK with API implementations replacedAppLoadBalancer (ALB) Database Master/Peer (DB M/P)
AppServer (AS) Database Slave/Peer (DB S/P)
GAE App
Developer
(AppScale Admin)
GAE App
Users
AppScale
tools
HTTPS
App
Controller
ALB
DB M/P
DB S/P
ASGAE App
UsersGAE App
Users
AppScale Cloud
MapReduce
Tasks
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 28/35
AppScale
• Extension of GAE SDK with API implementations replacedDatastore -> HBase, Hypertable, Cassandra, Voldemort, MySQL,
MongoDB, MemCacheDB in progress: Oracle’s TimesTenMapReduce -> Hadoop MemCache -> Python & Java
Authentication -> built-in, decoupled from Google Accounts
GAE App
Developer
(AppScale Admin)
GAE App
Users
AppScale
tools
HTTPS
App
Controller
ALB
DB M/P
DB S/P
ASGAE App
UsersGAE App
Users
AppScale Cloud
MapReduce
Tasks
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 29/35
RACELab Research
• Cross-language shared memory (Michal Weigel) – Replace high overhead communication protocols transparently
for co-located runtimes• Multi-language profiling and optimization (Nagy Mostafa)
– Track changes in software revisions that impact performance
– ross- anguage spec a zat on o nterpreters
• Computationally intensive AppScale (Chris Bunch) – MapReduce, HPC libraries
– Scheduling and load balancing
• Alternative data-intensive computational models in the cloud(Navraj Chohan) – Cloud storage and persistance of data
– Streaming data management
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 30/35
AppScale
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 31/35
AppScale
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 32/35
UCSB Open-Source Cloud Computing
• Eucalyptus Lead: Rich Wolski, MAYHEM Lab
– Elastic Utility Computing Architecture Linking YourPrograms T o Useful S ystems
– Infrastructure-as-a-service (IaaS) framework – Web services based implementation of elastic/utility/cloud
computing infrastructure
–
Linux image hosting via virtualization – Emulates the Amazon AWS interfaces
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 33/35
• Interface is based on Amazon’s published WSDL – EC2/S3/EBS
–
EC2 “Availability” zones correspond to individual clusters – Uses either Amazon tools or Eucalyptus replacements
– Web services and REST interface
•
Does not assume that worker nodes have publicly routableIP addresses
– All cloud images have access to a private network interface – Multiple networking options including elastic IPs and security
groups• WS-security for authentication
• Web-based administration and accounting
E l
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 34/35
Eucalyptus
• Availability – Via popular Linux distributions: Ubuntu, Debian, CentOS, …
•
Uptake – 19,400 downloads, > 71 countries
– May 2009: Key technology in thep at orm
– Fundamental technology of theUbuntu Oct. 2009 distribution
• 10,000,000 potential downloads
• Commercial spinoff: Eucalyptus Systems, Inc – Support open source community www.eucalyptus.com
– Customized cloud services/components/scalability/SLAs
– Hybrid clouds - High availability (now available!)
A S l
8/14/2019 Chandra Krintz
http://slidepdf.com/reader/full/chandra-krintz 35/35
AppScale
• Open-source platform-as-a-service (PaaS) thatemulates Google App Engine (GAE)
• GAE is a full application stack that facilitates constructionof interactive webpages with a database backing store
– Sandbox / restrictionsPure Python or Java, no thread/subprocess spawning, system calls
No writes to file system, reads only to static files uploaded w/app
Storage using key-value, schema-free datastore (Bigtable-based)
HTTP/S communication only, CGI to handle page requests
Limit on number of datastore elements accessed per request
Limit on response duration, task frequency, request rate
Enforced quotas (BW, CPU, requests/s, files, app size, …)