Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Basics of Cloud Computing
MTAT.08.027 Basics of Cloud Computing (3 ECTS) MTAT.08.027 Basics of Cloud Computing (3 ECTS)
MTAT.08.011 Basics of Grid and Cloud Computing
Satish [email protected]
Course Purpose
• Introduce cloud computing concepts
• Introduce cloud providers
• Introduction to distributed computing
algorithms like MapReducealgorithms like MapReduce
• Glance of research at distributed systems
group in cloud computing domain
• http://courses.cs.ut.ee/2012/cloud/
4/4/2012 Satish Srirama 2/33
Questions
• Is everyone comfortable with data structures?
• How comfortable you are with algorithms?
• How comfortable you are with programming?
– Java ? – Java ?
• External APIs?
– Python – I assume you are
– Web programming
• Were you able to submit all exercises in Grid part?
4/4/2012 Satish Srirama 3/33
Outline
• Cloud computing
• Cloud providers
• SciCloud - http://ds.cs.ut.ee/research/scicloud
• MapReduce• MapReduce
• MapReduce in different domains
• Cloud platforms - Aneka
• Mobile Cloud
4/4/2012 Satish Srirama 4/33
Grading
• Written exam – 50%
• Labs – 20%
– 4 lab exercises
– Active participation in the lectures (Max 5%)– Active participation in the lectures (Max 5%)
• Project – 30%
– 1 man week task/per person
– Duration of 4 weeks
– Groups should deliver bigger tasks
– Advertised after the 3rd lecture
4/4/2012 Satish Srirama 5/33
Course schedule
• 04.04 Basics of Cloud computing
• 11.04 Cloud Providers
• 18.04 MapReduce
No lectureNo lecture
– Me and Pelle will be on a EU project meeting
• 02.05 Working with Hadoop
• 09.05 MapReduce Algorithms
• 16.05 MapReduce in Information Retrieval
• 23.05 Aneka and other PaaS
4/4/2012 Satish Srirama 6/33
Course schedule - continued
• Labs
05-10.04 Starting with a cloud
12-17.04 Working with SciCloud
19-24.04 MapReduce. Projects will be advertised.
• Remaining lab will be advertised in sometime• Remaining lab will be advertised in sometime
• Projects will have 2 meetings
– Intermediary presentation
– Final presentation
• Deliverables will be discussed when projects are advertised
• Dates will be announced later based on consensus
4/4/2012 Satish Srirama 7/33
Course schedule - continued
• Exam – 30th May 2012
• Second exam – 5th June 2012
• Students who want their results by end of
May should submit their projects by 29th MayMay should submit their projects by 29th May
• Other students can submit projects by 12th
June
• Examination for second attempt students 13th
June
4/4/2012 Satish Srirama 8/33
CLOUD COMPUTING
Lecture 1
4/4/2012 Satish Srirama 9
WHAT IS CLOUD COMPUTING?
“It’s nothing new”“...we’ve redefined Cloud Computing to include everything that we already do... I don’t understand what we would do differently ... other than change the wording of some of our ads.”
“It’s a trap”“It’s worse than stupidity: it’s marketing hype. Somebody is saying this is inevitable—and whenever you hear that, it’s very likely to be a set of businesses campaigning to make it true.”
WHAT IS CLOUD COMPUTING?
No consistent answer!
Everyone thinks it is something else…
the wording of some of our ads.”
Larry Ellison, CEO, Oracle (Wall
Street Journal, Sept. 26, 2008)
campaigning to make it true.”
Richard Stallman, Founder, Free Software Foundation (The Guardian, Sept. 29, 2008)
Slide taken from Professor Anthony D. Joseph’s lecture at RWTH Aachen4/4/2012 Satish Srirama 10
What is Cloud Computing?
• Computing as a utility– Utility services e.g. water, electricity, gas etc
– Consumers pay based on their usage
• Cloud Computing characteristics – Illusion of infinite resources– Illusion of infinite resources
– No up-front cost
– Fine-grained billing (e.g. hourly)
• Gartner: “Cloud computing is a style of computing where massively scalable IT-related capabilities are provided ‘as a service’ across the Internet to multiple external customers”
Satish Srirama4/4/2012 11/33
Timeline
4/4/2012 Satish Srirama 12/33
How Cloud & Grid are related…
• Share a lot commonality– Intention, architecture and technology
• Differences– Programming model, business model, compute
model, applications, and Virtualization.model, applications, and Virtualization.
• The problems are mostly the same– Manage large facilities;
– Define methods by which consumers discover, request and use resources provided by the central facilities;
– Implement the often highly parallel computations that execute on those resources.
4/4/2012 Satish Srirama 13/33
Virtualization
• Virtualization techniques are the basis of the cloud computing
• Virtualization technologies partition hardware and thus provide flexible and scalable computing platforms
• Virtual machine techniques– VMware and Xen OS
App App App
OS OS– VMware and Xen
– OpenNebula
– Amazon EC2
• Grid– do not rely on virtualization as much as Clouds do, each
individual organization maintain full control of their resources
• For cloud, virtualization is almost an indispensable ingredient
Hardware
OS
Hypervisor
OS OS
Virtualized Stack
4/4/2012 Satish Srirama 14/33
4/4/2012 Satish Srirama
Enabling Grids for E-sciencE - EGEE
15/33
Clouds - Why Now (not then)?
• Commoditization of HW– x86 as universal ISA, plus fast virtualization
– Bet: Can statistically multiplex multiple instances onto a single box without interference between instances
• Web 2.0– Standard software stack, largely open source (LAMP)– Standard software stack, largely open source (LAMP)
– Asynchronous JavaScript and XML (AJAX)
• Novel economic model: fine grain billing– Earlier examples: Sun, Intel Computing Services—longer commitment,
more $$$/hour
• Infrastructure software: e.g. Google FileSystem, HDFS
• Operational expertise: failover, DDoS, firewalls...
• More pervasive broadband Internet
4/4/2012 Satish Srirama
ISA - Instruction Set Architecture
LAMP – Linux, Apache Http Server, MySQL, PHP, perl or python
16/33
Cloud Computing - Services• Software as a Service – SaaS
– A way to access applications hosted on the web through your web browser
• Platform as a Service – PaaS– A pay-as-you-go model for IT
resources accessed over the
SaaS
Facebook, Flikr, Myspace.com,
Google maps API, Gmail
Level of
Abstraction
resources accessed over the Internet
• Infrastructure as a Service –IaaS– Use of commodity computers,
distributed across Internet, to perform parallel processing, distributed storage, indexing and mining of data
– VirtualizationSatish Srirama
PaaS
Google App Engine, Force.com, Hadoop, Azure,
Amazon S3, etc
IaaSAmazon EC2, SciCloud, Joyent Accelerators, Nirvanix Storage Delivery Network, etc.
4/4/2012 17/33
Cloud Computing - Themes
• Massively scalable
• On-demand & dynamic
• Only use what you need - Elastic – No upfront commitments, use on short term basis
• Accessible via Internet, location independent• Accessible via Internet, location independent
• Transparent – Complexity concealed from users, virtualized,
abstracted
• Service oriented – Easy to use SLAs
Satish Srirama
SLA – Service Level Agreement
4/4/2012 18/33
Cloud Models
• Internal (private) cloud– Cloud with in an organization
• Community cloud– Cloud infrastructure jointly
owned by several organizations
• Public cloud• Public cloud– Cloud infrastructure owned by
an organization, provided to general public as service
• Hybrid cloud– Composition of two or more
cloud models
Satish Srirama4/4/2012 19/33
Short Term Implications of Clouds
• Startups and prototyping
– Minimize infrastructure risk
– Lower cost of entry
• Batch jobs• Batch jobs
• One-off tasks
– Washington post, NY Times
• Cost associatively for scientific applications
• Research at scale
4/4/2012 Satish Srirama 20/33
Cloud Application Demand
• Many cloud applications have cyclical demand curves– Daily, weekly, monthly, …
Demand
Re
so
urc
es
• Workload spikes are more frequent and significant– When some event happens like a pop star has expired:
• More # tweets, Wikipedia traffic increases
• 22% of tweets, 20% of Wikipedia traffic when Michael Jackson expired in 2009 – Google thought they are under attack
Demand
Time
Re
so
urc
es
4/4/2012 Satish Srirama 21/33
Economics of Cloud Users
• Pay by use instead of provisioning for peak
Capacity
Re
so
urc
es
Re
so
urc
es
Unused resources
Static data center Data center in the cloud
Demand
Time
Re
so
urc
es
Demand
Capacity
TimeR
eso
urc
es
4/4/2012 Satish Srirama 22/33
Unused resources
Economics of Cloud Users - continued
• Risk of over-provisioning: underutilization
– Huge sunk cost in infrastructureCapacity
Static data center
Demand
Time
Re
so
urc
es
4/4/2012 Satish Srirama 23/33
Economics of Cloud Users - continued
• Heavy penalty for under-provisioning
Re
so
urc
es
Demand
Capacity
Re
so
urc
es
Lost revenue
Lost users
Re
so
urc
es
Demand
Capacity
Time (days)1 2 3
Demand
Time (days)1 2 3
Re
so
urc
es
Demand
Capacity
Time (days)1 2 3
4/4/2012 Satish Srirama 24/33
Economics of Cloud Providers
• Building a very large-scale datacenter is very expensive
– $100+ Million (Minimum)
• Large Internet Companies Already Building Huge DCs
– Google, Amazon, Microsoft…
• 5-7x economies of scale [Hamilton 2008]• 5-7x economies of scale [Hamilton 2008]
ResourceCost in
Medium DC
Cost in
Very Large DCRatio
Network $95 / Mbps / month $13 / Mbps / month 7.3x
Storage $2.20 / GB / month $0.40 / GB / month 5.5x
Administration ≈140 servers/admin >1000 servers/admin 7.1x
4/4/2012 Satish Srirama 25/33
Economics of Cloud Providers -
continued• Power
Price per
KWH
Where Possible Reasons Why
3.6¢ Idaho Hydroelectric power; not sent long distance
10.0¢ California Electricity transmitted long distance over the
grid; limited transmission lines in Bay Area; no
• Cooling is also expensive– Build data centers near rivers
• Extra benefits– Amazon: utilize off-peak capacity
– Microsoft: sell .NET tools
– Google: reuse existing infrastructure
4/4/2012 Satish Srirama
grid; limited transmission lines in Bay Area; no
coal fired electricity allowed in California.
18.0¢ Hawaii Must ship fuel to generate electricity
26/33
Economics of Cloud Providers - Failures
• Cloud Computing providers bring a shift from high reliability/availability servers to commodity servers
– At least one failure per day in large datacenter
• Why?
– Significant economic incentives – much lower per-server – Significant economic incentives – much lower per-server cost
• Caveat: User software has to adapt to failures
– Very hard problem!
• Solution: Replicate data and computation
– MapReduce & Distributed File System (Will discuss later in Lecture 3)
4/4/2012 Satish Srirama 27/33
Adoption Challenges
Challenge Opportunity
Availability Multiple providers & Use elasticity to prevent DDoSattacks
Data lock-in StandardizationData lock-in Standardization
Data Confidentiality and Auditability
Encryption, VLANs, Firewalls; Geographical Data Storage
4/4/2012 Satish Srirama 28/33
Growth Challenges
Challenge Opportunity
Data transfer bottlenecks
FedEx-ing disks, Data Backup/Archival
Performance unpredictability
Improved VM support, flash memory, scheduling VMsunpredictability memory, scheduling VMs
Scalable storage Invent scalable store
Bugs in large distributed systems
Invent Debugger that relies on Distributed VMs
Scaling quickly Invent Auto-Scaler; Snapshots for conservation
4/4/2012 Satish Srirama 29/33
Policy and Business Challenges
Challenge Opportunity
Reputation Fate Sharing Offer reputation-guarding services like those for email
Software Licensing Pay-for-use licenses; Bulk use salesuse sales
4/4/2012 Satish Srirama 30/33
Long Term Implications of clouds
• Application software:
– Cloud & client parts, disconnection tolerance
• Infrastructure software:
– Resource accounting, VM awareness– Resource accounting, VM awareness
• Hardware systems:
– Containers, energy proportionality
4/4/2012 Satish Srirama 31/33
Next lecture
• Cloud providers
– Amazon EC2, S3, EBS
• Eucalyptus
• SciCloud• SciCloud
4/4/2012 Satish Srirama 32/33
References
• Several of the slides are taken from Prof. Anthony D. Joseph’s lecture at RWTH Aachen (March 2010)
• Papers to read
M. Armbrust et al., “Above the Clouds, A Berkeley – M. Armbrust et al., “Above the Clouds, A Berkeley View of Cloud Computing”, Technical Report, University of California, Feb, 2009.
• The Cloud: Battle of the Tech Titans
– Cover story in Businessweek– http://www.businessweek.com/magazine/content/11_11/b4219052599182.htm
4/4/2012 Satish Srirama 33/33