Upload
nouman-ashraf-awan
View
465
Download
2
Embed Size (px)
Citation preview
11Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 11
Distributed and Cloud ComputingDistributed and Cloud Computing
Enabling Technologies Enabling Technologiesand Distributed System Modelsand Distributed System Models
22
Centralized computingCentralized computing Parallel computingParallel computing Distributed computingDistributed computing Concurrent computing/programmingConcurrent computing/programming
33
Centralized computingCentralized computing Computer resources (processors, memory, and Computer resources (processors, memory, and
storage) are fully shared and tightly coupled storage) are fully shared and tightly coupled within one integrated OSwithin one integrated OS
All computer resources are centralized in one All computer resources are centralized in one physical systemphysical system
Scalability and cost issues under heavy load of Scalability and cost issues under heavy load of user requests at global scaleuser requests at global scale
44
Parallel computingParallel computing All processors are either tightly coupled with All processors are either tightly coupled with
centralized shared memory or loosely coupled centralized shared memory or loosely coupled with distributed memorywith distributed memory
Also called parallel processingAlso called parallel processing Interprocessor communication is accomplished Interprocessor communication is accomplished
through shared memory or via message passingthrough shared memory or via message passing A computer system capable of parallel A computer system capable of parallel
computing is commonly known as a parallel computing is commonly known as a parallel computercomputer
Programs running in a parallel computer are Programs running in a parallel computer are called parallel programscalled parallel programs
55
Distributed computingDistributed computing A field of computer science /engineering that A field of computer science /engineering that
studies distributed systemsstudies distributed systems A distributed system consists of multiple A distributed system consists of multiple
autonomous computers, each having its own autonomous computers, each having its own private memory, communicating through a private memory, communicating through a computer networkcomputer network
A distributed system is a collection of A distributed system is a collection of independent entities that cooperate to solve a independent entities that cooperate to solve a problem that cannot be individually solvedproblem that cannot be individually solved
A collection of independent computers that A collection of independent computers that appears to the users of the system as a single appears to the users of the system as a single coherent computer coherent computer
66
Concurrent computing/programmingConcurrent computing/programming Refer to the union of parallel computing and distributing Refer to the union of parallel computing and distributing
computing Biased practitioners may interpret them computing Biased practitioners may interpret them differentlydifferently
Ubiquitous computing Ubiquitous computing refers to computing with refers to computing with pervasive devices at any place and time using wired or pervasive devices at any place and time using wired or wireless communicationwireless communication
The Internet of Things (IoT) The Internet of Things (IoT) is a networked connection is a networked connection of everyday objects including computers, sensors, of everyday objects including computers, sensors, humans, ...humans, ...
The IoT is supported by Internet clouds to achieve The IoT is supported by Internet clouds to achieve ubiquitous computing with any object at any place and ubiquitous computing with any object at any place and timetime
The term The term Internet computing Internet computing is even broader and covers is even broader and covers all computing paradigms over the Internetall computing paradigms over the Internet
77Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 77
Data Deluge Enabling New Challenges
(Courtesy of Judy Qiu, Indiana University, 2011)
88
From Desktop/HPC/Grids to From Desktop/HPC/Grids to Internet Clouds in 30 YearsInternet Clouds in 30 Years
HPC moving from centralized supercomputers to geographically distributed desktops, clusters, and grids to clouds over last 30 years
R/D efforts on HPC, clusters, Grids, P2P, and virtual machines has laid the foundation of cloud computing that has been greatly advocated since 2007
Location of computing infrastructure in areas with lower costs in hardware, software, datasets, space, and power requirements – moving from desktop computing to datacenter-based clouds
99Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 99
Interactions among 4 technical challenges : Data Deluge, Cloud Technology, eScience,
and Multicore/Parallel Computing
(Courtesy of Judy Qiu, Indiana University, 2011)
1010
Data DelugeData Deluge
Refers to situation where the Sheer Volume of Refers to situation where the Sheer Volume of New Data being generated is overwhelming the New Data being generated is overwhelming the capacity of institution to manage it and capacity of institution to manage it and researchers to make use of it researchers to make use of it
1111
eScienceeScience
E-Science or eScience is computationally intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require grid computing; the term sometimes includes technologies that enable distributed collaboration, such as the Access Grid.
1212
VideoVideo
1313
1414Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 1414
Clouds and Internet of ThingsClouds and Internet of ThingsHPC: High-Performance Computing
HTC: High-Throughput Computing
P2P: Peer to Peer
MPP: Massively Parallel ProcessorsSource: K. Hwang, G. Fox, and J. Dongarra,
Distributed and Cloud Computing, Morgan Kaufmann, 2012.
1515
Computing Paradigm DistinctionsComputing Paradigm Distinctions Centralized ComputingCentralized Computing
All computer resources are centralized in one physical system.All computer resources are centralized in one physical system. Parallel ComputingParallel Computing
All processors are either tightly coupled with central shard All processors are either tightly coupled with central shard memory or loosely coupled with distributed memorymemory or loosely coupled with distributed memory
Distributed ComputingDistributed Computing Field of CS/CE that studies distributed systems. A distributed Field of CS/CE that studies distributed systems. A distributed
system consists of multiple autonomous computers, each with system consists of multiple autonomous computers, each with its own private memory, communicating over a network.its own private memory, communicating over a network.
Cloud ComputingCloud Computing An Internet cloud of resources that may be either centralized or An Internet cloud of resources that may be either centralized or
decentralized. The cloud apples to parallel or distributed decentralized. The cloud apples to parallel or distributed computing or both. Clouds may be built from physical or computing or both. Clouds may be built from physical or virtualized resources.virtualized resources.
1616
Technology Convergence toward HPC for Science and HTC for Business: Utility Computing
(Courtesy of Raj Buyya, University of Melbourne, 2011)
Copyright © 2012, Elsevier Inc. All rights reserved.
1717
2011 Gartner “IT Hype Cycle” for Emerging Technologies2011 Gartner “IT Hype Cycle” for Emerging Technologies
2007
2008
2009
2010
2011
Copyright © 2012, Elsevier Inc. All rights reserved.
1818
33 year Improvement in Processor and Network 33 year Improvement in Processor and Network TechnologiesTechnologies
Technologies for Network-based Systems
1919
Multi-threading ProcessorsMulti-threading Processors Four-issue Four-issue superscalarsuperscalar (e.g. Sun Ultrasparc I) (e.g. Sun Ultrasparc I)
Implements instruction level parallelism (ILP) within a single Implements instruction level parallelism (ILP) within a single processor. processor.
Executes more than one instruction during a clock cycle by Executes more than one instruction during a clock cycle by sending multiple instructions to redundant functional units. sending multiple instructions to redundant functional units.
Fine-grain multithreaded processorFine-grain multithreaded processor Switch threads after each cycleSwitch threads after each cycle Interleave instruction executionInterleave instruction execution If one thread stalls, others are executedIf one thread stalls, others are executed
Coarse-grain multithreaded processorCoarse-grain multithreaded processor Executes a single thread until it reaches certain situationsExecutes a single thread until it reaches certain situations
Simultaneous multithread processor (SMT)Simultaneous multithread processor (SMT) Instructions from more than one thread can execute in any Instructions from more than one thread can execute in any
given pipeline stage at a time. given pipeline stage at a time.
2020
5 Micro-architectures of CPUs5 Micro-architectures of CPUs
Each row represents the issue slots for a single execution cycle: •A filled box indicates that the processor found an instruction to execute in that issue slot on that cycle;•An empty box denotes an unused slot.
2121Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 2121
33 year Improvement in Memory and Disk 33 year Improvement in Memory and Disk TechnologiesTechnologies
2222Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 2222
Architecture of A Many-Core Architecture of A Many-Core Multiprocessor GPU interacting Multiprocessor GPU interacting
with a CPU Processorwith a CPU Processor
2323
Interconnection NetworksInterconnection Networks
• SAN (storage area network) - connects servers with disk arrays• LAN (local area network) – connects clients, hosts, and servers• NAS (network attached storage) – connects clients with large storage
systems
2424Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 2424
Datacenter and Server Cost DistributionDatacenter and Server Cost Distribution
2525
Virtual MachinesVirtual Machines Eliminate real machine constraintEliminate real machine constraint
Increases portability and flexibilityIncreases portability and flexibility Virtual machine adds software to a physical Virtual machine adds software to a physical
machine to give it the appearance of a different machine to give it the appearance of a different platform or multiple platforms. platform or multiple platforms.
BenefitsBenefits Cross platform compatibilityCross platform compatibility Increase SecurityIncrease Security Enhance PerformanceEnhance Performance Simplify software migrationSimplify software migration
2626
Initial Hardware ModelInitial Hardware Model All applications access hardware resources (i.e. All applications access hardware resources (i.e.
memory, i/o) through system calls to operating memory, i/o) through system calls to operating system (privileged instructions)system (privileged instructions)
AdvantagesAdvantages Design is decoupled (i.e. OS people can develop Design is decoupled (i.e. OS people can develop
OS separate of Hardware people developing OS separate of Hardware people developing hardware)hardware)
Hardware and software can be upgraded without Hardware and software can be upgraded without notifying the Application programsnotifying the Application programs
DisadvantageDisadvantage Application compiled on one ISA will not run on Application compiled on one ISA will not run on
another ISA.. another ISA.. Applications compiled for Mac use different Applications compiled for Mac use different
operating system calls then application designed operating system calls then application designed for windows.for windows.
ISA’s must support old softwareISA’s must support old software Can often be inhibiting in terms of performanceCan often be inhibiting in terms of performance
Since software is developed separately from Since software is developed separately from hardware… Software is not necessarily optimized hardware… Software is not necessarily optimized for hardware.for hardware.
2727
Virtual Machine BasicsVirtual Machine Basics Virtual software placed Virtual software placed
between underlying machine between underlying machine and conventional softwareand conventional software Conventional software sees Conventional software sees
different ISA from the one different ISA from the one supported by the hardwaresupported by the hardware
Virtualization process Virtualization process involves:involves: Mapping of virtual resources Mapping of virtual resources
(registers and memory) to (registers and memory) to real hardware resourcesreal hardware resources
Using real machine Using real machine instructions to carry out the instructions to carry out the actions specified by the actions specified by the virtual machine instructionsvirtual machine instructions
2828
Three VM ArchitecturesThree VM Architectures
2929
System Models for Distributed and Cloud System Models for Distributed and Cloud ComputingComputing
3030Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 3030
A Typical Cluster ArchitectureA Typical Cluster Architecture
3131
Computational or Data GridComputational or Data Grid
3232Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 3232
A Typical Computational GridA Typical Computational Grid
3333
Peer-to-Peer (P2P) NetworkPeer-to-Peer (P2P) Network A distributed A distributed system architecture Each computer in the network can act as a client or Each computer in the network can act as a client or
server for other netwpork computers.server for other netwpork computers. No centralized controlNo centralized control Typically many nodes, but unreliable and Typically many nodes, but unreliable and
heterogeneousheterogeneous Nodes are symmetric in functionNodes are symmetric in function Take advantage of distributed, shared resources Take advantage of distributed, shared resources
(bandwidth, CPU, storage) on peer-nodes(bandwidth, CPU, storage) on peer-nodes Fault-tolerant, self-organizingFault-tolerant, self-organizing Operate in dynamic environment, frequent join and Operate in dynamic environment, frequent join and
leave is the normleave is the norm
3434
Peer-to-Peer (P2P) NetworkPeer-to-Peer (P2P) Network
Overlay network - computer network built on top of another network. •Nodes in the overlay can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network. •For example, distributed systems such as cloud computing, peer-to-peer networks, and client-server applications are overlay networks because their nodes run on top of the Internet.
3535Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 3535
3636
The Cloud Historical roots in todayHistorical roots in today’’s s
Internet appsInternet apps Search, email, social networksSearch, email, social networks File storage (Live Mesh, MobileFile storage (Live Mesh, Mobile
Me, Flicker, …)Me, Flicker, …) A cloud infrastructure provides a A cloud infrastructure provides a
framework to manage scalable, reliable, framework to manage scalable, reliable, on-demand access to applicationson-demand access to applications
A cloud is the A cloud is the ““invisibleinvisible”” backend to backend to many of our mobile applicationsmany of our mobile applications
A model of computation and data A model of computation and data storage based on storage based on ““pay as you gopay as you go”” access to access to ““unlimitedunlimited”” remote data center remote data center capabilitiescapabilities
Copyright © 2012, Elsevier Inc. All rights reserved.
3737Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 3737
Basic Concept of Internet CloudsBasic Concept of Internet Clouds• Cloud computing is the use of computing resources (hardware and
software) that are delivered as a service over a network (typically the Internet).
• The name comes from the use of a cloud-shaped symbol as an abstraction for the complex infrastructure it contains in system diagrams.
• Cloud computing entrusts remote services with a user's data, software and computation.
3838
Service-oriented architecture (SOA)Service-oriented architecture (SOA)
SOA is an evolution of distributed computing based on the request/reply design paradigm for synchronous and asynchronous applications.
An application's business logic or individual functions are modularized and presented as services for consumer/client applications.
Key to these services - their loosely coupled nature; i.e., the service interface is independent of the implementation.
Application developers or system integrators can build applications by composing one or more services without knowing the services' underlying implementations. For example, a service can be implemented either in .Net or
J2EE, and the application consuming the service can be on a different platform or language.
3939Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 3939
Cloud Computing Challenges: Dealing with too many issues (Courtesy of R. Buyya)
Virtualization
QoS
Service Level
Agreements
Resource Metering
Billing
Pricing
Provisioning on DemandUtility & Risk Management
Scalability
Reliability
Energy Efficiency
Security
Privacy
Trust
Legal &Regulatory
Software Eng. Complexity
Programming Env. & Application Dev.
4040Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 4040
The Internet of ThingsThe Internet of Things (IoT)(IoT)
InternetClouds
Internet of Things
The Internet
Smart Earth
Smart Earth:
An IBM
Dream
4141
Video Video
4242
4343Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 4343
Opportunities of IoT in 3 DimensionsOpportunities of IoT in 3 Dimensions
(courtesy of Wikipedia, 2010)
4444
Transparent Cloud Computing Environment
Separates user data, application, OS, and space – good for cloud computing.
4545Copyright © 2012, Elsevier Inc. All rights reserved. 1 - 1 - 4545
Parallel and Distributed Programming Parallel and Distributed Programming
4646
Message-Passing Interface (MPI)Message-Passing Interface (MPI) This is the primary programming standard used This is the primary programming standard used
to develop parallel and concurrent programs to to develop parallel and concurrent programs to run on a distributed systemrun on a distributed system
MPI is essentially a library of subprograms that MPI is essentially a library of subprograms that can be called from C or FORTRAN to write can be called from C or FORTRAN to write parallel programs running on a distributed parallel programs running on a distributed systemsystem
The idea is to embody clusters, grid systems, The idea is to embody clusters, grid systems, and P2P systems with upgraded web services and P2P systems with upgraded web services and utility computing applicationsand utility computing applications
4747
MapReduceMapReduce This is a web programming model for scalable data processing on This is a web programming model for scalable data processing on
large clusters over large data setslarge clusters over large data sets The model is applied mainly in web-scale search and cloud The model is applied mainly in web-scale search and cloud
computing applicationscomputing applications The master node specifies a Map function to divide the input into The master node specifies a Map function to divide the input into
subproblemssubproblems Applies a Reduce function to merge all intermediate values with the Applies a Reduce function to merge all intermediate values with the
same intermediate keysame intermediate key Highly scalable to explore high degrees of parallelism at different Highly scalable to explore high degrees of parallelism at different
job levelsjob levels A typical MapReduce computation process can handle terabytes of A typical MapReduce computation process can handle terabytes of
data on tens of thousands or more client machinesdata on tens of thousands or more client machines Thousands of MapReduce jobs are executed on Google’s clusters Thousands of MapReduce jobs are executed on Google’s clusters
every dayevery day
4848
Hadoop LibraryHadoop Library Offers a software platform that was originally developed Offers a software platform that was originally developed
by a Yahoo! groupby a Yahoo! group The package enables users to write and run applications The package enables users to write and run applications
over vast amounts of distributed dataover vast amounts of distributed data Users can easily scale Hadoop to store and process Users can easily scale Hadoop to store and process
petabytes of data in the web spacepetabytes of data in the web space Economical:Economical:Comes with an open source version of Comes with an open source version of
MapReduce that minimizes overhead in task spawning MapReduce that minimizes overhead in task spawning and massive data communicationand massive data communication
Efficient: Efficient: Processes data with a high degree of Processes data with a high degree of parallelism across a large number of commodity nodesparallelism across a large number of commodity nodes
Reliable: Reliable: Automatically keeps multiple data copies to Automatically keeps multiple data copies to facilitate redeployment of computing tasks upon facilitate redeployment of computing tasks upon unexpected system failuresunexpected system failures
4949
VideoVideo
5050
5151
Dimensions of ScalabilityDimensions of Scalability Size – increasing performance by increasing Size – increasing performance by increasing
machine sizemachine size Software – upgrade to OS, libraries, new apps.Software – upgrade to OS, libraries, new apps. Application – matching problem size with Application – matching problem size with
machine sizemachine size Technology – adapting system to new Technology – adapting system to new
technologiestechnologies
5252
Performance metricsPerformance metrics CPU speed: CPU speed: Million Instructions Per Second(MIPS)Million Instructions Per Second(MIPS) Network bandwidth: Network bandwidth: Megabits Per Second (Mbps)Megabits Per Second (Mbps) Distributed systems throughputDistributed systems throughput Tera floating-point operations per second (Tflops) Tera floating-point operations per second (Tflops) Transactions per second (TPS)Transactions per second (TPS)
Job response time and network latencyJob response time and network latency Low latency and high bandwidth interconnection is Low latency and high bandwidth interconnection is
preferredpreferred System overhead is often attributed to OS boot time, System overhead is often attributed to OS boot time,
compile time, I/O data rate, and the runtime support compile time, I/O data rate, and the runtime support systemsystem
QoS for Internet and web services, system availability QoS for Internet and web services, system availability and dependability, securityand dependability, security
5353
Amdahl’s LawAmdahl’s Law Let us suppose a uniprocessor workstation executes a Let us suppose a uniprocessor workstation executes a
given program in time given program in time T T minutesminutes Now the same program is partitioned for parallel Now the same program is partitioned for parallel
execution on a cluster of many nodesexecution on a cluster of many nodes We assume that a fraction We assume that a fraction α α of the code must be of the code must be
executed sequentially. Therefore, executed sequentially. Therefore, (1 − (1 − αα) ) of the code of the code can be compiled for parallel execution by can be compiled for parallel execution by n n processorsprocessors
The total execution time of the program is calculated by:The total execution time of the program is calculated by: ααT T + (1 − + (1 − αα))T T //nn where, the first term is the sequential execution time on where, the first term is the sequential execution time on
a single processor and the second term is the parallel a single processor and the second term is the parallel execution time on execution time on n n processing nodesprocessing nodes
5454
Amdahl’s Law: Speedup factorAmdahl’s Law: Speedup factor The speedup factor of using the n-processor system over the use of The speedup factor of using the n-processor system over the use of
a single processor is expressed by:a single processor is expressed by:
Speedup Speedup = = S= T S= T (old)(old)/T/T(new(new))= = T T //[[ααT T + (1 − + (1 − αα))T T //nn]] = 1= 1//[[α α + (1 − + (1 − αα))//nn]]
The maximum speedup of The maximum speedup of n n is achieved only if the code is fully is achieved only if the code is fully parallelizable with parallelizable with α α = 0= 0
As the cluster becomes sufficiently large, that is, As the cluster becomes sufficiently large, that is, n n → ∞→ ∞, , SS approaches approaches 11/α/α, an upper bound on the speedup , an upper bound on the speedup SS The sequential bottleneck is the portion of the code that cannot be The sequential bottleneck is the portion of the code that cannot be
parallelizedparallelized if if α α = 0= 0..25 → 1 − 25 → 1 − α α = 0= 0..75, Max. speedup=475, Max. speedup=4
5555
Fixed workload system efficiencyFixed workload system efficiency Amdahl’s law assumes the same amount of workload for Amdahl’s law assumes the same amount of workload for
both sequential and parallel execution of the program both sequential and parallel execution of the program with a fixed problem size or data set (fixed-workload with a fixed problem size or data set (fixed-workload speedup)speedup)
To execute a fixed workload on n processors, parallel To execute a fixed workload on n processors, parallel processing may lead to a system efficiency:processing may lead to a system efficiency:
E E = = S S //n n = 1= 1//[[ααn n + 1 + 1 − − αα]]
Low efficiency with large cluster size For Low efficiency with large cluster size For α α = 0= 0..25 and 25 and n n = 256, = 256, E E = 1= 1..5%5%
Low usage of resources: Majority of the processors Low usage of resources: Majority of the processors remain idleremain idle
5656
Gustafson’s LawGustafson’s Law Scaling the problem size to match the cluster capability (scaled-Scaling the problem size to match the cluster capability (scaled-
workload speedup)workload speedup) Let W be the workload in a given programm. When using an n-Let W be the workload in a given programm. When using an n-
processor system, the user scales the workload toprocessor system, the user scales the workload to
W W ′ ′ = = ααW W + (1 − + (1 − αα))nWnW The parallel execution time of a scaled workload W’ on n The parallel execution time of a scaled workload W’ on n
processors is defined by scaled-workload speedupprocessors is defined by scaled-workload speedup
S S ′ ′ = = W W ′(′(new)new)//WW(old)(old)= [= [ααW W + (1 − + (1 − αα))nW nW ]]//WW = = α α + (1 − + (1 − αα))nn Thus efficiency is Thus efficiency is E E ′ ′ = = S S ′′//n n = = α/α/n n + (1 − + (1 − αα) )
For For α α = 0= 0..25 and 25 and n n = 256, = 256, E E = 75%= 75%
5757
A driving metaphor Amdahl's Law approximately suggestsAmdahl's Law approximately suggests Suppose a car is traveling between two cities 60 Suppose a car is traveling between two cities 60
miles apart, and has already spent one hour miles apart, and has already spent one hour traveling half the distance at 30 mph. No matter traveling half the distance at 30 mph. No matter how fast you drive the last half, it is impossible how fast you drive the last half, it is impossible to achieve 90 mph average before reaching the to achieve 90 mph average before reaching the second city. Since it has already taken you 1 second city. Since it has already taken you 1 hour and you only have a distance of 60 miles hour and you only have a distance of 60 miles total; going infinitely fast you would only achieve total; going infinitely fast you would only achieve 60 mph60 mph
5858
A driving metaphor Gustafson's Law approximately states Gustafson's Law approximately states Suppose a car has already been traveling for Suppose a car has already been traveling for
some time at less than 90mph. Given enough some time at less than 90mph. Given enough time and distance to travel, the car's average time and distance to travel, the car's average speed can always eventually reach 90mph, no speed can always eventually reach 90mph, no matter how long or how slowly it has already matter how long or how slowly it has already traveled. For example, if the car spent one hour traveled. For example, if the car spent one hour at 30 mph, it could achieve this by driving at 120 at 30 mph, it could achieve this by driving at 120 mph for two additional hours, or at 150 mph for mph for two additional hours, or at 150 mph for an hour, and so onan hour, and so on