Lecture - 1 “Moving Ahead” · Cloud computing is a model for enabling ubiquitous, convenient ,...

Preview:

Citation preview

Lecture - 1

“Moving Ahead” - from Clusters and Grids to Cloud computing

Salman Toor salman.toor@it.uu.se

Basic questions• Why Cloud computing?

• What are the previous technologies?

• What was missing in the previous technologies?

• Will previous technologies be substituted?

• Can legacy applications run on Cloud platforms?

2

3

Were supercomputers the only source of large scale computing before Clouds?

ANSWER: NO

Distributed Computing Infrastructures (DCI)

• Cluster Computing • Accessible via Local Area Network (LAN)

• Grid Computing • Based on Wide Area Network (WAN)

• Cloud Computing • Next generation computing model

• Desktop Computing • Utility Computing • P2P Computing • Pervasive Computing • Ubiquitous Computing • Mobile Computing

4

Contribution of large scale computing

• Areas in which the role of large scale computing is inevitable:

• Particle Physics • Bioinformatics • Computational Mathematics • Quantum Chemistry • … • …

5

Computing model• Most of the large scale applications both from academia

and industry were designed for batch processing

• Batch Processing:

6

A complete set of batch or group of instructions together with the required input data to accomplish a given task (often known as job). No user interaction is possible during the execution.

Cluster computing

http://www.wikid.eu/index.php/Computer_Clustering

Cluster computing• A cluster is a type of parallel or distributed computer

system, which consists of a collection of interconnected stand-alone computers working together as a single integrated computing resource

• First realised in 60’s but gained real momentum in mid 80’s

• The aim is to move away from the specialised supercomputing platform and build more general purpose computing environment based on commodity hardware

http://www.cloudbus.org/papers/ic_cluster.pdf

Cluster computing• The concept of building computing clusters materialised with

tremendous growth in computer hardware

• In a typical scenario (worker/slave/compute) cluster nodes are dedicated resources with no external peripherals attached

• Specifically designed for batch processing

• Cluster Types:

• Supercomputing clusters • Commodity hardware based clusters

9

Cluster computing• Known Softwares of Cluster computing:

• HTCondor • Portable Batch System (PBS) • Load Sharing Facility (LSF) • Simple Linux Utility for Resource Management

(SLRM) • Rocks • …. • ….

10

Cluster computing Advantages

• Uniform access to available resources • Load balancing • Various job scheduling techniques • Cluster management tools • User interfaces

• single job submission • complex workflows management

• Fundamental level security (in typical cases) • Production quality softwares are available

11

Cluster computing Disadvantages

• Applications need to adopt the way underlying infrastructure is designed

• Cluster softwares are non-coherent • Steep learning curve • Less secure (improved significantly over the years) • Tightly coupled with the underlying resources • Difficult to port new applications • Applications need to stick with the available tools and

libraries • Non standard interfaces

12

Cluster computing Current status

• Cluster computing is one of the most established way of accessing limited amount of interconnected computational resources

• For example, hundreds of organisations in industry, government, and academia have used HTCondor

• Extension like Directed Acyclic Graph Manager (DagMAN) in HTCondor are still in use to define complex workflows

13https://research.cs.wisc.edu/htcondor/description.htmlhttps://research.cs.wisc.edu/htcondor/dagman/dagman.html

Cluster computing Short falls

Uniform access to large number of resources System that can handle complex and large workloads

• Possible next steps

• Explore ways to find more resources • Uniform access to distributed computational

resources • A bigger system for batch processing

14

Grid computing • Definition - 1 : (Computational Grid)

• Definition - 2 : (Computational Power Grids)

15

Theanatomyofthegrid:EnablingscalablevirtualorganizationsTheGrid2:Blueprintforanewcomputinginfrastructure

Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed autonomous resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements.

http://www.gridcomputing.com/gridfaq.html

The computational power grid is analogous to electric power grid and it allows to couple geographically distributed resources and offer a consistent and inexpensive access to resources irrespective of their physical location or access point.

http://toolkit.globus.org/alliance/publications/papers/chapter2.pdf

Grid computing Vision

16

Grid computing Actual picture

17http://kekcc.kek.jp/service/cc/uguide_en/10_1.system_tokutyou.html

Grid computing System components

• Application execution tools • Multi-level scheduling • Resource discovery • Reliability • Quality of Services (QoS) • Resource allocation • Metadata management

18

• Information extraction • Runtime environments • Security • Data management • Interoperability • Virtual Organisation

Management System (VOMS)

• …. • ….

Grid computing Virtual Organisation Management System

(VOMS)• Virtual Organisation

• Virtual Organisation Management System

19Article:Fromgridmap-filetoVOMS:managingauthorizationinaGridenvironmenthttp://toolkit.globus.org/grid_software/security/voms.php

An abstract entity grouping Users, Institutions and Resources in a same administrative domain.

VOMS is a system for managing authorisation data within multi-institutional collaborations. VOMS provides a database of user roles and capabilities and a set of tools for accessing and manipulating the database and using the database contents to generate Grid credentials for users when needed.

Large Hadron Collider Grid (LCG)

20http://www.isgtw.org/feature/isgtw-feature-mega-grid-mega-science

Grid Computing Basic Workflow

21

UI JDL

Resource Broker

Job Submission Service

Storage Element

Computing Element

Information Service

Job Status

DataSets info

Job Submit E

vent

Job Query Jo

b Stat

us

Input �sandbox�

Input �sandbox� + Broker Info

Globus RSL

Output �sandbox�

Output �sandbox�

Job Status

Publish

vom

s-pro

xy-in

it

Exp

ande

d JD

L

SE & CE info

JobworkflowingLitemiddleware:http://slideplayer.com/slide/2801198/

Grid computing at CERN• Large Hadron Collider (LHC) experiment at

European Organisation for Nuclear Research (CERN)

• The Grid runs more than two million jobs per day

• Till 2013, system had 100PB of data and its growing 27PB per year

• Expected to generate 400PB of data till 2023

22

https://www.youtube.com/watch?v=7k3VnWXOjP4 http://home.web.cern.ch/about/updates/2013/02/cern-data-centre-passes-100-petabytes http://www.hpcwire.com/2014/11/04/cern-details-openstack-journey/ http://home.web.cern.ch/about/computing

Grid computing Advantages

• Seamless access to geographically distributed resources

• Provide means to accelerate collaborative science

• The concept of virtual organisations (VO) evolved with Grids

• Each site in the Grid system is fully autonomous

• Transparent access to the heterogeneous resources

• Allows large scale batch processing capabilities

23

Grid Computing Disadvantages

• Complex system architecture

• Steep learning curve for the end user

• Only allow batch processing, zero level interactivity

• Difficult to attach a comprehensive economic model

• The sites are autonomous but the softwares are tightly connected with the underlying hardware

• Mostly available for academic and research activities

• Lack of standard interface

• Static availability of resources24

Grid computing Current status

• European Middleware Initiative (EMI)

• Compute Resources: • gLite Middleware • Advanced Resource connector (ARC) • Unicore

• Storage Resources • DCache • Castor • DPM

25

Grid computing Current status

• Advanced Resource connector (ARC)

26

Grid computing Current status

• Nordic Data Grid Facility (NDGF)

• Storage/data grid based on DCache software stack

• Data is distributed over many computing centres across Scandinavia

• Secure data access using variety of protocols

27http://neic.nordforsk.org/about/strategic-areas/tier-1

Grid computing Short falls

Tight coupling with hardware resources User interfaces Limited user community Weak monitoring and billing system Limited user level access Complex software stack

Security model users and project management system

28

Possible next stepsAsystemthatcanaddresstheselimitations

Cloud computing NIST definition

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.

29http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf

ExamplefromSoVwareEngineering

30

WaterfallModel

UnifiedModelingLanguage(UML)

SpiralModel

GridCom

putin

gClou

dCo

mpu

ting

Strength of cloud computing

Cloudcompu\ngreducesthegapbetweentheconceptandtheimplementa\onbydefiningrolesandresponsibili\esthatallows:

• levelofabstrac\on• ServiceLevelAgreements(SLA)• paradigmshiVfromserversto*-as-a-service• possibilitytoa_acheconomicmodel• on-demandresourceavailability

31

Cloud computing Roles and responsibilities

• Infrastructure provider

• Platform provider

• Software provider

• Network provider

32

Why Cloud Computing?

33

Cloud computing

Cloud computing

34

A well-defined economic model

• Driving force behind Cloud concept

• Public Clouds • Amazon • HP Helion Cloud • Intel Cloud

• Private or Community Clouds • Smog • ePouta • UberCloud

Cloud computing

35

Complete isolation, direct access and full control of

allocated resources

Cloud computing

36

On demand resource allocation No job queues!

• No need of specialised static worker nodes

Cloud computing

37

Loose coupling with the underlying resources

• Live or block based VM migration

Cloud computing

38

“Standard” interface to interact with the cloud resources

• Amazon EC2 and S3 APIs could be used to connect to OpenStack based Cloud

• RestAPIs based communication

Cloud computing

39

Orchestration of scalable services

• Amazon EC2 (Compute) • Amazon S3 (Storage) • Amazon Elastic MapReduce • OpenStack Sahara (virtual Hadoop cluster) • OpenStack Trove (Database)

Cloud computing

40

Minimal interaction with service providers

Are legacy applications portable to Clouds?

41

Cloud computing

ANSWER: Yes

Cloud computing Computing model

• Together with batch processing, Cloud computing model provides interactive processing of complex applications

• Frameworks like; IPython or Jupyter notebooks extend web technologies for interactive computing

42

Wikipedia:

Interactive computing refers to software which accepts input from humans — for example, data or commands.

Cloud Computing

43

Introduction to SNIC Cloud (IaaS)

http://smog.uppmax.uu.se

44

Security on SNIC Cloud

45

What SNIC Cloud will provide?• Resources

– Compute – Storage – Network

• Users will have complete control over the allocated resources.

• Power comes with the responsibility!

46

Important• Users can login as supper-user root, can install or

uninstall whatever they want. • Question: What if for “connivance” I will create a user

account on my VM with the name “XXX” and password “XXX123”… Can I ?

• The answer is YES, you can. But it may have serious consequences!!!!

47

Consequences• Since the VMs will be available via Internet and with weak

password or sometimes even with strong passwords, systems can get hacked.

• The attacker can do varies things: – Destroy the data available on the VM – Corrupt the VM so it will not be usable – Generate an attack using your VM – Or even much more …

48

What should we do ? • Don’t use password based logins!

• The convention is to use SSH key-pair login mechanism.

• For this course it is required that all the students always use SSH keys to access resources.

49

What is SSH key-pair?• A public key based authentication system used to

identify users on SSH enabled servers • based on pair of keys

– private key (user’s personal key) – public key (world readable key)

• User can generated RSA or DSA based keys – RSA (Rivest-Shamir-Adleman) keys have a minimum

key length of 768 bits and the default length is 2048 – The key length of DSA (Digital Signature Algorithm) is

always 1024

50https://wiki.archlinux.org/index.php/SSH_keys https://help.ubuntu.com/community/SSH/OpenSSH/Keys

Key-Pair generation • OpenStack based key

generation interface

51

• Command-line interface

$ ssh-keygenor $ ssh-keygen -t rsa -b 2048

Recommended