24
Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing – Issues in Data grids

and Solutions

Sudhindra Rao

Page 2: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 2

Outline

Grid Computing – introduction Computational Grids Data Grids Data Management Related Work Technologies – JavaSpaces, OceanStore Our research plan Discussion

Page 3: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 3

What is grid computing?

Use a network of PCs Faster networks, cheaper PCs, lot of idle time Easy to build, maintain, scale Generic solution for scientific and business

problems alike Some form of grid computing - SETI@Home,

Argonne National Lab, Google etc.

Page 4: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 4

Capabilities

Security

Manageablity

Agility

Goals

Efficiency

Profitability Control

Uncertainty

Complexity

Distribution

New Opportunities

Wo

rld E

ven

tsM

arke

t Dyn

am

ics

Grid Computing

Maturing Technology

Why today?

Page 5: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 5

Compute-intensiveanalytics

OLAP dataanalysis

DataCenteroperations

ComputeUtilityservices

•Value at risk•Credit risk•Real-time risk management•Automated trade programs

•Anti-money laundering•Credit card (risk and customerData mining)•Billing

•In-process system migration•High fault tolerance•Geographic data center independence for failover and business applications

•Data center compute farms•Corporate compute utilityservices creating a low-cost infrastructure similar to the electric grid

Applications – data grids

Geographic distribution of data Computations on large scale data

Page 6: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 6

Distributed Computing Evolution

File sharing CORBA Data translation

Data queues Publish/Subscribe Smart routing

Pipes/sockets Clusters Data grids Utility service

Middleware

Client/ServerGrid Computing

Evolution of distributed computing

Page 7: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 7

Compute grid

Distributed pool of resources Completing a task for a user User requests and reserves resources Some kind of middleware manages

resources and tasks Resilient and fault tolerant

Page 8: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 8

Data grid

Client

Network pipe1-1 connectivity

Server

Data Storage

Compute grid – coordinating set of tasks

Multiple applications/worker threadsaccessing single datastore

Business AppServer

Client

Network pipe1-1 connectivity

Server

Page 9: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 9

Data Storage

Compute grid – coordinating set of tasks

Data grid – manages data

Data grid – eliminates data access bottlenecks

Page 10: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 10

Data grid architecture

Mechanism neutrality Policy neutrality Compatibility with

compute grid Uniformity with

information infrastructure

Services Storage Service Grid storage API Metadata service

Page 11: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 11

Data grid architecture

Expectations Coordination between

compute and data grid Data delivery to

facilitate task and resource management

Sharing data distribution and location information

Leveraging data locality

Guarantees Dependability Consistency Pervasiveness Security Inexpensive

Page 12: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 12

BatchSynchronousStatic dataNontransactional

AtomicSynchronousStatic DataNontransactional

AtomicAsynchronousStatic DataNontransactional

AtomicAsynchronousDynamic dataNontransactional

AtomicSynchronousStatic dataTransactional

AtomicAsynchronousDynamic dataTransactional

AtomicAsynchronousStatic dataTransactional

BatchSynchronousStatic dataTransactional

Application ComplexityWork, Time, Data, Transactional

Dat

a G

rid Q

oS

Leve

l 0Le

vel 1

OLAPReal-time datamart

Monte Carlo Simulation

Data delivery - QoS requirements

Page 13: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 13

Related Work

Grid File System - provides primitives like a file system – Level 0 QoS

NFSv4 – High performance, extensible, secure – in the works

Secure File System – self certifying paths, unique identifiers, global namespace, key based certification

Page 14: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 14

Technologies related to data grids - JavaSpaces

“Make Room for JavaSpaces, Part IEase the Development of Distributed Apps with JavaSpaces” - Eric Freeman and Susan Hupfer

Page 15: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 15

OceanStore

Global replication of data Promiscuously caches data Version based archival storage Applications can control their consistency

requirements to manage performance Internal event monitors analyze access

patterns to move data and provide redundancy

Page 16: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 16

Grid Fabric - Integrasoft

Business solution provided for financial institutions, share traders

Designed to complement compute grid Works closely with compute grid to schedule

tasks based on data availability Moves data closer to computation

Page 17: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 17

WebServicesBusiness process

Data Grid

Delivers has

Req

uire

s State

SOA and Data grids

Moore’s law and Metcalf’s law Network based computation and grid computing with SOA Intelligent infrastructure – SONA

Page 18: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 18

Web 2.0

Page 19: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 19

Our research – Motivation

Issues in data management Data tightly coupled to computation Data cached locally Distribution is haphazard and reuse is

minimal Data pulled by computation – not delivered Mechanisms still improvise based on

experience on smaller systems

Page 20: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 20

Data Grid and DBMS

Grid DBMS Security Transparency Robustness Efficiency Intelligence Fragmentation Heterogeneity

DBMS Data Regions

Tables Schema Ordered Structure

Triggers Events Events

Stored Procedures

Optimizations Distributed procedures

Intra-table fields Indexing Cross-structure

Table/row level Locking Data atom level

Table joins Relation Data atom

SQL Query Programmatic string base

Indexes Repeated data access

Tags

Page 21: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 21

Data grid – eliminates data access bottlenecks

PersistenceMechanism – with data regions

Data Storage

indicatesReplicas, relations

Data grids as extended DBMS

Page 22: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 22

Datacentric grids

Automated space management and garbage collection

Space and data objects lifetime mechanism I/O allocation on storage system Estimating access from Magnetic storage Co-scheduling of compute and storage resources Space reservation dilemma Thin clients Code mobility towards data

Page 23: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid Computing OSCAR Lab 23

Expected Results

Can we move computation closer to data? Data grid –with features of persistence? Performance improvement using tags? Loosely coupled data grid and compute grid? Scalability of unique naming in file systems?

Page 24: Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Thank you!