9
Parallelisation in LarKC

LarKC Tutorial at ISWC 2009 - Parallelisation

  • Upload
    larkc

  • View
    1.035

  • Download
    0

Embed Size (px)

DESCRIPTION

The aim of the EU FP 7 Large-Scale Integrating Project LarKC is to develop the Large Knowledge Collider (LarKC, for short, pronounced “lark”), a platform for massive distributed incomplete reasoning that will remove the scalability barriers of currently existing reasoning systems for the Semantic Web. The LarKC platform is available at larkc.sourceforge.net. This talk, is part of a tutorial for early users of the LarKC platform, and describes the parallelisation approach in the platform.

Citation preview

Page 1: LarKC Tutorial at ISWC 2009 - Parallelisation

Parallelisation in LarKC

Page 2: LarKC Tutorial at ISWC 2009 - Parallelisation

Parallelization and Distribution - Motivation

• Distribution– Make use of all (distributed) resources

available

– Use data that cannot be shipped (either because of size or because of security restrictions) => move computation to the data vs move data to computation

• Parallelization– Make use of all resources available (e.g. if

we have 17 machines, we would like them to work at the same time, not one after the other).

– Either within 1 site (e.g. HPC cluster) or distributed (e.g. thinking@home)

– Improve efficiency of computation

2

Scalability

General conceptsize N => time T size 2*N =>

time ≤ 2T (same resources) OR time ~T (double resources)

Page 3: LarKC Tutorial at ISWC 2009 - Parallelisation

“within a plug-in” parallelization

“within a plug-in” parallelization

MPIMPI OpenMPOpenMP

hybridhybrid …

“across plug-ins” or “across instances of the same plug-in”

parallelization

“across plug-ins” or “across instances of the same plug-in”

parallelization

IBIS/JavaGATIBIS/JavaGAT …Grid middleware

solutionsGrid middleware

solutions

Parallelization and Distribution strategies in LarKC

3

Scalability at plug-in level

Scalability at plug-in level

Scalability at pipeline level

Scalability at pipeline level

Plug-in scope Platform scope

Page 4: LarKC Tutorial at ISWC 2009 - Parallelisation

Parallelization and Distribution in the LarKC Platform – Local execution

4

Current Prototype

• Modular• Plugable • Loosely coupling between platform&plug-ins and between plug-ins• Support for coarse-grained parallelization (across plug-ins)

LocalPlug-in Manager

LocalPlug-in Manager

QueryTransformer

QueryTransformer

Plug-in APIPlug-in API

LocalPlug-in Manager

LocalPlug-in Manager

IdentifierIdentifier

Plug-in APIPlug-in API

LocalPlug-in Manager

LocalPlug-in Manager

Info. SetTransformer

Info. SetTransformer

Plug-in APIPlug-in API

LocalPlug-in Manager

LocalPlug-in Manager

SelecterSelecter

Plug-in APIPlug-in API

LocalPlug-in Manager

LocalPlug-in Manager

ReasonerReasoner

Plug-in APIPlug-in API

DeciderDecider

Plug-in RegistryPlug-in

Registry

PipelineSupportSystem

PipelineSupportSystem

Page 5: LarKC Tutorial at ISWC 2009 - Parallelisation

RemotePlug-in Manager

RemotePlug-in Manager

QueryTransformer

QueryTransformer

Plug-in APIPlug-in API

RemotePlug-in Manager

RemotePlug-in Manager

IdentifierIdentifier

Plug-in APIPlug-in API

RemotePlug-in Manager

RemotePlug-in Manager

Info. SetTransformer

Info. SetTransformer

Plug-in APIPlug-in API

RemotePlug-in Manager

RemotePlug-in Manager

SelecterSelecter

Plug-in APIPlug-in API

RemotePlug-in Manager

RemotePlug-in Manager

ReasonerReasoner

Plug-in APIPlug-in API

StubPlug-in Manager

StubPlug-in Manager

StubPlug-in Manager

StubPlug-in Manager

StubPlug-in Manager

StubPlug-in Manager

StubPlug-in Manager

StubPlug-in Manager

StubPlug-in Manager

StubPlug-in Manager

DeciderDecider

Plug-in RegistryPlug-in

Registry

PipelineSupportSystem

PipelineSupportSystem

Parallelization and Distribution in the LarKC Platform – Remote execution

5

Implementation in progress

+ Support for distributed remote execution

Page 6: LarKC Tutorial at ISWC 2009 - Parallelisation

Parallelization across plug-ins

Application of Parallelization and Distribution - Example

6

Identifier Identifier

Selecter 1Selecter 1

ReasonerReasoner

DeciderDecider

Selecter 2Selecter 2

QueryTransformer

QueryTransformer

Reasoner

Parallelization within plug-in

Distribution

Strategy for Parallelization and Distribution must be customized for

every use case Optimization of performance

Automating this in Decider: maybe another research programme

LarKC offers the necessary support for its deployment and execution

Page 7: LarKC Tutorial at ISWC 2009 - Parallelisation

High Performance and Distributed Computing support in LarKC (1/2)

LarKC supports large-scale HPC and distributed computing environments for executing plug-ins/pipelines

LarKC supports large-scale HPC and distributed computing environments for executing plug-ins/pipelines

7

Plug-in layerPlug-in layer

Platform layer

Platform layer

DeciderDecider IdentifierIdentifier

LarKC platformLarKC

platform

ReasonerReasoner

LarKC Data Layer

Resource layer

Resource layer

Developer extensions

LarKC middleware adapters/extensions

UserUserenvironmentenvironment

High-performance and High-performance and Grid (Cloud) environmentGrid (Cloud) environment Data StorageData Storage

RDFStoreRDFStore

RDFDocRDFDoc

RDFDocRDFDoc

RDFStoreRDFStore

High-performance and cluster

systems

High-performance and cluster

systems

Public Desktop Grid

Public Desktop Grid

Volunteer resources

Volunteer resources

User desktop machine

User desktop machine

Cloud resourcesCloud resourcesNative

middlewaresolutions

Page 8: LarKC Tutorial at ISWC 2009 - Parallelisation

High-performance computing systems (clusters of SMP nodes)

High-performance computing systems (clusters of SMP nodes)

Computing environments potentially supported by LarKCComputing environments potentially supported by LarKC

Public Desktop Grid (BOINC based)

Public Desktop Grid (BOINC based)

Public Desktop Grid (XtremWeb based)

Public Desktop Grid (XtremWeb based)

Volunteer resourcesVolunteer resources

Public Desktop GridPublic Desktop Grid Local Desktop GridLocal Desktop Grid

High Performance and Distributed Computing support in LarKC (2/2)

8

High Performance Computing Grid infrastructure (e.g. EGEE, DEISA, etc.)

High Performance Computing Grid infrastructure (e.g. EGEE, DEISA, etc.)

Cloud computing environments Cloud computing environments

Service Grid (e.g. EDGeS)Service Grid (e.g. EDGeS)

Implementation in progress

Page 9: LarKC Tutorial at ISWC 2009 - Parallelisation

9footer08/04/23

end