40
© 2007 The MathWorks, Inc. ® ® Distributed Computing with MATLAB® in Grids Silvina Grad-Freilich Manager, Parallel Computing Technical Marketing [email protected]

Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

©20

07 T

he M

athW

orks

, Inc

.

® ®

Distributed Computing with MATLAB®in Grids

Silvina Grad-Freilich

Manager, Parallel Computing Technical Marketing

[email protected]

Page 2: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

2

® ®

Using FORTRAN and MPI Using MATLAB and MPI

Using MATLAB Distributed Arrays

P>> D = distribute(A) P>> E = D’

Transposing a Distributed Matrix

Page 3: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

3

® ®

Headquarters:Natick, Massachusetts USA

USA: California, Michigan, Washington DC, Texas

Europe:UK, France, Germany, Switzerland, Italy, Spain, Benelux

Worldwide training and consulting

Distributors in 25 countries

The MathWorks at a Glance

Earth’s topography on an equidistant cylindrical projection, created with the MATLAB Mapping Toolbox

■ More than 1,000,000 users in 175+ countries

■ Over 1,600 employees worldwide

Page 4: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

4

® ®

The leading environment for technical computing

� Numeric computation� Data analysis and visualization

� The de factoindustry-standard,high-level programming language for algorithm development

� Toolboxes for signal and image processing, statistics, optimization, symbolic math, and other areas

� Foundation of the MathWorks product family

Page 5: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

5

® ®

The leading environment for modeling, simulating, and implementing communications systems and semiconductors

� Foundation for Model-Based Design� Digital, analog, and mixed-signal systems,

with floating- and fixed-point support� Algorithm development, system-level design,

implementation, and test and verification� Optimized code generation for FPGAs and

DSPs� Blocksets for signal processing,

communications, video and image processing, and RF

� Open architecture with links to third-party modeling tools, IDEs, and test systems

Page 6: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

6

® ®

Key Industries

Core� Aerospace and Defense� Automotive� Communications, Electronics,

Semiconductors, and Computers� Education

Emerging� Biotech, Pharmaceutical, and Medical� Financial Services� Industrial Equipment and Machinery� Instrumentation

Ongoing� Chemical and Petroleum� Earth and Ocean Sciences� Utilities and Energy

Page 7: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

7

® ®

User RequirementsTwo user communities

Easier programming

HPC User

CCCC

FortranFortranFortranFortran

Higher data volumes & compute intensity

Technical Computing User

PERSONAL SUPERCOMPUTING

WITH MATLAB

Page 8: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

8

® ®

Personal Supercomputing with MATLAB

DistributedComputing

ToolboxTOOLBOXES

BLOCKSETS

Computer ClusterComputer Cluster

CPU

CPU

CPU

CPU

MATLAB Distributed Computing EngineMATLAB Distributed Computing Engine

Scheduler

Worker

Worker

Worker

Worker

Page 9: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

9

® ®

Case #1

Distributing Tasks (Task Parallel)

Time Time

Pro

cess

es

Page 10: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

10

® ®

Parallel for loops

% BER Simulations

for s = 0 : 1 : 10

[t, BER, bits] = RunBERSim(s, maxBits, maxErr);

end

Page 11: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

11

® ®

Parallel for loops

% BER Simulations

matlabpool(4);

parfor (s = 0 : 1 : 10)

[t, BER, bits] = RunBERSim(s, maxBits, maxErr);

end

Page 12: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

12

® ®

Research Engineers Advance Design of the International Linear Collider with MathWorks Tools

The ChallengeTo design a control system for ensuring the precise alignment of particle beams in the International Linear Collider

The SolutionUse MATLAB, Simulink, the Distributed Computing Toolbox, and the Instrument Control Toolbox to design, model, and simulate the accelerator and alignment control system

The Results� Simulation time reduced by an order of magnitude� Development integrated� Existing work leveraged

“With the Distributed Computing

Toolbox, we saw a linear

improvement in speed. MathWorks

tools have enabled us to accomplish

work that was once impossible."

Dr. Glen White, University of

London

“With the Distributed Computing

Toolbox, we saw a linear

improvement in speed. MathWorks

tools have enabled us to accomplish

work that was once impossible."

Dr. Glen White, University of

London

Page 13: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

13

® ®

Case #2

Large Data Sets (Data Parallel)

Page 14: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

14

® ®

Parallel Computing Capabilities used in Cancer Research

The ChallengeTo improve a non-invasive breast cancer diagnostic technique based on acoustic imaging of micro-calcifications in breast tissue.

The SolutionUse MATLAB, PDE Toolbox and Distributed Computing toolbox to develop and investigate algorithms , and visualize results

The Results� No major effort required to convert serial code to

parallel through use of distributed arrays � Computation time shortened by an order of

magnitude

“With parallel MATLAB the

solution for the entire scattering

problem can be accomplished in

less than 20 minutes on 12

processors compared to about 4

hours for the serial solution. ”

“With parallel MATLAB the

solution for the entire scattering

problem can be accomplished in

less than 20 minutes on 12

processors compared to about 4

hours for the serial solution. ”

Three elliptical scatterers; Neumann Boundary Conditions applied

Page 15: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

15

® ®

Distributed Arrays, Parallel Algorithms

� Distributed arrays� Store segments of data across participating workers

� Create from any MATLAB built-in class� Examples: doubles, sparse, logicals, cell arrays, and arrays of structs

� Parallel algorithms for distributed arrays� Matrix manipulation operations

� Examples: indexing, data type conversion, and transpose

� Parallel linear algebra functions such as svd and lu

� Data distribution� Automatic, specify your own, or change at any time

Page 16: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

16

® ®

Distributed arrays make conversion from serial to parallel code easier

Page 17: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

17

® ®

Distributed arrays make conversion from serial to parallel code easier

•Parallel data types (distributed array) automatically propagated

•No code changes required

Page 18: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

18

® ®

Over 150 Parallel Functions Available

Page 19: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

19

® ®

Interactive Prototyping and Development

Page 20: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

20

® ®

Interactive to Batch Execution

Page 21: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

21

® ®

MPI-based Functions in Distributed Computing Toolbox

Use when a high degree of control over parallel algorithm is required.

� High-level abstractions of MPI functions� labSendReceive , labBroadcast , and others

� Send, receive, and broadcast any MATLAB data type

� Automatic bookkeeping� Set-up: communication, ranks, etc.

� Error detection: deadlocks and miscommunications

� Pluggable � Use any MPI implementation that is binary compatible with MPICH-2

Page 22: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

22

® ®

Supported on all MATLAB platforms

Page 23: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

23

Licensing: Distributed Computing Toolbox

� Standard toolbox license (individual, concurrent, etc.)� Requires MATLAB� Allows up to 4 local workers

Worker

SchedulerWorker

Worker

DistributedComputing

Toolbox

MATLAB

Simulink

Blocksets

Toolboxes

Task

Result

Job

Result

MATLAB DistributedComputing Engine

Page 24: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

24

Run four local workers with a DCT license

� Easy to experiment with explicit parallelism on multi-core machines

� Rapidly develop parallel applications on local computer

� Take full advantage of desktop power

� No separate compute cluster required

Distributed Computing

Toolbox

Distributed Computing

Toolbox

Page 25: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

25

Computer clusterComputer cluster

CPU

CPU

CPU

CPU

MATLAB Distributed Computing EngineMATLAB Distributed Computing Engine

Scheduler

Worker

Worker

Worker

Worker

Scale up to cluster configuration with nocode changes

Distributed Computing

Toolbox

Distributed Computing

Toolbox

Page 26: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

26

Licensing:MATLAB Distributed Computing Engine

� One key required per worker– Packs of 8, 16, 32, 64, 128, etc.– Worker is a MATLAB session, not a processor

� All-product install*– No code generation or deployment products

Worker

SchedulerWorker

Worker

DistributedComputing

Toolbox

MATLAB

Simulink

Blocksets

Toolboxes

Task

Result

Job

Result

MATLAB DistributedComputing Engine

Page 27: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

27

Compute clusterCompute cluster

CPU

CPU

CPU

CPUScheduler

Worker

Worker

Worker

Worker

Dynamic Licensing

Page 28: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

28

Compute clusterCompute cluster

CPU

CPU

CPU

CPUScheduler

Worker

Worker

Worker

Worker

Dynamic Licensing

Page 29: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

29

Compute clusterCompute cluster

CPU

CPU

CPU

CPUScheduler

Worker

Worker

Worker

Worker

Dynamic Licensing

Page 30: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

30

Compute clusterCompute cluster

CPU

CPU

CPU

CPUScheduler

Worker

Worker

Worker

Worker

Dynamic Licensing

Page 31: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

31

Computer clusterComputer cluster

CPU

CPU

CPU

CPU

MATLAB Distributed Computing EngineMATLAB Distributed Computing Engine

Scheduler

Client MachineClient Machine

Third Party Schedulers

DistributedComputing

ToolboxTOOLBOXES

BLOCKSETS

Third-PartyScheduler

Page 32: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

32

Open API for generic schedulers

Extended support for 3rd-party schedulers

Page 33: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

33

Challenges in moving to the Grid

� Technical: Integration with Grid middleware– Create and run batch jobs– Run jobs without requiring shared file system

– Integration with local resource manager through Workload Management System

� Business: Licensing model– Define policy on license management within the Grid framework

Page 34: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

34

Computer clusterComputer cluster

CPU

CPU

CPU

CPU

MATLAB Distributed Computing EngineMATLAB Distributed Computing Engine

Scheduler

Client MachineClient Machine

Open API for Generic Schedulers

DistributedComputing

ToolboxTOOLBOXES

BLOCKSETS

Third-PartyScheduler

Page 35: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

35

Challenges in moving to the Grid

� Technical: Integration with Grid middleware– Create and run batch jobs– Run jobs without requiring shared file system

– Integration with local resource manager through Workload Management System

� Business: Licensing model– Define policy on license management within the Grid framework

Some of the issues to resolve:1. DCT and Engine licensed by different organizations

Page 36: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

36

Degree/non-degree granting organization supporting researchers

Researchers working for affiliated degree-granting organizations

HPC CenterEnd Users

Portal

University A HPC Center

Licensing Pilot for Third-Party Use

Page 37: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

37

Challenges in moving to the Grid

� Technical: Integration with Grid middleware– Create and run batch jobs– Run jobs without requiring shared file system

– Integration with local resource manager through Workload Management System

� Business: Licensing model– Define policy on license management within the Grid framework

Some of the issues to resolve:1. DCT and Engine licensed by different organizations2. Commercial vs. academic use3. Policy on license management within the EGEE framework

Page 38: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

38

Pilot: EGEE – The MathWorksIntegrate distributed computing tools with EGEE middleware

Step 1: Research need and pre-setup� Survey EGEE virtual organizations on MATLAB use (EGEE)� Identify sites to be used in test (EGEE)� Provide trial licenses (MathWorks)

Step 2: Technical feasibility study� Integrate with local resource manager (EGEE)� Integrate with local resource manager through Workload Management

System (MathWorks & EGEE)

Step 3: Define licensing model� Create model for Grid deployment within the EGEE framework

(MathWorks)

Page 39: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

39

Summary

� Supercomputing with MATLAB us used by two user groups– HPC users - delivering the benefits of MATLAB – MATLAB users - delivering the power of HPC

� Distributed Computing– Interactive prototype and development of parallel MATLAB

applications� Interactive and batch execution modes

– Trivial code changes required to distribute algorithms onto multiple processors

� Deployment in Grids– Need to work on viable business model for users, HPC centers,

and commercial organizations

Page 40: Distributed Computing with MATLAB® in Grids...Supercomputing with MATLAB us used by two user groups – HPC users - delivering the benefits of MATLAB – MATLAB users - delivering

40

® ®