36
Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

Devendar Bureddy

HPCAC-Stanford, Feb 8, 2017

In-Network Computing SHARP Technology for MPI Offloads

Page 2: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 2- Mellanox Confidential -

Accelerating HPC applications and Artificial Intelligence

Accelerating HPC Applications

Significantly reduce MPI collective runtime

Increase CPU availability and efficiency

Enable communication and computation overlap

Enabling Artificial Intelligence Solutions to Perform

Critical and Timely Decision Making

Accelerating distributed machine learning

Page 3: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 3- Mellanox Confidential -

SwitchIB-2 : Smart Switch

Switch-IB 2 Enables the Switch Network to

Operate as a Co-Processor

SHARP Enables Switch-IB 2 to Manage and

Execute MPI Operations in the Network

The world fastest switch with <90 nanosecond latency

36-ports, 100Gb/s per port, 7.2Tb/s throughput, 7.02

Billion messages/sec

Adaptive routing, congestion control

Multiple topologies

Page 4: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 4- Mellanox Confidential -

SHARP Performance Data – OSU Allreduce 1PPN, 128 nodes

Page 5: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 5- Mellanox Confidential -

SHArP Performance Advantage

MiniFE is a Finite Element mini-application

• Implements kernels that represent

implicit finite-element applications

AllReduce

Page 6: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 6- Mellanox Confidential -

Allreduce – Four Process Recursive Doubling

1 2 3 4

1 2 3 4

1 2 3 4

Step 1

Step 2

Page 7: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 7- Mellanox Confidential -

SHARP: Scalable Hierarchical Aggregation Protocol

Reliable Scalable General Purpose Primitive

• In-network Tree based aggregation mechanism

• Large number of groups

• Multiple simultaneous outstanding operations

Applicable to Multiple Use-cases

• HPC Applications using MPI / SHMEM

• Distributed Deep Learning applications

Scalable High Performance Collective Offload

• Barrier, Reduce, All-Reduce, Broadcast

• Sum, Min, Max, Min-loc, max-loc, OR, XOR, AND

• Integer and Floating-Point, 32 / 64 bit

SHARP Tree

SHARP Tree Aggregation Node

(Process running on HCA)

SHARP Tree Endnode

(Process running on HCA)

SHARP Tree Root

Page 8: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 8- Mellanox Confidential -

SHARP Tree

SHARP Operations are Executed by a SHARP Tree• Multiple SHArP Trees are Supported

• Each SHArP Tree can handle Multiple Outstanding SHArP Operations

• Within a SHArP Tree, each Operation is Uniquely Identified by a SHArP-Tuple

- GroupID

- SequenceNumber

SHARP Tree is a Logical Construct• Nodes in the SHArP Tree are IB Endnodes

• Logical tree defined on top of the physical underlying fabric

• SHArP Tree Links are implemented on top of the IB transport (Reliable Connection)

• Expected to follow the physical topology for performance but not required

Physical Topology

One SHArP Tree

Switch/Router

HCA

SHArP Tree Aggregation Node

(Process running on HCA)

SHArP Tree Endnode

(Process running on HCA)

SHArP Tree Root

Page 9: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 9- Mellanox Confidential -

SHARP Tree (cont’d)

SHARP Tree is a Logical Construct

• Motivation to loosely follow the underlying

Physical Topology for Performance

- Efficient use of Physical Link Bandwidth

- Shortest Overall Path Length from Leafs to

Root

SHARP Group is a Tree Subset

• A sub-tree spanning a subset of the tree

end-nodes

SHARP Tree Nodes are ‘Processes’

• Multiple SHARP Tree “Nodes” can reside

on the same Physical HCA

SHArP Tree Root

SHArP Tree Root

SHARP Tree

The same SHArP Tree

(Logical View)

SHArP Tree Aggregation Node

SHArP Tree Endnode

(Process running on HCA)

Page 10: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 10- Mellanox Confidential -

SHARP SW Overview

Release

- | version |MOFED version |SwitchIB-2 FW | HPCX n| UFM version|

- |-------------------|---------------------------------|----------------------|------------|-----------------|

- |v1.0 |MLNX OFED 3.3-x.x.x | 15.1100.0072 | 1.6 | - |

- |v1.1 |MLNX OFED 3.4-0.1.2.0| 15.1200.0102 | 1.7 | - |

- |v1.2 |MLNX OFED 4.0-x.x.x | 15.1200.0102 | 1.8 | 5.8 |

Prerequisites

• SHARP software modules are delivered as part of HPC-X

• Switch-IB 2 firmware - 15.1200.0102 | or later

• MLXN OS - 3.6.1002 or later

• MLNX OFED 4.0

• MLNX OpenSM 4.7.0 or later (available with MLNX OFED 3.3-x.x.x or UFM 5.6)

Page 11: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 11- Mellanox Confidential -

Mellanox HPC-X™ Scalable HPC Software Toolkit

Complete MPI, PGAS OpenSHMEM and UPC package

Maximize application performance

For commercial and open source applications

Best out of the box experience

Page 12: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 12- Mellanox Confidential -

Mellanox HPC-X - Package Contents

Allow fast and simple deployment of

HPC libraries

Both Stable & Latest Beta are bundled

All libraries are pre-compiled

Includes scripts/module files to ease

deployment

Package Includes

OpenMPI / OpenSHMEM

BUPC (Berkeley UPC)

UCX

MXM

FCA-2.5

FCA-3.x (HCOLL)

SHARP

KNEM

Allows fast intra-node MPI communication for large

messages

Profiling Tools

Libibprof

IPM

Standard Benchmarks

OSU

IMB

Page 13: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 13- Mellanox Confidential -

HPCX/SHARP SW architecture

HCOLL

• optimized collective library

Libsharp_coll.so

• Implementation of high level sharp API for

enabling sharp collectives for MPI

• uses low level libsharp.so API

Libsharp.so

• Implementation of low level sharp API

High level API

• Easy to use

• Easy to integrate with multiple MPIs(OpenMPI,

MPICH, MVAPICH*)

SHARP(libsharp/libsharp_coll)

HCOLL(libhcoll)

MPI (OpenMPI)

InfiniBand Network

Page 14: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 14- Mellanox Confidential -

SHARP SW Architecture

Aggregation

Node

Aggregation

Node Aggregation

Node

Aggregation

Node….

Compute node

MPI

Process

SHARPD

daemon

Compute node

MPI

Process

SHARPD

daemon

Compute node

MPI

Process

SHARPD

daemon

..….

Subnet

Manager

Aggragation

Manager (AM)

Page 15: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 15- Mellanox Confidential -

SHARP SW Components

SHARP SW components:

• Libs

libsharp.so ( low level api)

libsharp_coll.so ( high level api)

• Daemons (sysadmin)

sharpd, sharp_am

• Scripts

sharp_benchmark.sh

sharp_daemons_setup.sh

• Utilities

sharp_coll_dump_config

sharp_hello

sharp_mpi_test

• public API

sharp_coll.h

Page 16: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 16- Mellanox Confidential -

SHARP: SHARP Daemons

sharp_am: Aggregation Manager daemon

• same node as Subnet Manager

• Resource manager

sharpd: SHARP daemon

• compute nodes

• Light wait process

• Almost 0% cpu usage

• Only control path

Page 17: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 17- Mellanox Confidential -

SHARP: Configuring Subnet Manager

Edit the opensm.conf file.

Set the parameter “sharp_enabled” to “2”.

Run OpenSM with the configuration file.

• % opensm -F <opensm configuration file> -B

Verify that the Aggregation Nodes were activated by the OpenSM, run

"ibnetdiscover".For example:

vendid=0x0

devid=0xcf09

sysimgguid=0x7cfe900300a5a2a0

caguid=0x7cfe900300a5a2a8

Ca 1 "H-7cfe900300a5a2a8" # "Mellanox Technologies Aggregation Node"

[1](7cfe900300a5a2a8) "S-7cfe900300a5a2a0"[37] # lid 256 lmc 0 "MF0;sharp2:MSB7800/U1" lid 512 4xFDR

Page 18: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 18- Mellanox Confidential -

SHARP: Configuring Aggregation Manager

Create a configuration directory for the future SHArP configuration file.

• % mkdir $HPCX_SHARP_DIR/conf

Create the fabric.lst file.

• Copy the subnet LST file created by the Subnet Manager to the AM’s configuration files directory

• Rename it to fabric.lst :

- % cp /var/log/opensm-subnet.lst $HPCX_SHARP_DIR/conf/fabric.lst

Create root GUIDs file.

• Copy the root_guids.conf file if used for configuration of Subnet Manager to

$HPCX_SHARP_DIR/conf/root_guid.cfg (or)

• Identify the root switches of the fabric and create a file with the node GUIDs of the root switches of the fabric.

• For example : if there are two root switches files contains

0x0002c90000000001

0x0002c90000000008

Create sharp_am.conf file% cat > $HPCX_SHARP_DIR/conf/sharp_am.conf << EOF

fabric_lst_file $HPCX_SHARP_DIR/conf/fabric.lst

root_guids_file $HPCX_SHARP_DIR/conf/root_guid.cfg

ib_port_guid <PortGUID of the relevant HCA port or 0x0> EOF

Page 19: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 19- Mellanox Confidential -

SHARP: Running SHARP Daemons

Setup the daemons

• $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh

Usage

• Usage: sharp_daemons_setup.sh <-s> <-r> [-p SHArP location dir] <-d daemon> <-m>

-s - Setup SHArP daemon

-r - Remove SHArP daemon

-p - Path to alternative SHArP location dir

-d - Daemon name (sharpd or sharp_am)

-m - Add monit capability for daemon control

• $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s $HPCX_SHARP_DIR -d sharp_am

• $service sharp_am start

Page 20: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 20- Mellanox Confidential -

SHARP: Running SHARP Daemons

sharp_am• %$HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s $HPCX_SHARP_DIR -d sharp_am

• %service sharp_am start

• Log : /var/log/sharp_am.log

Sharpd• conf file: $HPCX_SHARP_DIR/conf/sharpd.conf

ib_dev <relevant_hca:port>

sharpd_log_level 2

• %pdsh -w <hostlist> $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s $HPCX_SHARP_DIR -d

sharpd

• %pdsh -w jupiter[001-032] service sharpd start

• Log : /var/log/sharpd.log

Page 21: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 21- Mellanox Confidential -

SHARP resource(quota) handling

Job Quota

• Max groups

disable sharp on communicator if it fails to create sharp group

• Max osts

1 for blocking collectives

> 1 for non-blocking & fragmentation

• Data size per OST

Max sharp supported payload (256 B)

• Max QPs

Need at least #PPN QPs/node

Hard problem to efficiently allocate resources

• Application profiling might help

• Per group

- Max OSTs

- Data size per OST

Page 22: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 22- Mellanox Confidential -

MPI Collective offloads using SHARP

Enabled through FCA-3.x (HCOLL)

Flags• HCOLL_ENABLE_SHARP (default : 0)

0 - Don't use SHArP

1 - probe SHArP availability and use it

2 - Force to use SHArP

3 - Force to use SHArP for all MPI communicators

4 - Force to use SHArP for all MPI communicators and for all supported collectives

• HCOLL_SHARP_NP ( default: 2)Number of nodes(node leaders) threshold in communicator to create SHArP group and use SHArP collectives

• SHARP_COLL_LOG_LEVEL

0 – fatal , 1 – error, 2 – warn, 3 – info, 4 – debug, 5 – trace

• HCOLL_BCOL_P2P_ALLREDUCE_SHARP_MAX

- Maximum allreduce size run through SHArP

Page 23: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 23- Mellanox Confidential -

Resources (quota)

• SHARP_COLL_JOB_QUOTA_MAX_GROUPS

- #communicators

• SHARP_COLL_JOB_QUOTA_OSTS

- Parallelism on communicator

• SHARP_COLL_JOB_QUOTA_PAYLOAD_PER_OST

- Payload/OST

For complete list of SHARP COLL tuning options

- $HPCX_SHARP_DIR/bin/sharp_coll_dump_config -f

MPI Collectives offloads using SHARP

Page 24: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 24- Mellanox Confidential -

MPI Collectives offloads using SHARP

$ mpirun --display-map --bind-to core --map-by node -np 32 -mca pml yalla -mca btl_openib_warn_default_gid_prefix

0 -mca rmaps_dist_device mlx5_0:1 -mca rmaps_base_mapping_policy dist:span -x MXM_RDMA_PORTS=mlx5_0:1

-x HCOLL_MAIN_IB=mlx5_0:1 -x HCOLL_ENABLE_SHARP=1 -x

HCOLL_BCOL_P2P_ALLREDUCE_SHARP_MAX=4096 -x SHARP_COLL_LOG_LEVEL=3 -x

HCOLL_ENABLE_MCAST_ALL=0 -x SHARP_COLL_JOB_QUOTA_PAYLOAD_PER_OST=128 /home/user02/hpcx-

v1.7.405-gcc-MLNX_OFED_LINUX-3.4-1.0.0.0-redhat6.5-x86_64/ompi-v1.10/tests/osu-micro-benchmarks-

5.2/osu_allreduce -i 10000 -x 1000 -m 256

# OSU MPI Allreduce Latency Test v5.2

# Size Avg Latency(us)

4 2.01

8 2.58

16 2.04

32 2.59

64 2.10

128 2.79

256 2.69

sharp_init_session

Page 25: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 25- Mellanox Confidential -

SHARP API

Page 26: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 26- Mellanox Confidential -

SHARP API

Datatypes

• 32 bit – INT/UINT/FLOAT

• 64 bit – LONG/Unsinged LONG/DOUBLE

Reduce Ops

• MIN

• MAX

• SUM

• LAND

• LOR

• BOR

• LXOR

• BXOR

• MAXLOC

• MINLOC

• PROD ( Not supported)

Page 27: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 27- Mellanox Confidential -

SHARP API

Control Path

• sharp Init / finalize

• sharp comm(group) create/destroy

Data path

• Collectives operations

- Allreduce

- Barrier

- Bast ( Not supported yet)

- Reduce ( Not supported yet)

• Blocking & Non-Blocking

Page 28: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 28- Mellanox Confidential -

SHARP API

libsharp_coll.so

d

Data pathControl path

SHARP API

Request pool mpool

utils(timer/env/data

structures..)

lisharp.soLow level SHARP API

Page 29: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 29- Mellanox Confidential -

SHARP API – Control path

Sharp job initialize

• API

int sharp_coll_init(struct sharp_coll_init_spec *sharp_coll_spec,

struct sharp_coll_context **sharp_coll_context)

• Code flow in HPC-X

• Input spec

MPI_Init () hcoll_init() sharp_coll_init()

struct sharp_coll_init_spec {int job_id; /* MPI Job ID */

char *hostlist; /* optional in job scheduler case */

int world_rank;

int world_size;

int (*progress_func)(void); /* External progress func */

struct sharp_coll_config config; /* SHARP COLL Configuration. See `sharp_coll_default_config' */

struct sharp_coll_out_of_band_collsoob_colls;

};

Page 30: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 30- Mellanox Confidential -

High level API – Control path..

- Sharp job finalize

• api

void sharp_coll_finalize(struct sharp_coll_context *context);

- Code flow in HPC-X

MPI_Finalize() hcoll_finalize() sharp_coll_finalize()

Page 31: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 31- Mellanox Confidential -

High level API – Control path…

Sharp comm(group) create

• Sharp comm is created for hcoll ptp subgroup

• api

• code flow in HPCX

• Input spec

int sharp_coll_comm_init(struct sharp_coll_context *context,

struct sharp_coll_comm_init_spec *spec,

struct sharp_coll_comm **sharp_coll_comm);

struct sharp_coll_comm_init_spec {int rank; /* MPI rank */int size; /* Communicator size */int is_comm_world; /* is MPI_COMM_WORLD */void *oob_ctx; /* External context for OOB functions */

};

MPI_Comm_create() hcoll_create_context sharp_coll_comm_init()

Page 32: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 32- Mellanox Confidential -

High level API – Control path…

Sharp comm(group) create

• Sharp comm is created for hcoll ptp subgroup

• api

void sharp_coll_comm_destroy(struct sharp_coll_comm *comm);

• code flow in HPC-X

• Input specstruct sharp_coll_comm_init_spec {

int rank; /* MPI rank */int size; /* Communicator size */int is_comm_world; /* is MPI_COMM_WORLD */void *oob_ctx; /* External context for OOB functions */

};

MPI_Comm_free() hcoll_destroy_context sharp_coll_comm_destroy()

Page 33: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 33- Mellanox Confidential -

High level API – data path

Barrier int sharp_coll_do_barrier(struct sharp_coll_comm *comm);

Allreduce int sharp_coll_do_allreduce(struct sharp_coll_comm *comm, struct sharp_coll_reduce_spec *spec);

- Reduce spec:

- Input Buffer types

Contig Buffer

Stream ( not supported yet)

IOV ( not supported yet)

struct sharp_coll_reduce_spec {int root; /* root rank number (ignored for allreduce) */struct sharp_coll_data_desc sbuf_desc; /* data to submit */struct sharp_coll_data_desc rbuf_desc; /* buffer to receive the result */enum sharp_datatype dtype; /* data type */int length; /* reduce operation size */enum sharp_reduce_op op; /* operator */

};

Page 34: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 34- Mellanox Confidential -

High level API – data path..

Non-blocking API• Returns request handle

• use can test/wait for the completion

• API

- int sharp_coll_do_barrier_nb(struct sharp_coll_comm *comm, struct sharp_coll_request **req);

- int sharp_coll_do_allreduce_nb(.., .., struct sharp_coll_request **req);

- int sharp_coll_req_test(struct sharp_coll_request *req);

- int sharp_coll_req_wait(struct sharp_coll_request *req);

- int sharp_coll_req_free(struct sharp_coll_request *req);

Page 35: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

© 2017 Mellanox Technologies 35- Mellanox Confidential -

Summary

Scalable In-network computing

Easy to use

Easy to integrate API

Page 36: In-Network Computing SHARP Technology for MPI Offloads · Devendar Bureddy HPCAC-Stanford, Feb 8, 2017 In-Network Computing SHARP Technology for MPI Offloads

Thank You