28
Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer (University of Florida, DELL EMC, Intel Corporation) The 9 th Workshop on Big Data Benchmarks, Performance Optimization and Emerging Hardware | ASPLOS 2018 March 25, 2018

Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Using FPGAs as MicroservicesDavid Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

(University of Florida, DELL EMC, Intel Corporation)

The 9th Workshop on Big Data Benchmarks, Performance Optimization and Emerging Hardware | ASPLOS 2018

March 25, 2018

Page 2: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Contents

• Motivation

• Challenges of FPGA integration

• Proposed solutions

• Case study with Spark

• Key design elements

• Conclusions & future work

2D. Ojika

Page 3: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Motivation

• FPGAs are increasingly being used in low-latency / high throughput computing (e.g., big data, machine learning)

• Cloud service providers (CSPs) continue to introduce FPGAs in their datacenters (e.g., Amazon, Microsoft) to meet computing demands

• Emerging cloud computing trends (microservices, serverless) show promise in hyperscale datacenters, increased developer productivity

3D. Ojika

Cloud

Developer

Page 4: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

FPGA + Microservices ?

4D. Ojika

Page 5: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Overview of FPGAs

• FPGA: Field-programable gate array

• Contemporary FPGAs architectures incorporate more advanced resources:• Multiply-accumulate (MAC)

• Off-chip memory controllers

• High-speed serial transceivers

• Embedded, distributed memories

• Phase-locked loops (PLLs)

• Up to 2 million logic cells 5D. Ojika

CLB: configurable logic block

• A reconfigurable chip composed of CLBs

• Basic structure consists of: • Look-up table (LUT), Flip-Flop (FF), Routing wires, I/O pads

Page 6: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Overview of FPGAs

• FPGA: Field-programable gate array

• Contemporary FPGAs architectures incorporate more advanced resources:• Multiply-accumulate (MAC)

• Off-chip memory controllers

• High-speed serial transceivers

• Embedded, distributed memories

• Phase-locked loops (PLLs)

• Up to 2 million logic cells

Key benefits: Flexibility, Power, Performance

6D. Ojika

CLB: configurable logic block

• A reconfigurable chip composed of CLBs

• Basic structure consists of: • Look-up table (LUT), Flip-Flop (FF), Routing wires, I/O pads

Page 7: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

What are Microservices

• Microservices: a collection of loosely-coupled software services

- Unlike Monolithic, provide lightweight (shared) services

- Scale consistently with changing workloads

• Our approach: use Microservices (as collection of loosely-coupled FPGA accelerator functions) to support FPGAs as “software services”

Monolith / Layered Microservice / Disaggregated

7D. Ojika

Page 8: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

What are Microservices

• Microservices: a collection of loosely-coupled software services

- Unlike Monolithic, provide lightweight (shared) services

- Scale consistently with changing workloads

• Our approach: use Microservices (as collection of loosely-coupled FPGA accelerator functions) to support FPGAs as “software services”

Monolith / Layered Microservice / Disaggregated

Key benefits: scalability, fault-tolerance, dynamic configuration, auto-deployment

8D. Ojika

Page 9: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Challenges of FPGA integration

9D. Ojika

Page 10: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

1. Intricate software interface

• JVM-to-FPGAinterfacing is cumbersome

• How to keep FPGA accelerator fully-utilized

Expose all FPGA functions ?

10D. Ojika

Page 11: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

2. Non-trivial FPGA sharing and data movement

• FPGA and CPU thread co-existence is complicated

• Data transfer overheads can hurt performance

FPGA mgt. & data movement ?

11D. Ojika

Page 12: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

3. FPGA reconfiguration

• Reconfiguration can take milliseconds to a few seconds

• Certain applications may be intolerable to downtime

FPGA CPUApp

CPU fallback ?

Hiding reconfiguration time

12D. Ojika

Page 13: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

4. Challenging programming model

• Requirement on hardware-specific knowledge

• Long synthesis (compile) time

Programmer/ Developer

FPGA

hours of development / design iterations

HLS HDL binary

Programmer vs Developer ?

13D. Ojika

Page 14: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Proposed Solution

• FaaM: a runtime framework for deploying FPGA accelerators as microservices

• Leverage optimized hardware libraries

• Share accelerator across threads, applications, users

• Platform-agnostic (OpenCL, etc.)

• Seamless integration with Java, support wide range of apps/frameworks

• Takes care of accelerator management (init./setup, buffer mgt., etc.)

14D. Ojika

Page 15: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

FPGA Microservices Overview

FaaM

FPGALibrary

Accelerator Manager

/w HW-accelerated function

15D. Ojika

Page 16: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Experimenting with Spark

Data Shuffle/Sort in Spark

1. Map – convert raw input into key/value pairs. Output to memory (or disk)

2. Shuffle/Sort – All reduces retrieve all records from all mappers over network

3. Reduce – For each distinct key, do ‘something’ with all the corresponding values. Output to disk

…data movement can hurt performance

16D. Ojika

Page 17: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Experiment Setup

Spark Worker

(Xeon+FPGA Server)

CPU FPGA

• Platform: Xeon+FPGA with Arria10 FPGA

• Hardware-accelerated function: DEFLATE

compression algorithm (level 9)

• Workload: Sort

• Evaluation technique: Focus on

compression of Spark Output RDDs

• With FaaM (FPGA Microservice):

• No change to application code

• Only update Spark config. files

17D. Ojika

Page 18: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Performance Results

3.2X speedup4X memory saving

Functional Evaluation Performance Evaluation

FPGA as “drop-in replacement” via FaaM

18D. Ojika

Page 19: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

+ 2X improvement

*CPU used only as baseline.

System-level optimization #1: enable RDD caching

19D. Ojika

RDD: Resilient Distributed Disk

Page 20: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

System-level optimization #2: increase CPU count

Increased FPGA accelerator utilization, resulting in improved application performance

20D. Ojika

Page 21: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Key design elements

21D. Ojika

Page 22: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

1. FPGA accelerator abstraction

Driver

AFU: Accelerator Functional Unit

Arria 10 FPGA

22

D. Ojika

Page 23: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

1. FPGA accelerator abstraction

Driver

Library

AFU: Accelerator Functional Unit

Arria 10 FPGA

New

23

D. Ojika

Page 24: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

2. Task scheduler

Task scheduler

Library

• JVM and native memory data transfers

can incur significant overhead

• Leverage non-blocking I/O; access

data in streaming fashion

• No garbage collection

• Buffer re-use among groups of SW

threads

• Hde data transfer latency by

overlapping comm. with comp. 24D. Ojika

Disk

Pinned buffer

(NIO)

FPGA-to-Host Memory Communication

Page 25: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

3. JVM runtime system

Task scheduler

Library

Runtime system

CPU-FPGA Thread Co-existence

25D. Ojika

• Keep FPGA busy all time with tasks

• Stateless

• Accelerator management, (buffer

size, int./clean-up, etc.)

Page 26: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Buffer R/W performance

Sweet spot – resulting in almost

no performance loss (next slide)

for the runtime system.

26D. Ojika

Page 27: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Compression: Raw Performance

Functional Evaluation Performance Evaluation

27D. Ojika

almost equalequal

Page 28: Using FPGAs as Microservices - University of Florida · 2018-03-30 · Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer

Conclusions & Future Work

• FPGAs compliment CPUs in certain compute-intensive task– 3.2X speedup, 4X memory footprint reduction in Spark (single-node deployment!)

– Potentially increased benefit in larger-scale scenarios

• FPGA accelerators can leverage microservices architecture – Key to efficiency is CPU-memory-FPGA interaction; accelerator management

– Optimizations to both low-level and high-level stack help improve performance

• Future work:– Performance evaluation with cloud and machine learning workloads

28D. Ojika