19
EIE: Efficient Inference Engine on Compressed Deep Neural Network Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark Horowitz, Bill Dally Stanford University Jun 20 2016 Slides mostly from Han’s presentation Presented by Sihang Liu

EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

EIE: Efficient Inference Engine on Compressed Deep Neural Network

Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark Horowitz, Bill Dally

Stanford University

Jun 20 2016

Slides mostly from Han’s presentation

Presented by Sihang Liu

Page 2: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Background: Hardwares for Deep Neural Network

CPU

Page 3: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Background: Hardwares for Deep Neural Network

CPU GPU

Page 4: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Background: Hardwares for Deep Neural Network

CPU GPU Tensor Processing Unit (TPU)

Page 5: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Background: Speedup from GPU

Source: nvidia.com

Page 6: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Deep Learning on Mobile

Page 7: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Difficulty?

Model Size!

Motherboard of a smart phone

Page 8: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Difficulty?

Model Size!

Page 9: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Deep Compression

Problem: DNN model too largeSolution: Deep Compression

Page 10: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Deep Compression (cont.)

Page 11: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Pruning AlexNet & VGGNetAlexNet

VGG-16

Page 12: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Deep Compression (cont.)

● No loss of accuracy● 120X less energy

Page 13: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Accelerator for Compressed Sparse Neural Network

Problem: Irregular Computation Pattern Solution: EIE accelerator

Page 14: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Distributed Storage and Processing

Page 15: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

PE Architecture

Page 16: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Benchmark• CPU: Intel Core-i7 5930k • GPU: NVIDIA Titan X • Mobile GPU: NVIDIA Jetson TK1

Page 17: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Scalability

Page 18: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Prediction Accuracy

Page 19: EIE: Efficient Inference Engine on Jun 20 2016 Compressed Deep …vicente/recognition/2016/presentations/eie.pdf · EIE: Efficient Inference Engine on Compressed Deep Neural Network

Comparison: Throughput