Click here to load reader

Energy-Efficient Face Detection Using Andes RISC-V ... Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

  • View
    2

  • Download
    0

Embed Size (px)

Text of Energy-Efficient Face Detection Using Andes RISC-V ... Image from Joint Face Detection and Alignment...

  • Energy-Efficient Face Detection Using Andes RISC-V Processor

    Presenter: Chien-Hao Chen

    Advisor: Prof. Chen-Yi Lee

    Date: 2018/03/12

    1

  • Outline • Introduction

    • Face Detector on Andes Processor

    • Experiment Result

    • Conclusion

    • Reference

    2

  • Outline • Introduction

    • Motivation

    • Face Detection Model

    • Face Detector on Andes Processor

    • Experiment Result

    • Conclusion

    • Reference

    3

  • Motivation • Cloud computing

    – Image upload to cloud → → result returned

    • Edge computing

    – Image directly computed → → result returned

    4

    processing

    processing

  • Face Detection Model  MTCNN, 2016[1]

    1. Resize image and sliding window sampling

    2. P-Net (Proposal): Find candidate bounding box

    3. R-Net (Refine): Reject the wrong candidate from P-Net

    4. O-Net (Output): From R-Net, find more correct face region

    P-Net R-Net O-Net

    5 Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1499-1503, 2016

  • Face Detection Model • P-Net (Proposal):

    • Fully convolution with 3 convolution and 1 max pooling layer

    • Rough proposal

    • R-Net (Refine): • 3 convolution, 2 max pooling and 1 fully connect layer

    • Reject false proposal from P-Net

    • O-Net (Output): • 4 convolution, 3 max pooling and

    1 fully connect layer

    • More complicated model

    → Reject false result from R-NET

    → Better face bounding box position

    6

    Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1499-1503, 2016

  • Outline • Introduction

    • Face Detector on Andes Processor − Hardware environment

    − Model Simplification and Acceleration

    • Experiment Result

    • Conclusion

    • Reference

    7

  • 8

    Hardware environment  Andes RISC-V :

    − Processor 60MHz, 64-bit AndesCore

    − Xilinx Kintex-7 FPGA XC7K410T

    − DRAM: 1GB

    − Flash: 64MB

  • Outline • Introduction

    • Face Detector on Andes Processor − Hardware environment

    − Model Simplification and Acceleration

    • Experiment Result

    • Conclusion

    • Reference

    9

  •  Depth-wise separable convolution [3]

    10

    Model Simplification and Acceleration

    Model Simplify

    1 1

  •  Depth-wise MTCNN

    • P-Net: (Proposal) • Fully convolution with 1 convolution layer: stride = 2 (channel: 10)

    2 DW convolution layer: stride = 1 (channel: 16, 32)

    • R-Net: (Refine) • 1 convolution layer: stride = 2

    1 DW convolution layer: stride = 2 1 DW convolution layer: stride = 1

    • 1 fully connect

    • O-Net: (Output) • 1 convolution: stride = 2

    2 DW convolution: stride = 2 2 convolution: stride = 1 (channel: 128, 128)

    • 1 fully connect

    11

    Model Simplification and Acceleration

    8 24

  •  Motivation

    • Ex: If PNET input size 240 × 320 output1 size 115 × 155 × 2 output2 size 115 × 155 × 4

    • Soft-max:

    𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥 + 𝑒𝑦

    𝑒𝑦

    𝑒𝑥 + 𝑒𝑦

    → 6 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 & 2 𝑑𝑖𝑣𝑖𝑠𝑖𝑜𝑛

    • For output1 Soft-max: → 115 × 155 × 6~107𝑘 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 → 115 × 155 × 2~35𝑘 𝑑𝑖𝑣𝑖𝑠𝑖𝑜𝑛

    12

    1 2 Soft-max

    Approximation

    Model Simplification and Acceleration

    𝐻𝑜𝑢𝑡 = 𝐻𝑖𝑛 − 𝐻𝑓𝑖𝑙𝑡𝑒𝑟 + 𝑃𝑎𝑑𝑑𝑖𝑛𝑔

    𝑆𝑡𝑟𝑖𝑑𝑒 + 1

    = 240 − 12 + 0

    2 + 1 = 115

  •  Soft-max approximation

    • 𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥+𝑒𝑦

    𝑒𝑦

    𝑒𝑥+𝑒𝑦

    13

    Model Simplification and Acceleration

    1 2 Soft-max

    Approximation

  •  Soft-max approximation

    • 𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥+𝑒𝑦

    𝑒𝑦

    𝑒𝑥+𝑒𝑦

    14

    > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

    Model Simplification and Acceleration

    1 2 Soft-max

    Approximation

  •  Soft-max approximation

    • 𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥+𝑒𝑦

    𝑒𝑦

    𝑒𝑥+𝑒𝑦

    15

    > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

    𝑒𝑥

    𝑒𝑥 + 𝑒𝑦 > 𝑃

    Model Simplification and Acceleration

    1 2 Soft-max

    Approximation

  •  Soft-max approximation

    • 𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥+𝑒𝑦

    𝑒𝑦

    𝑒𝑥+𝑒𝑦

    16

    > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

    𝑒𝑥

    𝑒𝑥 + 𝑒𝑦 > 𝑃

    𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

    Model Simplification and Acceleration

    1 2 Soft-max

    Approximation

  •  Soft-max approximation

    • 𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥+𝑒𝑦

    𝑒𝑦

    𝑒𝑥+𝑒𝑦

    17

    > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

    𝑒𝑥

    𝑒𝑥 + 𝑒𝑦 > 𝑃

    𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

    (1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦

    Model Simplification and Acceleration

    1 2 Soft-max

    Approximation

  •  Soft-max approximation

    • 𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥+𝑒𝑦

    𝑒𝑦

    𝑒𝑥+𝑒𝑦

    18

    > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

    𝑒𝑥

    𝑒𝑥 + 𝑒𝑦 > 𝑃

    𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

    (1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦

    𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦

    Model Simplification and Acceleration

    1 2 Soft-max

    Approximation

  •  Soft-max approximation

    • 𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥+𝑒𝑦

    𝑒𝑦

    𝑒𝑥+𝑒𝑦

    19

    > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

    𝑒𝑥

    𝑒𝑥 + 𝑒𝑦 > 𝑃

    𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

    (1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦

    𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦

    𝑥 > 𝑙𝑛 ( 𝑃

    1 − 𝑃 ) + 𝑦

    Model Simplification and Acceleration

    1 2 Soft-max

    Approximation

  •  Soft-max approximation

    • 𝜎 𝑥 𝑦 =

    𝑒𝑥

    𝑒𝑥+𝑒𝑦

    𝑒𝑦

    𝑒𝑥+𝑒𝑦

    20

    > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

    𝑒𝑥

    𝑒𝑥 + 𝑒𝑦 > 𝑃

    𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

    (1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦

    𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦

    𝑥 > 𝑙𝑛 ( 𝑃

    1 − 𝑃 ) + 𝑦

    constant

    Model Simplification and Acceleration

    1 2 Soft-max

    Approximation

  • 21

    𝑒𝑥

    𝑒𝑥 + 𝑒𝑦 = 0.7

    𝑥 = 𝑙𝑛 ( 0.7

    1 − 0.7 ) + 𝑦

    Model Simplification and Acceleration

    1 2

  • Outline • Introduction

    • Face Detector on Andes Processor

    • Experiment Result

    • Conclusion

    • Reference

    22

  • • On FDDB[4] database: • P-Net, R-Net threshold = 0.6, 0.7; min-face = 25x25

    23

    Experiment Result

    Method Accuracy @

    FPPI 0.01 Accuracy @

    FPPI 0.1 Accuracy @

    FPPI 1.0

    Speedup @ Andes RISC-V

    Processor

    MTCNN 84.95% 92.40% 94.66% -

    Ours 82.59% 88.15% 90.68% 106x

    • FPPI: False Positive Per Image

  • • On FDDB database:

    24

    Experiment Result

    • FPPI: False Positive Per Image

    Method Accuracy @

    FPPI 1.0

    Speedup @ Andes RISC-V

    Processor

    MTCNN 94.66% -

    Ours 90.68% 106x

    Method Accuracy

    @ FPPI 0.1 Accuracy

    @ FPPI 0.01 FPS

    (Titan X GPU)

    FPS (1080-Ti)

    Brodmann17 89.25% 81.88% 200 90

    DeepIR 88.45% 82.16%

  • • On FDDB database:

    • Performance without considering face size under 48x48

    • P-Net, R-Net threshold = 0.9, 0.85; min-face = 48x48

    • P-Net, R-Net threshold = 0.6, 0.7; min-face = 48x48

    25

    Method Accuracy @

    FPPI 0.01 Accuracy @

    FPPI 0.1

    Ours 86.64% 87.7%

    Method Accuracy @

Search related