Click here to load reader
View
2
Download
0
Embed Size (px)
Energy-Efficient Face Detection Using Andes RISC-V Processor
Presenter: Chien-Hao Chen
Advisor: Prof. Chen-Yi Lee
Date: 2018/03/12
1
Outline • Introduction
• Face Detector on Andes Processor
• Experiment Result
• Conclusion
• Reference
2
Outline • Introduction
• Motivation
• Face Detection Model
• Face Detector on Andes Processor
• Experiment Result
• Conclusion
• Reference
3
Motivation • Cloud computing
– Image upload to cloud → → result returned
• Edge computing
– Image directly computed → → result returned
4
processing
processing
Face Detection Model MTCNN, 2016[1]
1. Resize image and sliding window sampling
2. P-Net (Proposal): Find candidate bounding box
3. R-Net (Refine): Reject the wrong candidate from P-Net
4. O-Net (Output): From R-Net, find more correct face region
P-Net R-Net O-Net
5 Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1499-1503, 2016
Face Detection Model • P-Net (Proposal):
• Fully convolution with 3 convolution and 1 max pooling layer
• Rough proposal
• R-Net (Refine): • 3 convolution, 2 max pooling and 1 fully connect layer
• Reject false proposal from P-Net
• O-Net (Output): • 4 convolution, 3 max pooling and
1 fully connect layer
• More complicated model
→ Reject false result from R-NET
→ Better face bounding box position
6
Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1499-1503, 2016
Outline • Introduction
• Face Detector on Andes Processor − Hardware environment
− Model Simplification and Acceleration
• Experiment Result
• Conclusion
• Reference
7
8
Hardware environment Andes RISC-V :
− Processor 60MHz, 64-bit AndesCore
− Xilinx Kintex-7 FPGA XC7K410T
− DRAM: 1GB
− Flash: 64MB
Outline • Introduction
• Face Detector on Andes Processor − Hardware environment
− Model Simplification and Acceleration
• Experiment Result
• Conclusion
• Reference
9
Depth-wise separable convolution [3]
10
Model Simplification and Acceleration
Model Simplify
1 1
Depth-wise MTCNN
• P-Net: (Proposal) • Fully convolution with 1 convolution layer: stride = 2 (channel: 10)
2 DW convolution layer: stride = 1 (channel: 16, 32)
• R-Net: (Refine) • 1 convolution layer: stride = 2
1 DW convolution layer: stride = 2 1 DW convolution layer: stride = 1
• 1 fully connect
• O-Net: (Output) • 1 convolution: stride = 2
2 DW convolution: stride = 2 2 convolution: stride = 1 (channel: 128, 128)
• 1 fully connect
11
Model Simplification and Acceleration
8 24
Motivation
• Ex: If PNET input size 240 × 320 output1 size 115 × 155 × 2 output2 size 115 × 155 × 4
• Soft-max:
𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥 + 𝑒𝑦
𝑒𝑦
𝑒𝑥 + 𝑒𝑦
→ 6 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 & 2 𝑑𝑖𝑣𝑖𝑠𝑖𝑜𝑛
• For output1 Soft-max: → 115 × 155 × 6~107𝑘 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 → 115 × 155 × 2~35𝑘 𝑑𝑖𝑣𝑖𝑠𝑖𝑜𝑛
12
1 2 Soft-max
Approximation
Model Simplification and Acceleration
𝐻𝑜𝑢𝑡 = 𝐻𝑖𝑛 − 𝐻𝑓𝑖𝑙𝑡𝑒𝑟 + 𝑃𝑎𝑑𝑑𝑖𝑛𝑔
𝑆𝑡𝑟𝑖𝑑𝑒 + 1
= 240 − 12 + 0
2 + 1 = 115
Soft-max approximation
• 𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
13
Model Simplification and Acceleration
1 2 Soft-max
Approximation
Soft-max approximation
• 𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
14
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
Model Simplification and Acceleration
1 2 Soft-max
Approximation
Soft-max approximation
• 𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
15
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦 > 𝑃
Model Simplification and Acceleration
1 2 Soft-max
Approximation
Soft-max approximation
• 𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
16
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦 > 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
Model Simplification and Acceleration
1 2 Soft-max
Approximation
Soft-max approximation
• 𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
17
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦 > 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦
Model Simplification and Acceleration
1 2 Soft-max
Approximation
Soft-max approximation
• 𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
18
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦 > 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦
𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦
Model Simplification and Acceleration
1 2 Soft-max
Approximation
Soft-max approximation
• 𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
19
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦 > 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦
𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦
𝑥 > 𝑙𝑛 ( 𝑃
1 − 𝑃 ) + 𝑦
Model Simplification and Acceleration
1 2 Soft-max
Approximation
Soft-max approximation
• 𝜎 𝑥 𝑦 =
𝑒𝑥
𝑒𝑥+𝑒𝑦
𝑒𝑦
𝑒𝑥+𝑒𝑦
20
> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)
𝑒𝑥
𝑒𝑥 + 𝑒𝑦 > 𝑃
𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦
(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦
𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦
𝑥 > 𝑙𝑛 ( 𝑃
1 − 𝑃 ) + 𝑦
constant
Model Simplification and Acceleration
1 2 Soft-max
Approximation
21
𝑒𝑥
𝑒𝑥 + 𝑒𝑦 = 0.7
𝑥 = 𝑙𝑛 ( 0.7
1 − 0.7 ) + 𝑦
Model Simplification and Acceleration
1 2
Outline • Introduction
• Face Detector on Andes Processor
• Experiment Result
• Conclusion
• Reference
22
• On FDDB[4] database: • P-Net, R-Net threshold = 0.6, 0.7; min-face = 25x25
23
Experiment Result
Method Accuracy @
FPPI 0.01 Accuracy @
FPPI 0.1 Accuracy @
FPPI 1.0
Speedup @ Andes RISC-V
Processor
MTCNN 84.95% 92.40% 94.66% -
Ours 82.59% 88.15% 90.68% 106x
• FPPI: False Positive Per Image
• On FDDB database:
24
Experiment Result
• FPPI: False Positive Per Image
Method Accuracy @
FPPI 1.0
Speedup @ Andes RISC-V
Processor
MTCNN 94.66% -
Ours 90.68% 106x
Method Accuracy
@ FPPI 0.1 Accuracy
@ FPPI 0.01 FPS
(Titan X GPU)
FPS (1080-Ti)
Brodmann17 89.25% 81.88% 200 90
DeepIR 88.45% 82.16%
• On FDDB database:
• Performance without considering face size under 48x48
• P-Net, R-Net threshold = 0.9, 0.85; min-face = 48x48
• P-Net, R-Net threshold = 0.6, 0.7; min-face = 48x48
25
Method Accuracy @
FPPI 0.01 Accuracy @
FPPI 0.1
Ours 86.64% 87.7%
Method Accuracy @