View
0
Download
0
Category
Preview:
Citation preview
TensorRT Optimizations for
Embedded Facial Recognition
Alexey Kadeishvili, CTO, Vocord
Vocord Company: Main Facts
www.vocord.com 2
■ Developer of video surveillance and video analytics systems since 1999
■ Deep expertise in facial recognition
■ Top-rated in NIST and Megaface face recognition tests
■ NVIDIA Metropolis program member
Our customers and partners
Notable figures
250+ projects for public and private sectors
140 million faces in enrollment database in a single project
200,000 cameras are managed by VOCORD video analysis software
350,000/month API request to VOCORD FaceMatica cloud
Geography: Europe, Middle East, SE Asia, East Asia, Latin America,
Oceania
www.vocord.com 3
Face recognition products
www.vocord.com 4
All products support NVIDIA GPU
VOCORD FaceMaticaFace recognition engine
in a Cloud
VOCORD NetCamNew generation face
recognition camera
VOCORD NanoFaceNVIDIA Jetson-based
embedded face recognition
solution
VOCORD FaceControl“Faces in the crowd” FR system
VOCORD FaceControl 3DFree flow 3D facial recognition
nano
Face Recognition SDKFace recognition engine SDK
www.vocord.com 5
Enrolment DB
Recognition engine
Inbound image quality
Main Factors Impacting Facial Recognition
Enrolment DB quality:
something beyond control
Recognition engine: already works as in the Marvel movies
www.vocord.com 6
VOCORD Facial Recognition Engine
TOP in Megaface Face Scrub Open Challenge 2015-2018With accuracy 91.76%
TOP in NIST Face Recognition Vendor Test 2016-2018 TPR at FPR 10-4 = 98.7%, TPR at FPR 10-6 = 96.6%
www.vocord.com 7
Cross Nation Invariance
Source: NIST Face recognition vendor test, 2018
www.vocord.com 8
Pose Invariance
< 10˚
10 ÷ 30˚
30 ÷ 45˚
45 ÷ 60˚
> 60˚
> 60˚, enrollment DB >60˚
0.25
0.2
0.15
0.1
0.05
0
FR
R
FAR
1.E-011.E-041.E-05 1.E-031.E-07 1.E-06 1.E-02 1.E00
Enrollment DB <30˚Group 1
<10˚
Group 2
10 ÷ 30˚
Group 3
30 ÷ 45˚
Group 4
45 ÷ 60˚
Group 5
> 60˚
www.vocord.com 9
Image Resolution Impact
L=48 pix L =24 pix
*L – the distance between eyes, pix
** FAR=10-4
Fa
ce
ide
ntifica
tion p
rob
ab
ility
Pixels between eyes (L)
0.7
0.8
0.85
0.75
1.0
0.95
0.9
724836 6012 24
Tru
e Id
en
tifica
tion R
ate
**
Op
tim
al re
solu
tion
Re
com
me
nd
ed
min
imu
m
www.vocord.com 10
How to improve recognition?
Recognition engine: already works
as in the Marvel movies
Enrollment DB quality:
something beyond control
The quality of acquired face
images: point of growth
EnrollmentDB
Recognition Engine
Inbound Image Quality
www.vocord.com 11
Different types of test datasets
NIST FRVT Report 2017 10 03
www.vocord.com 12
“Controlled” dataset
Algorithm A
Algorithm B
NIST FRVT Report 2017 10 03
www.vocord.com 13
“Uncontrolled” dataset
Algorithm A
Algorithm B
NIST FRVT Report 2017 10 03
www.vocord.com 14
Controlled vs. Uncontrolled (FRR log scale)
0.1
0.3
0.4
0.2
0.7
0.6
0.5
FAR
FR
R
1.E-041.E-05 1.E-031.E-07 1.E-06 1.E-02
Algorithm A,
uncontrolled environment
Algorithm B,
uncontrolled environment
Algorithm A,
controlled environment
Algorithm B,
controlled environment
www.vocord.com 15
Controlled vs. Uncontrolled (linear scale)
0.1
0.3
0.4
0.2
0.7
0.6
0.5
FAR
FR
R
1.E-041.E-05 1.E-031.E-07 1.E-06 1.E-02
Algorithm A,
uncontrolled environment
Algorithm B,
uncontrolled environment
Algorithm A,
controlled environment
Algorithm B,
controlled environment
Hit the bottom: Images from IP camera
The Advantages of Edge Video Analysis
www.vocord.com 17
■ Face recognition onboard
■ No compression artifacts: the
image is taken directly from the
sensor
■ Dynamic Region of Interest for
every intelligent algorithm
■ Algorithm adjustment for particular
camera set upVOCORD NetCam.AI
edge video analytics camera
Video Enhancement Onboard
18
12 bit image
with static ROI
12 bit image with
dynamic ROI
Backlight, no
enhancement
Dynamic ROI enhances the quality of image in the face area
VOCORD NetCam.AI HW Features
www.vocord.com 19
Automated lens control High quality sensor
NVIDIA Jetson TX1 GPU
VOCORD NetCam.AI Tech Specs
www.vocord.com 20
Camera specs
Resolution 3÷5 Mpix
Temperature range -25С ~ +50С
Ingress Protection IP 67
Dimensions 20x71x150 mm
Power consumption 15W
Built-in facial recognition engine specs
Min face resolution for face recognition 12 pixels between the eyes
Number of faces detected in one frame Up to 25
Latency of biometric template extraction Up to 150 ms per 1 face
Face recognition performance Up to 32 faces/s
Inference framework TensorRT
Performance on Different Platforms
www.vocord.com 21
32
19
12
9
64
2,2 1,4 0,9
0
5
10
15
20
25
30
35
"Shallow" CNN "Medium" CNN "Deep" CNN
NVIDIA Jetson TX1
Intel Movidius
Qualcom Snapdragon 820
Higher FPS Improves Accuracy
www.vocord.com 22
1.E-02
FR
R
0
0.03
0.7
0.15
0.11
1.E-041.E-05 1.E-031.E-07 1.E-06
FAR
0.13
0.09
0.5
0.01
”Shallow” CNN
“Medium” CNN
“Deep” CNN
Single face:
Track (multiple faces):
”Shallow” CNN
“Medium” CNN
“Deep” CNN
TensorRT vs. MXNet Performance
www.vocord.com 23
“Shallow” CNN “Very” CNN“Medium” CNN
Platform: NVIDIA Jetson TX1
FP
S
TensoRT
MXNet
15
35
25
30
20
5
10
0
32
1819
10
12
6
www.vocord.com 24
WHAT’S THE PROFIT?
Face recognition systems architectures
Edge analytics system
with VOCORD NetCam.AI cameras
25
“Traditional” server architecture approach
with regular IP-camerasVS
LAN, Wi-Fi LAN
One archive server
Data center
with many expensive rack
servers
95% of processing is here 95% of processing is here
www.vocord.com 26
Cost-Efficiency: 100 High Loaded Cameras
Edge computing with VOCORD NetCam.AI
26
“Traditional” server architecture with IP camerasVSCameras
USD 2,000 x 100 = USD 200,000
Server for matching and archive
USD 10,000
Cameras
USD 500 x 100 = USD 50,000
Servers
Detection: 2 servers, 4xCPU 32 cores each
USD 60,000
Template extraction: 4 servers, 2 GPU Tesla P40 each
USD 120,000
Server for matching and archive
USD 10,000
CAPEX: USD 210,000 CAPEX: USD 240,000
Maintenance costs:
power supply (7-8 kWt), bandwidth (2Gbps), rack space
OPEX: USD 30,000 per year
Maintenance costs:
power supply (800 Wt), bandwidth (2Gbps), rack space
OPEX: USD 2,000 per year
www.vocord.com 27
• Uploading various video analytics algorithms
• Highly customized algorithms
• Interacting cameras as a part of IoT
• 3D vision
WHAT’S NEXT?
Open Platform: Easy Algorithm Uploading
www.vocord.com 28
Facial
recognition
Behavioral
analysis
License plate
recognition
Emergency
cases
Lost and
found objects
Vehicle
types
Camera-Dependent Algorithm Customization
www.vocord.com 29
Step 1. The camera
collects images and
uploads them to the server
Step 2. The neural network
is retrained on the server
using new images
Step 3. Customized,
light-weight neural network
is uploaded back to the camera
Customization to restricted data
www.vocord.com 30
Deeper DNNs provide better
performance on unrestricted data
On restricted data difference between deep and shallow
network is negligible
Unrestricted data Restricted data
0.01
0.015
0.005
0.04
0.025
0.02
0.035
0.03
1.E-011.E-041.E-05 1.E-031.E-07 1.E-06 1.E-02
“Deep” neural network
“Shallow” nueral network
FAR
FR
R
1.E-02
FR
R
0
0.01
0.015
0.005
0.04
0.025
0.02
0.035
0.03
1.E-041.E-05 1.E-031.E-07 1.E-06
FAR
“Deep” neural network
“Shallow” neural network
Intercamera Tracking
www.vocord.com 31
NetCam.AI #1 NetCam.AI #2
Face
Jeans
Bag
Obtaining 3D Models
■ Building a 3D object from synchronous snapshots from multiple cameras
■ Feature preprocessing for conjugate points search
www.vocord.com 32
E-mail: sales@vocord.com
Website: www.vocord.com
Thank you for your attention! Questions?
Recommended