Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
SEOUL | Oct.7, 2016
Yukyung Choi Namil Kim Soonmin Hwang In So Kweon Jongchan Park
Thermal Image Enhancement Using Convolutional Neural Network
Visual Perception for Autonomous Driving During Day and Night
2
AG
EN
DA
Introduction to our Vision
Introduction to Thermal Image Enhancement (TEN)
The Period to Explore GPUs for Image Processing
3
Toward the Next Generation of Vision Technology
Why do we use LWIR sensors?
4
5
KAIST Multi-spectral Recognition Dataset in day and night, IJRR under review.
6
Motorola DynaTAC iPhone
KYLE BEAN - HISTORY OF MOBILE EVOLUTION, 2012.
여주시립폰박물관
7
expensive cheap heavy light
industrial commerciable
Le Penseur
8
Our goal:
Given a single modality, let’s generate multispectral information.
Time Invariance Spectral Image
(LWIR)
Visible-Spectral Context (Chromaticity, Depth, etc)
“Transfer”
(To be appear in IROS2016-Exhibition )
9
What is the next generation vision sensor for use in any time of the day?
“Visual information gives you a wider view than radars.” - Angelova, IEEE Spectrum -
Why do we enhance LWIR images?
10
Thermal Infrared Camera
Audi A8 (2014)
Mercedes Benz-S (2014)
BMW X5 (2014)
Dashboard Installation
Night Driving Trailer
11
8um~14um
Long wave infrared (LWIR) Visible (RGB)
400nm~700nm
PM04 AM02 PM04 AM02
wavelength
O X light invariant
not enough enough texture/color
large small diffraction distortion But
small large resolution
KAIST All-day dataset (FLIR A655sc, Point Grey Flea3)
12
Make HR image with LR image
320x240 (*LR) 640x480 († HR) 640x480 (HR) 640x480 (Good Quality HR)
Image Enlargement (SR) Detail Enhancement (DDE)
Improve HR image for better visibility and recognition performance
Limitation #2: Diffraction Distortion (blur) Limitation #1: Resolution
*LR: low resolution, †HR: high resolution
13
Image Enlargement (SR)
Which type of image is useful in thermal image enhancing?
Thermal image enhancing: when and where is it useful?
Visible images? or Thermal images? or some other imaging spectra?
Visibility? or Recognition performance?
14
Visible 0.4-0.7um
Short Wave IR 1.0-3.0um
Middle Wave IR 3.0-5.0um
Long Wave IR 8.0-14.0um
Thermal Infrared Face Recognition – A Biometric Identification Technique for Robust Security system, Refinements and New Ideas in Face Recognition.
𝜆
BLUR
(1) Which type of image is useful in thermal image enhancing?
15 From “Global-Local Face Upsampling Network,” Arxiv (27 Apr 2016).
Same or not? Input Bicubic Proposed GT
(2) Thermal image enhancing: when and where is it useful?
16
Image Enlargement : Architecture
(cin, cout,f,p) cin/out is the number of input/out channel, f is the size of filter, p is the size of padding .
Bicubic : Interpolation Layer (preprocessing) Conv : Convolution Layer ReLU: Rectified Linear Unit Layer
†HR *LR
*LR: low resolution
Feature Extraction Mapping Reconstruction
MSE (pixel loss)
†HR: high resolution
Limitation #1: Resolution
1) Shallow Network 2) MSE (pixel loss) 3) Bicubic Interpolation
Thermal Image Enhancement Using Convolutional Neural Network, (To be appear in IROS2016)
17
Image Enlargement : Data
[RGB] “RGB 91” Dataset (gray channel)
[MWIR] “Thermal Stereo” Dataset
Pre-training: 64×64, 91 𝑝𝑎𝑡𝑐ℎ𝑒𝑠 Fine-tuning: 36×36, 𝑠𝑡𝑟𝑖𝑑𝑒 6, 118,211 𝑝𝑎𝑡𝑐ℎ𝑒𝑠 No data augmentation The size of batch : 128, Learning rate: 0.001 (decreased by a factor 10 at every 30 epochs until 60 epochs)
[LWIR] “Multimodal Stereo” Dataset
Train Data Test Data
Limitation #1: Resolution
(1) Which type of image is useful in thermal image enhancing?
[LWIR] “Multimodal Stereo” Dataset
Training parameter
18
Comparison of performance in Visible and MWIR
RGB MWIR
TENet x2
Image Enlargement : (1) Which type of image is useful in thermal image enhancing?
Bicubic Gray-TENet MWIR-TENet
PSNR(dB) 39.2000 40.8257 33.3964
Limitation #1: Resolution
19
Image Enlargement : (2) when and where is it useful?
(a)
(b)
(c)
(d)
(e)
(f)
Pedestrian detection result Visual odometry result
Trajectories are estimated by Andreas’s algorithm. [2] Detections are conducted on KAIST-RCV algorithm. [1]
Far
[2] StereoScan: Dense 3D Reconstruction in Real-time, IV 2011. [1] Multispectral Pedestrian Detection: Benchmark Dataset and Baseline, CVPR2015
RGB LWIR E-LWIR
AM11:00 AM01:00
RGB LWIR E-LWIR
GT: GPS/IMU data
[m]
[m] [m]
[m]
Limitation #1: Resolution
20
Detail Enhancement : Architecture
Conv : Convolution Layer ReLU: Rectified Linear Unit Layer BatchNorm: Batch Normalization Layer
*HR: input image †HR: enhanced image ‡R: Residual image
‡R
Feature Extraction Mapping Reconstruction
†HR *HR
Limitation #2: Diffraction Distortion
(cin, cout,f,p) cin/out is the number of input/out channel, f is the size of filter, p is the size of padding .
Patent Pending* (To be appear in KINPEX2016)
21
Detail Enhancement : Data Limitation #2: Diffraction Distortion
Train Data Test Data
[RGB] RGBT67 in KAIST all-day dataset (y channel) [LWIR] RGBT67 in KAIST all-day dataset
64×64, 𝑠𝑡𝑟𝑖𝑑𝑒 32, 136,800 𝑝𝑎𝑡𝑐ℎ𝑒𝑠 Data Augmentation ( up-down flip, left-right flip ) The size of batch : 64, Learning rate: 0.001
Training parameter [LWIR] RGBT67 in KAIST all-day dataset
(1) Which type of image is useful in thermal image enhancing?
22
Detail Enhancement : Data Limitation #2: Diffraction Distortion
(1) Which type of image is useful in thermal image enhancing?
[2] No Reference Image and Video Quality Assessment, SPL 2013.
4.69
35.5
3.64
42.1
3.8
40.1
0
10
20
30
40
50
NIQE PSNR
Input RGB Model LWIR Model
*NIQE: Image distortion score [2]
Comparison of performance in Visible and LWIR
23
RESULT VIDEO
24
It’s time to explore GPUs for image processing
4th Industrial Revolution is just around the corner with GPUs
25
From NVIDIA
26
It’s time to explore GPUs for image processing
GTX 1080 I5-6600
Device Computatio
nal Time
Speed
Up
Frames
per Sec
CPU: i5-6600
(3.30GHz) 616.34ms x1 1.62
GTX 1080 2.94ms x209.64 340.14
Jetson TX1 48.05ms x12.83 20.82
*test image : 320x240 (with caffe framework)
Jetson TX1
Image Enlargement
27
Today’s Summary
Introduction of our vision
Talk about Thermal Image Enhancement Network (TEN)
Now let’s explore GPUs for image processing!
Toward the Next Generation of Vision Technology
SEOUL | Oct.7, 2016
Yukyung Choi Namil Kim Soonmin Hwang In So Kweon Jongchan Park
THANK YOU
https://github.com/kaist-rcv/multispectral
Thank you to all my coworkers