oefpos09037 iis fah KLT A0.indd – Fast GPU-based KLT ...Fast GPU-based KLT feature point tracking...

Preview:

Citation preview

a TRADITION of INNOVATIONISO 9001:2008 certified

oef pos 09 037

16.48

35.57

82.54

1.92

3.44

7.32

0 20 40 60 80 100

SD

HD 720p

HD 1080p

Imag

e fo

rmat

Time [ms]

GPUCPU

HD 1080p video (Full-HD)

5500

7000

8500

10000

Imag

e fo

rmat

0 50 100 150 200Time [ms]

98.7

120.1

147.3

172.3

10.7

11.9

13.7

15.3

GPUCPU

GPU5000

CPU5000

GPU7000

CPU7000

GPU10000

CPU10000Features

04080

120160200240280320360400

Overall runtime for Full-HD video

TrackingEnf. Min. Dist.Selecting

www.jo

anne

um.at/iis

JOANNEUM RESEARCHForschungsgesellschaft mbH

Institute of Information Systems

Hannes Fassold

Steyrergasse 178010 GRAZ, AUSTRIA

Phone +43 316 876-1119Fax +43 316 876-1191

hannes.fassold@joanneum.at

iis@joanneum.atwww.joanneum.at/iis

This work was partially supported by the European Commission

under the contracts

FP7-215475, “2020 3D Media – Spatial Sound and Vision”,

www.20203dmedia.eu

and

FP7-216465, “SCOVIS – Self-Configurable Cognitive Video

Supervision”, www.scovis.eu

FP7-231161, “PrestoPRIME”, www.prestoprime.org

Fast GPU-based KLT feature point tracking using CUDA

IntroductionAutomatic selection and tracking of feature points is a basic task ■

for many CV algorithms (e. g. structure from motion, object tracking)

Very popular due to good performance: ■

KLT algorithm [Kanade, Lucas and Tomasi]

Optimized OpenCV implementation available, but is not realtime capable ■

(> 30 fps) for High Defi nition (HD) video

GPUs provide a lot of computation power and can be used ■

for general purpose compution (CUDA)

Goal: Port KLT feature point tracker ■

to CUDA to achieve realtime performance for HD video

Feature point selectionAlgorithm ■

Calculate cornerness for each pixel1. Enforce min. quality (5 – 10 % of max. cornerness) and do non-maxima 2. suppressionEnforce minimum distance between all feature points3.

First two steps are done on the GPU ■

Step 3 is inherently serial, therefore done on CPU ■

Alternative method for step 3 was implemented ■

(uses a mask image, much faster for many feature points)

Feature point trackingAlgorithm ■

Calculate image pyramids1. For each feature on every pyramid level 2. (from coarse to fi ne), calculate the optical fl ow iteratively, using Gauss-Newton method

All steps are done on the GPU ■

(each feature point is one thread)

For a low number of feature points (< 1000) ■

modern GPUs are under-utilized

Evaluation Tracker quality (subjective assessment) ■

Most of the feature points and their optical fl ow are the same ➜

for CPU and GPU implementationDifferences mainly for feature points ➜

which seem to be not correctly tracked by both implementations

Tracker runtime ■

CPU (OpenCV) routine uses IPP & OpenMP internally, ➜

runs on Intel Xeon QuadCore 2.4 GhzGPU routine runs on Geforce GX280 ➜

Achieves overall ➜ 5 – 10 times speedupHigher speedup for larger images and more feature points ➜

ConclusionGoal of HD realtime tracking achieved ■

Key for successfully porting an algorithm to the GPU ■

Parallelization of the algorithm ➜

into a large number of loosely coupled threads possibleKnowledge of the GPU architecture ➜

(e. g the different memory types and their properties)

Hannes Fassold, Jakub Rosner, Peter Schallauer and Werner Bailer

Figure 1: KLT feature points (green: GPU implementation, red: CPU implementation, yellow: both)

Figure 2: Runtime for feature point selection for different image sizes (without minimum distance enforcement)

Figure 3: Runtime for feature point tracking for Full-HD images for different numbers of feature points

Figure 4: Overall runtime of the KLT feature point tracker for Full-HD (1920 x 1080) video

Recommended