17
© Copyright Khronos Group, 2010 - Page 1 OpenCL Imaging on The GPU: Optical Flow James Fung Developer Technology, NVIDIA

OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 - Page 1

OpenCL Imaging on The GPU: Optical Flow

James FungDeveloper Technology, NVIDIA

Page 2: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 2

Image Sequence: Middlebury “Minicooper” sequence, frames 10

Pyramidal Lucas KanadeOptical Flow in OpenCL

Algorithm by Jean-Yves Bouguet (Intel)

Implementation by James Fung (NVIDIA)

Page 3: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 3

Optical Flow• Calculate movement of selected points in pairs of images

• Applications:

- Image stabilization

- Feature tracking

- Video encoding

• May be used to track

- a few select points: sparse optical flow

- All image points: dense optical flow- Computationally intensive!

Page 4: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 4

Image Sequence: Middlebury “Minicooper” sequence, frames 10

Page 5: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 5

Image Sequence: Middlebury “Minicooper” sequence, frames 10

Page 6: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 6

Page 7: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 7

Large Motion (3-7 pixels)

Small motion (< 3 pixels)

Calculate image mismatch vector bEstimate motion v = G-1 b(x’,y’)Update position (x’,y’) = ( x’+vx, y’+vy )Repeat for N iterations or until convergence

Algorithm

• For each point u on image I, find the corresponding

point v on image J

• Calculate spatial gradient G in vicinity of each point u

• Iteratively solve for flow:

Page 8: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 8

G2x2 Flow (v)

Downsample Scharr Edge filter

Flow

Result

Solve for flow

v = G-1b

Precompute G from I

Iterate v = G-1b

Algorithm

Image I

Source image pair

Image J

Iterate v = G-1b

Iterate v = G-1b

Page 9: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 9

Optical Flow in OpenCL• Embarrassingly parallel

• Computationally intensive

• Can take advantage of special-purpose GPU hardware:

- Texture cache for 2D spatial locality

- Hardware bilinear interpolation for sub-pixel lookups

- Local Memory

• Can use OpenCL-OpenGL interoperability for visualization

Page 10: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 10

Large Motion (3-7 pixels)

Small motion (< 3 pixels)

Texture Cache• Different areas of image have different amounts of motion,

but motion is spatially coherent

- Lookup window offset varies inside the image

- Texture cache captures 2D locality

Page 11: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 11

),(),(),( L

y

L

y

L

x

L

x

LL

k vgyvgxJyxIyxI

),(),(

),(),(

yxIyxI

yxIyxIb

yk

xkwp

wpy

wp

wpx

k

yy

yy

xx

xx

•Image level L

•Pixel center p

•Window w

•Guess g from previous level L+1

sampler_t bilinSample =... | CLK_FILTER_LINEAR;

…float Jsample = read_imagef( J_float,

bilinSampler, Jidx+(float2)(i,j) ).x;…

Hardware Interpolation• Sub-pixel accuracy & sampling is crucial

• Between iterations, x,y is non-integer

• Use texture hardware linear interpolation

Page 12: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 12

Using OpenCL Local Memory• Lookup locations for I, Ix and Iy are static per iteration

• Can be explicitly cached in Local Memory

- Local memory is on-chip

- Local memory is faster to access, useful for repeated access

Page 13: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 13

0

10

20

30

40

50

60

Texture Only 2 Tex, 1 Lmem 1 Tex, 2 Lmem Local Mem Only

Time(ms)

# Local/Texture Buffers

Effect of moving from Texture to Local Memory Usage

GTX 460 (7 SMs)

GTX 580 (16 SMs)

Middlebury “Minicooper” data set Frames: 10-11 Resolution: 640 x 480 GreyscaleConvergence: 0.0004pxFlow compute time for 3 pyramid levels shown

Page 14: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 14

Optical Flow Demo

• Pyramidal Lucas Kanade

Optical Flow

• Visualization done on GPU by

sharing data between

OpenGL & OpenCL

Page 15: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 15

Optical Flow Performance

0 10 20 30 40 50 60

32

16

8

4

Time (ms) (lower is better)

Ite

rati

on

sLK Pyramidal Optical Flow

GTX 460 & GTX 580

GTX 460

GTX 580

Middlebury “Minicooper” data set Frames: 10-11 Resolution: 640x480 GreyscaleConvergence: 0.0004pxPyramid 3-level flow compute time shown

Page 16: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 16

OpenCL Performance Analysiswith the CUDA Visual Profiler 4.0

Page 17: OpenCL Imaging on The GPU: Optical Flow · Optical Flow Performance 0 10 20 30 40 50 60 32 16 8 4 Time (ms) (lower is better) s LK Pyramidal Optical Flow GTX 460 & GTX 580 GTX 460

© Copyright Khronos Group, 2010 -

Page 17

Thank You!

Contact:

James Fung

[email protected]