Transcript
Page 1: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Accelerating image recognition on mobile devices using GPGPU

Miguel Bordallo1, Henri Nykänen2, Jari Hannuksela1, Olli Silvén1 and Markku Vehviläinen3

1 University of Oulu, Finland2 Visidon Ltd. Oulu, Finland

3 Nokia Research Center, Tampere, Finland

Jari Hannuksela, Olli SilvénMachine Vision Group, Infotech Oulu

Department of Electrical and Information EngineeeringUniversity of Oulu, Finland

Page 2: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Contents

IntroductionMobile Image Recognition

• Local Binary PatternGraphics processor as a computing

engineGPU accelerated image recognition

• LBP Fragment Shader implementation

• Image preprocessingExperiments and results

• Speed• Power Consumptions

Page 3: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Motivation

• Face detection and recognition is a key component of future multimodal user interfaces

• Mobile computation power still not harnessed properly for real-time computer vision

• High demand computations compromise battery life.

• Need for energy and computationally efficient solutions

Page 4: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Face analysis using local binary patterns

• Face analysis is one of the major challenges in computer vision

• LBP method has already been adopted by many leading scientists

• Excellent results in face recognition and authentication, face detection, facial expression recognition, gender classification

Page 5: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Local Binary Pattern

Page 6: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

GPU as a computing engine

• Newer phones include a GPU chipset• OpenGL ES as a highly optimized and attractive

accelerator interface• Emerging platforms (OpenCL EP) will facilitate

using the GPU as a computing resource• Compatible data formats for graphics and

camera sub-systems desirable

GPU can be treated aan independent

entity

Page 7: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Fixed pipeline (OpenGL ES 1.1) vs. programmable pipeline (OpenGL ES 2.0)

Page 8: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Stream processing (OpenGL) vs. shared memory processing (CUDA)

Page 9: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

OpenCL (Embedded Profile)

• Emerging platforms will offer needed flexibility• OpenCL Embedded Profile is a subset of OpenCL• Supports data and task parallel programming

models• Code executed concurrently on CPU & GPU (& DSP)

– Other current and future resources are compatible – Easier programming in a heterogeneous processor

environment

• High parallelization on image processing computations -> High efficiency

Page 10: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

GPU assisted face analysis process

Page 11: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

GPU-accelerated image recognition

• Open GL ES 2.0:– Image features (LBP,...) extraction:– Image preprocessing– Image scaling– Displaying

• C code:– Camera control– Classification

• c

Page 12: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

LBP fragment shader implementation

•Access the image via texture lookup•Fetch the selected picture pixel•Fetch the neighbours values•Compute binary vector•Multiply by weighting factor

• Two versions:– Version 1: calculates LBP map in one grayscale channel– Version 2: calculates 4 LBP maps in RGBA channels

Page 13: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Preprocessing

Create quad

Divide texture &Convert to grayscale

Render each piecein one channel

Page 14: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Experiments setup

• OMAP 3 family (OMAP3530)– ARM Cortex A8 CPU– Power VRSGX535 GPU

• 3 set-ups:– Beagleboard revision 3– Zoom AM3517EVM (TI Sitara)– Nokia N900

Page 15: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Processing times: LBP extraction

•Computing LBP in four channels (version 2) faster than computing in one

•CPU faster than GPU

•Concurrent execution of algorithms in GPU + CPU increases performance

Size GPUv1 GPUv2 CPU CPU& GPUv1

CPU& GPUv2

1024x1024

232ms 180ms 100ms 116ms 90ms

512x512 76ms 46ms 25ms 37ms 23ms

64x64 2ms 1,5ms 0,4ms 1ms 0,2ms

Page 16: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Processing times: Preprocessing

•GPU outperforms CPU in pixelwise simple operations (scaling + interpolation)

•Concurrent execution of algorithms in GPU + CPU slower than GPU alone due to data transfers

Size GPU CPU CPU &GPU

1024x1024 35ms 100ms 54ms

512x512 10ms 25ms 15ms

64x64 0,2ms 0,4ms 0,4ms

Page 17: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Speed (II): Preprocessing

Size GPU CPU CPU&GPU

1024x1024 35ms 100ms 54ms

512x512 10ms 25ms 15ms

64x64 0,2ms 0,4ms 0,4ms

Page 18: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Speed (II): Preprocessing

Size GPU CPU GPU preprocessing & CPU LBP extraction

1024x1024 215ms 205ms 142ms

512x512 56ms 50ms 40ms

64x64 1,8ms 1ms 0,8ms

Page 19: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Power and Energy consumptions

•Power consumption of GPU and CPU is independent•CPU – 190mW•GPU – 110mW-130mW (increases with image size)

•Energy consumption depends on processing time•GPU has smaller energy per operation.

Operation GPU CPU

Preprocesing 27mJ 19mJ

LBP 5,3mJ 10mJ

Combined algorithm

32,3mJ 28mJ

Page 20: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Summary

•GPUs can be used as a general purpose procesors•New platforms will offer more efficiency and flexibility

•Not optimized interfaces include excesive overheads

Page 21: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Future directions

• Implementation of classifier• Implementations in OpenCL• Multi-scale LBP• Implementation of other feature

extraction

Page 22: Accelerating image recognition on mobile devices using GPGPU

MACHINE VISION GROUP

Thank you!

• Any questions???

Thanks to Texas Instruments for the donation of the Hardware


Recommended