41
Copyright © 2016 Embedded Vision Alliance 1 Computer Vision 2.0: Where We Are and Where We’re Going Jeff Bier Founder, Embedded Vision Alliance | President, BDTI May 3, 2016

"Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Embed Size (px)

Citation preview

Page 1: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 1

Computer Vision 2.0:

Where We Are and Where We’re Going

Jeff Bier Founder, Embedded Vision Alliance | President, BDTI

May 3, 2016

Page 2: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 2

Computer vision: research and fundamental

technology for extracting meaning from images

Machine vision: factory applications

Embedded vision: thousands of applications

• Consumer, automotive, medical, defense, retail,

gaming, security, education, transportation, …

• Embedded systems, mobile devices, PCs and the cloud

The Evolution of Vision Technology

Page 3: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 3

Applications

Page 4: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 4

Applications: Natural User Interface

Source: 3rd-strike.com

Source: engadget.com Source: stuff.tv

Page 5: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 5

Applications: Automotive Safety

Source: Subaru

Source: digitaltrends.com

Technologyreivew.com

“Now, to win top overall safety scores from the IIHS, a car needs to have a forward-collision warning system with automatic braking. In addition, any autobrake system has to function effectively in formal track tests…”

“A 2009 study conducted by the IIHS found a 7 percent reduction in crashes for vehicles with a basic forward-collision warning system, and a 14 to 15 percent reduction for those with automatic braking.”

Page 6: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 6

Software-Defined Sensor

Source: videantis

Source: proctorcars.com Source: optalert.com Source: teslaliving.net

Page 7: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 7

Mercedes Magic Body Control

https://www.youtube.com/watch?v=940wGYCeQ68

Page 8: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 8

Applications: Keeping an Eye on Our Stuff

Source: bestbuy.ca

Source: Tend Insights

Source: Camio

Page 9: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 9

Keeping an Eye on Our Stuff (Industrial Version)

Source: govtech.com

Source: exacq.com

Source: technologyreview.com

Source: Kespry/

Page 10: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 10

Autonomous Vehicles Come in Many Varieties

Source: nimblechapps.com

Source: digitaltrends.com

Source: forbes.com

Source: linuxgizmos.com

Page 11: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 11

DJI Phantom 4

https://www.youtube.com/watch?v=JJPSSqMQajA

Page 12: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 12

• I can’t tell you for sure what the

killer app is for computer vision.

• (If I could, I’d be rich … or at

least I’d be Carnac the

Magnificient.)

• But we do know a couple of

things.

Applications: What’s The Future Hold

Page 13: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 13

One thing we know about computer vision is that it will eventually be

“invisible”

Computer Vision is an Enabling Technology, Not

an End in Itself

Shoebox speech recognition

system, 1960s, IBM

Dragon, 1990s-2000s iPhone, 2015

Page 14: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 14

• We’ve been here before—lots of times

before, actually

• Example: RISC in the 1980s, digital

signal processing (DSP) in the 1990s

• Search for applications enabled by a new

technology …

• … leads to a scramble to figure out common

algorithms and algorithmic building blocks …

• … which in turn drive processor architecture

(“what do we do in hardware”?)

• … which in turn drives what apps are

possible or easy

We Also Know Apps Don’t Live Alone

Algorithms

Processors

Applications

Page 15: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 15

Algorithms

Page 16: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 16

• Infinitely varying inputs in many applications

• Uncontrolled conditions: lighting, orientation, motion, occlusion

• Leads to ambiguity…

• Leads to the need for complex,

multi-layered algorithms to extract

meaning from pixels

• Plus:

• Lack of analytical models means

exhaustive experimentation is required

• Numerous algorithms and algorithm

parameters to choose from

Vision Algorithms Are Challenging

www.selectspecs.com

Page 17: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 17

Source: xkcd.com

Page 18: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 18

Source: hitl.washington.edu/artoolkit Source: xkcd.com

Page 19: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 19

Deep Neural Networks: Learning Machines

Source: NVIDIA

Page 20: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 20

• Originally used solely for classification, convnets are now also being used

for:

• Detection

• Segmentation

• Sequences (e.g., video captioning)

• Visual motor control

Expanding Applicability of Deep Learning

Source: Long, Shelhamer, Darrell. CVPR’15

Source: Levine, Finn, Darrell, Abbeel, UC Berkeley

Page 21: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 21

Then:

• Needed many algorithm engineers

• Needed lots of compute for runtime

• We lacked an underlying theory of

visual perception

• We struggled to implement what we

could describe

What Changes… and What Doesn’t?

Now:

• Need lots of training data

• Need lots of compute for

runtime… and more for training

• We still lack the theory, but now

have more general solutions

• We are increasingly able to

implement what we can show

Page 22: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 22

• For many applications algorithms will

converge around deep neural networks

• Some applications will include multiple

deep learning modules

• We’ll also converge on a small set of other

algorithms (i.e., not deep learning) for

specific tasks

• E.g., SLAM, stereo correspondence,

panoramic image stitching, …

Where Does That Leave Us?

Algorithms

Processors

Applications

Page 23: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 23

System Architecture

Page 24: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 24

Every Computer Vision System Looks Something

Like This

Camera Local

Processor

Network

Connection

Cloud

Backend

Page 25: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 25

Cloud, Edge or Both? Yes.

Page 26: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 26

Lots of Options, with Tradeoffs, Depending on

What You’re Trying To Do

Cloud Use (Compute and/or Bandwidth)

Local

Processing

Power

Low High

Low

High

CubeWorks

Camio

NAUTO

Facebook, Google,

Clarif.ai, …

ADAS

Page 27: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 27

Processors for Deploying Vision

Page 28: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 28

The Old Days:

“Any color you want so long as it’s beige”

Page 29: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 29

Lots of options:

• PC CPU

• PC CPU + discrete or integrated GPU

• Mobile application processor (e.g., Qualcomm Snapdragon)

• CPU + discrete or integrated FPGA (Xilinx, Altera)

• DSPs (e.g., Texas Instruments ‘C6x)

Today: General-Purpose Chips Used for Vision

Page 30: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 30

The options multiply like crazy (get it?)

Processor Chips:

• Analog Devices BF609

• Inuitive NU3000

• MobileEye EyeQ4

• Movidius Myriad 2

• NXP S32V

• Texas Instruments TDA3x, TDA2Ec,

Jacinto 6 Entry

Trend: Vision-specific Processors

Processor Cores:

• Apical Spirit

• Cadence Vision P5, Vision P6

• CEVA XM-4

• Synopsys DesignWare EV

• Vivante VIP7000, GC7000-XS

VX

Page 31: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 31

• Heterogeneity is great! It gives:

• Most efficient use of your resources (cost, speed, power)

• Insurance (you’re not committed to a particular platform or

technology)

• But it comes at a cost: hard to program

• Where we are now: “deal with it”

Trend: Heterogeneity

Page 32: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 32

• Long term:

• Heterogeneity in hardware becomes increasingly hidden

through higher level abstractions

• More vision-specific co-processors, which are specialized

for the “winning” algorithms

• A winnowing of architectures reduces diversity

Future: Heterogeneity

Page 33: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 33

Development

Page 34: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 34

• Development centered around the

PC

• Algorithms implemented from

scratch

• Hand-optimized

Development: The Old Days…

Page 35: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 35

• OpenCV enables fast algorithm experimentation

• Toolkits from technology suppliers

• Functionality encapsulated in software modules

• Object detection, emotion analysis, SLAM, AR

• In OpenCV and elsewhere

• If you need to optimize: CUDA, OpenCL, NEON compiler intrinsics, etc.

Development: Today

Page 36: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 36

• Heterogeneity of hardware becomes hidden

• OpenVX: Abstracts hardware, not the algorithm

• Higher-level APIs: Abstract the algorithm and hardware

• Higher-level deep learning abstractions

• Automated optimization of neural networks

• Automated design and training of neural networks

• Development shifts from implementation to integration

Development: Future

Page 37: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 37

The Business of Computer Vision

Page 38: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 38

• Ubiquitous

• Invisible

• A gigantic creator of value

• Both for suppliers

• … and those who use it

Analogy to Wireless (Thanks, Raj!)

Facebook stats:

• 1.5B monthly mobile active users

• 989M daily mobile active users

• 54% login ONLY from mobile

• 79% of ad revenue from mobile

Page 39: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 39

Intel’s Public Computer Vision Investments

2009 2010 2011 2012 2013 2014 2015

Prism Skylabs

Retail people tracking

$25M investment

10/2013

Vuzix

Digital

eyewear

$24.8M

1/2015

Olaworks

Face recognition

$30.7M

4/2012

InVision Biometrics

3D sensors

$50M

11/2011

Imagination

Mobile GPU

$38M investment

6/2009

InVisage

Quantum film

sensors

$32.5M

12/2014

EyeFluence

Eye tracking

Undisclosed

11/2014

Avegant

Glyph

VR headset

$9.4M

11/2014

3Gear

Gesture

recognition

$1.9M

4/2014

Emotient

Facial expression

recognition

$6M

2/2014

Omek Interactive

Gesture recognition

$40M

7/2013

CognoVision

Digital signage

$30M

11/2010

Tyzx

Stereo vision

Est. $50M

2012

| | | | | |

?

INVESTMENTS

ACQUISITIONS

Tobii

Eye

tracking

$21M

3/2014

$25M 6

2

$21M 9

$33M $38M $25M

$30M $31M $40M $50M $50M

(Estimates)

Page 40: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 40

• Computer vision will become ubiquitous and invisible

• It will be a huge creator of value, both for suppliers as well as those who

leverage the technology in their applications

• Deep learning will become a dominant technique (but not the only

technique)

• Computation distributed between the cloud and the edge

• Heterogeneity in hardware becomes increasingly hidden

• Development shifts from implementation to integration

• …Until the next disruptive technology emerges

Conclusions

Page 41: "Computer Vision 2.0: Where We Are and Where We're Going," a Presentation from the Embedded Vision Alliance

Copyright © 2016 Embedded Vision Alliance 41

Embedded Vision Alliance Member Companies