38
1 Open Senses Bob Igo http://bob.igo.name CPOSC 2013

Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

1

Open Senses

Bob Igo

http://bob.igo.name

CPOSC 2013

Page 2: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

2

Robots Need Senses

● Vision– To detect

● Hearing– To hear and locate

commands● Touch

– To manipulate● Speech(*)

– To argue

Page 3: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

3

But I Don't Have a Robot!

● Many robots use commodity hardware.

Page 4: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

4

Vision

● Uses– License plate

recognition– Logo recognition– Motion detection

Page 5: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

5

Vision: Face Detection

● Uses– Keep screen awake.– Lock screen.– Pause video when not

watching.

Page 6: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

6

Vision: Face Detection

● Project: OpenCV– Computer vision suite– Tons of features– Linux, Android, OSX,

iOS, Windows● Demo: ./facedetect.py

– Angle and facial expression critical

● Tied to training data

Page 7: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

7

Vision: Face Detection

● Uses– Find human weak points

● Neck is positioned below the face area.

● Eye location often provided.

Page 8: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

8

Vision: Face Recognition

● Uses– Tagging/sorting of

photos– Custom doorbell

project● e.g. "Skippy is here."

instead of "ding-dong"

● Requires training

Page 9: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

9

Vision: Face Recognition

● Uses– Identify resistance leaders for target prioritization.

– Test disguise effectiveness.

Page 10: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

10

Hearing: Localization

● Trivial to detect sound– Nontrivial to figure out its

source.● Uses

– Determine room/zone occupancy

– Target PTZ camera● Projects

– ManyEars– HARK

Page 11: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

11

Hearing: Localization

● Uses– Locate living humans

Page 12: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

12

Localization: ManyEars

● Linux, OSX, Windows● Specialized hardware

– OpenHardware– 8 microphone inputs– Realtime constraints– CDN $1000 pre-made– CDN $670 DIY

8SoundsUSB

Page 13: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

13

Localization: ManyEars

Page 14: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

14

Localization: HARK

● Open Source– Only official support for

Ubuntu– Based on ManyEars

● Localization + Separation + Recognition

● Specialized hardware– Not open

MicroCone, USD $360

Page 15: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

15

Localization: HARK

● Each sound source can be localized.

● Simultaneous audio can be processed into separate audio channels.

● Speech recognition can be done on each channel.

Page 16: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

16

Page 17: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

17

Page 18: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

18

Page 19: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

19

Hearing: Speech Recognition● Uses

– Front-end to automation suite

– Occupancy detection● Project

– Julius

Page 20: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

20

Recognition: Julius

● Linux, Windows● Continuous

recognition● Great for

domain-constrained inputs.

● You need an acoustic model.

Page 21: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

21

Recognition: Julius

● Things to change– A dictionary

● Words and the phonemes that make them.

– e.g. [CALL] k ao l

– A grammar● What are the valid

sentences in the domain?

– e.g. SENT: CALL_V F_NAME_KENNETH

● Acoustic model:http://www.repository.voxforge1.org/downloads/Main/Tags/Releases/0_1_1-build726/

Example command:● julius-4.2.3 -input mic -C../julius_acoustic_models/julian.jconf

Page 22: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

22

Touch

● Uses– Avoid crushing delicate

objects.– Simply detect contact.

Page 23: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

23

Touch

● Uses– Crush delicate objects.

Page 24: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

24

Touch

● Project– TakkTile

● Schematics CC BY-SA● Firmware GPLv3+● NOTE:

– Terms of licenses may conflict with what they state on their website.

● Arduino, Ubuntu (via USB-I2C bridge ($44-$49)) DIY 3-sensor TakkTile

http://www.takktile.com/tutorial:thee-sensor-array(sic)

Page 25: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

25

Touch: TakkTile

TakkStrip pre-made: $149 with rubber; $49 without.

Page 26: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

26

Touch: TakkTile

● Technology– MEMS barometers

● robust and sensitive

Page 27: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

27

Page 28: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

28

Touch: TakkTile

Page 29: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

29

Speech Synthesis

● Uses– Give feedback without

occupying your eyes– Provide complex

information– Be one half of a speech

interface

Page 30: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

30

Speech Synthesis

● Uses– Communicate equipment needs to pre­uprising human population.

● e.g. "I need your clothes, your boots and your motorcycle."

Page 31: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

31

Speech Synthesis:OpenMary

● Project: OpenMary– Linux, OSX, Solaris,

Windows– client/server– "Emotional TTS"

Page 32: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

32

Speech Synthesis:OpenMary

● marytts-client.sh

Page 33: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

33

Speech Synthesis:OpenMary

Page 34: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

34

Speech Synthesis:OpenMary

● Get new voices– marytts-component-installer.sh

Page 35: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

35

Speech Synthesis:OpenMary

● Poppy (dfki-poppy) is awesome.

Page 36: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

36

Speech Synthesis:OpenMary

● Obadiah (dfki-obadiah) is super casual.

Page 37: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

37

Available Demos

● OpenCV– Face detection

● OpenMary– Speech synthesis

Page 38: Bob Igo  CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo  CPOSC 2013

38

References● OpenCV project

– http://opencv.org/● OpenCV Face Recogition Training

– http://docs.opencv.org/trunk/modules/contrib/doc/facerec/facerec_tutorial.html● ManyEars

– http://sourceforge.net/apps/mediawiki/manyears/index.php?title=Main_Page● 8SoundsUSB

– http://sourceforge.net/apps/mediawiki/eightsoundsusb/index.php?title=Main_Page● HARK

– http://winnie.kuis.kyoto-u.ac.jp/HARK/● HARK video demo

– http://www.youtube.com/watch?v=xpjPun7Owxg● Julius

– http://julius.sourceforge.jp/en_index.php● TakkTile

– http://www.takktile.com/● Barometers as touch sensors

– http://www.youtube.com/watch?v=0EMi_pcG9rE● iRobot hand with takktile

– https://www.youtube.com/watch?v=WvjzSrMbfLk● OpenMary

– http://mary.dfki.de/