View
218
Download
0
Embed Size (px)
Citation preview
EURON Summer School 2003
1
Real-Time Computer Vision for Human Interfaces
Yoshio Matsumoto
Nara Institute of Science and Technology, Japan
EURON Summer School 2003
2
Introduction to Robotics Lab at NAIST
Nara
EURON Summer School 2003
3
video
EURON Summer School 2003
4
Face Measurement Systemand Its Applications
Yoshio Matsumoto
Nara Institute of Science and Technology, Japan
EURON Summer School 2003
5
Motivation: Face Measurement
• when a person teaches a task to a robot• when a person makes a cooperative task with a robot
• Head motion and gaze direction reflect intention and attention of a person.
• For Human-Robot Interaction,
Goal of this research:
Head Motion => GestureGaze Direction => Attention
=>=>=> Intention
natural ways of communication is required.
EURON Summer School 2003
6
• Head should not move, or head pose is measured by magnetic sensor etc.
• Eye rotation is detected using IR refection on corneal• Head mounted device prohibits natural behaviors • Accurate in best condition, however hard to keep it• Binocular systems and external camera system are also available
Gaze Tracking Techniques:Corneal Reflection
EURON Summer School 2003
7
Gaze Tracking Techniques:Corneal Reflection (cont’d)
• Purkinje images appear as small white dots in close proximity to the (dark) pupil
• tracker calibration is achieved by measuring user gazing at properly positioned grid points (usually 5 or 9)
• tracker interpolates POR on perpendicular screen in front of user
Figure: Pupil and Purkinje images as seen by eye tracker’s camera
EURON Summer School 2003
8
Gaze Tracking Techniques:EOG (electro-oculography)
Figure: EOG measurement:
• relies on measurement of skin’s potential differences using electrodes placed around the eye
• most widely used method some 20 years ago (still used today)
• measures eye movements relative to head position
• not generally suitable for POR measurement (unless head is also tracked)
EURON Summer School 2003
9
Gaze Tracking Techniques: Scleral Contact Lens/Search Coil
Figure: Scleral coil:
• search coil embedded in contact lens and electromagnetic field frames
• possibly most precise
• similar to electromagnetic position/orientation trackers used in motion-capture
EURON Summer School 2003
10
Gaze Tracking Techniques: Scleral Contact Lens/Search Coil (cont’d)
Figure: Example of scleral suction ring insertion:
• most intrusive method
• insertion of lens requires care
• wearing of lens causes discomfort
• highly accurate, but limited measurement range (~5°)
• measures eye movements relative to head position
• not generally suitable for POR measurement (unless head is also tracked)
EURON Summer School 2003
11
Real-Time Vision for Face Measurement
• One of the important topics in PUI research• Being actively studied at Microsoft, IBM, MIT,
CMU, Univ. of Illinois etc.
Toyama, MSR (1998)
Colmenarez, UIUC (1997)
• Few of them can measure the quantitative gaze direction• Most of them use monocular camera systems
EURON Summer School 2003
12
Our Approach
• Stereo camera system for 3D measurement• 3D facial model• Feature tracking by normalized correlation• 3D model fitting algorithm based on spring model
• Real-time Face Tracking• Gaze direction, blinking and lip motion are additio
nally measured
EURON Summer School 2003
13
1. Algorithm of Face Measurement
EURON Summer School 2003
14
• PC with Pentium III 450MHz or higher
• OS : Linux 2.2 or 2.4• Stereo Camera Pair
System Configuration
EURON Summer School 2003
15
3D Facial Model
3D coordinates of features
(-50.7, 13.4, -4.2)(-18.2, 12.8, -1.9)( 16.1, 12.8, -2.1)( 48.0, 11.0,-10.8)(-25.9,-58.7, 10.1)( 27.6,-58.6, 4.5)
Template imagefor feature tracking
Template imageof whole face
• Corners of eyes and mouth are used for tracking• 3D Facial Model = Template Images + 3D coordinates
EURON Summer School 2003
16
3D Model Fitting for Face Tracking
: Coordinate of each feature in the model: Coordinate of each measurement: Reliability for each measurement (0 ..1)
: Rotation matrix: Translation vector
: # of features
)()(1
0ii
Tii
n
ii ytRxytRxWE
),,( zyxT),( R
ixiyiW
n
EMinimizing
EURON Summer School 2003
17
Assumption: displacement between previous and current frame is small
At time t At time t + dt
Model
Measurement
Model
Measurement
3D Model Fitting for Face Tracking(cont’d)
),( R),( zyxT
)0,0,0(R)0,0,0(T
),( R),( zzyyxxT
EURON Summer School 2003
18
Gradient 3D model fitting methodbased on virtual spring model
At time t + dt
Displacement dut toand still remains
Stiffness of spring ki= Wi (Reliability)
k3
k1
k2
F3
F2
F1
R and tare acquired simultaneously
3D Model Fitting for Face Tracking(cont’d)
),( R
),( ),( zyx
),( zyxT
EURON Summer School 2003
19
Result of Face Tracking
EURON Summer School 2003
20
Accuracy of Face Tracking
Angle of Rotation Stage [deg]
Est
imat
ed A
ngle
[de
g]
0 20 40 60 80
0
20
40
60
80
Measurable Range
Pitch :-80~ +80[deg]
Accuracy
Approx. 2[deg]
Yaw :-35~ +35[deg]
Roll :-35~+35[deg]
EURON Summer School 2003
21
Estimation of Gaze Direction
(Top View)
EURON Summer School 2003
22
• Look at markers from ① to ⑩ with intervals of 10cm
• Intersection of 3D gaze vector and the board are displayed as a fixation point
Result of Gaze Measurement
EURON Summer School 2003
23
Result of Gaze Measurement
EURON Summer School 2003
24
Accuracy: Approx. 5[deg]
Result of Gaze Measurement
EURON Summer School 2003
25
Result of Face Measurement
EURON Summer School 2003
26
30Hz, 1.5kg, US$2000
Hardware Configuration
Selectable from below combinations depending on requirements and costs
IEEE1394 High-Speed Stereo Camera, 80Hz
Desktop PC
Notebook PC
NTSC Camera×2
NTSC Camera×2 (Multiplexed)
IEEE1394 Camera×2
IEEE1394 High-Speed Camera (80Hz)
EURON Summer School 2003
27
• Multiplexed stereo video streams into a single video stream.
• Vertical resolution of each video signal becomes half.
• Can be used for any conventional image processing system.
SyncOut
Vin 2
Vin 1
GND
Vcc
MixOut
Field Multiplexing Device
EURON Summer School 2003
28
Gaze Vector(gx,gy,gz)
Head Position(x,y,z)
θi
Obj 1
Obj 2
Obj i
The object that a person is looking at is recognized by calculating θi for all objects.
Attention Recognition
EURON Summer School 2003
29
Attention Recognition
EURON Summer School 2003
30
Attention Recognition
EURON Summer School 2003
31
Spotting gesture recognition based on Continuous DP Matching using head motions (velocity, angular velocity)
Gesture Recognition
EURON Summer School 2003
32
Gesture Recognition
EURON Summer School 2003
33
Gesture Recognition
EURON Summer School 2003
34
Head pos: 2mm dir: 2degGaze dir: 5deg
Accuracy
Processing Speed30Hz ~ 80Hz
(depending on camera)
Specs of Developed System
EURON Summer School 2003
35
Software Configuration
EURON Summer School 2003
36
Software ConfigurationCommercial Face Recognition Software
EURON Summer School 2003
37
Software Configuration
Video Capturing Library
EURON Summer School 2003
38
Software Configuration
Application Software
EURON Summer School 2003
39
Potential Application Areas
Since this system is non-contact, passive, andinexpensive, it can be applied to many application areas where conventional systems cannot be used.
• Human Interfaces• Computer Interface ( e.g. Hands-free mouse )• Robot Interface ( e.g. Eye-contact communicatio
n )• Safety System ( e.g. Driver support )• Assistive Products for the disabled
• Human Modeling • Cognitive Science (Experiment on visual cognitio
n )• Ergonomics ( e.g. Human-friendly design)
EURON Summer School 2003
40
2. Application to Human Interfaces
EURON Summer School 2003
41
2.1 Computer Interfaces
EURON Summer School 2003
42
Direct Usage of Head Movements:Hands-Free Mouse
EURON Summer School 2003
43
How to Use Eye Movementsfor Computer Interfaces ?
• Eye movement [Glenstrup 95]– Convergence– Rolling– Saccades– Pursuit motion– Nystagmus– Drift and micro-saccades– Physiological nystagmus
• Need to refine raw data: – distinguish Fixations
from Saccades
EURON Summer School 2003
44
How to Use Eye Movementsfor Computer Interfaces ?
Saccade/Fixation detection• Velocity-Threshold
– Saccades >300 deg/sec.– Fixations <100 deg/sec.– Usual threshold 200 deg/se
c.
• Dispersion-Threshold
– Threshold set such that visual angle is between 0.5 and 1.
yy
xxD
minmax
minmax
EURON Summer School 2003
45
• Command based Interface– Obvious application: Selection of objects {Menu selection,
Window scrolling, …) Pointing.– Midas Touch problem: Eyes not a control device.– Use dwell time to trigger a selection.
• Non-Command Interfaces– The computer monitors user’s actions instead of waiting for
user’s command.– Potential Applications: User Support Detect difficulties and provide translation support of difficult words
How to Use Eye Movementsfor Computer Interfaces ?
EURON Summer School 2003
46
Pro-Active Dictionary:How to detect User Difficulties?
Gaze Pattern in
Normal Reading
Gaze Pattern when
Difficulties Encountered
EURON Summer School 2003
47
Pro-Active Dictionary:Implementation
Gaze Tracking ModuleUser
Action
Gaze
IE Web
Browser
Data
buff
erin
g &
Fe
edback
Windows
Linux
Assistance
Module
BH
O
EURON Summer School 2003
48
Pro-Active Dictionary:Demonstration
EURON Summer School 2003
49
2.1 Robot Interfaces
EURON Summer School 2003
50
Intelligent Wheelchair
PC
CCD cameras
Ultrasonic sensors
Laser Scanner
Notebook PC
CCD cameras
EURON Summer School 2003
51
• Gesture– Nodding -> To start– Shaking ->To stop
• Face Direction– To determine the direction to move
• Gaze Direction– To determine if the user is concentrated
Intelligent Wheelchair
How is facial information used ?
EURON Summer School 2003
52
Intelligent Wheelchair
EURON Summer School 2003
53
Intelligent Wheelchair
Various lighting conditions
EURON Summer School 2003
54
Intelligent Wheelchair
EURON Summer School 2003
55
Intelligent Wheelchair
Sensor-based collision avoidance and wall following
EURON Summer School 2003
56
Result of Experiment (Trajectory)
start
goal
1m
EURON Summer School 2003
57
Result (Input and Output Values)
Face DirectionSteering AngleVelocityTurn left
Stop
Follow wallTurn right
0 5Time [sec]
10 15 20 25 30 35
80
60
40
20
0
-20
-40
-60
-80
[deg]
40 45 50
Left
Rig
ht
EURON Summer School 2003
58
Intelligent Wheelchair
30
20
10
0
-10
-20
-30
0 3Time [sec]
[deg]
6 9 12 15 18 21 24
Looking AsideCareful Operation
Face Direction(Horizontal)Gaze Direction(Horizontal)
How to detect concentration of the user?
EURON Summer School 2003
59
Intelligent Wheelchair
Poster
Estimation of User’s Attention
EURON Summer School 2003
60
Intelligent Wheelchair
Estimation of User’s Attention
Posters
Gaze direction
Robot
EURON Summer School 2003
61
Paying attentionto the poster
Not paying attention to the poster
Concentrated on the position of the poster.
Distributed on various positions.
Intelligent Wheelchair
Estimation of User’s Attention
EURON Summer School 2003
62
Intelligent Wheelchair
Estimation of User’s Attention
EURON Summer School 2003
63
Receptionist Robot ASKA
EURON Summer School 2003
64
Speech Recognitionand Voice Synthesis
Human andFace
Recognition
NLP Dialog Model
Gesture Generation and
Autonomous Mobility
Receptionist Robot
Information Retrieval
from Databases
Receptionist Robot ASKA
Platform for experimentin real world
EURON Summer School 2003
65
ASKA: System Overview
PC for gesture
Camera
Microphone
Speaker
PC for recognition and speech
PC for vision
PC for server
EURON Summer School 2003
66
Hardware Configuration
telecamera
wide-range camera
Human detection and tracking
Measurement of facial information
Base: tmsuk04 (tmsuk Co., LTD.)Degree of freedom: Chest 1 [DOF] Arms 14 [DOF] Hands 6 [DOF]
EURON Summer School 2003
67
Software Configuration
Person Finder Speech Recognizer Talker
Body GestureController
Head MotionController
ServerDataData
EURON Summer School 2003
68
Receptionist Robot ASKA
EURON Summer School 2003
69
Problem to Solve with VisionProblemProblem
Wrong responses to : Background noises Utterances which are not spoken to ASKA
Purpose of This Research :Purpose of This Research :• To recognize the utterance period• To detect whether user speaks to ASKA or
not
For natural human-robot interaction.For natural human-robot interaction.
EURON Summer School 2003
70
Software Configuration
Person Finder Speech Recognizer Talker
Body GestureController
Head MotionController
ServerDataData
1. Human detection and tracking2. Detection of the sign of utterance
EURON Summer School 2003
71
Pilot Study
•Measured informationMeasured information•Head orientation•Gaze•Lip motion•Blink
•SubjectSubject•10
•Speech sentenceSpeech sentence•Fixed sentence : 5•Free sentence : 5
Dialog Experiment
EURON Summer School 2003
72
Information at Utterance
Head orientation
Lip distance
gaze
time
blink
time
timetime
angle
angle
dis
tance
pitch
yaw
pitch
yaw
width
height
blink
Subject A
Utterance period
inte
nsi
ty
EURON Summer School 2003
73
Sign of Utterance : Start[d
eg]
[deg]
pitch
yaw
pitch
yaw
[sec] [sec]
Head orientation
gaze
Subject B
The Face and gaze direct to object before the utterance
Utterance period
EURON Summer School 2003
74Detection of Start of Utterance
Face direction area
Gaze area
Sign of Utterance : Start
Face and gaze vectors are within the threshold area.
A user have an intention of utterance
Allowing areaFrom the center of ASKA’s face :• Gaze vector area
300 [mm]• Face vector area
400 [mm]
EURON Summer School 2003
75
Experiment : Detection of Utterance Sign
EURON Summer School 2003
76
Demonstration : Interaction Based on Utterance Sign
EURON Summer School 2003
77
3.Application to Human Modeling
EURON Summer School 2003
78
Measurement of Infants
EURON Summer School 2003
79
Measurement of Infants
EURON Summer School 2003
80
Measurement of Infants
Measured Results
Time[sec]Fac
e D
irec
tion
(Hor
izon
tal)
Missing Data Measured Data
EURON Summer School 2003
81
Measurement of TV Game Players
EURON Summer School 2003
82
[s]
[s]
1st Play( Level: Hard) 2nd Play( Level: Hard)
3rd Play (Level: Hard)4th Play ( Level: Easy)
Frequency of Blinking
This result coincides with Psychological Report
Measurement of TV Game Players
EURON Summer School 2003
83
System overview
Measurement of Drivers
EURON Summer School 2003
84
Software Configuration
Measurement of Drivers
EURON Summer School 2003
85
Measurement of Drivers
EURON Summer School 2003
86
Measurement of Drivers
EURON Summer School 2003
87
Experiment at night
Measurement of Drivers
EURON Summer School 2003
88
0
5
0
1
00100 50 0
Hor
izon
tal H
ead
Pos
itio
n[m
m]
Time [min]
Yaw
Rate of V
ehicle [d
eg/sec]0 0.1 0.2 0.3
Ave. Speed: 20[km/s] Ave. Speed: 40[km/s]
Measurement of Drivers
Horizontal head movements at curves
Not passive head movements due to centrifugal force,but active head movements to cope with centrifugal force.
0 0.1 0.2 0.3
EURON Summer School 2003
89
0 20 40 60 80
Subject A
Ver
tica
l Hea
d P
osit
ion
[m
m]
Subject B Subject C
0 20 40 60 0 20 40 60
Vertical head position in long driving
Measurement of Drivers
0
50
100
Time [min] Time [min]Time [min]
•Body posture changed after long driving ?•Body lowered due to the softness of the seat?
EURON Summer School 2003
90
Measured fixation point and yaw rate of vehicle
Measurement of Drivers
EURON Summer School 2003
91
Attention Recognition
Measurement of Drivers
EURON Summer School 2003
92
Attention Recognition
Measurement of Drivers
EURON Summer School 2003
93
Where are you looking when lane changing?
Measurement of Drivers
Time[s]
EURON Summer School 2003
94
Measurement of Patients in PET
To compensate head motion during measurement
EURON Summer School 2003
95
Summary Human Interfaces
Robot InteractionComputer Interface
Face Measurement System
Human Modeling Psychology / ErgonomicsCognitive Science
Head pos: 2mm dir: 2degGaze dir: 5deg
Accuracy
Processing Speed30Hz ~ 80Hz
(depending on camera & CPU)
EURON Summer School 2003
96
Fin