View
215
Download
1
Category
Preview:
Citation preview
Towards a Reactive Virtual
Trainer Zsófia Ruttkay, Job Zwiers, Herwin van Welbergen,
Dennis Reidsma
HMI, Dept. of CS, University of Twente Amsterdam, The Netherlands
zsofi@cs.utwente.nl
page 2
Overview RVT usage Related work RVT technological challenges
– Architecture
– Integration of reactive and proactive actions
– Multi-modal sync
A close look at clapping - demos
page 3
RVT usage
RVT = IVA with expert and psychological knowledge of a real physiotherapist, to be used e. g. to:
– prevent RSI for computer workers
– preserve/restore weight and physical condition as (personal) trainer
– act as physiotherapist to cure illnesses affecting motion
RVT is medium and emphatic consultant Relevance for society
– ageing population, unhealthy life-style,
– human experts: low number, expensive, at certain locations
RVT usage context
– PC + 1-2 camera in normal setting (homes, offices)
– ‘instructed’ by authorized person (may be the user, as well as developer)
– can be adapted/extended
page 4
Related work
Tra
iner ca
libra
tion
Mediu
m/co
nsu
ltant
Inp
ut
Feedback
Motio
n d
em
o/co
rrectio
n
Exercise
revisio
n
Auth
orin
g
J. Davis, A. Bobick: Virtual PAT, MIT, 1998.
1movie split 2 cam. assessment
- - script
S-P.Chao et al: Tai Chi synthesizer, 2004.
1 m - - - - nl script
W. IJsselsteijn et al (Philips): Fun and Sports: Enhancing the Home Fitness Experience, Proc. of ICEC 2004.
1 c heart-rate
assessment
- - ?
Sony’s EyeToy: Kinetic ‘game’, 2005
2 m/c 1 cam. general, well-placed
d - By User from pre-set choice/types
T. Bickmore: Laura & FitTrack
1 c data to be typed
assessment
- - closed?
page 5
Own related work – Virtual Rap Dancer
page 6
Own related work – Virtual Conductor
page 7
RVT technological challenges
Vision-based perception, may be extended with biosignals
Reactive on exercise performance, physical state, overall performance
Smalltalk, exercise correction, plan revision
VRT body and motion parameters adaptable/calibrated
Authoring by human
Extensible by expert (new exercises)
Motion with music, speech or clapping (also as input for tempo)
Playground for multi-modal output generation
“Exercise motion intelligence”: timing, concatenation, idle poses, …
page 8
RVT architecture
Calibration of user
Multi-sensorintegration
Authoringscenario
Exercise sce-nario revision
Optical motion tracking
Motion interpretation
Motion specification
Biosensingmodule(s)
Acoustic beat tracking
VT
Monitoringthe user
Multi-modal feedback
Motion demonstratio
n
Presentation of feedback of VT
Planning action of VT
Human expert
User
Interfaces
page 9
Multi-modal sync
Exercises are executed using several modalities– Body movement– Speech– Music– Sound (clap, foot tap)
Challenges– Synchronization– Monitoring user => real time (re)planning
• Exaggeration to point out details• Speed up / slow down• Feedback/correction• …
page 10
Synchronization: related work
Classic approach in speech/gesture synchronization:– Speech leads, gesture follows
MURML (Kopp et al.)– No leading modality– Planning in sequential chunks containing one piece of speech
and one aligned gesture– Co-articulation at the border of chunks
BML (Kopp, Krenn, Marsella, Marshall, Pelachaud, Pirker, Thórisson, Vilhjalmsson) – No leading modality– Synchronized alignment points in behavior phases– For now, aimed mainly at speech/gesture synchronization– In development
page 11
Synchronization: own previous work Virtual Dancer
• Synchronization between music (beats) and dance animation
• Dance move selection by user interaction Virtual Presenter
• Synchronization between speech, gesture, posture and sheet display
• Leading modality can change over time GESTYLE markup language with par/seq
and wait constructs
page 12
Close look at clapping
stroke (hold) retraction (hold)
page 13
Clapping Exercise
page 14
Close look at clapping
Start with a simple clap exercise and see what we run into
The clap exercise:– Clap for the tempo of the beat of a metronome (later: of music)– When the palms touch, a clap sound is heard– Count while clapping, using speech synthesis
• Possible alignment at: word start/end, phonological peak start/center/end
• For now, we pick the center of the phonological peak, but we do generate the other alignment points for easy adaptation
page 15
Two examples for multi-modal sync
Specification in BMLT
Planning in real-time – under/overspecification!
page 16
What if we speed up the tempo?
The clapping animation should be faster Possibilities:
– Lower amplitude? – Linear speedup?– Speedup of stroke?– Speedup of retraction?– A combination of above?
page 17
What if we slow down the metronome?
Slower clapping? (movies here)– Linear slowdown?– Slowdown of stroke?– Slowdown of retraction?– Hold at end of retraction (hands open)?– Hold after stroke (clap)?– A combination of above?
Back to idle position?
page 18
Open issues on planning
What do real humans do? Do the semantics of a motion (clap) change if we
change its amplitude or velocity profile? E.g. emotions, individual features Smooth tempo changes Automatic concatenation and inserted idle poses Appropriate high-level parameters
– Related (e.g. amplitude/speed)?– Different of parameters for communicative gestures (e.g.
by Pelachaud)? Amplitude and motion path specification Is our synchronization system capable to re-plan
in real time?
Recommended