Upload
sharleen-phillips
View
219
Download
1
Tags:
Embed Size (px)
Citation preview
Bangpeng Yao Li Fei-Fei
Computer Science Department, Stanford University, USA
Modeling Mutual Context of Object and Human Pose
in Human-Object Interaction Activities
IntroductionModeling mutual context of object and poseModel learningModel inference, object detection, and human
pose estimationExperimentsConclusion
Outline
IntroductionModeling mutual context of object and poseModel learningModel inference, object detection, and human
pose estimationExperimentsConclusion
Outline
Human pose estimation & Object detection
Introduction
Right-arm
Left-arm
Torso
Right-leg
Left-leg
Tennisracket
Challenging:
Introduction
Mutual context:Human pose estimation & Object detection - facilitate the recognition of each other
Introduction
Mutual context V.S no mutual context
Introduction
IntroductionModeling mutual context of object and poseModel learningModel inference, object detection, and human
pose estimationExperimentsConclusion
Outline
HOI activity
A: Activity class, ex : tennis server, volleyball smash
O:Object, ex : tennis racket, volleyball
H:Human pose
P: Body partsf: visual feature
Each A have more than one type of H
HOI activity
: edge of the model : potential function
: weight : Freguencies of
co-occurrence between A, O, and H , , : Spatial
relationship among object and body parts, compute by
: (position, orientation, scale)
The model
: model the dependence of the object and a body part with their corresponding image evidence
The model
Co-occurrence context for the activity class, object, and human pose
Multiple types of human pose for each activity
Spatial context between object and body parts
Properties of the model
IntroductionModeling mutual context of object and poseModel learningModel inference, object detection, and human
pose estimationExperimentsConclusion
Outline
Learning step needs to achieve two goals:structure learning & parameter
estimation
Structure learning: discover the hidden human pose and the connectivity among the object, human pose, and body parts
Parameter estimation: for the potential weight to maximize the discrimination between different activities
Model learning
Objective: Connectivity pattern between the object, the human pose, and the body parts
Method: hill-climbing approach with tabu list
Structure learning
Hill-climbing approach adds or removes edges one at a time until maximum is reached
Hill-climbing structure learning
Humanpose
Objective: obtain a set of potential weight that maximize the discrimination between different classes of activities
Training sample : : is potential function value, disconnected edge set 0
: is the human pose H : is the class label AIf , then
: is a weight vector for the r-th sub-class
Max-margin parameter estimation
: is L2 norm : normalization constant
Multiclass SVM
Using only one human pose for each HOI class is not enough to characterize well all the image in this class
Analysis of our learning algorithm
IntroductionModeling mutual context of object and poseModel learningModel inference, object detection, and human
pose estimationExperimentsConclusion
Outline
Given a new testing image, our objective is : - estimate the pose of the human- detect the object that is interacting with the human
Model inference, object detection, and human pose estimation
IntroductionModeling mutual context of object and poseModel learningModel inference, object detection, and human
pose estimationExperimentsConclusion
Outline
Cricket - defensive shot (player and cricket bat)
Cricket - bowling (player and cricket ball)Croquet - shot (player and croquet mallet)Tennis - forehand (player and tennis racket)Tennis – serve (player and tennis racket)Volleyball - smash (player and volleyball)
30 images for training, 20 for testing
The sports dataset
Better object detection
Sliding window Pedestrian as context Our method
detector
Better object detection
Pose estimation still difficult
Multiple pose is better than only one pose
Better pose estimation
Upper: our methodLower left: object detection by a scanning
windowLower right: pose estimation by the state-of-
art pictorial structure method
Note Gupta et.al. uses predominantly the background scene context
Combining object and pose for HOI activity classification
IntroductionModeling mutual context of object and poseModel learningModel inference, object detection, and human
pose estimationExperimentsConclusion
Outline
Treat object and human pose as the context of each other in different HOI activity classes
Structure learning method - connectivity important patterns between objects and human pose
Further improve : - incorporate useful background scene context to facilitate the recognition of foreground object and
activity- deal with more than one object
Conclusion
Thanks!!!