Tiny People Posevgg/publications/2018/... · 2018-12-04 · Tiny People dataset • We introduce a...

Preview:

Citation preview

• At the end of each trial, the correct category was revealed and the subjects recorded the accuracy of their category guess.

Tiny People dataset

• We introduce a new Tiny People dataset, which contains 200 real scene images

capturing people at distance (the dataset is used for testing only)

Tiny People Pose

Lukáš Neumann and Andrea Vedaldi

Introduction

1. Andriluka, M., Pishchulin, L., Gehler, P., Bernt, S.: 2d human pose estimation: New benchmark and state of the art analysis, CVPR 2014

2. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation, ECCV 2016

3. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields, CVPR 2017

4. Hu, P., Ramanan, D.: Finding tiny faces, CVPR 2017

Visual Geometry Group, Department of Engineering Science, University of Oxford

• Goal: recognizing pose from tiny images of people, down to 24px high

• Tiny People are important in surveillance, autonomous driving, etc.

• Challenge: low resolution data is extremely ambiguous

• Approach

• Models data uncertainty via probability distributions

• A CNN that outputs a distribution over possible body joints from a small

image of a person

• A probabilistic variant of keypoint localization using dense heat maps

• Evaluation

• Downsampled standard benchmarks (MPII Human Pose and MS-COCO)

• New Tiny People dataset of “real” low-resolution people

lukas@robots.ox.ac.uk

• Our method outperforms both standard models

References

This research was supported by

Probabilistic Formulation

• Our model emits a continuous Gaussian distribution for each keypoint u by

estimating Gaussian Mixture Model parameters over a coarse 16 × 16 feature

map Ωd generated over the whole image

• Output: a continuous distribution over possible body joint configurations:

• Training: minimum neg. log-likelihood of the ground truth keypoint locations

MPII Human Pose dataset

• Evaluation on the standard MPII Human Pose dataset1, where we reduced the

resolution to approximate people seen at a distance

• Standard-height (128px): comparable to the state-of-the-art Stacked Hourglass2

and better than Part Affinity Fields3

• Small people (<64px): our method performs significantly better

detection accuracy model surpriseregression error

person height histogram

of human pose datasets

• We also train the Tiny Faces detector4 for person detection and use it as an

input for our method

• We have shown modelling uncertainty explicitly in a deep network can significantly boosts accuracy for ambiguous data, such as low-resolution pose

recognition

Standard Formulation

• The standard approach2 for landmark detection is to output one heatmap per

joint

• Heatmaps are fitted to the ground truth data by a L2 per-pixel regression of a

heuristic Gaussian-like kernel around the ground truth landmark location

• Limitations:

• Heatmaps only convey information about the

location – they may appear to encode uncertainty, but they do not

• The accuracy is bound by the heatmap

resolution

• The model allows for modelling data uncertainty (by increasing the variance of

corresponding Gaussians), as well as for sub-pixel accuracy

Recommended