Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT

Preview:

Citation preview

LabelMe: Online Image Annotation and Applications

Proceedings of the IEEE 2010Antonio Torralba, MIT

Jenny Yuen, MITBryan C. Russell, MIT

OutlineIntroductionWeb Annotation and Data Statistics

-A. Data Set Evolution and Distribution of Objects-B. Study of Online Labelers

The Space of LabelMe Images-A. Distribution of Scene Types-B. The Space of Images-C. Recognition by Scene Alignment

Beyond 2-D Images-A. From Annotations to 3-D-B. Video Annotation

Conclusion

IntroductionFrom small data set to large data setIn 2005, an online tool LabelMe is

createdLabelMe provides functionalities for

drawing polygons to outline the spatioal extent of object in images

Web Annotation and Data StatisticsA. Data Set Evolution and Distribution of

ObjectsB. Study of Online Labelers

The Features of LabelMe DatabaseObject class recognitionLearning about objects embedded in a sceneHigh-quality labelingMany diverse object classesMany diverse imagesMany noncopyrighted imagesOpen and dynamic

Data Set Evolution and Distribution of Objects(1/2)

(a)Number of annotated objects(b)Number of images with at least one annotated object(c)Number of unique object descriptions

Data Set Evolution and Distribution of Objects(2/2)

The observation suggests two learning problems:1) Learning from few training samples(N->1)2) Learning with millions of samples(N->)

Study of Online LabelersFrom July 7, 2008

to March 19, 2009

(a)Number of new annotations provided by individual users(b)Distribution of the length of time it takes to label an object

The Space of LabelMe ImagesA. Distribution of Scene TypesB. The Space of ImagesC. Recognition by Scene Alignment

Distribution of Scene Types(1/1)Let’s start from cognitive psychologyNext we study how many configurations of 4

objects are presentedThe distribution follows a power law

(n=1,2,4,8)

The Space of Images(1/3)Define “Semantic Distance”:

1) Assign each pixel to a single object category2) Divide the image into NN nonoverlapping windows and build histogram for each window3) Use spatial pyramid matching over object labels

Process of Defining Semantic Distance(2/3)

The Space of Images(3/3)A visualization of 12201 images that are fully

annotated

Recognition by Scene AlignmentWhen giving a new image as input, we use GIST

descriptor to compute the distance

The Power of a Large Scale DatabaseAn algorithm provides an upper bound:

find the nearest neighbor of input image as a labeling of the input image

This result gives us a hint about “How many more images do we need to label”?

Beyond 2-D ImagesA. From Annotations to 3-DB. Video Annotation

From Annotations to 3-D(1/7)The label of objects now contains some

implicit information observed by analyzing the overlap between object boundaries

Object types Ground Objects

Standing Objects

Attached objects

Relations between objects

Supported-by

Part-of

From Annotations to 3-D(2/7)Learning the relationship between objects

1) part-of : evaluate the frequency of high relative overlap between polygons2)supported-by : have the bottom part of its polygon live inside the supporting object

From Annotations to 3-D(3/7)

From Annotations to 3-D(4/7)Reconstructing a 3D model for input image

1) define object type2) define polygon edge type3) compute the real distance between objects

Object type Edge type

Ground objects(green)

Contact(white)

Standing objects(red)

Attached(gray)

Attached objects(yellow)

Occlusion(black)

From Annotations to 3-D(5/7)

From Annotations to 3-D(6/7)The more labeling makes the quality betterHowever, if the labeling goes wrong

From Annotations to 3-D(7/7)

Video Annotation(1/1)

ConclusionA web-based tool that allows the labeling of

objects and their location in imagesLabelMe has collected a large annotated

database of images with many different scene and object class

LabelMe can recover the 3-D description of an image

The next goal is expending the database of video and offering a promising direction of computer vision and computer graphics

References

References

There are a lot more references …

Recommended