Object Recognition a Machine Translation Learning a Lexicon for a Fixed Image Vocabulary Miriam Miklofsky

Object Recognition a Machine Translation

Learning a Lexicon for a Fixed Image Vocabulary

Miriam Miklofsky

Lexicons

A vocabulary of terms used in a subjectA specialized list of terms

Devices that predict one representation given another representation

Dataset

Aligned bitext Annotated images Images with regions Unknown which region of image goes

with which word from text

EM

Clustering

K means clustering Vector quantize the image region

representation

Kullback-Leibler divergence Relative entropy Measure of difference of two

probability distributions over the same event space

Evaluation

Auto annotate images Quantize regions Use lexicon to determine word Annotate image with word

Results - Annotation

Base results 80 words of 371 word vocabulary

could be predicted

Retraining Similar results but some words with

higher recall and precision

Results(cont.)

Null probability Recall decreases Precision increases

Clustering of like words Recall values of clusters higher than

for single words

Results -Correspondence

Base results Some good words up to 70% correct

prediction

Null prediction Predict good words with greater

probability

Word clustering Prediction rate generally increases

Evaluation

Human evaluation Images viewed by hand Somewhat subjective

EM (cont.)

KL Divergence

Documents

Object Recognition a Machine Translation Learning a Lexicon for a Fixed Image Vocabulary Miriam Miklofsky