Semantic and Diverse Summarization of Egocentric Photo Events · refined list should be both...

Semantic and Diverse Summarization of Egocentric

Photo EventsAniol Lidon Baulida

Master Computer Vision (UAB, UPC, UPF, UOC)

Advisors:Xavier Giró Nieto, Image Processing Group, Universitat Politècnica de CatalunyaPetia Radeva, Barcelona Perceptual Computing Lab, Universitat de Barcelona

CollaborationBarcelona Perceptual Computing Laboratory :

Marc Bolaños, Petia Radeva

Image Processing Group:

Xavier Giró

Grup de Recerca Cervell, Cognició i Conducta:

Maite Garolera

Institute of Creative Media Technologies:

Matthias Zeppelzauer

Motivation• In 2013, 44.4 million people with dementia worldwide.• “Cognitive Stimulation Therapy”

Motivation• Lifelogging with Narrative Clip.• Up to 2000~3000 images at day!• Summarization is needed.

Automatically summarize events. • Sorting by priority.• Trade-off between relevance and diversity.• Obtaining sorted ranks.

RELEVANCE

DIVERSITY

Sate of the art• This project continues the work started by Ricard Mestre.

– Event segmentation and selecting the most repetitive image from an event.

• Off-the-shelf algorithms used:– Informativeness network: provided by Marc Bolaños (to be published)– Blur detection: Crete et al. The blur effect: perception and estimation with a new no-

reference perceptual blur metric– Saliency Maps: provided by Kevin McGuinness (to be published).– Face detection: Zhu et al. Face detection, pose estimation, and landmark localization in

the wild.– Object Candidates: Arbelaez et al. Multiscale Combinatorial Grouping – Object Detector: Hoffman et al. Large Scale Detection through Adaptation.– Affective: Campos et al. Diving Deep into Sentiment: Understanding Fine-tuned CNNs for

Visual Sentiment Prediction

Pipeline

Prefiltering

Aim: Removing uninformative images.

Informativeness network

Fine-tuning by Human Annotations

Filtering out: Discarding absolutely uninformative frames.

Pipeline

Relevance

What is relevance?Frame-level:

•Repeated.• Unusual.• WHAT? Representative of an activity. • WHO? Social interactions. • WHERE? Environment. • WHEN an event has occurred. • HOW activity occurred.

Relevance

What is relevance?Frame-level:

• WHAT? Representative of an activity. • Saliency Maps• Object detection

• WHO? Social interactions. • Face detection• Sentiment Analysis (Affectivity)

Relevance Ranking: pipeline

Prefiltering

Diversityre-ranking

Relevance rankingSaliency maps

SalNet CNN

Aim: Determining interesting zones.

Scoring for relevance: Averaging all saliency-map values.

Relevance ranking

Objects

LSDA Large Scale Detection through Adaptation

Object Detector

Aim: Finding well defined objects.

Scoring for relevance: Summing all detected objects scores.

Relevance ranking

Face detection, pose estimation, and landmark localization in the wild.

Aim: Finding well defined faces.

Scoring for relevance: Summing exponentially all faces confidences.

Relevance Ranking: pipeline

Prefiltering

Diversityre-ranking

Pipeline

Diversity re-ranking

Re-ranking by Soft Max Diversity Fusion

Color similarity

Faces similarity

Color similarity

Faces similarity

Color similarity

Faces similarity

Similarity measure

ImageNetEuclidean distance between features (L2 norm).

CNN trained with ImageNet DB (1000 classes) using CaffeNet Architecture.

Fully connected layer 8 removed.

Pipeline

Assesment

Validation of automatic approach

Manually annotated summaries

• 7 dataset with labelled ground-truth • 2 Online questionnaires• Mean Opinion Score

Psychologists feedback:

INTERMEDIATE VALIDATION FINAL EVALUATION

Subjective problem

Precision

GROUND-TRUTH SELECTED

Metric

Mean Normalized Sum of Max Similarities (MNSMS)

Normalization in both axesY: Divide by GT samplesX: Reshape samples to N bins

Ground-Truth

Similarity Sum= + +

Metric

Ground-Truth

Similarity Sum= + +

Metric

Ground-Truth

Similarity Sum= + +

Metric

Normalization in both axesY: Divide by GT samplesX: Reshape samples

Ground-Truth

Similarity Sum= + +

Metric

Normalization in both axesY: Divide by GT samplesX: Reshape samples

Ground-Truth

Similarity Sum= + +

Assesment

• 7 dataset with labelled ground-truth• MNSMS (ImageNet) AUC

• 2 Online questionnaires• Mean Opinion Score

Intermediate validation

Prefiltering•Informativeness Network

•Hand Crafter Estimators

• Not prefitering

• SalNet

• SalNet + Gaussian

Objects Relevance• LSDA (object detector)

• MCG (object candidates)

SalNet SalNet + Gauss

LSDA MCG

Saliency RelevanceSaliency Relevance AUC

Objects Relevance AUC

Affective Relevance• Positive

• Negative

•Extremum

•Random

Sentiment analysis CNN • 2 classes: positive / negative

Assesment

• 7 dataset with labelled ground-truth• MNSMS (ImageNet) AUC

• 2 rounds of online questionnaires• Mean Opinion Score

Final evaluation

SIMILARITY• ImageNet CNN (fc8 removed)

• Places CNN (fc8 removed)

• LSDA (only spatial NMS)

• Fusion (ImageNet + Places + LSDA)

(Diversity re-ranking + Weight fusion in MNSMS)

Final evaluation

MEAN OPINION SCORE• ImageNet configuration

• Uniform Sampling

• Ground-truth (previous manual annotation)

Final evaluation

Final resultsRepresentativity of summaries:

Preferred summary:

Mean Opinion Score (1 worse - 5 best)

GeneralizationMediaeval diverse task

• APPLICATION: Finding more information about a place to visit. • GOAL: Povide a ranked list of Flickr photos for a predefined set of queries. The

refined list should be both relevant to the query and also diverse.

46A. Lidon, M. Bolaños, M. Seidl, X. Giro-i Nieto, P. Radeva, and M. Zeppelzauer, “Upc-ub-stp @ mediaeval 2015 diversity task: Iterative reranking of relevant images,” in MediaEval 2015 Workshop, Wurzen, Germany, 2015.

0,40,420,440,460,48

0,50,520,540,56

Run 1 F1@20 (Visual)

Conclusions

• Contributions: – Mean Normalized Sum of Max Similarities. – New criterion for semantic diversity (based on LSDA).– New method for diversity fusion.– Online evaluation questionnaires.

Conclusions• Tested in two applications:

– Memory reinforcement for mild-dementia.– Diverse Social Images Task from the scientific MediaEval benchmark.

• Mean Opinion Score of 4.6 out of 5.00.

• Publications:– Working-notes paper in MediaEval challenge.– Wearable and Ego-vision Systems for Augmented Experience of the

journal IEEE Transactions on Human-Machine Systems.

• Code available: https://imatge.upc.edu/web/resources/semantic-and-diverse-summarization-egocentric-photo-events-software

Future work

• Further in other relevance criterion.• Higher level of semantics. • Determine automatically the summary length.

Thanks for your attention!

Prefiltering

Hand-crafted estimators

Burned Color mean

Crete et al.

Informativeness network

•CNN trained with ImageNet + Places.

•Finetuned with human annotations: relevant / irrelevant

by Marc Bolaños (UB)

Relevance ranking

Affective

• VitorNet CNN (2 classes sentiment prediccions)

by Victor Campos (UPC)

Relevance ranking

Late fusion

• Score normalization:•By Rank

•By Score

• Aggregate scores

Using MNSMS weights will be learned

Similarity measure

ImageNet

Places

CNN trained with ImageNet DB (1000 classes) using CaffeNet Architecture.

CNN trained with Places (476 classes) DB using CaffeNet Architecture.

Object detector : Large Scale Detection through Adaptation (7500 classes).Knowledgement transfer: Classifiers without bounding box annotated data into detectorsTwo post-processing steps of no-maxima supression.

ResultMediaeval diverse task

Ranking for relevance

Filtering

Distance computation

Diversity

Informativeness network, Textual

Keep N% top results

ImageNet, Places, Textual

Diverse top results

ResultMediaeval diverse task

Visual Textual Multi Crediv. Multi

Semantic and Diverse Summarization of Egocentric Photo Events · refined list should be both...

Documents

in · 2019-10-05 · Adirisory Member - cv. Charles Beach dvisory Member - ov. E. Lidon Adams DIGM.JCT TJJSTEES: Dr. . D. Gunsalus ev M. Lewis Adams dr, Laymond Campbell cv .ohort

La concha en la cultura Bolaños - laiesken.net

Retention: Diverse Institutions = Diverse Retention Practices?

Diverse America Diverse Perspectives

“Tu farmacia a favor de la lactancia materna” MÓDULO 3 Rosario Cáceres Fernández-Bolaños

Manejo de dislipemias en ap DRA Bolaños

JULIANA HERRERA BOLAÑOS - USTA

Correa Florez, Bolaños Ocampo, Escobar Zuluaga - 2014 - Multi-objective Transmission Expansion Planning Considering Multiple Generation

El General Torres Rojas en La División Acorazada Brunete. Roberto Muñoz Bolaños

EVOLVE'16 | Enhance | Oscar Bolaños & Justin Edelson | Search All the Things: Omnisearch in AEM 6.2

Ycuá Bolaños Tragedy Asunción del Paraguay, August 1, 2004

Diverse-ability Connecting Students with Diverse Needs

America’s Diverse Diverse Population

UNIVERSIDAD ESTATAL A DISTANCIA. DIDÁCTICA DE LA INFORMÁTICA EDUCATIVA PROFESORA. MARIELA BOLAÑOS ESTUDIANTE: MONSERRATH CALVO III CUATRIMESTRE

CAROLINA BOLAÑOS VILLADA BERTHA INÉS RAMÍREZ OSORIO ENGLISH WORK

consultas.stjjalisco.gob.mxconsultas.stjjalisco.gob.mx/pages/transparencia/... · Web viewLISTADO DE PERITOS. ALBACEA PROVISIONAL O DEFINITIVO. NombreLic. José Salvador Díaz Bolaños

Recipe for growing on balconies and terraces Mariló Antón Bolaños

Web viewTimothy O’Halloran. Eileen Lidon. ... Create a table using Word Processing to compare/contrast major features including monthly base fee, download speed,

PROGRAMA PROGRAM BOOK - etologiabrasil.org.br · Rodrigues Paranhos da Costa, ... Marco Antonio Correa Varella (UNB) ... Marcelo Fernández-Bolaños Martin Michele Verderane

CENTRO NACIONAL DE REGISTROS REGISTRO DE … · trabanino de amaroli, elizabeth sanchez chinchilla, mario ernesto serrano de miranda, daysi arely bolaÑos bolaÑos, cesar osmin rivera