33
Indexing of Video Databases: Towards Semantic Sensitive Retrieval and Browsing J.Fan, H.Luo, K.Ahmed Presented by Zeehasham Rasheed

Presented by Zeehasham Rasheed

  • Upload
    ellard

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Concept-Oriented Indexing of Video Databases: Towards Semantic Sensitive Retrieval and Browsing J.Fan, H.Luo, K.Ahmed. Presented by Zeehasham Rasheed. Outline. Introduction Proposed Framework Semantic Video Classification Semantic Video Database Performance Analysis - PowerPoint PPT Presentation

Citation preview

Page 1: Presented by Zeehasham Rasheed

Concept-Oriented Indexing of Video Databases:

Towards Semantic Sensitive Retrieval and Browsing

J.Fan, H.Luo, K.Ahmed

Presented byZeehasham Rasheed

Page 2: Presented by Zeehasham Rasheed

Outline Introduction Proposed Framework Semantic Video Classification Semantic Video Database Performance Analysis Conclusion and Future work

Page 3: Presented by Zeehasham Rasheed

Introduction Digital Video now plays an important

role in Medical education. Several content based video retrieval

(CBVR) system have been proposed. Challenging problems are semantic

gap, semantic video classification and video database indexing.

Page 4: Presented by Zeehasham Rasheed

Proposed Framework Semantic video content framework

using Principal video shots Semantic video concept model Semantic Video Classifier training

framework Concept oriented video database

Page 5: Presented by Zeehasham Rasheed

Challenging Issues The performance of semantic

video classifiers largely depends on the quality of features and automatic semantic video object extraction is in general very hard.

Page 6: Presented by Zeehasham Rasheed

Existing CBVR systems are unable to support video access at the semantic level because of semantic gap.

So to bridge the semantic gap, the rule based approach uses domain knowledge to define rules for extracting semantic video concepts.

Page 7: Presented by Zeehasham Rasheed

Semantic video classification techniques can be classified into Rule based approach and statistical approach.

Rule based approach provide ease to insert, delete and modify the existing rules

Statistical approach uses machine learning techniques.

Page 8: Presented by Zeehasham Rasheed

Semantic Sensitive Video Content Analysis It is necessary to understand what

are the suitable video patterns for interpreting certain domains for medical education.

A good semantic sensitive video content framework should be able to enhance the quality of features.

Page 9: Presented by Zeehasham Rasheed

Developed a novel framework by using Principal video shots for video content representation and feature extraction.

Based on the knowledge of Medical consultants, a set of multimodal salient objects and semantic medical concepts have been designed.

Page 10: Presented by Zeehasham Rasheed

Multimodal Salient objects include visual, auditory and image textual salient objects.

Visual salient objects include human faces, blood-red regions, skin region.

Auditory salient objects include single human speech, multiple human speech.

Page 11: Presented by Zeehasham Rasheed

Semantic Medical concepts include lecture presentation, gastrointestinal surgery, dialog, traumatic surgery.

So all these are required to select the principal video shots.

Page 12: Presented by Zeehasham Rasheed

Semantic Video Concept and Database Modeling Which database model can be

used to support concept oriented video database

Proposed a novel framework to organize large scale video collection into domain dependent concept heirarchy.

Page 13: Presented by Zeehasham Rasheed

The deepest level of concept hierarchy is defined as the domain dependent elementary semantic medical concepts.

For example five different types of principal video shots such as human face, slides, text lines, slides, human speech, are related to elementary semantic medical concept “Lecture Presentation”

Page 14: Presented by Zeehasham Rasheed

Semantic Video Classification Major step is to classify the principal

video shots into most relevant elementary semantic medical concept.

Use one against all rule to label training samples

Where X are the perceptual features and C is the semantic label for sample

Page 15: Presented by Zeehasham Rasheed

Posterior probability that a principal video shot with feature X can be assigned to elementary semantic medical concept C is determined by Bayesian Framework.

Page 16: Presented by Zeehasham Rasheed

In the last, to achieve better likelihood for higher classification accuracy, they used maximum a posterior probability (MAP) as a classifier.

Page 17: Presented by Zeehasham Rasheed

The MAP estimation can be achieved by using the expectation maximization algorithm

They called it adaptive EM algorithm

Page 18: Presented by Zeehasham Rasheed

Testing The principal video shots and their

features are extracted from test video clips. Linear Discriminant analysis is used to obtain more representative features.

Given an unlabeled principal video shot and its feature values.

Page 19: Presented by Zeehasham Rasheed

It is finally assigned to best elementary semantic medical concept corresponds to maximum posterior probability

Page 20: Presented by Zeehasham Rasheed

Concept Oriented Video Database Organization Uses the following technique to

support the statistical video database indexing

Each database node (semantic medical concept node) is described by the semantic label (keyword), visual summary and statistical properties of the class distribution.

Page 21: Presented by Zeehasham Rasheed

Representation of database node is done by following parameters

Page 22: Presented by Zeehasham Rasheed

Hierarchical Video Retrieval Intuitive approach for the naive users to

specify queries. Query Concept Specification via

Browsing: support user to get a good idea of video content quickly by browsing the visual summary for semantic medical concept nodes.

They can pick one or multiple video clips as their query.

Page 23: Presented by Zeehasham Rasheed

Query Concept Specification via Keywords: Keywords are most useful for the naive users to specify their queries at semantic level.

Query Concept Specification via Pattern combination: user can interpret query by using general combinations of principal video shots.

Page 24: Presented by Zeehasham Rasheed

Query Concept Evaluation for query-by-example After query concepts are

interpreted by selected video clips, search is performed.

User can then label those retrieved videos as relevant or irrelevant.

To improve the search results for the next iteration, some steps have been taken.

Page 25: Presented by Zeehasham Rasheed

Information sample selection: Irrelevant video data samples obtained in the previous query and located in the nearest neighbor sphere is used to shrink the sampling area.

So in this way, irrelevant samples are taken out from the sampling area of the current query iteration.

Page 26: Presented by Zeehasham Rasheed

Best Search Direction Prediction: Relevance feedback improves the query results and reduces the size of query iterations, the best search direction for the next query iteration can be predicted by combining such nearest neighbor sphere reduction.

Page 27: Presented by Zeehasham Rasheed

Query Refinement: Only the previous query vector and positive samples are used to determine query vector for the next iteration. It is based on Rocchio’s formula

Page 28: Presented by Zeehasham Rasheed

Performance Analysis Benchmark Matrics:

1- Classification Accuracy (misclassification ratio versus classification accuracy ration)2- Retrieval Accuracy (Precision versus Recall)

Page 29: Presented by Zeehasham Rasheed

Experiments Our experiments are conducted on two

image/video databases: skin database (i.e., marked face database) from Purdue University and medical video database. The skin database consists of 1265 face images. 150 face images are selected as the labeled samples for classifier training.

The medical video database includes more than 35000 principal video shots from 45 h of MPEG medical videos, where 1500 principal video shots are selected as the training samples and labeled by our medical consultant.

Page 30: Presented by Zeehasham Rasheed
Page 31: Presented by Zeehasham Rasheed
Page 32: Presented by Zeehasham Rasheed

Conclusion and Future Work Adaptive EM algorithm have

improved the classification accuracy. A novel semantic sensitive video

content framework via principal video shots have been proposed.

Future work is to obtain more accurate estimation using unlabeled data.

Page 33: Presented by Zeehasham Rasheed

Questions ???