Upload
albert-shaw
View
219
Download
8
Embed Size (px)
Citation preview
A Novel Framework for Semantic Annotation andPersonalized Retrieval of Sports Video
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 3, APRIL 2008
Outline
Introduction Semantic Annotation of Sports Video
Text Analysis Video Analysis Text/Video Alignment Video Annotation and Indexing
Personalized Video Retrieval Experiment and Evaluation
Introduction Semantic Annotation of Sports Video
Text Analysis Video Analysis Text/Video Alignment Video Annotation and Indexing
Personalized Video Retrieval Experiment and Evaluation
Introduction Semantic Annotation of Sports Video
Text Analysis Video Analysis Text/Video Alignment Video Annotation and Indexing
Personalized Video Retrieval Experiment and Evaluation
Text Analysis
Caption text overlaid on the video The recognition of caption text overlaid on sports
video using OCR is not ideal due to the quality of the broadcast sports video.
Closed caption Closed caption is a transcript from speech to text
thus contains a lot of information irrelevant to the games and lacks of a well-defined structure.
Text Analysis
Web-casting text is another text source related to sports video It is available in many sports websites such as BBC and
ESPN and can be easily accessed during or after the game
The content of web-casting text is more focused on events of sports games and has a well-defined structure
Since webcasting text is a text counterpart of broadcast sports video, it includes detailed information of an event in sports games
The analysis of web-casting text
ROI Segmentation Keyword Identification Text Event Detection
ROI Segmentation
Keyword Identification
Text Event Detection
Example 1 (soccer):
79:19 Goal by Didier Drogba (Chelsea) drilled left-footed from right side of six-yard box (6 yards). Chelsea 4-1 Bayern Munich
Example 2 (basketball):
8:52 Kobe Bryant makes 17-foot two point shot (Smush Parker assists). LA Lakers 9-11 Denver
Text Event Detection
The presentation style of the event for soccer and basketball in web-casting text is slightly different, but the event and event semantics can be easily extracted and represented using a common structure as follows.
<Event> by <Player> of <Team> at <Time>
Goal by Frank Lampard of Chelsea at 58:58 (soccer)
Introduction Semantic Annotation of Sports Video
Text Analysis Video Analysis Text/Video Alignment Video Annotation and Indexing
Personalized Video Retrieval Experiment and Evaluation
Video Analysis
Shot Classification Replay Detection Video Event Modeling
Event with replay
far view shot, close-up shots, replay, close-up shots, far view shot
Introduction Semantic Annotation of Sports Video
Text Analysis Video Analysis Text/Video Alignment Video Annotation and Indexing
Personalized Video Retrieval Experiment and Evaluation
Text/Video Alignment Event Moment Detection
Clock Digits Location
Clock Digits Recognition
Event Boundary Detection Hidden Markov Model (HMM)
Introduction Semantic Annotation of Sports Video
Text Analysis Video Analysis Text/Video Alignment Video Annotation and Indexing
Personalized Video Retrieval Experiment and Evaluation
Video Annotation and Indexing For each game, we annotate the video in two le
vels L1 : annotation exhibits an overall game summary
including game name, date, place, teams, number of audience, scores, etc
L2 : annotates each event in the video using text semantics extracted from the text event and video boundaries obtained from text/video alignment<Event> <Priority> by <Player> of <Team> at <Time> <VideoStartFrame> <VideoEndFrame>
Video Annotation and Indexing
Introduction Semantic Annotation of Sports Video
Text Analysis Video Analysis Text/Video Alignment Video Annotation and Indexing
Personalized Video Retrieval Experiment and Evaluation
Introduction Semantic Annotation of Sports Video
Text Analysis Video Analysis Text/Video Alignment Video Annotation and Indexing
Personalized Video Retrieval Experiment and Evaluation
Text Event Detection
The precisions and recalls of all the events except precision of the shot event for soccer (97.1%) achieve 100%.
Shot Classification and Replay Detection
Evaluation on Personalized Retrieval