Upload
carlos-turro-ribalta
View
137
Download
3
Embed Size (px)
Citation preview
Automatic transcription of video
filesCarlos Turró
Universitat Politecnica de Valencia
Agenda• Why automatic transcription• State of the art: The transLectures project• Automatic transcription of Lecture Recordings: The Opencast Project• Notes & the near future
Why automatic transcription of video files?• Accessibility
Why automatic transcription of video files?• Accessibility
• Searching into a video file• Searching into a video repository• Topic identification• …and much more
The transLectures project• Development of an engine for Automated Speech Recognition (ASR) for
lectures & educational content• Development of translation tools for that content
• Implementation• Case studies: Videolectures.NET & Polimedia (UPV video repository)• Real-life evaluation• Integration into Opencast
http://www.translectures.eu
5
transLectures partners
12 Nov 2013
Name Country
1 Universitat Politècnica de València (MLLP) Spain2 Xerox SAS France3 Institut Jožef Stefan Slovenia3+ Knowledge for All Foundation UK4 RWTH Aachen University Germany5 EML – European Media Laboratory Germany6 DDS – Deluxe Digital Studios UK
36 Months
November 2014
Statistical transcription (and translation)
Acustic Model
LanguageModel
TRANSCRIPTION
Sound ASR Engine
Statistical transcription (and translation)
Acustic Model
LanguageModel
Manually transcriptedvoice Modeling Engine
Architecture of TransLecturesLecture
Language Model
Slides
Extracontent
Result
Intelligent interaction
Transcription Translation
Languages
12 Nov 2013 10
• Transcription (ASR)• EN • SL • ES
• Translation (MT)• EN>SL , SL>EN• EN>ES , ES>EN• EN>FR• EN>DE
Transcription and Translation Platform
Transcription and Translation Platform API
Transcription and Translation Platform• Post-editing web interface (in HTML5)
Example video• https://media.upv.es/?id=b444d12e-db23-9a4f-9b3b-d1d9275d4cb4
Scientifical Evaluations• WER = Word Error Ratio
• The lower the better
• Usually, a human transcriptor has a WER around 12
Beyond transLectures
Beyond transLecturesWER
Language M10 M17Dutch 25.7 24.5Italian 21.2 17.7Portuguese 45.9 43.0Spanish 15.9 14.4Estonian N/A 27.1French N/A 22.7
Beyond transLectures
The Opencast Community is…Universities, companies and people:• concerned with academic video• attracted to the Opencast values of openly exchanging ideas,
experience, knowledge and code• committed to building and maintaining a robust, flexible, high-quality
open source lecture capture and academic video management solution.
Now also part of
Full-featured Lecture Recording ecosystem
Who uses Opencast?Around the world, with strong adoption in Europe especially.
43 Adopters with public information (May 2014)
30+ commercial partner clients
http://opencast.org/matterhorn-adopters
Yesterday’s tweet
Indexing in Opencast• Opencast has built-in OCR indexing capabilities
Video (slides) -> OCR (hunspell) -> Word list filter -> Apache Lucene search server
• New operations can be addedVideo (slides) -> transcription (tL) -> Apache Lucene search serverorVideo (slides) -> OCR (hunspell) -> transcription (tL) -> Word list filter ->Apache Lucene search server
Why do I need an indexing server?• Powerful, Accurate and Efficient Search Algorithms
• ranked searching -- best results returned first• many powerful query types: phrase queries, wildcard queries, proximity
queries, range queries and more• fielded searching (e.g. title, author, contents)• sorting by any field• multiple-index searching with merged results• allows simultaneous update and searching• flexible faceting, highlighting, joins and result grouping• fast, memory-efficient and typo-tolerant suggesters
Demo on searching• https://media.upv.es
Notes & the near future• ASR Technology is enough good for automated transcription of videos
… with enough good sound
• There are lecture recording systems that enables to plug transcriptions for searching
…like Opencast
• There are already things to solve• Transcription speed (in good progress)• Topic indentification• Adding more languages
Thanks!Questions?
Learning more ….transLectureshttp://translectures.eu
Video in a multilingual context (EMMA)http://association.media-and-learning.eu/portal/resource/ml-webinar-video-multilingual-context
Opencast State of the Projecthttp://lanyrd.com/2015/apereo/sdmpry/