24
THALES COMMUNICATIONS IST Proposal MobiNews Meeting - June 10th, 2003 “Automatic and Personalised Compilation of Broadcast News with Audio Playback on Mobile Devices” June 10th, 2003 François CAPMAN, PhD Research Engineer, Technologies Radio & Signal Unit [email protected] Tel : +33 (0) 1 46 13 29 63 Fax : +33 (0) 1 46 13 25 55

June 10th, 2003

  • Upload
    hovan

  • View
    50

  • Download
    0

Embed Size (px)

DESCRIPTION

IST Proposal MobiNews Meeting - June 10th, 2003 “Automatic and Personalised Compilation of Broadcast News with Audio Playback on Mobile Devices”. François CAPMAN, PhD Research Engineer, Technologies Radio & Signal Unit [email protected] Tel : +33 (0) 1 46 13 29 63 - PowerPoint PPT Presentation

Citation preview

Page 1: June 10th, 2003

THALES COMMUNICATIONS

IST Proposal

MobiNews Meeting - June 10th, 2003“Automatic and Personalised Compilation of Broadcast News with Audio Playback

on Mobile Devices”

June 10th, 2003

François CAPMAN, PhD

Research Engineer, Technologies Radio & Signal Unit

[email protected]

Tel : +33 (0) 1 46 13 29 63

Fax : +33 (0) 1 46 13 25 55

Page 2: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 2THALES COMMUNICATIONS

MobiNews Workshop Agenda

• 10h00 - 10h15 Agenda, objectives of the meeting

• 10h15 - 10h30 Presentation of MobiNews IST proposal , current status

• 10h30 - 11h30 Presentation of each organisation 1 (5mn/10mn)

• 11h30 - 11h45 Break

• 11h45 - 12h15 Presentation of each organisation 2 (5mn/10mn)

• 12h15 - 12h45 Definition of contributions and overall structure of the project

• 12h45 - 13h45 Lunch

• 13h45 - 15h15 Detailed structure of the project, description of work-packages

• 15h15 - 15h45 Other topics (additional partners, ...)

• 15h45 - 16h00 Further steps, planning for the proposal

• 16h00 - 16h30 Discussion - Conclusion

Page 3: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 3THALES COMMUNICATIONS

IST Objectives (2nd Call)

Call 2: publication 17/6 2003, closing 15/10 2003 – would have an indicative budget of around 525 MEuros (80 % pre-distributed).

Objectives covered in Call 2

• Advanced displays• Optical, opto-electronic, & photonic functional components• Open development platforms for software and services• Cognitive systems• Embedded systems• Applications and services for the mobile user and worker (60 MEuros)• Cross-media content for leisure and entertainment (55 MEuros)• GRID-based Systems for solving complex problems• Improving Risk management• eInclusion

Specific Targeted Research Project (STREP) : 2.5 / 3.0 MEuros (Funding)

Page 4: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 4THALES COMMUNICATIONS

IST Objectives (2nd Call)2.3.2.7 Cross-media content for leisure and entertainment

Objective: To improve the full digital content chain, covering creation, acquisition, management and production, through effective multimedia technologies enabling multi-channel, cross-platform access to media, entertainment and leisure content in the form of film, music, games, news and alike. It will accelerate take up in B2B, B2C and C2C, currently hampered by insufficient productivity, convergence and high cost.Focus is on:– Developing technologies supporting the creation of new, compelling forms of content for interactive, creative or artistic consumption. Research should aim at advancing imaging technologies and audio-visual representation, multi-dimensional immersive environments and experience portals, as well as virtual, augmented and mixed reality technologies featuring higher levels of quality and accuracy. Device adaptivity and contextualisation, personalisation and (emotive) feedback, and ability to capture real-time, multimodal and multisensorial input will be embedded as needed.– Developing integrated content programming environments allowing to retrieve content from different sources, types and locations, and to store, compress and categorise it, with a view to realising programming appropriate to a particular audience and delivery channel, including interactive TV, e-cinema, radio, online games and music.

Page 5: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 5THALES COMMUNICATIONS

IST Objectives (2nd Call)2.3.2.6 Applications and Services for the Mobile User and worker

Objective: To foster the emergence of rich landscape of innovative applications and services for the mobile user and worker and to support the use and development of new work methods and collaborative work environments. These should be based on interoperable mobile, wireless technologies and the convergence of fixed and mobile communication infrastructures. Such applications and services will enable new business models, new ways of working, improved customer relations and government services in any context. The target applications and services will be capable of being seamlessly accessed and provided anywhere, anytime and in any context.Focus is on:– The integration of technologies into a wide range of innovative mobile and multimodal applications and services including workplace designs that enhance creativity and productivity.(Intelligent, adaptive and self-configuring services that deploy wearable interfaces and enable automatic context-sensitivity, user profiling and personalisation in a trusted and secure environment as well as multi-lingual and multi-cultural presentation, and multiple modes of interaction)– Addressing the major hurdles for the deployment of applications and services for the mobile user.

Page 6: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 6THALES COMMUNICATIONS

MobiNews Proposal

• Targeted Application• Automatic compilation of broadcast news (audio, text) with audio playback on mobile

devices (2.5G, 3G).

• Access to personally selected text and audio news from a service/source provider using Multimedia Messaging Service (MMS) transmission protocol.

• Expected Features• Fast and reliable access to synthetic newscast on a regular basis (daily, weekly, …) or upon

request.

• Access to various identified sources within the same compilation, using scheduled programme.

• Automatic server-based generation of the synthetic newscast, with MMS WAP 2.0 Low-cost transmission towards mobile devices.

• User-defined profile for automatic download

• Enhanced Man Machine Interface (MMI) for queries’ submission, key-word-based search, ...

Page 7: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 7THALES COMMUNICATIONS

MobiNews Proposal• Technical Objectives

• Audio data and Text data Structuring:

• automatic / semiautomatic segmentation (speaker tracking, scheduled programme, …)

• classification, discrimination (speech, music, jingles, …)

• transcription and information retrieval (word-spotting, key-words, …)

• automatic summarisation

• Very Low Bit Rate (VLBR) Wide-Band speech compression (with optional scalable audio stage).

• Text-To-Speech (TTS) synthesis for audio display of the transmitted text component (optional voice conversion, style / prosody mimicking).

• Software optimisation (complexity and memory) of VLBR decoder and TTS modules for embedded solutions on mobile devices (downloadable as plug-ins).

• Enhanced interface for mobile products (Natural Language Processing (NLP), …)

• Demonstrator with MMS link between a PC-based server and a handheld mobile terminal.

Page 8: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 8THALES COMMUNICATIONS

MobiNews Proposal

MULTIMEDIA LIBRARYPUSH / PULL CONTENT SERVICES

MMS COMPOSER

MMS CENTRE

EXTERNAL APPLICATIONCONTENT PROVIDER

MMS SUBSCRIBERS

Page 9: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 9THALES COMMUNICATIONS

MobiNews Proposal

CONTENT PROVIDERSSERVICE PROVIDER

AUDIO & TEXT DATASTRUCTURING

AUTOMATICCOMPILATION OF

SYNTHETIC NEWSCAST

Personalised REQUESTPredefined PROFILE

MMS-formattedCOMPRESSION

MOBILE CUSTOMERS

MMS newscast

MMS newscast

MMS newscast

MMS newscastAudio / Text Database

Audio / Text Database

Page 10: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 10THALES COMMUNICATIONS

Targeted duration: 10 to 15 minutes in one single MMS

VLBR between 800 and 1200 bits/sec

VLBR compression for MobiNewsAMR Mode Bit Rate in kbit/sec Duration of 30 kbytes MMS Duration of 100 kbytes MMS

0 6.60 36.4 sec 2 min 01sec1 8.85 27.1 sec 1 min 30 sec2 12.65 19.0 sec 1 min 03 sec3 14.25 16.8 sec 56.1 sec4 15.85 15.1 sec 50.5 sec5 18.25 13.2 sec 43.8 sec6 19.85 12.1 sec 40.3 sec7 23.05 10.4 sec 34.7 sec8 23.85 10.1 sec 33.5 sec

Time-equivalent duration of MMS messages using AMR speech encoder.

30 kbytes MMS 60 kbytes MMS 100 kbytes MMS05 minutes 800 bits/sec 1600 bits/sec 2667 bits/sec10 minutes 400 bits/sec 800 bits/sec 1333 bits/sec15 minutes 200 bits/sec 400 bits/sec 889 bits/sec

Targeted averaged bit rates for compression of specified duration messages using MMS .

Page 11: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 11THALES COMMUNICATIONS

MobiNews Work Packages

• Definition of Work Packages

• WP 1 Project management

• WP 2 Analysis of the needs, analysis of the market, dissemination

• WP 3 Broadcast radio news databases (specifications, collect, recordings)

• WP 4 Audio and text data structuring

• WP 5 Very-Low Bit Rate (VLBR) compression for synthetic newscast

• WP 6 Text-To-Speech (TTS) synthesis for mobile devices

• WP 7 MMS-based demonstrator (Server and mobile applications, MMI, …)

• WP 8 Evaluation methodology, field trials, analysis

Page 12: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 12THALES COMMUNICATIONS

MobiNews Consortium• Thales Communications (France)

• L.I.A. (France)

• E.N.S.T. (France)

• E.S.I.E.E. (France)

• Elan Speech (France)

• Brno University of Technology (Czech Republic)

• Multitel (Belgium)

• INESC-ID (Portugal)

• PT Inovação, Voice services and platforms Dept (Portugal)

• Radio France Multimedia (France)

• Belga Press Agency (Belgium)

• Portuguese Radio/TV (Portugal) ???

Page 13: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 13THALES COMMUNICATIONS

Presentation of organisations

• 1 - Gwenaël Guilmin (Thales Communications)

• 2 - Bertrand Ravera : RNRT project proposal Mobi-Info

• 2 - Corinne Fredouille (L.I.A.)

• 3 - Maurice Charbit (E.N.S.T.)

• 4 - Geneviève Baudoin (E.S.I.E.E.)

• 5 - Jacques Toën (ELAN SPEECH)

• 6 - Petr Motlicek (BRNO University of Technology)

• 7 - Stéphane Deketelaere (MULTITEL)

• 8 - Isabel Trancoso (INESC-ID)

• 9 - Nuno Beires (PT INOVACAO)

• 10 - Caroline Roy (RADIO France MULTIMEDIA)

General Presentation and Potential Contributions to MobiNews

Page 14: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 14THALES COMMUNICATIONS

Contributions

• Thales Communications:• Speech segmentation / classification• Very-Low Bit Rate speech compression using parametric approaches• optimisation of VLBR for a mobile plug-in

• E.N.S.T.:• voice conversion using improved HNM synthesis,• joint-optimisation of speech units for coding and synthesis

• E.S.I.E.E.:• Very Low Bit Rate speech compression using recognition/synthesis• Very Low Bit Rate speech compression using parametric approaches• voice conversion• joint-optimisation of speech units for coding and synthesis

• BRNO University of Technology:• Very Low Bit Rate speech compression using recognition/synthesis

Page 15: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 15THALES COMMUNICATIONS

Contributions• ELAN SPEECH:

• distributed architecture (mobile/server) for speech synthesis• optimisation for a mobile plug-in• voice personalization, voice conversion

• INESC-ID, and L.I.A.:• audio data structuring

• MULTITEL:• Man-Machine Interface, Natural Language Processing

• PT INOVACAO:• MMS synthetic newscast packaging• MMS-based demonstrator

• Radio France Multimedia, and Belga Press Agency (+ Portuguese TV/rad)• specifications• news content provider• evaluation

Page 16: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 16THALES COMMUNICATIONS

MobiNews: WORKPLAN

• WP 2: Analysis of the market, … needs, dissemination:

• WP2.1: Analysis of the market: existing services

• WP2.2: Analysis of the needs: limitations of the existing services

• WP2.3: Dissemination: valorisation of the outcome of the project, standardisation, ...

Page 17: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 17THALES COMMUNICATIONS

MobiNews: WORKPLAN

• WP 3: Broadcast radio news databases

• WP3.1: Audio databases (collect, recordings, annotation, meta-data, …)

• WP3.2: Text databases (collect, annotation, meta-data, …)

• WP3.3: Service specifications (features, user acceptance, …)

Page 18: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 18THALES COMMUNICATIONS

MobiNews: WORKPLAN

• WP 4: Audio and Text data Structuring

• WP4.1: Low-level segmentation• speech/non speech discrimination (silence, noise, pause, speech, music, jingle, …)

• speaker characterisation (identification, tracking, segmentation, clustering, …)

• WP4.2: High-level segmentation• speech-to-text transcription

• story segmentation, topic detection, tracking and classification

• WP4.3: Customisation• text summarisation, audio summarisation

• constrained summarisation (profile-driven, queries-driven, duration, multi-sources, …)

• meta-data information

• evaluation methodology (reference human-built summaries, quiz scores, …)

Page 19: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 19THALES COMMUNICATIONS

MobiNews: WORKPLAN

• WP 5: VLBR Speech / Audio compression

• WP5.1: Segmental-based parametric compression of synthetic newscast• audio stream analysis and segmentation

• optimised compression of structured messages

• scalable solutions (bit-rate and bandwidth)

• WP5.2: Compression based on natural speech units indexing• optimised HNM-based speech synthesis

• speaker-independent mode (speaker adaptation, voice conversion)

• joint-optimisation of units for both synthesis and coding

• compression of synthesis units for memory storage optimisation

Page 20: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 20THALES COMMUNICATIONS

MobiNews: WORKPLAN

• WP 6: Text-To-Speech synthesis for mobile devices

• WP6.1: Voice conversion / customisation

• WP6.2: Optimisation for mobile terminals• complexity reduction

• memory storage

• distributed software architecture

Page 21: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 21THALES COMMUNICATIONS

MobiNews: WORKPLAN

• WP 7: User-centred design of the MMI

(Man Machine Interface)• WP7.1: Server-based application

• optimised entries for the definition of user profile, user queries, ...

• WP7.2: Mobile embedded application• design of an efficient mobile interface with emphasis on the ease-of-use and the

acceptability (= usability)

Page 22: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 22THALES COMMUNICATIONS

MobiNews: WORKPLAN

• WP 8: MMS-based demonstrator

• WP8.1: Server-based applications• module for data structuring

• module for audio compression

• MMS packaging

• WP8.2: Mobile devices embedded applications• MMS de-packaging

• optimised plug-in for text-to-speech synthesis

• optimised plug-in for audio decompression

Page 23: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 23THALES COMMUNICATIONS

MobiNews: WORKPLAN

• WP 9: Evaluation methodology, Field trials, Analysis

• WP9.1: Evaluation methodologies• audio quality for speech synthesis and compression

• evaluation of synthetic newscast (summarisation)

• evaluation of MMI (queries, profile, …)

• WP9.2: Field trials and analysis• quiz score methods

• … ?

Page 24: June 10th, 2003

Les

info

rmat

ions

con

tenu

es d

ans

ce d

ocum

ent

sont

la p

ropr

iété

exc

lusi

ve d

u G

roup

e T

HA

LES

. E

lles

ne d

oive

nt p

as ê

tre

divu

lgué

es s

ans

l'acc

ord

écrit

de

TH

ALE

S C

omm

unic

atio

ns.

Radio & Signal Technologies TBU- 24THALES COMMUNICATIONS

Administrative Issues

• the project proposal will include:

• A1 form: proposal acronym, proposal number, proposal title, estimated duration (30 months ?), key word codes, abstract (co-ordinator)

• A2 form: participant submission form (for each participant)

• A3 form:financial information (co-ordinator)

• B part: non-anonymous description of scientific/technological objectives