Speech Technology and Big Data

Nick Campbell

Speech Communication Lab

Trinity College Dublin, Ireland

*Speech Technology & big data

*Who am I . . .

*TCD – Stokes Professor (Dublin)

*CNGL – PI – Delivery & Interaction

*ELRA – board member / VP – speech

*ISCA – board member – workshops

*IEEE – Sig Proc Soc - SLTC member

*ATR/NiCT – research director(Japan)

*Speech Prosody 2014 (Dublin) host

* Speech scientist/researcher/corpus analyst

*Speech research insights

*AT&T Bell Labs

*The ideas people – think ‘BIG’

*IBM UK Scientific Centre

*The corpus people – ‘collect it all’

*ATR basic telecom research

*The fundamentals - learn how to ‘infer’ from it

*collecting speech data

*we used to be considered BIG – speech data (and now multimedia) gobbled up memory

*I collected 1500 hours of everyday chat/daily conversations in 2000 – (@1GB per minute) - took 5-years to process!

*now Apple, Google, Ms, .. get that each minute (but the secret is in the metadata)

*we need accessible data & tools for everybody!

*protecting speech data

*but we need to manage privacy issues first!

*identifying speech data

*and we need a way to protect IP as well

*written publications have ISBN standard

*work is now underway (cf ELRA & COCOSDA) to institute ISLRN for Language Resources

*researchers need to get credit for corpora as well as for publishing research results

*The community needs a way to identify, acknowledge, attribute, and reference data

*tagging speech

*tools for processing speech & multimodal data

*htk, hts, R, etc . . . not simple to use

*little consensus on what features to encode

*manual bootstrap – much too time-consuming!

*content analytics

*social interaction

*personal idiosyncracies

*group dynamics – multimodal data (TB/hr)

*issues of robustness / domain specificity / privacy / storage & archiving / redistribution

*speech and language!

context analytics:

*cultural and language-specific needs

*multimodal – multimedia – multilingual

*tools for ‘less-well-supported’ languages

*e.g., U-STAR consortium for speech research – sharing tools & data & knowledge for research

*getting help

*European Language Resources Association

*COCOSDA – int’l coordinating committee

*IEEE SLTC, ISCA SIGS, there are places to go

*but are they ready for really BIG data? perhaps not yet . . .

*next steps

*curricula prepare people

*what standards to rely on?

*what resources available?

*what features to extract?

*what tools to work with?

*what use to put it to?

*what info to hide?

*what to do next?

*Thank you . . .