User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research

User Benefits of Non-Linear Time Compression

Liwei He and Anoop Gupta

Microsoft Research

Introduction

Time compression: key to browse AV content

We focus on informational content

Audio time compression algorithms

Linear: speed up audio uniformly

Non-linear: exploit fine-grain structure of human speech (e.g. pause, phonemes)

How much more do users gain from more complex algorithms?

Methodology

Conduct user listening test

One Linear TC algorithm

Two Non-linear TC algorithms

Simple: Pause-removal followed by Linear TC

Sophisticated: Adaptive TC

Compare objective and subjective measurements

Time Compression Algorithms

Linear Time Compression

Classic algorithms

Overlap Add (OLA) and Synchronized OLA (SOLA)

We use SOLA

Non-Linear Time Compression

Algorithm 1: Pause removal plus TC

Energy and Zero Crossing Rate analysis

Leave 150ms untouched

Shorten >150ms to 150ms

Apply SOLA algorithm

PR shortens speech by 10-25%

Non-Linear Time Compression (cont.)

Algorithm 2: Adaptive TC

Mimics people when talking fast

Pauses and silences are compressed the most

Stressed vowels are compressed the least

Consonants are compressed more than vowels

Consonants are compressed based on neighboring vowels

System Implications

Computational complexity

Adaptive TC 10x more costly than Linear TC

Complexity in client-server implementation

Buffer management required for non-linear TC

Audio-video synchronization quality

User Study Method

User Study Goals

Highest intelligible speed

Comprehension

Subjective preference

Sustainable speed

Experiment Method

24 subjects

4 tasks for each subject

3 time compression algorithms

Linear TC using SOLA (Linear)

Pause removal plus Linear TC (PR-Lin)

Adaptive TC (Adapt)

Each test takes approximately 30 minutes

Highest Intelligible Speed Task

3 clips from technical talks

Find the highest speed when most of words are understandable

Comprehension Task

3 clips at 1.5x and 3 clips at 2.5x

Clips from TOEFL listening test

Answer 4 multiple choice questions

Subjective Preference Task

3 pairs of clips at 1.5x

3 pairs of clips at 2.5x

Each pair contains the same clip compressed with 2 of the 3 TC algorithms

Indicate preference on 3-point scale

Sustainable Speed Task

3 clips each 8 minute along

Clips from a CD audio book

Find the maximum comfortable speed

Write a 4-5 sentence summary at the end

User Study Results

Highest Intelligible Speed Task

PR-Lin is significantly better than Adapt (p<.01)

Linear PR-Lin Adapt

Comprehension Task

Linear PR-Lin Adapt

Adapt is better than PR-Lin (p=.083) at 2.5x

Preference Task at 1.5x

Slight preference for PR-Lin (p=.093)

1.5xPrefer Former

Prefer None

Prefer Latter

Linear vs. PR-Lin

6 5 13

PR-Lin vs. Adapt

13 5 6

Adapt vs. Linear

Preference Task at 2.5x

PR-Lin and Adapt do significantly better than Linear

2.5xPrefer Former

Prefer None

Prefer Latter

Linear vs. PR-Lin

2 8 14

PR-Lin vs. Adapt

4 9 11

Adapt vs. Linear

21 3 0

Sustainable Speed Task

Linear PR-Lin Adapt

Conclusions

Previous Works

Mach1 (Covell et. al. ICASSP 98)

Comprehension and preference tasks

Comparing Linear and Mach1 (Adapt) at 2.6-4.2x

Comprehension scores 17% better w/ Mach1

95% prefers Mach1 to Linear

No data on < 2.0x

Other works (Harrigan, Omoigui, Li, Foulke)

1.2-1.7x is the sustainable listening speed

Conclusions

Trade off in TC algorithms is task-related

Listening: Linear TC is sufficient

Fast Forwarding: Non-linear TC is more suitable

Adapt TC is close to the way people talk fast

Limit lies in the human-listening and comprehension

User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research

Documents

AUTOMOBILE INDUSTRY- Anoop G

Anoop insurance in industry.docx

11/4/1999ACM Multimedia 991 Auto-Summarization of Audio-Video Presentations Li-wei He, Elizabeth Sanocki Anoop Gupta, Jonathan Grudin Collaboration and

· Architect Name A.s. ASSSOCIATES AMIT GAJPAL JOSEPH BASTIAN ANOOP KESHARWANI ANSHUL DEODAS ANURAG TAMRAKAR Aadesh Nayyar Abhinav Gupta Ajay Nagle Ajay Yadav Ajay kumar Soni Anil

Proceedings Template - WORD - Lenovoresearch.lenovo.com/webapp/image/doc/Automatically... · Web viewAutomatically Extracting Highlights for TV Baseball Programs Yong Rui, Anoop Gupta,

Anoop Seminar Report20

IIIT Hyderabad Atif Iqbal and Anoop Namboodiri atif.iqbal@research.iiit.ac.inatif.iqbal@research.iiit.ac.in, anoop@iiit.ac.in anoop@iiit.ac.in Cascaded

Impression anoop/prosthodontic courses

Anoop Final Project

Dr ANOOP DIXIT @ SPECTRUM CAREER INSTITUTE ......Dr. ANOOP DIXIT (9810683007, 9811683007) END OF LECTURE 1 START OF LECTURE 2 Dr ANOOP DIXIT @ SPECTRUM CAREER INSTITUTE(9810683007)

Dr. Anoop Gupta

Anoop Mayampurath | amayampurath@peds.bsd.uchicago.edu

Auto-Summarization of Audio-Video Presentationssmaskey/candidacy/cand...Auto-Summarization of Audio-Video Presentations Liwei He, Elizabeth Sanocki, Anoop Gupta, and Jonathan Grudin

NOTICE FOR THE EXTRA-ORDINARY GENERAL MEETING · RESOLVED FURTHER THAT Mr. Anuj Gulati, Managing Director & CEO, Mr. Pankaj Gupta, Chief Financial Officer, Mr. Anoop Singh, Chief

Ryen W. White, Peter Bailey, Liwei Chen Microsoft Corporation

Floating bridge by Anoop

User Benefits of Non-Linear Time Compression 1 Liwei He & Anoop Gupta September 21st, 2000 Microsoft Research

1 Communications, Collaboration, and Community Anoop Gupta Microsoft Research Collaborators: Michael Cohen, Ross Cutler, Zicheng Liu, Yong Rui, Kentaro

Microsoft Spectrum Observatory Ranveer Chandra Microsoft Research Joint work with: Anoop Gupta, Matt Valerio, Paul Garnett, Victor Bahl, Aakanksha Chowdhery,

R Comparing Presentation Summaries: Slides vs. Reading vs. Listening Liwei He, Elizabeth Sanocki Anoop Gupta, Jonathan Grudin Collaboration and Multimedia