Upload
christina-daniels
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Simon Tuckerww
w.a
mip
roje
ct.o
rg
NLP Presentation
Efficient user-centred access to multimedia meeting content
Simon Tucker and Steve Whittaker
University of Sheffield
{s.tucker, s.whittaker}@shef.ac.uk
Simon Tucker NLP Presentation
AMI Project
•Meetings are a critical way in which knowledge is created and shared within organisations
•Most of this knowledge is never recorded
•AMI provides Multimodal Access to Multimedia Records of Meetings
•16 Partners
•Follow on project AMIDA – Real Time
Simon Tucker NLP Presentation
Sheffield AMI Work
•User Requirements
•Temporal Compression of Speech
•Reducing the amount of time required to listen to a meeting recording but still getting the important information.
•Dynamic Visual Summarization Techniques
•A number of methods for dynamically presenting summary information interactively.
•Temporal Compression of Video
•Audio motivated video compression.
Simon Tucker NLP Presentation
Meeting browsers
•The primary means of accessing meeting records is via a browser.
•In previous work we segregated browsers into four categories according to their focus.
•The focus is either the primary means of presentation or navigation that the browser used.
•This segregation allowed us to get a good idea of the current browser space.
Simon Tucker NLP Presentation
User Requirements
•Can make use of two different methods to collect user requirements
•Practice–centric•Examination of current practices.
•Collection through observation.
•Technology-centric•Exposure to new technology.
•Collection through user opinion.
Simon Tucker NLP Presentation
Practice-centric AMI study
•Meetings already generate a large amount of information exchange.
•Personal Notes.
•Minutes.
•Post-meeting email discussion.
•Informal meeting discussions.
•Approach taken is to record (where possible) and then analyse these records.
•Use this analysis information to determine how meeting records are used and what are any problems associated with such records.
Simon Tucker NLP Presentation
Study details
•We examined the meeting recording practices of two firms.
•We studied a core team over a series of meetings.
•Thus we can study the lifecycle of meeting documents.
•Meetings in both firms were task oriented rather than being about the generation of ideas.
•We collected permission to make recordings from each meeting participant
•We also allowed participants to request that the recordings be switched off.
•Names were removed from transcripts.
Simon Tucker NLP Presentation
Existing Tools and Problems
Type of Record Functions ProblemsPublic Record (Minutes)
*Group Todos(actions/decisions)
*Summary/Gist
Group Archive (history)
Not timely
Lacks context & completeness
Requires effort to produce
Private Record (Personal Notes)
*Personal Todos(actions/decisions)(context for actions)
Briefing for non-attendees
Personal Archive
Esoteric
Detracts from ability to contribute
Simon Tucker NLP Presentation
Analysis of State of the Art Tools
•Important to assess the state of the art.
•Assessed the efficiency of the first generation AMI meeting browser in answering typical questions about a meeting.•Generated a number of questions about a single
meeting.
•Subjects asked to answer these questions using the meeting browser.
•‘Thinkaloud’ was encouraged and we examined the accuracy of the answers.
•The questions were either about specific information (what was the total budget?) or were more general (what was Ed’s contribution to the meeting?).
Simon Tucker NLP Presentation
Tools Analysis Results
•Inefficient for access•Too much low
level detail
•Assumption of large display
•Users need abstraction / summarisation tools
Simon Tucker NLP Presentation
Efficient Access to Meeting Data
•There is a clear need for efficient access to meeting data.
•Meetings contain a lot of irrelevant information (both in general and for specific participants).
•Minutes and notes capture important information but lack contextual information.
•State of the art tools lack abstraction – generally present the raw recordings, unfiltered.
•We focus on lightweight components allowing for efficient access to meeting data.
Simon Tucker NLP Presentation
Temporal Compression of Speech
•Intended for environments which necessitate speech only access.
•e.g. Mobile phone, travelling in car etc.
•Aim is to reduce the length of the recording but to retain the important content.
•Two techniques for reducing the length:
•Speed Up: Play the full clip back at a faster rate.
•Excision: Remove sections of the recording.
Simon Tucker NLP Presentation
Speed Up
•Simplest approach is to directly alter the playback rate.
•Has the side effect of altering the pitch of the speakers.
•Use an overlap and add algorithm to speed up whilst keeping pitch constant.
•Has the problem of not reflecting how speakers naturally increase their speech rate.
•Use a variable playback rate to better match how human speakers alter their speech rate.
Simon Tucker NLP Presentation
Excision
•Simple approach is to remove non-informational parts of the recording e.g. silence.
•Limited by the amount of silence.
•Derive measures of word importance and only play back the important words; missing words are mentally replaced.
•Far from “natural” speech.
•Use larger parts of speech (utterances) and locate important utterances and play only those back.
Simon Tucker NLP Presentation
•Initial Exploratory Experiment
•Gain an understanding of the space.
•Informally assessed a large number of techniques.
•Located promising directions for research.
•Follow up detailed study
•Examined a subset of the techniques explored.
•Used a measure of gisting ability to assess success.
•Examined short and long meeting clips.
•Also examined effect of a user interface.
Experimental Overview
Simon Tucker NLP Presentation
Measuring Gisting Ability
•A key facet of our techniques is that they support the discovery of gist rather than facts.
•Therefore the metrics we have used previously do not adequately capture the proposed usage of these tools.
•Key components of the performance metric:
•Must be quick to assess and to score (experimenter and subject time)
•Objective measure
Simon Tucker NLP Presentation
Measuring Gisting Ability (2)
•Our solution was to use a hybrid gold standard scheme.
•We measure the importance of utterances from the transcript and select a number of utterances from the full range of importance.
•We then ask judges to rank these utterances in order of importance.
•Subjects then listen to the meetings and perform the same ranking.
•The objective score is then the difference between the gold standard and subject rankings
Simon Tucker NLP Presentation
Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it? To all the people who can Quest like A
Tribe does Before this, did you really know what live was?
Comprehend to the track, for it's why cuz Gettin measures on the tip of the vibers Rock and roll to the beat of the funk fuzzWipe your feet really good on the rhythm rug If you feel
the urge to freak, do the jitterbug Come and spread your arms if you really need a hug Afrocentric living is a big shrug A life filled with *HORN* that's what I love A lower plateau is what we're above If you diss us, we won't even think of
Will Nipper the doggy give a big shove? This rhythm really fits like a snug glove Like a box of positives is a plus, love As the Tribe flies high like a dove [Phife Dawg] Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it?
To my Tribe that flows in layers Right now, Phife is a poem sayer At times, I'm a studio conveyor Mr. Dinkins, would you
please be my mayor?You'll be doing us a really big favor
Boy this track really has a lot of flavor When it comes to rhythms, Quest is your saviorFollow us for the funky
behavior Make a note on the rhythm we gave ya Feel free, drop your pants, check your ha-ir Do you like the garments
that we wear? I instruct you to be the obeyer A rhythm recipe that you'll savor Doesn't matter if you're minor or major Yes, the Tribe of the game, rhythm player As you inhale like a breath of fresh air
Speech Recording
Transcript
Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it? To all the people who can Quest like A
Tribe does Before this, did you really know what live was?
Comprehend to the track, for it's why cuz Gettin measures on the tip of the vibers Rock and roll to the beat of the funk fuzzWipe your feet really good on the rhythm rug If you feel
the urge to freak, do the jitterbug Come and spread your arms if you really need a hug Afrocentric living is a big shrug A life filled with *HORN* that's what I love A lower plateau is what we're above If you diss us, we won't even think of
Will Nipper the doggy give a big shove? This rhythm really fits like a snug glove Like a box of positives is a plus, love As the Tribe flies high like a dove [Phife Dawg] Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it?
To my Tribe that flows in layers Right now, Phife is a poem sayer At times, I'm a studio conveyor Mr. Dinkins, would you
please be my mayor?You'll be doing us a really big favor
Boy this track really has a lot of flavor When it comes to rhythms, Quest is your saviorFollow us for the funky
behavior Make a note on the rhythm we gave ya Feel free, drop your pants, check your ha-ir Do you like the garments
that we wear? I instruct you to be the obeyer A rhythm recipe that you'll savor Doesn't matter if you're minor or major Yes, the Tribe of the game, rhythm player As you inhale like a breath of fresh air
Temporal Compression
Utterance Identification
Judge Target Utterance Rankings
Gold Standard Target Utterance
Ranking
that we wear? I instruct you to be the obeyer A rhythm recipe that you'll savorDoesn't matter if you're minor or major Yes, the Tribe of the game, rhythm player As you inhale like a breath of fresh air
5
Boy this track really has a lot of flavorWhen it comes to rhythms, Quest is your saviorFollow us for the funky 4
please be my mayor?You'll be doing us a really big favor3
Will Nipper the doggy give a big shove? This rhythm really fits like a snug glove Like a box of positives is a plus, love As the Tribe flies high like a dove [PhifeDawg] Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it?
2
Tribe does Before this, did you really know what live was?1
Listener
please be my mayor?You'll be doing us a really big favor5
Boy this track really has a lot of flavorWhen it comes to rhythms, Quest is your saviorFollow us for the funky 4
that we wear? I instruct you to be the obeyer A rhythm recipe that you'll savorDoesn't matter if you're minor or major Yes, the Tribe of the game, rhythm player As you inhale like a breath of fresh air
3
Tribe does Before this, did you really know what live was?2
Will Nipper the doggy give a big shove? This rhythm really fits like a snug glove Like a box of positives is a plus, love As the Tribe flies high like a dove [PhifeDawg] Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it?
1
Listener Ranking of Target Utterances
ComparisonComprehension
Efficiency
Measuring Gisting Ability
Simon Tucker NLP Presentation
Results
•Removing unimportant utterances performed better than speed up.
•Listeners understood the gist of a recording faster.
•All techniques performed better than applying no compression.
•With longer clips understanding was the same.
•Speed up required more interface interactions than excision.
No Compression
Word Excision Utterance Excision
Speed Up
Compression Type
0
0.001
0.002
0.003
0.004
Mea
n C
ompr
ehen
sion
Eff
icie
ncy
Simon Tucker NLP Presentation
•Using summary information to locate points of interest within a meeting transcript.
•Traditional summaries can be customized but are largely presented statically.
•Underpinned by two concepts:
•User is able to dynamically alter the summarization level.•Alteration shown in real time.
•Applying different presentation techniques.
Dynamic Summarization
Simon Tucker NLP Presentation
•Using the same process to evaluate as was used for the speech work.
•An initial lightweight evaluation of a number of UI concepts intended to find promising directions of research.
•A follow up study examining the techniques in more detail with a more rigorous evaluation protocol.
Development Procedure
Simon Tucker NLP Presentation
•Two unit levels examined:
•Words
•Utterances
•Two presentation techniques:
•Unit shading.
•Unit excision.
•Two hybrid techniques:
•Combining the four techniques into one
•An experimental fish-eye view
Dynamic Summary Display
Simon Tucker NLP Presentation
Initial results
•Shading works well.
•Operating at the word level is satisfactory.
•Fish-eye was not liked.
•The combinatorial approach did not really offer anything novel.
Simon Tucker NLP Presentation
Follow Up Study
•Focus solely on the Word Excision and Word Shading techniques (highest rated in the previous experiment).
•Two questions (one specific, one general) about a number of meetings.
•Use the two interfaces (plus a control plain text transcript) to answer the questions (one question per meeting).
•Measure the time taken to answer, the accuracy and the amount of interface actions used when answering the questions.
•Collect subjective preference data and user comments about each of the techniques.
Simon Tucker NLP Presentation
Follow Up Study Results
•Subjects were largely accurate – there was no effect on interface type on the accuracy
•No effect of interface type on time taken to answer – i.e. there was no efficiency loss as a result of using the dynamic interfaces.
Simon Tucker NLP Presentation
Preference and Process Results
•Subjects overwhelmingly preferred the Word Excision Condition.•Subjects scored the Word
Excision and Plain Transcript conditions equally.
•The Word Shading condition required less interface actions than the Word Excision condition.•Specifically users spent
more time changing compression levels in the Word Excision condition.
Simon Tucker NLP Presentation
Video Compression
•The same techniques for audio can also be applied to video.
•Compress the audio recording and use this compressed version to derive an audio-video recording.
•Informal evaluation indicates a different modality for video.
Simon Tucker NLP Presentation
Video Examples
•Type of video being used
•Word excised video
•The cuts are now much more disconcerting.
•Sped Up video
•More comfortable to watch but disconcerting at high compression levels.
•Can also do non-linear compressed video
•Speed up only the non-silent parts.
•Can also e.g. speed up through unimportant parts
Simon Tucker NLP Presentation
Summary
•Looking at Interfaces for Browsing Meeting Recordings
•Problems with abstraction in current meeting recording technology and automatic browsing systems
•Temporal Compression of Speech
•Reducing the time required to listen to a speech recording but keeping the important information.
•Utterance Excision.