20
How can crowdsourcing and machine learning improve speech technology? Joao Freitas, Daniela Braga CSW Global London April 14 th 2016

How Can Crowdsourcing and Machine Learning Improve Speech Technology?

Embed Size (px)

Citation preview

Page 1: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

How can crowdsourcing and machine learning improve speech technology?

Joao Freitas, Daniela BragaCSW Global LondonApril 14th 2016

Page 2: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd2

How many of you have tried speech recognition?

Page 3: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd3

Speech Technology is everywhere

Page 4: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd4

And it starts to understands you…

Page 5: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd5

What it takes to get there

Large amounts of data

Deep Learning

3000+ hours speech recordings + transcription200+ words with pronunciations

0.5M natural language variants + semantic annotation

Language and Product dependent!

Page 6: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd6

DefinedCrowd landscape

We serve the data needs for AI and ML landscape.

We’re a SaaS company that collects and enriches training data for AI,

combining crowdsourcing and ML.

Page 7: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd7

The world before DefinedCrowd

Louis, Speech Scientist

Wants to test if the Chinese acoustic model works for

Mandarin speakers in Singapore

User Goal

Hires:• Few vendors• 1PM • 1 Dev• 1 Chinese LE in-

house

What does he do?

50 hours of raw speech with…

• Poor quality (~20% of garbage)

• Unknown sources • Long wait

What does he get?

Page 8: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd8

The world after DefinedCrowd

Andy, Speech Scientist

Wants to test if the Chinese acoustic model works for

Mandarin speakers in Singapore

User Goal

Subscribes our platform

What does he do?

50 hours of pure speech with…

• High-quality• 100% transparency• 50% faster

throughput

What does he get?

• Picks a template• Adjusts settings

and picks the crowd• Launches the job• Collects the data

How does he do it?

Page 9: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd9

Our platform – enterprise side

Page 10: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd

Unique crowd model

US: 200+

Brazil: 200+

Taiwan: 100+

Russia: 200+

Japan: 100+

Korea: 100+

Ukraine (100+)

Spain (100+)Portugal (100+)

France (100+)Germany (100+)

Denmark (50+)

Sweden (50+)Finland (50+)

Netherlands (50+)

Italy (100+) Greece (100+)

Czech Republic (100+)

Poland (100+)

Turkey (100+)

Belgium (50+)

Australia: 100+

New Zealand:50+

Mexico: 100+Puerto Rico: 100+

Canada: 100+

China: 200+

Vietnam: 50+Thailand: 50+

Malaysia: 50+Singapore: 50+

India: 100+

30+ countries

100+ dialects

3,000 crowd

Page 11: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd11

We know a lot about our crowd

Languages & Dialect

User Activity

Job Performance

School & Courses

Profile Info

Other Jobs

Page 12: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd12

Why is Machine Learning

relevant for Crowdsourcing?

Page 13: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd13

We learn from metadata to provide recommendations to customers and crowd members

How we use Machine Learning

Page 14: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd14

How we detect spam

Raw data

• Logging system• Behavior measures

Data Processing

•Clean data •Transform data

Feature Extraction

• Task-related measures (e.g. average duration)

• Session Duration• Execution peaks• Consensus score• Real-time audits

Classification & Analysis

• Detect outliers/ anomalies

• Predict task / job duration

OUTLIE

R

Page 15: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd15

Example of Results I

Page 16: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd

Same results – Different perspective

Page 17: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd17

Another Dimension

Page 18: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd18

Quality in our platform

1. Combined score of Qualification Tests2. Real-time Audits and Reviews3. Majority Vote 4. Overall Majority 5. Worker Expertise6. Task Subjectiveness7. …

Page 19: How Can Crowdsourcing and Machine Learning Improve Speech Technology?

April 2016 definedcrowd19

Other predictions using Machine Learning

Best quality / budget tradeoff

Best match between job and crowd member

Expected quality

When will a job finish (even before it starts)

Quality Time

Cost