71
Crowdsourcing OCR Correction Through Game Playing By: Lin Tingji Jovian National University of Singapore (NUS) Olivier Amprimo Digital Resources & Services, National Library Board (NLB)

Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

Embed Size (px)

DESCRIPTION

Presentation of a pilot to test human computation gaming to improve OCR correction of non-digital born content, in Singapore.This presentation was given at BarCamp Singapore 4, Saturday 21 November 2009.

Citation preview

Page 1: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

Crowdsourcing OCR CorrectionThrough Game Playing

By:

Lin Tingji JovianNational University of Singapore (NUS)

Olivier AmprimoDigital Resources & Services, National Library Board (NLB)

Page 2: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Overview

1. Problems with Digital Archiving

Page 3: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Overview

1. Problems with Digital Archiving

2. Solutions

Page 4: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Overview

1. Problems with Digital Archiving

2. Solutions

3. TypeAttack

Page 5: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Overview

1. Problems with Digital Archiving

2. Solutions

3. TypeAttack

4. How Does TypeAttack Work

Page 6: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Overview

1. Problems with Digital Archiving

2. Solutions

3. TypeAttack

4. How Does TypeAttack work

5. Where We Are Now

Page 7: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

Problems

Page 8: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Problems in Digitizing Archives

• National Library Board (NLB) digitizes The Straits Times (SPH) articles.

Page 9: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Problems in Digitizing Archives

• National Library Board (NLB) digitizes The Straits Times (SPH) articles.

• However, when the articles are older, digitization of the content is prone tomany inaccuracies.

Page 10: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Problems in Digitizing Archives

• National Library Board (NLB) digitizes The Straits Times (SPH) articles.

• However, when the articles are older, digitization of the content is prone tomany inaccuracies.

• For example:

Source: ReCaptcha

Page 11: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Page 12: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Page 13: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Some of NLB’s OCRResult

NLB’s OCR Translation

Page 14: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

IS NOT GOOD ENOUGH !Some of NLB’s OCRResult

Page 15: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Problems in Digitizing Archives

• National Library Board (NLB) digitizes Straits Times (SPH) articles.

• However, when the articles are older, digitization of the content is prone tomany inaccuracies.

• For example:

• In fact, the NLB needs to employ people to double check and rectifyerrors.

Page 16: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Problems in Digitizing Archives

• National Library Board (NLB) digitizes Straits Times (SPH) articles.

• However, when the articles are older, digitization of the content is prone tomany inaccuracies.

• For example:

• In fact, the NLB needs to employ people to double check and rectifyerrors.

• This leads to extra cost and inefficiency.

Page 17: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

Solutions

Page 18: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Solution to Improve Digitization Process

1. Many tasks challenges even sophisticated computer programs.

Page 19: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Solution to Improve Digitization Process

1. Many tasks challenges even sophisticated computer programs.

2. They are trivial for humans

Page 20: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Options to Improve Digitization Process

1. Employ a large number of people, dedicated full time.

+ + NLB has experience in doing this.

- - Resource Allocation and Co-ordination Cost

Page 21: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

2. Enroll and Support Volunteers

+ + National Library of Australia > http://www.nla.gov.au/ndp/get_involved/

- - Singapore specifics:• Copyrights• Computer Literacy of Elders• Partial Retirement

Options to Improve Digitization Process

Page 22: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

3. Make it a mandatory part of popular processes

+ + A Turing Machine to separate bots from humans online > ReCaptcha

- - ReCaptcha turned out to be very proprietaryNo plans for a change, even with GoogleWord, not sentence > decontextualization = poor meaning

Options to Improve Digitization Process

Page 23: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

4. Make it part of something attractive

+ + Tap into popularity of games – Human Computational GamesMore than 200 million hours are spent each day playing

computer games in U.S. alone.The World Financial Crisis makes it bigger!

>> TypeAttack

Options to Improve Digitization Process

Page 24: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game
Page 25: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

TypeAttack

• TypeAttack is a Human Computational Game on Facebook that helpsdigitize archives for National Library Board.

• Being built on Facebook, TypeAttack can:– Harness Facebook’s 200 million active users worldwide.– Utilize Facebook’s viral techniques:

• Friend invites to the game.• Publishing typing scores onto a user’s Facebook wall.• Utilizing Facebook’s newsfeeds to expose TypeAttack to more

Facebook users.– Most importantly, perform Human Computation efficiently.

Page 26: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

TypeAttack

• TypeAttack is a Human Computational Game on Facebook that helpsdigitize archives for National Library Board.

• Being built on Facebook, TypeAttack can:– Harness Facebook’s 200 million active users worldwide.– Utilize Facebook’s viral techniques:

• Friend invites to the game.• Publishing typing scores onto a user’s Facebook wall.• Utilizing Facebook’s newsfeeds to expose TypeAttack to more

Facebook users.– Most importantly, perform Human Computation efficiently.

Page 27: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

TypeAttack

• TypeAttack is a Human Computational Game on Facebook that helpsdigitize archives for National Library Board.

• Being built on Facebook, TypeAttack can:– Harness Facebook’s 200 million active users worldwide.– Utilize Facebook’s viral techniques:

• Friend invites to the game.• Publishing typing scores onto a user’s Facebook wall.• Utilizing Facebook’s newsfeeds to expose TypeAttack to more

Facebook users.– Most importantly, perform Human Computation efficiently.

Page 28: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

TypeAttack

• TypeAttack is a Human Computational Game on Facebook that helpsdigitize archives for National Library Board.

• Being built on Facebook, TypeAttack can:– Harness Facebook’s 200 million active users worldwide.– Utilize Facebook’s viral techniques:

• Friend invites to the game.• Publishing typing scores onto a user’s Facebook wall.• Utilizing Facebook’s newsfeeds to expose TypeAttack to more

Facebook users.– Most importantly, perform Human Computation efficiently.

Page 29: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Luis von AhnCarnegie Mellon University

Since people spend so much time on computer games…

Let us make use of them to perform tasks that computershave difficulty performing.

BIRTH OF Human Computational Games1.These are games played by humans that produce useful computationas a side-effect.

2.People play not because they want to solve computational problems,but because they want to be entertained.

3.It combines human brainpower with computers to solve problemsthat neither could solve alone.

Step-back:Motivation behind Human Computational Games

Page 30: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Luis von AhnCarnegie Mellon University

Since people spend so much time on computer games…

Let us make use of them to perform tasks that computershave difficulty performing.

BIRTH OF Human Computational Games1.These are games played by humans that produce useful computationas a side-effect.

2.People play not because they want to solve computational problems,but because they want to be entertained.

3.It combines human brainpower with computers to solve problemsthat neither could solve alone.

Step-back:Motivation behind Human Computational Games

Page 31: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Luis von AhnCarnegie Mellon University

Since people spend so much time on computer games…

Let us make use of them to perform tasks that computershave difficulty performing.

BIRTH OF Human Computational Games1.These are games played by humans that produce useful computationas a side-effect.

2.People play not because they want to solve computational problems,but because they want to be entertained.

3.It combines human brainpower with computers to solve problemsthat neither could solve alone.

Step-back:Motivation behind Human Computational Games

Page 32: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

How Does TypeAttack Works

Page 33: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How TypeAttack works

• Flow:

Page 34: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

• Flow:

Entire collection of Straits Timesin the year 1938 (worth 10 GB)

XML files representing data per (newspaper)page

How TypeAttack works

Page 35: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

image

XML

Page 36: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

1. TypeAttack “cuts” the differentarticles in a page.

2. Within an article, it cuts out shortsnippets of text.

Page 37: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

1. TypeAttack “cuts” the differentarticles in a page.

2. Within an article, it cuts out shortsnippets of text.

Page 38: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Page 39: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

• Flow:

Entire 1938 Straits Times+

Respective XML files

TypeAttack+

Facebook users+

Computational Algorithms

Digitized versions withover 99.0% transcription

accuracy at the word level.

How TypeAttack works

Page 40: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?

TypeAttack uses 2 kinds of information:

1. Output from Facebook userss.

2. NLB’s OCR Translation Results– accuracy rate @ word level

3. Bi-gram (text prediction)– Looks at the probability of two-word sequence– E.g. Given Word A, what is the probability of Word B?

Page 41: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?

TypeAttack uses 2 kinds of information:

1. Output from Facebook userss.

2. NLB’s OCR Translation Results– accuracy rate @ word level

3. Bi-gram (text prediction)– Looks at the probability of two-word sequence– E.g. Given Word A, what is the probability of Word B?

Page 42: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?

TypeAttack uses 2 kinds of information:

1. Output from Facebook userss.

2. Results from NLB’s OCR Translation.– accuracy rate @ word level

3. Bi-gram (text prediction)– Looks at the probability of two-word sequence– E.g. Given Word A, what is the probability of Word B?

Page 43: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

Page 44: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

Today is nice.

Today iz nice.

is nize.

Today

Page 45: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

Today is nice.

Today iz nice.

is nize.

Today

Page 46: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

Today is nice.

Today iz nice.

is nize.

Today

Page 47: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

Today is nice.

Today iz nice.

is nize.

Today is

Page 48: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

Today is nice.

Today iz nice.

is nize.

Today is

Page 49: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

Today is nice.

Today iz nice.

is nize.

Today is nice.

Page 50: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

Today is nice.

Today iz nice.

is nize.

Today is nice.

Page 51: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

• Say for example, the paragraph to be typed is “Today is nice.".

• Based on all the players' output on this particular paragraph, it will get aprobability of each word.

“Today" = 96.4% of users typed."is" = 95.1% of users typed."nice." = 97.3% of users typed.

• Thus since each word in this paragraph is at least 95% probable, wedetermine that “Today is nice." is the correct output.

Page 52: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

• Say for example, the paragraph to be typed is “Today is nice.".

• Based on all the players' output on this particular paragraph, it will get aprobability of each word.

“Today" = 96.4% of users typed."is" = 95.1% of users typed."nice." = 97.3% of users typed.

• Thus since each word in this paragraph is at least 95% probable, wedetermine that “Today is nice." is the correct output.

Page 53: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(1) Output from Facebook users

• Say for example, the paragraph to be typed is “Today is nice.".

• Based on all the players' output on this particular paragraph, it will get aprobability of each word.

“Today" = 96.4% of users typed."is" = 95.1% of users typed."nice." = 97.3% of users typed.

• Thus since each word in this paragraph is at least 95% probable, wedetermine that “Today is nice." is the correct output.

Page 54: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(2) NLB’s OCR Translation Result

• Method previously simply compares the output between players.

• To speed things up, we will compare the players' output with the OCRResult.

• Once every word in the paragraph is >95% probable, the paragraph'sstatus is set to 'complete' and will not be displayed in the game anymore.

Page 55: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(2) NLB’s OCR Translation Result

• Method previously simply compares the output between players.

• To speed things up, we will compare the players' output with theOCR Translation Result.

• Once every word in the paragraph is >95% probable, the paragraph'sstatus is set to 'complete' and will not be displayed in the game anymore.

Page 56: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

How does TypeAttack digitize content?(2) NLB’s OCR Translation Result

• Method previously simply compares the output between players.

• To speed things up, we will compare the players' output with theOCR Translation Result.

• Once every word in the paragraph is >95% probable, the paragraph'sstatus is set to 'complete' and will not be displayed in the game anymore.

Page 57: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Innovativeness and Uniqueness

1. Uses Game Elements and Social Networking to channel humanbrainpower through computer games.

Page 58: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Innovativeness and Uniqueness

1. Uses Game Elements and Social Networking to channel humanbrainpower through computer games.

2. Utilizing probabilities that are determined from different areas (fromFacebook userss and OCR) to ensure that we can extract the correcttext content with minimum user output.

Page 59: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Innovativeness and Uniqueness

1. Uses Game Elements and Social Networking to channel humanbrainpower through computer games.

2. Utilizing probabilities that are determined from different areas (fromFacebook userss and OCR) to ensure that we can extract the correcttext content with minimum user output.

3. Working SystemSeamlessly integrates Game Elements, Computational Algorithms andSocial Networking aspects to solve problems that neither humans norcomputers can solve individually.

Page 60: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

Where We Are Now

Page 61: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Where We Are Now

• Evaluating the project : Sustainability• User growth• Volume of contribution• Average user contribution trend• Automating the process further (snippet selection)• Expansion beyond FaceBook

Page 62: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Where We Are Now

• Evaluating the project : Sustainability• User growth• Volume of contribution• Average user contribution trend• Automating the process further (snippet selection)• Expansion beyond FaceBook

• Impacts• Product Design• Marketing and Communication• NLB Digitization Process• NLB Culture

Page 63: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Where We Are Now

• Evaluating the project : Economic Rationale• Performance (Words per minute) TypeAttack (60) vs Standard (33)• Cost of TypeAttack (operations + development) vs Part timers

Page 64: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Where We Are Now

• Evaluating the project : Economic Rationale• Performance (Words per minute) TypeAttack (60) vs Standard (33)• Cost of TypeAttack (operations + development) vs Part timers

• Impacts• Finance• IT and Digital Services

Page 65: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Where We Are Now

• Evaluating the project : Scope of Activity• What about content with low word confidence (< 95%) ?• Only the Straits Times +70 years old?• Other languages? Mandarin, Bahasa Melayu, Tamil

Page 66: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

Where We Are Now

• Evaluating the project : Scope of Activity• What about content with low word confidence (< 95%) ?• Only the Straits Times +70 years old?• Other languages? Mandarin, Bahasa Melayu, Tamil

• Impacts• NLB Digitization Process• Singapore Copyright Law: 70 years > 20 years• NLB / SPH Partnership (derogatory agreement on copyright)

Page 67: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

What You Can Do

Page 68: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

What YOU can do

• Be part of the crowd: join and play! http://apps.facebook.com/typeattack/

Page 69: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

What YOU can do

• Be part of the crowd: join and play! http://apps.facebook.com/typeattack/

• Spread the word!

Page 70: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

http://apps.facebook.com/typeattack/ © Lin Tingji Jovian, Olivier Amprimo

What YOU can do

• Be part of the crowd: join and play! http://apps.facebook.com/typeattack/

• Spread the word!

• Feel free to contribute!

• After which, NLB will decide whether they will want to use the system tofurther digitize more contents that have poor OCR accuracies.

Page 71: Type Attack - Crowdsourcing OCR Correction With a Human Computation Game

Thank You