33
CAPTCHA CAPTCHA

Captchas

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Captchas

CAPTCHACAPTCHA

Page 2: Captchas

CAPTCHACAPTCHA• CAPTCHA is an acronym for “ CAPTCHA is an acronym for “

Completely automated public Turing Test Completely automated public Turing Test To Tell the Computers and Human apart”To Tell the Computers and Human apart”

• A CAPTCHA is a challenge response test A CAPTCHA is a challenge response test used in computing to determine the user used in computing to determine the user is human .is human .

• Trademarked in 2000 by Luis von Trademarked in 2000 by Luis von Ahn,Manuel Blum,Nicholas Hopper and Ahn,Manuel Blum,Nicholas Hopper and John Langford of Carnegie Mellon John Langford of Carnegie Mellon University ,who developed the first University ,who developed the first CAPTCHA.CAPTCHA.

Page 3: Captchas

CAPTCHACAPTCHA• A common type of CAPTCHA requires the user A common type of CAPTCHA requires the user

to type the letters of a distorted image to type the letters of a distorted image sometimes with the addition of an obscured sometimes with the addition of an obscured sequence of letters or digits appears on screen.sequence of letters or digits appears on screen.

• This string which the user has to type to submit This string which the user has to type to submit a form .This is a simple problem for humans,but a form .This is a simple problem for humans,but a very hard problem for computers which have a very hard problem for computers which have to use character recognition,because the to use character recognition,because the displayed string is alienated in a way,which displayed string is alienated in a way,which makes it very hard for a computer to decodemakes it very hard for a computer to decode

• Early CAPTCHAs such as these distorted Early CAPTCHAs such as these distorted images generated by EZ-Gimpy program were images generated by EZ-Gimpy program were used on Yahoo.used on Yahoo.

Page 4: Captchas

CAPTCHACAPTCHA

A program that can generate and A program that can generate and grade tests that:grade tests that:

1. Most humans can pass1. Most humans can pass

2. 2. CurrentCurrent computer programs cannot computer programs cannot passpass

Page 5: Captchas

Contd…Contd…

• The concept of a CAPTCHA is motivated The concept of a CAPTCHA is motivated by real-world problems faced by internet by real-world problems faced by internet companies such as Yahoo! and AltaVista. companies such as Yahoo! and AltaVista.

• These companies offer free email These companies offer free email accounts, intended for use by humans. accounts, intended for use by humans.

• However, they found that many online However, they found that many online vendors were using "bots", computer vendors were using "bots", computer programs that would sign up for programs that would sign up for thousands of email accounts, from which thousands of email accounts, from which they could send out masses of junk email. they could send out masses of junk email.

Page 6: Captchas

OriginOrigin

• The first discussion of automated tests The first discussion of automated tests which distinguish humans from which distinguish humans from computers for the purpose of controlling computers for the purpose of controlling access to web services appears in a 1996 access to web services appears in a 1996 manuscript of manuscript of Moni NaorMoni Naor from the from the Weizmann Institute of ScienceWeizmann Institute of Science, entitled , entitled "Verification of a human in the loop, or "Verification of a human in the loop, or Identification via the Turing Test".Identification via the Turing Test".

• Primitive CAPTCHAs seem to have been Primitive CAPTCHAs seem to have been later developed in 1997 at AltaVista by later developed in 1997 at AltaVista by Andrei Broder and his colleagues to Andrei Broder and his colleagues to prevent bots from adding URLs to their prevent bots from adding URLs to their search engine search engine

Page 7: Captchas

Contd…Contd…

• In order to make the images resistant to In order to make the images resistant to OCR (Optical Character Recognition), OCR (Optical Character Recognition), the team simulated situations that the team simulated situations that scanner manuals claimed resulted in scanner manuals claimed resulted in bad OCR.bad OCR.

• In 2000, von Ahn and Blum developed In 2000, von Ahn and Blum developed and publicized the notion of a and publicized the notion of a CAPTCHA, which included any program CAPTCHA, which included any program that can distinguish humans from that can distinguish humans from computers. computers.

Page 8: Captchas

CharacteristicsCharacteristics

• A CAPTCHA system is an automated means A CAPTCHA system is an automated means of generating new challenges which current of generating new challenges which current computers are unable to accurately solve, computers are unable to accurately solve, but most humans can solve .but most humans can solve .

• CAPTCHAs are by definition fully automated, CAPTCHAs are by definition fully automated, requiring little human maintenance or requiring little human maintenance or intervention in administering the test. intervention in administering the test.

• This has obvious benefits in cost and This has obvious benefits in cost and reliability.reliability.By definition, the algorithm used to create By definition, the algorithm used to create the CAPTCHA must be made public, though the CAPTCHA must be made public, though it may be covered by a patent. it may be covered by a patent.

Page 9: Captchas

AccessibilityAccessibility

• Because CAPTCHAs rely on perception, users Because CAPTCHAs rely on perception, users unable to perceive a CAPTCHA due to a unable to perceive a CAPTCHA due to a disability (such as blindness) will be unable disability (such as blindness) will be unable to perform the task protected by a CAPTCHA. to perform the task protected by a CAPTCHA. In certain Cases, failing to provide a In certain Cases, failing to provide a universally accessible means of bypassing the universally accessible means of bypassing the CAPTCHA could make site owners a target of CAPTCHA could make site owners a target of litigation litigation

• In order to combat this problem, many In order to combat this problem, many implementations of CAPTCHAs permit users implementations of CAPTCHAs permit users to opt for an audio CAPTCHA in addition to a to opt for an audio CAPTCHA in addition to a text based one.text based one.

Page 10: Captchas

Contd…Contd…

• While the combination of an audio and While the combination of an audio and visual CAPTCHA can not satisfy all visual CAPTCHA can not satisfy all users (for example, those with users (for example, those with deafblindness), the choice of adding a deafblindness), the choice of adding a CAPTCHA to an application is a CAPTCHA to an application is a balance between ease of use for balance between ease of use for legitimate users and creating enough legitimate users and creating enough of a challenge for abusers that abusing of a challenge for abusers that abusing the application is not worthwhile the application is not worthwhile

Page 11: Captchas

Contd…Contd…

• The inconvenience caused by a The inconvenience caused by a CAPTCHA is sometimes higher for CAPTCHA is sometimes higher for users with disabilities. For some users with disabilities. For some applications, the potential for abuse applications, the potential for abuse is so high that the application author is so high that the application author feels that a CAPTCHA is necessary. feels that a CAPTCHA is necessary. For other applications, the need for For other applications, the need for accessibility outweighs the abuse accessibility outweighs the abuse that a CAPTCHA would prevent. that a CAPTCHA would prevent.

Page 12: Captchas

EZ-GimpyEZ-Gimpy

• EZ-Gimpy and Gimpy, the CAPTCHAs that we EZ-Gimpy and Gimpy, the CAPTCHAs that we have broken, are examples of word-based have broken, are examples of word-based CAPTCHAs.CAPTCHAs.

• In EZ-Gimpy, the CATPCHA used by Yahoo! In EZ-Gimpy, the CATPCHA used by Yahoo! the user is presented with an image of a single the user is presented with an image of a single word. word.

• This image has been distorted, and a cluttered, This image has been distorted, and a cluttered, textured background has been added. textured background has been added.

• The distortion and clutter is sufficient to The distortion and clutter is sufficient to confuse current OCR (optical character confuse current OCR (optical character recognition) software. recognition) software.

Page 13: Captchas

Contd…Contd…

• However, using our computer vision However, using our computer vision techniques we are able to correctly techniques we are able to correctly identify the word 92% of the time. identify the word 92% of the time.

• Gimpy is a more difficult variant of a Gimpy is a more difficult variant of a word-based CAPTCHA. Ten words are word-based CAPTCHA. Ten words are presented in distortion and clutter presented in distortion and clutter similar to EZ-Gimpy. similar to EZ-Gimpy.

• The words are also overlapped, The words are also overlapped, providing a CAPTCHA test that can be providing a CAPTCHA test that can be challenging for humans in some cases. challenging for humans in some cases.

Page 14: Captchas

Generating Generating CAPTCHAs :BongoCAPTCHAs :Bongo

Answer: leftAnswer: left

Page 15: Captchas

Contd…Contd…

• to the Right series or to the Left displays to the Right series or to the Left displays two series of blocks, the Left and the two series of blocks, the Left and the RightRight

• blocks in the Left series differ from those blocks in the Left series differ from those in the Right, and the user must find the in the Right, and the user must find the characteristic that sets them apart.characteristic that sets them apart.

• then, the user is presented with a single then, the user is presented with a single block and is asked to determine whether block and is asked to determine whether this block belongs this block belongs

Page 16: Captchas

Sound Based Sound Based CAPTCHAs:EcoCAPTCHAs:Eco

• Sound-based CAPTCHASound-based CAPTCHA picks a word or a sequence of picks a word or a sequence of

numbers at random, renders the word numbers at random, renders the word or the numbers into a sound clip and or the numbers into a sound clip and distorts the sound clip.distorts the sound clip.

then presents the distorted sound then presents the distorted sound clip to its user and asks them to enter clip to its user and asks them to enter the contents of the sound clipthe contents of the sound clip

Page 17: Captchas

Text Based CAPTCHAsText Based CAPTCHAs

Page 18: Captchas

Character RecognitionCharacter Recognition

• A number of research projects have attempted A number of research projects have attempted (often with success) to beat visual CAPTCHAs (often with success) to beat visual CAPTCHAs by creating programs that contain the by creating programs that contain the following functionality following functionality

1.Extraction of the image from the web 1.Extraction of the image from the web page. page.

2.Removal of background clutter, for 2.Removal of background clutter, for example with color example with color filters and detection of thin lines. filters and detection of thin lines.

3.Segmentation, i.e. splitting the image into 3.Segmentation, i.e. splitting the image into segments containing a single letter. segments containing a single letter.

4.Identifying the letter for each segment 4.Identifying the letter for each segment

Page 19: Captchas

Contd…Contd…

• Steps 1, 2, and 4 are easy tasks for Steps 1, 2, and 4 are easy tasks for computerscomputers

The only part where humans still out The only part where humans still out perform computers is segmentation. perform computers is segmentation.

• If the background clutter consists of If the background clutter consists of shapes similar to letter shapes, and the shapes similar to letter shapes, and the letters are connected by this clutter, the letters are connected by this clutter, the segmentation becomes nearly impossible segmentation becomes nearly impossible with current software. Hence, an effective with current software. Hence, an effective CAPTCHA should focus on the CAPTCHA should focus on the segmentation segmentation

Page 20: Captchas

Graphic Based Graphic Based CAPTCHAsCAPTCHAs

Page 21: Captchas

Image-recognition Image-recognition CAPTCHAsCAPTCHAs

• Some researchers promote image recognition Some researchers promote image recognition CAPTCHAs as a possible alternative for text CAPTCHAs as a possible alternative for text based CAPTCHAs. To date, no major website based CAPTCHAs. To date, no major website has made use of an image based CAPTCHA. As has made use of an image based CAPTCHA. As such, the technology would be best described such, the technology would be best described as in the stage of theoretical research. Image as in the stage of theoretical research. Image recognition CAPTCHAs face many potential recognition CAPTCHAs face many potential problems which have not been fully studied:problems which have not been fully studied:

• It is difficult for a small site to acquire a large It is difficult for a small site to acquire a large dictionary of images which an attacker does dictionary of images which an attacker does not have access to. Without a means of not have access to. Without a means of automatically acquiring new labelled images, automatically acquiring new labelled images, an image based challenge does not meet the an image based challenge does not meet the definition of a CAPTCHA. definition of a CAPTCHA.

Page 22: Captchas

PrinciplesPrinciplesThe principles behind CAPTCHA are as follows:The principles behind CAPTCHA are as follows:• The user is presented with a garbled image on The user is presented with a garbled image on

which some text is displayed. This image is which some text is displayed. This image is generated by the server using random text.generated by the server using random text.

• The user must enter the same letters in the text The user must enter the same letters in the text into a text field that is displayed on the form to into a text field that is displayed on the form to protect. protect.

• When the form is submitted, the server checks When the form is submitted, the server checks if the text entered by the user matches the if the text entered by the user matches the initial generated text. If it does, the transaction initial generated text. If it does, the transaction continues. Otherwise, an error message is continues. Otherwise, an error message is displayed and the user has to enter a new code.displayed and the user has to enter a new code.

  

Page 23: Captchas

CAPTCHA would look CAPTCHA would look like…like…

• The captcha would look like this:The captcha would look like this:• On the On the main registration formmain registration form a regular a regular

captcha is presented just like before. Users captcha is presented just like before. Users that can see the image may use this test. A that can see the image may use this test. A link informs users that there is an alternative link informs users that there is an alternative test. test.

• Clicking the link leads to Clicking the link leads to the audio based test formthe audio based test form. This form provides . This form provides access to an audio file and three input fields. access to an audio file and three input fields. The audio file contains three numbers that The audio file contains three numbers that the user has to enter into the fieldsthe user has to enter into the fields

Page 24: Captchas

ApplicationsApplications

• Online pollsOnline polls• Protecting Website RegistrationProtecting Website Registration• Preventing Comment Spam in Blogs.Preventing Comment Spam in Blogs.• Search Engine BotsSearch Engine Bots• Worms and SpamWorms and Spam• Prevent Dictionary attacksPrevent Dictionary attacks

Page 25: Captchas

ApplicationsApplications

• Online pollsOnline polls

In November 1999,htttp://slashdot.comIn November 1999,htttp://slashdot.com

Released an online poll asking which was Released an online poll asking which was the best graduate school in computer the best graduate school in computer science!. As is the case with most online polls, science!. As is the case with most online polls, IP addresses of voters were recorded in order IP addresses of voters were recorded in order to prevent single users from voting more than to prevent single users from voting more than once. However, students at Carnegie Mellon once. However, students at Carnegie Mellon found a way to stuff the ballots by using found a way to stuff the ballots by using programs that voted for CMU thousands of programs that voted for CMU thousands of times. times.

Page 26: Captchas

Contd…Contd…

CMU's score started growing rapidly. CMU's score started growing rapidly. The next day, students at MIT wrote The next day, students at MIT wrote their own voting program and the their own voting program and the poll became a contest between poll became a contest between voting “bots". MIT finished with voting “bots". MIT finished with 21,156 votes, Carnegie Mellon with 21,156 votes, Carnegie Mellon with 21,032 and every other school with 21,032 and every other school with less than 1,000. less than 1,000.

Page 27: Captchas

ApplicationsApplications

• Protecting Website RegistrationProtecting Website Registration

Several companies offer free email Several companies offer free email services. Up Until a few years ago most services. Up Until a few years ago most of these services suffered from a a of these services suffered from a a specific type of attack:”bots” that would specific type of attack:”bots” that would sign up for thousands of email accounts sign up for thousands of email accounts every minuite.The solution to this every minuite.The solution to this problem was to use CAPTCHAs to problem was to use CAPTCHAs to ensure that only humans obtain free ensure that only humans obtain free accounts.accounts.

Page 28: Captchas

ApplicationsApplications

• Preventing Comment spam in BlogsPreventing Comment spam in Blogs Most Bloggers are familiar with Most Bloggers are familiar with

programs that submit bogus comments programs that submit bogus comments usually for the purpose of raising search usually for the purpose of raising search engine ranks of some website.This is engine ranks of some website.This is called comment spam.By using a called comment spam.By using a CAPTCHA only humans can enter CAPTCHA only humans can enter comments on a blog.There is no need to comments on a blog.There is no need to make users sign up before they enter a make users sign up before they enter a comment,and no legitimate comments comment,and no legitimate comments are over lost!are over lost!

Page 29: Captchas

ApplicationsApplications

• Search Engine BotsSearch Engine Bots

It is sometimes desirable to It is sometimes desirable to keep webpages unindexed to keep webpages unindexed to prevent others from finding them prevent others from finding them easily.There is an html tag to easily.There is an html tag to prevent search engine bots from prevent search engine bots from reading webpages.reading webpages.

Page 30: Captchas

ApplicationsApplications

• Worms and SpamWorms and Spam

CAPTCHA tests also offer a CAPTCHA tests also offer a plausible solution against email plausible solution against email worms and spam:worms and spam:

only accept an email message if you only accept an email message if you know know

there is a human behind the other there is a human behind the other computer.computer.

Page 31: Captchas

ApplicationsApplications

• Preventing Dictionary attacksPreventing Dictionary attacks

CAPTCHA can also be used to prevent CAPTCHA can also be used to prevent dictionary attacks in password dictionary attacks in password systems.The idea is simple:prevent systems.The idea is simple:prevent computer from being able to iterate computer from being able to iterate through the entire space of passwords by through the entire space of passwords by requiring it to solve a CAPTCHA after a requiring it to solve a CAPTCHA after a certain number of unsuccessful logins.certain number of unsuccessful logins.

Page 32: Captchas

ConclusionConclusion

• Interested in breaking a Interested in breaking a CAPTCHA? CAPTCHA?

• People have tried already!People have tried already!

Page 33: Captchas

THANK YOUTHANK YOU