Digitizing Books using CAPTCHA

I came across this morning, a very interesting service that prevents spam using CAPTCHA and uses those results to digitize books.

What is a CAPTCHA?
A CAPTCHA is a program that can generate and grade tests that humans can pass but current computer programs cannot. The term CAPTCHA (for Completely Automated Turing Test To Tell Computers and Humans Apart) was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. At the time, they developed the first CAPTCHA to be used by Yahoo.

reCAPTCHA is a free CAPTCHA service that helps to digitize books.

reCAPTCHA is a system developed at Carnegie Mellon University which utilizes CAPTCHA to assist in the process of digitizing the text of books, while protecting websites from bots attempting to access restricted areas. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA.

But if a computer can’t read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here’s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

More info: http://recaptcha.net/learnmore.html

get it for FREE

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s