Max Gladwell

Social Media and Green Living

Max Gladwell header image 2

CAPTCHA For a Good Cause

April 21st, 2008 · 1 Comment

Welcome to the official reCAPTCHA Website

If you’ve built online profiles, commented on blogs, or used social networking to any degree, then you’re familiar with the infamous and often frustrating CAPTCHA.

Another way to make segmentation difficult is to crowd symbols together. This can be read by humans but cannot be segmented by bots

According to Wikipedia: A CAPTCHA (IPA: /ˈkæptʃə/) is a type of challenge-response test used in computing to determine that the response is not generated by a computer. A common type of CAPTCHA requires that the user type the letters of a distorted image, sometimes with the addition of an obscured sequence of letters or digits that appears on the screen.

Having painstakingly entered so many of them lately, we knew there had to be a way to harness the collective power of these repetitive but otherwise necessary tasks. It takes time, and it must happen tens of millions of times every day (we thought to ourselves). Having looked into it, there is a version that turns these internet speed bumps into a means of digitizing old books…one CAPTCHA phrase at at time. So this will now be used for comments on Max Gladwell, but you’ll know that there is a method to the madness. Please read on…

reCAPTCHA

About 60 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that’s not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into “reading” books.

To archive human knowledge and to make information more accessible to the world, multiple projects are currently digitizing physical books that were written before the computer age. The book pages are being photographically scanned, and then, to make them searchable, transformed into text using “Optical Character Recognition” (OCR). The transformation into text is useful because scanning a book produces images, which are difficult to store on small devices, expensive to download, and cannot be searched. The problem is that OCR is not perfect.

Example of OCR errors

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.

But if a computer can’t read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here’s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

Currently, we are helping to digitize books from the Internet Archive.

We feel that this is just one way to harness the power of mass CAPTCHA’ing for a worthy cause. Certainly there is a model that can integrate positive messaging, ad-driven donations, and other types of awareness initiatives. These are moments when you are focused on solving a simple task in order to move forward with whatever you’re doing. Why not put that focus and effort to good use? Leave a comment below and see what we mean.


Related Posts

  • No Related Post

Tags: Technology

1 response so far ↓

  • 1 old versions // May 11, 2008 at 5:36 am

    […] there is a model that can integrate positive messaging, ad-driven donations, and other types ofhttp://www.maxgladwell.com/2008/04/captcha-for-a-good-cause/Old VersionsOffers for download the old versions of the ICQ […]

Leave a Comment