CAPTCHA

From Computing and Software Wiki

Revision as of 06:39, 9 April 2009 by Dangelsm (Talk)
Jump to: navigation, search

CAPTCHA is an acronym for Completely Automated Public Turing Test to Tell Computers and Humans Apart. Commonly, these tests take the form of images of scrambled text that a human is able to read, but current optical character recognition software cannot decipher. The most common use of a CAPTCHA is to protect web-accessible services from being abused by "bots".

Contents

Background

The term CAPTCHA was first coined by Luis von Ahn, Manuel Blum, Nicholas J. Hopper and John Langford in 2000. von Ahn, Blum and Hopper were all from Carnegie Mellon University, while Langford was from IBM. In their paper "CAPTCHA: Using Hard AI Problems for Security" [4], they introduced the theoretical concept of CAPTCHA and some examples of how they could be used. They described a CAPTCHA as "a cryptographic protocol whose underlying hardness assumption is based on an AI problem" [4]. This can be compared to standard key-based cryptography where the "underlying hardness assumption" is that factoring of large numbers is hard. Further, they concluded that a CAPTCHA is a win-win situation, as either the CAPTCHA remains unsolvable by computers and security is maintained, or it is cracked by a computer program and the field of artificial intelligence has been advanced.

Applications

An example of reCAPTCHA, the currently recommended CAPTCHA implementation

In von Ahn, et al.'s 2000 paper, they gave some examples of how a CAPTCHA could be used.

Online Polls

In order to trust the results of any online poll, at the very least, only humans should be able to vote. Requiring a CAPTCHA before submitting a vote would ensure this.

Free e-mail Services

The free e-mail service offered by Yahoo! was one of the first uses of a CAPTCHA developed by von Ahn, et al. Free e-mail is just one example of an online service that is attractive to bots. As such, many bots try to sign up for as many of these accounts as possible to send spam anonymously. Using a CAPTCHA during the sign-up process prevents bots from signing up for accounts en masse.

Search Engine Bots

Sometimes one does not want a particular page indexed by a search engine. Although web pages can include a "noindex" value in a meta tag, this can easily be ignored by malicious indexers. If a page is only accessible via a CAPTCHA, search indexing bots would not be able to view the content.

Preventing Dictionary Attacks

A CAPTCHA can be used for a login system alongside a traditional password to avoid a bot trying to guess the password in a brute-force manner.

Weaknesses

A particularly poor CAPTCHA
Another poor CAPTCHA

Poorly Made CAPTCHA

A CAPTCHA can be described as poor in one of two ways. Either the test fails to be human-solvable in a reasonable amount of time, or it can be solved by a computer using current AI techniques.

To the right are two CAPTCHA that fall under the first category. The first image displays a CAPTCHA that requires the user to solve a difficult calculus problem in order to proceed. While this may successfully thwart a bot, it also prevents many legitimate users from using the web service. Likewise, the second example CAPTCHA is simple unreadable by humans due to poor contrast.

A CAPTCHA that has been successfully solved by a computer

The second category of poor CAPTCHA are those that can be solved by a computer, as it then fails to be a test that can tell computers and humans apart. To the left is an example of a program written by Casey Chesnut that successfully posted spam to 94 blogs in 10 minutes [2].

Although it can be said that these are examples of poor CAPTCHA, based on the very definition of a CAPTCHA these are not CAPTCHA at all. If a test is either not solvable by humans or solvable by computers, it is not longer a test that tells computers and humans apart

Accessibility

Many CAPTCHA also suffer from poor accessibility. For example, each of the CAPTCHA shown above would be unusable by a blind user as they require the user to decipher text from a bitmap image. Some websites now provide audio CAPTCHA as well, though these are sometimes equally difficult to understand by humans, or easier to crack with speech-recognition software [6]. As such, the W3C has recommended that low-volume, low-resource websites (such as blogs protecting against comment spam) replace CAPTCHA with spam-filtering heuristics [6].

References

  1. Carnegie Mellon University. 2009. What is a CAPTCHA?.
  2. Chesnut, Casey. 2005. Using AI to beat CAPTCHA and post comment spam
  3. Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham and Manuel Blum. 2008. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. In Science.
  4. Luis von Ahn, Manuel Blum, Nicholas Hopper, and John Langford. CAPTCHA: Using Hard AI Problems for Security. In Eurocrypt.
  5. Luis von Ahn, Manuel Blum and John Langford. 2004. Telling Humans and Computers Apart Automatically. In Communications of the ACM.
  6. W3C. 2005. Inaccessibility of CAPTCHA.
  7. Willis, John M. 2008. Top 10 Worst Captchas.


--Dangelsm 14:07, 8 April 2009 (EDT)

Personal tools