I read about this in Wired, and just saw it again. First things first: a "captcha" is a picture of a word - images on Google, entry in wikipedia on captcha - (usually all messed up: streched, with lines). It is hard for a computer to read (or use "OCR" on it) but a human can read it. It is used on websites when you are signing up for an account (like a new email) and the people who have the websites want to make sure that you are a human (because sometimes people make computer programs that sign up for fake accounts [e.g. 1000 emails for spamming or something]). It was invented by some people from Carnegie Mellon University. They have recently started the reCaptcha program. The beauty of reCaptcha is that it doesn't make goofy images: it uses existing ones. From scanned books. So, everytime someone does this activity, they are actually helping digitizing the world's libraries. Cool and simple. Wired article (which actually talks about a variety of interesting "games" for humans that help computer learn): Geek note 1: a captcha doesn't actually have to be an image. The point it is that if you can solve the captcha ["t" stands for test] then you are an "h": human. So, technically, a captcha can be a very wide variety of things. This is just the most common. Geek note 2: reCaptcha uses 1 known word and 1 unknown word. The first word is to see if you get that one right and the second is to log for aiding the OCR.

Wednesday, August 15, 2007, 12:00 AM

tagged: captcha, computerlearning, onlinegames