The origin of CAPTCHA and reCAPTCHA

The word CAPTCHA is an abbreviation for 'Completely Automated Public Turing test to Tell Computers and Humans Apart' (roughly the automatic test for distinguishing computers and people).

Like everything in the information age, changes are fast and quickly become a daily familiar that we don't even notice.The CAPTCHA user authentication system is an example.

CAPTCHA was first used in 1997 when the search engine Alta-Vista wanted to find a way to block automatic submission of URLs to their search engine.Although submitting URLs helps them expand their repositories to search, there are also bot users to spam their servers, in order to manipulate search engine ranking algorithms.

Andrei Broder, Chief Scientist at Alta Vista found a solution, developed a random algorithm to create images of printed documents.This algorithm was then completed by researchers at Carnegie Mellon University in the early 2000s.This group, with the head of Luis von Ahn (or he calls himself Big Lou) wants to find a way to distinguish spambots pretending to be human online.

They created a program that displays deformed documents that computers cannot recognize, but humans can predict.Users will have to type the text into the dialog to gain access.

CAPTCHA images are not the only form, but also in audio format (often distorted to prevent speech recognition software), written questions that computers cannot understand or PiCAPTCHA , including a sequence of images and requires the user to choose in a certain order.

Very successful, CAPTCHA becomes a popular and accepted tool for users.But they have forgotten a human personality: want to get paid.CAPTCHA spam farms appear all over the Internet, especially in poor countries and employees need only answer CAPTCHA puzzles to receive money.

On these 'farms', CAPTCHA is a money-making product.But there are millions of people voluntarily answering CAPTCHA puzzles for free, which, according to von Ahn, is a waste of unpaid labor.

The origin of CAPTCHA and reCAPTCHA Picture 1
CAPTCHA and reCAPTCHA become a multi-page identification tool

After that, reCAPTCHA was born and also very popular.The way it works is quite similar to before, users type text and numbers on the screen.But instead of random words, reCAPTCHA requires users to view images of letters and numbers from stored documents.Computer reading old text is quite good but when ink is blurred, broken paper will be very difficult to read, but for humans it is possible.

They started with the archive of The New York Times newspaper, then sold the technology to Google.Google uses it to re-type old books.Those blurred images are real words from real pages.That means you have done it for free for Google and The New York Times.

Von Ahn is very pleased with the new version and thinks that reCAPTCHA will work forever because 'there are lots of printed documents'.But this is the Internet era and many things we still consider to exist naturally online can disappear one day.The CAPTCHA system is no exception.

CAPTCHA is not unbreakable.In 2014, Google's analysis showed that artificial intelligence could break the most complex CAPTCHA and reCAPTCHA images with 99.8% accuracy.

Google has created a new No CAPTCHA reCAPTCHA system, not based on the user's ability to decode text but rather their online behavior before passing the security checkpoint.When users are on the page, the algorithm will see how they interact with the content to decide whether it is a person or a robot.

The battle between security experts and spambots will probably never end.One day No CAPTCHA reCAPTCHA can be bypassed and replaced with other technology.At that time, stay alert.

