Captchas are images with written text modified in such a way that it can be read by a human with high probability but would be difficult for a computer to read. This makes sure that a website is not exploited using a computer program and is used as intended by a human. Examples of a captcha:
CAPTCHAs of various kinds are commonly deployed for guarding account registration, comment posting, and so on. As captchas are usually used to deter software programs they can be usually very hard to read and human accuracy can be around 93% . It also takes something like 10 secs to read a captcha. As can be seen this takes quite a toll on the user experience.
There are a few approaches to defeating CAPTCHAs: using humans to recognize them and automated character recognition software. The only part where humans still outperform computers is versatility, i.e., even with different background clutter and shapes of letters, humans can still read the text. Here we will see how to use both of these approaches to create a generic software that read captchas.
We will show that with advanced neural networks (deep learning), most of the popular captcha softwares can be broken. Here we demonstrate using a captcha software available for common public usage.
To use neural networks, we need to provide some examples consisting of the images and corresponding texts. The computer will then learn to recognise the text given a new captcha image. Machine learning is a subfield of computer science that gives computers the ability to learn without being explicitly programmed. They try to find a function given examples of input(captcha image) and output(text). Here we use deep neural networks as out machine learning algorithm. So the main step are to generate the pairs of images and text.
Training data for deep learning
The use of human labor to solve CAPTCHAs effectively renders captchas vulnerable . The combination of cheap Internet access and the commodity nature of today’s CAPTCHAs has created a solving market. Today, there are many service providers that can solve large numbers of CAPTCHAs via on-demand services with retail prices as low as $1 per thousand. Using this we can collect quite a lot of training data easily and quickly. Then this data can be used to create a program as described above that can solve unlimited number of captchas.
We use torch to train the neural network. We train on a GPU Nvidia 780 Titan. You can check similar set up for CIFAR dataset at this blog post. The main difference is the criterion, as the output of a image is a sequence of characters and we adapt the network to get good enough results. We test our system using the following captcha available at this link
We scrape about 10K captcha images and use an online service to label them using humans. Once we have image and corresponding text we train our neural network using this training data.
We assume the dataset to be in a folder (named dataset). We have to set the appropriate height and width of the input image and number of characters in the captcha. The network can be changed accordingly to handle any other captcha.
First create the dataset in a folder in the required format as shown. Then edit the parameters to suit your needs:
- set validation data size
- set batch size to fit the training of network in your GPU
- set the number of iterations to run
- set learning rate, learning rate decay and momentum
Using this setup and training for a couple of hours we were able to get 90% accuracy on the dataset.
So without much effort and using standard deep learning library and generating datasets automatically and using huge computational power, we can break standard captcha systems in a days work.