[Note: This is the second in a two-part series on subscription bombing and how to defuse it. Last time, we looked at the techniques used to create recent attacks. The time we look at the technique Spamhaus recommends as the best way to avoid ending up the victim of a subscription bombing: the CAPTCHA.]
As we discussed in our last blog article, the best way to prevent subscription attacks, according to spam listing companies such as Spamhaus, is to use a verification test in your email signup form. The best known of these, and the one that Spamhaus recommends by name is the CAPTCHA. CAPTCHAs can be a pain in the neck sometimes, and when they are not easy to solve they can cause people to just give up trying and leave your site. But newsletter signups that don’t require CAPTCHAs are just what subscription bombers look for. If you find yourself on the receiving end of one of these attacks, you’ll have a lot more work to do to recover your reputation score, and will, after that, have to use a CAPTCHA anyway. Having accepted, however unhappily, that CAPTCHAs are a necessity, we’ll look at different CAPTCHA technologies that are available today.
The best known form of CAPTCHA is the reCAPTCHA, version 1, which consists of a small box displaying two distorted words (usually consisting of one real word and one that is gibberish). You are asked to enter the words you see, and if your answers are incorrect, you are presented with two new words and asked to try again.
ReCAPTCHA was developed by a group of computer scientists at Carnegie Mellon University who recognized that CAPTCHA technology offered a great crowd-sourced way to achieve better OCR. If the OCR software couldn’t identify a word, sometimes humans could, which meant you could feed words to people that computers couldn’t recognize. That’s why in 2009, the ReCAPTCHA technology was acquired by Google for their Books project, and was used by the New York Times to digitized their archives. This seemed like a good way to block fake signups, but they didn’t factor in either advances in OCR software, or the low costs of doing business in third world countries.
Capturing CAPTCHAs
Almost as soon as they appeared, people started working on ways to crack the CAPTCHA codes. One company we found in India offers workers around 90¢ and hour to solve as many CAPTCHA codes as humanly possible. Those who can’t do it quickly or who make too many mistakes are kicked off the service. This is a time-consuming way to crack CAPTCHA codes, but by offering wages far below anything most people could live on the authors presumably make it worth the effort. Just to pour salt in the wound, anyone interested in doing this thankless work is expected to pay a fee to join.
Meanwhile, OCR software kept getting better, so it wasn’t long before someone had the bright idea of creating a bot that used OCR to identify the words in a CAPTCHA. It doesn’t always get it right. In fact, it often gets it wrong, but it doesn’t matter. Unlike a human, who is going to give up in frustration after a few tries, a bot can keep trying and trying until it gets it right. Since their advent, bots have become a major problem for word identification types of verification. To counter this, word-based CAPTCHAs became more distorted and harder to decipher for humans and bots alike. We’ve all seen the results of this battle over decipherability. We’ve all encountered CAPTCHAs so hard to identify that it takes us a few tries to get them right, and we all have better things to do with our time than enter meaningless words in an attempt to receive more email.
To solve this problem, a new kind of ReCAPTCHA was created that relies on the natural differences between software and the human brain. This made it easier for humans to recognize the words, while keeping it hard for the bots the do the same. In recent variations, a reCAPTCHA might ask users to identify images instead of scrambled type relying on human intuition to solve. Take this example:
At the top of CAPTCHA we are presented with an image (in this case, a cat) and asked to find all the images with matching content. This is a mixed bag. It will certainly block bots from finding a solution, but it also presents us with instructions that those of us who skew towards the Asperger‘s end of the spectrum and tend to take things too literally might also find perplexing. The picture at the top is an adult gray tabby, but the pictures below are all of kittens and only two are gray tabbies. We realize most people won’t get this granular with the data, and that’s what Google is counting on. The top picture is a cat, so humans will click on all the pictures of the same animal, even when every other aspect of the picture is different.
I’m Not a Robot
Two years ago, Google introduced a version of the ReCAPTCHA they call a “No CAPTCHA reCAPTCHA.” With this type of CAPTCHA, there’s no need to try and decipher heavily distorted words, or squint to make out blurry photographs of street numbers, or identify various animals. You check the box labeled “I’m not a robot” and you’re done. The No CAPTCHA reCAPTCHA uses Google’s Javascript API and a form, and appears, for now at least, to be an excellent choice for verification. Spamhaus likes it, and it produces the least amount of hassle in the signup process.
Gamifying the Process
A variation on the CAPTCHA that is designed to alleviate the annoyance of typing in meaningless words is the addition of gaming elements to the verification process. With this technique, you are asked to complete some simple task to verify that you are a human being. The task is always simple and resembles a children’s game in its approach. You might, for example, be asked to “put the carrots in the shopping cart.” The picture will show an image of an empty shopping cart with images of various groceries floating next to it. By clicking and dragging the image of the carrots to the image of the shopping cart, you verify that you are a human.
These gamified verification techniques are effective approaches to the problem, although we haven’t seen that many instances of their use. They appear to be acceptable to Spamhaus as well. According to them, “…any mechanism that successfully keeps bots from abusing signup forms is good and absolutely necessary nowadays. Captcha is currently the best mechanism, and whatever the captcha test does (task, game, whatever) is also fine as long as bots can not easily defeat it.”
Alternatives to CAPTCHA
CAPTCHA is, by no means, the only way to verify a signup. Programmers continue to invent new ways to foil the bad guys. A couple alternatives are the Honeypot and the Social signup. Before choosing either of these, you should note that Spamhaus prefers a CAPTCHA verification that requires the user to perform a task. That’s not to say these are not effective in blocking bots, only that implementing them might not help you get off the SBL. As of right now, a CAPTCHA-type mechanism is the safest way to go.
Honeypot Verification
One of the earliest attempts to simplify the process of signing up and restrict it to real people is the use of a honeypot. The idea is simple: A form is hidden in the HTML for a page, but it isn’t visible on the page, so no human visitor to the site should ever know about it. Since bots don’t visit pages this way, but, instead, look at each page’s code for forms, they will see the form and attempt to fill it out, thus identifying them as bots and not humans. It is a wickedly clever technique for fooling the bots, although, as we’ve already discussed, bots have gotten much more sophisticated over the years and are seldom fooled by this technique anymore. It can also cause problems with browsers that have CSS turned off, and with ones such as Safari that autofill forms. It is still in use, but is often combined with a more interactive signup.
The Social Approach
As social sites become more and more important to people’s daily lives, we’ve seen a corresponding growth in sites that require social signups. Instead of entering words or playing games, you are offered a button that says “Sign Up With Facebook.” This approach lays everything on the line, but it also stands a significantly higher chance of losing the audience. Several studies have shown that people just don’t like using their Facebook accounts for promotional purposes, still preferring email as the main source for sales announcements. We don’t recommend using this approach except for those rare cases where your Facebook profile is your main sales mechanism.
At this time, we recommend the “No CAPTCHA reCAPTCHA” for your verification purposes. It satisfies Spamhaus’s requirements, and it makes the signup process as easy as possible for your subscribers. Of course, if history is any indication (and it usually is), it’s just a matter of time before this approach is compromised, and we’ll have to find a new way to verify newsletter signups. It is important to remember that nothing in the field of email marketing remains static. There’s no set-it-and-forget-it solution. You’ll still want to keep track of your email data to see if there are any anomalies occurring.