Proof of concept

As we thougth AI programmer are working on the subject and design some bot in order to beat existing captchas.

We recently found a very brillant post by Casey Chestnut : Using AI to beat CAPTCHA and post comment spam

In order to provide a proof of concept of our framework, and a tutorial about writing a new engine, we folow Casey advices to write new captcha engine.

Casey's advices

  • render the characters with different colors
  • make some characters darker than the background, and some lighter
  • use gradient colors for the backgrounds and the characters
  • dont align all the characters vertically
  • dont make the answers words, so that a dictionary could be used
  • use more characters and symbols
  • use uppercase and lowercase characters
  • use a different number of characters each time
  • rotate some of the characters more drastically (i.e. upside down)
  • do more overlapping of characters
  • make some pixels of a single character not touching
  • have grid lines that cross over the characters with their same color
  • consider asking natural language questions

Implementation

Lets see what advices could be implemented with exisiting jcaptcha components and wich one requires new components

Using existing components

dont make the answers words, so that a dictionary could be used

The component responsible for generating words is the WordGenerator
This advice is easily implemented using the
RandomWordGenerator

See SimpleListImageCaptchaEngine WordGenegration initialisation

or the
ComposeDictionaryWordGenerator

See DefaultGimpyEngine WordGenegration initialisation

The ComposeDictionaryWordGenerator is the best option, as it is proved that humans has the ability to read and recognize whole parts of words (TODO:find the study source).

use more characters and symbols

This is done using the initialisation String of the RandomTextPaster or using a custom dictionary

use uppercase and lowercase characters

Idem

use a different number of characters each time

The component responsible for pasting words on a background is the TextPaster
It provides min and max pasted text length and all implementations takes those attributes as constuctor parameter (as all framework components since contructor dependency injection principle has been applyed)

See DefaultGimpyEngine textPaster initialisation

make some pixels of a single character not touching

This is done with an existing implementation of the TextPaster BaffleTextPaster

See DefaultGimpyEngine BaffleTextPaster initialisation

rotate some of the characters more drastically (i.e. upside down)

The component responsible for rotating font is the FontGenerator
It has two existing implementations that allows to twist font (ie rotate characters once pasted, please excuse our approximative english)

Thus you may choose
TwistedRandomFontGenerator
or the
TwistedAndShearedRandomFontGenerator
That also shear the font.

To be trully honest, this component should be rewritten in order to fit the need, the angle should be more drastic :

float angle = myRandom.nextFloat() / 3;

have grid lines that cross over the characters with their same color

This one is not yet implemented in any jcaptcha engine. Let see how to do it.
As we are in the 'using existing' part, we will use the Deformation facilities provided througth JHLabs imaging filter library.
Using a simple 'weave' filter we transforme a captcha produced by the DefaultGimpyEngine :

to this :