Proof of concept
As we thougth AI programmer are working on the subject and design some bot in order to beat existing captchas.
We recently found a very brillant post by Casey Chestnut : Using AI to beat CAPTCHA and post comment spam
In order to provide a proof of concept of our framework, and a tutorial about writing a new engine, we folow Casey advices to write new captcha engine.
Casey's advices
- render the characters with different colors
- make some characters darker than the background, and some lighter
- use gradient colors for the backgrounds and the characters
- dont align all the characters vertically
- dont make the answers words, so that a dictionary could be used
- use more characters and symbols
- use uppercase and lowercase characters
- use a different number of characters each time
- rotate some of the characters more drastically (i.e. upside down)
- do more overlapping of characters
- make some pixels of a single character not touching
- have grid lines that cross over the characters with their same color
- consider asking natural language questions
Implementation
Lets see what advices could be implemented with exisiting jcaptcha components and wich one requires new components
Using existing components
dont make the answers words, so that a dictionary could be used
The component responsible for generating words is the WordGenerator
This advice is easily implemented using the
RandomWordGenerator
See SimpleListImageCaptchaEngine WordGenegration initialisation
or the
ComposeDictionaryWordGenerator
See DefaultGimpyEngine WordGenegration initialisation
The ComposeDictionaryWordGenerator is the best option, as it is proved that humans has the ability to read and recognize whole parts of words (TODO:find the study source).
use more characters and symbols
This is done using the initialisation String of the RandomTextPaster or using a custom dictionary
use uppercase and lowercase characters
Idem
use a different number of characters each time
The component responsible for pasting words on a background is the TextPaster
It provides min and max pasted text length and all implementations takes those attributes as constuctor parameter (as all framework components since contructor dependency injection principle has been applyed)
See DefaultGimpyEngine textPaster initialisation
make some pixels of a single character not touching
This is done with an existing implementation of the TextPaster BaffleTextPaster
See DefaultGimpyEngine BaffleTextPaster initialisation
rotate some of the characters more drastically (i.e. upside down)
The component responsible for rotating font is the FontGenerator
It has two existing implementations that allows to twist font (ie rotate characters once pasted, please excuse our approximative english)
Thus you may choose
TwistedRandomFontGenerator
or the
TwistedAndShearedRandomFontGenerator
That also shear the font.
To be trully honest, this component should be rewritten in order to fit the need, the angle should be more drastic :
float angle = myRandom.nextFloat() / 3;
have grid lines that cross over the characters with their same color
This one is not yet implemented in any jcaptcha engine. Let see how to do it.
As we are in the 'using existing' part, we will use the Deformation facilities provided througth JHLabs imaging filter library.
Using a simple 'weave' filter we transforme a captcha produced by the DefaultGimpyEngine :
to this :