CAPTCHA

Cyan · October 1, 2006, 11:32pm

Can you read the text in the pictue?

Easy.
With effort.
Barely.
No, it is unintelligible.

0 voters

Due to incessant spammage on the forums, I wrote an improved CAPTCHA script using Panda3D. It might take some tweaking to make it work on a webhost so I’m releasing it under a zlib-style license.

It requires the following truetype fonts:
Barred Out.ttf
Schizm.ttf
Scratch my back.ttf
SF Wasabi.ttf
Shattered.ttf
Spotted Fever.ttf
Staccatissmo.ttf
They are available for free on the internet and should be easy to find.

Also the environment.egg file (at least the last time I checked) has some improperly applied textures. Change

<Texture> Tex975 {
  "maps/envir-mountain2.png"
  <Scalar> format { rgba }
  <Scalar> wrapu { repeat }
  <Scalar> wrapv { repeat }
  <Scalar> minfilter { linear_mipmap_linear }
  <Scalar> magfilter { linear }
}

To

<Texture> Tex975 {
  "maps/envir-mountain2.png"
  <Scalar> format { rgba }
  <Scalar> wrapu { CLAMP }
  <Scalar> wrapv { CLAMP }
  <Scalar> minfilter { linear_mipmap_linear }
  <Scalar> magfilter { linear }
}

Same for “maps/envir-mountain1.png”, “maps/envir-groundcover1.png”, “maps/envir-reeds.png”, and “maps/envir-bamboo.png”. Things will look much better without the stray lines.

# Copyright (c) 2006 Cyan
#
# This software is provided 'as-is', without any express or implied warranty.
# In no event will the author(s) be held liable for any damages arising from
# the use of this software.
#
# Permission is granted to anyone to use this software for any purpose,
# including commercial applications, and to alter it and redistribute it
# freely, subject to the following restrictions:
#
#   1. The origin of this software must not be misrepresented; you must not
#      claim that you wrote the original software. If you use this software
#      in a product, an acknowledgment in the product documentation would be
#      appreciated but is not required.
#
#   2. Altered source versions must be plainly marked as such, and must not
#      be misrepresented as being the original software.
#
#   3. This notice may not be removed or altered from any source distribution.
from pandac.PandaModules import *
loadPrcFileData("", "window-type offscreen")
loadPrcFileData("", "win-size 400 200")
import sys
import random
from direct.gui.OnscreenText import OnscreenText
import direct.directbase.DirectStart
base.disableMouse()
#store captcha code from 1st command line argument.
ccode = sys.argv[1].upper()#no lowercase glyphs
length = len(ccode) #compute length of ccode
#the FONTS tuple contains tuples of the form ('font name','forbidden glyphs')
#Because not all the glyphs in a font are legible, they can be specified as
# 'forbidden glyphs' wich will never be displayed. Note when trying new fonts:
#If a particular glyph is ambiguous, or unreadable to a human out of context,
# it should be forbidden.
FONTS = (('Schizm.ttf','GNU0'),('SF Wasabi.ttf','GS56910'),
    ('Scratch my back.ttf','0'),('Scratch my back.ttf','0'),#double chance ;)
    ('Spotted Fever.ttf','I0'),('Shattered.ttf','EIL10'),
    ('Staccatissmo.ttf','I0'),('Barred Out.ttf','BI'))
def ChooseGlyph(tNode,char):#chooses an appropriate glyph from among FONTS
    tNode.setText(char)#assign the character to the TextNode
    choices = list(FONTS) #thaw tuple
    while 1:
        randy = random.randint(0,len(choices)-1)#Choose from FONTS at random
        if choices[randy][1].find(char) == -1:#if absent from forbidden glyphs
            font = choices[randy][0]#then use that font
            break #success!
        choices.pop(randy)#remove forbidden font and try again.
    tNode.setFont(loader.loadFont(font))#assign the font
def SavePic():
    base.screenshot('captcha.jpg',defaultFilename = False)
    sys.exit()#end program after saving image
base.setBackgroundColor(0.25,0.6,0.6)#blue sky
dumnod = aspect2d.attachNewNode("dumnod")#dummynode to parent the TextNodes
dumnod.setScale(3)#scaling blurs the glyphs
for i in range(length):#Create image. One TexNode per character in ccode.
    node = TextNode("tn%s" % i)#new TextNode: tn#
    ChooseGlyph(node,ccode[i:i+1])#choose a glyph to represent the character
    path = dumnod.attachNewNode(node)#attach to the dummynode
    path.setScale(0.19)#scale glyph
    path.setPos((-length/2+i)*.15,0,(random.random()-1)*.1)#position it
    path.setR(random.randint(-10,15))#rotate a little
watermarkText = "Panda3D Forums  "*30#this is a bot distraction
watermark = OnscreenText(text = watermarkText,fg=(1,1,0.5,.23),wordwrap=33,
    shadow=(1,0,0,.2), pos = (-1,1), scale = 0.25)#render watermark
base.camera.setPos(0,0,5)#put the camera on a tripod
base.camera.setH(random.randint(1,360))#randomize camera angle
environ = loader.loadModel("models/environment")#the built-in panda environment
environ.reparentTo(render)
environ.setScale(0.25,0.25,0.25)
environ.setPos(-8,42,0)
taskMgr.doMethodLater(1, SavePic, 'Save Task',extraArgs = [])#do 1 sec later
run()#run

the line

loadPrcFileData("", "window-type offscreen")

causes errors in Panda-1.2.3 . You must supply a string for display as the first command line argument or it wont work.

Tell me what you think!

ThomasEgi · October 2, 2006, 12:40pm

quite good, and might mislead captcha bots with the panda forum text in the background… but the C in the picture could also be a O due to the H hiding a part of it. in chase of this word its no provblem but with generic alphanumerical code it could get tricky. with a bit of further tweaking =) reallly nice.

felix_kytt · October 4, 2006, 9:45pm

might not be a good idea because those spammbots have been compiled by someone who might find your little script, yet again…

Cyan · October 5, 2006, 1:39am

That’s absurd. Graphics rendering and pattern recognition are two completely different things. That spammers may find the code is totally irrelevant. According to Wikipedia:

felix_kytt · October 5, 2006, 3:32am

lol, and thats why i added that yet again… in the end, you know, just-in-case! ^^

ynjh_jo · October 6, 2006, 12:11pm

Some suggestions for better distraction :

assign different color to each character
add a little transparency to it
[1&2] --> to give better appearance for overlapping characters.
use more fonts

These are some suitable fonts I know :
LARGER VIEW LARGER VIEW

Cyan · October 6, 2006, 6:31pm

According to Wikipedia there are three steps spambots’ AIs use to defeat this kind of CAPTCHA:

Computers can now do step 1 better than humans can. Their problem is step 2. Neural network algorithms can be trained to defeat step 3 very quicly and reliably once step 2 is done.

Bad idea. The whole point of my CAPTCHA is step 2: to confuse the computer’s segmentation algorithm. In other words, make it difficult to tell where one letter ends and the next begins. That’s why I have these shattered-looking fonts and I allow them to overlap. If the glyphs are different colors, then they would be too easy to segment.

Yeah, good idea, just not too much or the humans can’t read it either. That should be easy enough to implement.

Again, step 2 is the whole point. Humans are much better at segmentation than than computers. Even if we get it wrong sometimes the phpBB allows us three attempts before timeout.

Yeah, good idea. I specifically designed the script to make it easy to add more fonts. It’s as simple as adding a new (‘font’,‘glyphs’) tuple to the FONTS tuple. I looked at your recommended fonts. Most would be very good additions, provided you properly forbid their ambiguous glyphs.

neighborlee · October 9, 2006, 1:33am

Cyan:

According to Wikipedia there are three steps spambots’ AIs use to defeat this kind of CAPTCHA:

Removal of background clutter, for example with color filters and detection of thin lines.

Segmentation, i.e. splitting the image into segments containing a single letter.

Identifying the letter for each segment.

Computers can now do step 1 better than humans can. Their problem is step 2. Neural network algorithms can be trained to defeat step 3 very quicly and reliably once step 2 is done.

ynjh_jo:

Some suggestions for better distraction :

assign different color to each character

Bad idea. The whole point of my CAPTCHA is step 2: to confuse the computer’s segmentation algorithm. In other words, make it difficult to tell where one letter ends and the next begins. That’s why I have these shattered-looking fonts and I allow them to overlap. If the glyphs are different colors, then they would be too easy to segment.

add a little transparency to it

Yeah, good idea, just not too much or the humans can’t read it either. That should be easy enough to implement.

[1&2] → to give better appearance for overlapping characters.

Again, step 2 is the whole point. Humans are much better at segmentation than than computers. Even if we get it wrong sometimes the phpBB allows us three attempts before timeout.

use more fonts

Yeah, good idea. I specifically designed the script to make it easy to add more fonts. It’s as simple as adding a new (‘font’,‘glyphs’) tuple to the FONTS tuple. I looked at your recommended fonts. Most would be very good additions, provided you properly forbid their ambiguous glyphs.

nice I hope this works out…I must admit I had no idea the ‘H’ was even there…only until I saw topic of post was I sure I had to ‘find’ it…but then again maybe I’m being blind today ?

anyway great stuff !

cheers
neighborlee()

bigfoot29 · October 9, 2006, 1:51pm

Well, when you know whats written it can be read easily. However, with completely “senseless” letter/digit-mix it might get quite a bit harder to get the needed result

Edit: Geez… I need to get a ubuntu vserver up to let that script run

Regards, Bigfoot29

Fixer · October 9, 2006, 10:30pm

I’d lose the Barred Out font; that one’s pretty difficult for my poor brain

You may also want to cut out the middle man by compiling the fonts into texture content using egg-mkfont. Then having the actual font packages would no longer be necessary.

All-in-all, I’d say it’s a good idea!

Take care,
Mark

ynjh_jo · October 19, 2006, 8:57am

I believe you ever know about this visual trick :
“move your head a little away from the screen and you’ll see the letters !”
How about that ? Will AI break it as simple as downsampling the image ?

How about motion picture CAPTCHA, which the letters are keep distorted, partially covered, moving around & overlapping each other ?
The simplest implementation is the animated GIF, but will it work on most browsers ?

ThomasEgi · October 19, 2006, 2:36pm

animated gif should be fine with almost every GUI based brwoser. the few freaky people with text based ones arent able to read captcha anyway^^
but the “move your head away from screen” is quite bad because:
-A peolpe with very small screens wont be able to recognize anything (like the geeks @ chashregister in your supermarket)
-B some people could die because the distance between head and screen would be come too long and the radiation to keep them alive would be missing.
-C head-up-display users will complain^^
-D you cant expect a user to actually MOVE

bigfoot29 · October 19, 2006, 4:13pm

sooo… we have a small joker here ^^

animated gifs should be possible as well… make a bunch of jpg-images and then convert them together to a gif… first thing is possible with Panda3D for the other thingie there should also be a TON of command line tools

Regards, Bigfoot29

Cyan · October 19, 2006, 5:59pm

That is already one of the step 1 techniques used, but it doesn’t help much with step 2: the segmentation.

Again, bad idea. The spambots don’t “see” the image, they “see” a file. It could be a simple matter of taking a single frame and running the algorithm on that. Or worse, run the algorithm on each frame for error correction. By animating it you are just giving the AI more to work with.

I think my CAPTCHA version will work fine. I suppose with time and effort someone could develop an algorithm that defeats it at least some of the time, but why bother? To spam a single forum? Let’s face it, we’re not that important! (Not yet anyway. )

bigfoot29 · October 19, 2006, 6:48pm

I guess you are right… my problem is: I lack the time for testing… Having Panda3D in a Xen virtual machine running, all I need to do is writing a script that calls your code and returns the image to the webbrowser
(cgi mode) - due to the fact that I doubt that we get 1000 new users a day I guess it will be a sufficient solution…

Regards, Bigfoot29