Asp Forum - Re: [ANN] flatulent-0.0.1 ascii captcha for the masses

Brian Candler

7/3/2007 2:45:00 PM

> the flatulent gem provides brain dead simple ascii art captcha for
> ruby.

Hmm, maybe I'm missing the point, but aren't these really easy for bots to
decode?

Looking at the examples at http://drawohara.tumblr.com/po...
I see that random ASCII characters have been dropped around. But these can
be removed trivially, e.g.

gsub!(/[^\/|\\_()\n]/,' ')

Only minimal damage has been done to the original characters, which are now
easy to pattern-match. It therefore seems that the randomly-strewn
characters have the effect of making it more difficult for visually-impaired
users to access the site, but cause very little impedement to bots :-(

Now, if this ASCII art were turned into a PNG I guess that would make it a
bit harder - although not much, since it's pretty trivial to OCR a clean
grid of ASCII characters back to ASCII, albeit more computationally
expensive.

Perhaps this captcha will be useful if very few sites use it, so the
spammers don't bother writing a decoder. But in that case you don't want it
used by "the masses" :-)

Regards,

Brian.

11 Answers

Chad Perrin

7/3/2007 5:26:00 PM

On Wed, Jul 04, 2007 at 02:07:07AM +0900, ara.t.howard wrote:
>
> have fun putting that together. to do it you need to render, not
> just parse, html! no, let's just say you drive firefox to render the
> html, and then clip out the image, saving it as a tiff. this is the
> result http://... gives for the above ascii captcha
>
> l_ "l 1 l_
> _ J / l_ ' / / _l
> / ?
> iM=,

Is that supposed to be readable?

--
CCD CopyWrite Chad Perrin [ http://ccd.ap... ]
Isaac Asimov: "Part of the inhumanity of the computer is that, once it is
completely programmed and working smoothly, it is completely honest."

Sammy Larbi

7/4/2007 11:53:00 PM

ara.t.howard wrote:
>
> - image has an encoded timebomb in it: attacker has only 60s for
> post. this just rules out brute force attacks.

From when does it start counting? If I've read a blog post and then
try to comment, it's likely I've already used more than 60 seconds. In
fact, probably most of the time I take more than 60 seconds to comment
by itself.

I think a good protection scheme will take into account several factors,
assign them points for failure (or passing), and once a threshold has
been reached, fail the entire thing (or pass it, if you chose that route).

Sam

John Joyce

7/5/2007 12:18:00 AM

On Jul 4, 2007, at 6:52 PM, Sammy Larbi wrote:

> ara.t.howard wrote:
>>
>> - image has an encoded timebomb in it: attacker has only 60s for
>> post. this just rules out brute force attacks.
>
> From when does it start counting? If I've read a blog post and
> then try to comment, it's likely I've already used more than 60
> seconds. In fact, probably most of the time I take more than 60
> seconds to comment by itself.
>
> I think a good protection scheme will take into account several
> factors, assign them points for failure (or passing), and once a
> threshold has been reached, fail the entire thing (or pass it, if
> you chose that route).
>
> Sam
>
>
you should do like blogger (blogspot) and others, allow writing and
after clicking on 'submit' or 'post' or whatever to submit the form
info, you then redirect to a page with the captcha and a submit.
after the captcha page is sent, begin the count. 60 seconds seems a
bit short for a whole post, but with a separate redirect to the
captcha page, it's totally reasonable. If it takes longer, redirect
again to a new captcha. After 3 or 4 failed attempts, save it in a
log, kill that cookie and require a fresh start or a harder captcha.

Don't put the count in JavaScript EVER. Client side code is totally
spoof-able.
All you need is the session data in the cookie to identify the user
and check to see if the response came quick enough.
60 seconds might not be long enough, but a browser will time out
during that long of a wait for a request's response. Still a little
longer might be appropriate from an accessibility standpoint.

ara.t.howard

7/5/2007 2:53:00 AM

On Jul 4, 2007, at 5:52 PM, Sammy Larbi wrote:

> From when does it start counting?

from when the response it served. a timebomb is encoded in the form
with a server-side key.

to set the key

Flatulent.key = 'hostname or something'

to set the timebomb threshold

Flatulent.ttl = 120 # seconds

> If I've read a blog post and then try to comment, it's likely
> I've already used more than 60 seconds. In fact, probably most of
> the time I take more than 60 seconds to comment by itself.
>
> I think a good protection scheme will take into account several
> factors, assign them points for failure (or passing), and once a
> threshold has been reached, fail the entire thing (or pass it, if
> you chose that route).

indeed, internally we also will also track ip and greylist after n
attempts.

cheers.

-a
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama

ara.t.howard

7/5/2007 2:56:00 AM

On Jul 4, 2007, at 6:18 PM, John Joyce wrote:

> you should do like blogger (blogspot) and others, allow writing and
> after clicking on 'submit' or 'post' or whatever to submit the form
> info, you then redirect to a page with the captcha and a submit.
> after the captcha page is sent, begin the count. 60 seconds seems a
> bit short for a whole post, but with a separate redirect to the
> captcha page, it's totally reasonable. If it takes longer, redirect
> again to a new captcha. After 3 or 4 failed attempts, save it in a
> log, kill that cookie and require a fresh start or a harder captcha.
>
> Don't put the count in JavaScript EVER. Client side code is totally
> spoof-able.
> All you need is the session data in the cookie to identify the user
> and check to see if the response came quick enough.
> 60 seconds might not be long enough, but a browser will time out
> during that long of a wait for a request's response. Still a little
> longer might be appropriate from an accessibility standpoint.

all good ideas - for now i'm just trying to get something working.
fyi all the stuff is client side, however the captcha and timebomb
have been blowfish encoded into hidden fields with a key known only
to the server. one could make guesses, but that's about all.

cheers.

-a
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama

Chad Perrin

7/5/2007 7:32:00 AM

On Thu, Jul 05, 2007 at 07:31:43AM +0900, ara.t.howard wrote:
> On Jul 4, 2007, at 12:47 PM, Brian Candler wrote:
> >
> >Hmm - this risks making the captcha visible by fewer and fewer
> >browsers. OK,
> >so lynx wouldn't be able to view a PNG captcha either; but you risk
> >locking
> >out a lot of mobile devices, set-top boxes and other embedded web
> >browsers
> >(which could otherwise display a PNG quite happily)
> >
> >However, perhaps ASCII-art generation (as a form of unusual and
> >disjointed
> >character set) combined with server-side rendering to a PNG would
> >get around
> >that issue, save you a lot of work in obfuscating the HTML itself,
> >and also
> >be harder to parse.
>
> true. i'm not too worried about that though.

I'd be worried about the JavaScript and CSS requirements. In fact, I
won't use a system for validating humanity that doesn't work in Lynx,
unless some other necessary functionality of the website absolutely
cannot work in Lynx (such as Flash animations). Even then, I'd probably
avoid something that won't work in Lynx, since (for example) a Lynx user
could navigate to YouTube and do a search to find a particular video,
then use youtube-dl to download it to the computer and play it using
MPlayer. No in-browser support for Flash video needed. YouTube can be a
useful website for a Lynx user -- so mine should be, too, since I don't
even provide Flash videos as the main content of any of my websites.

--
CCD CopyWrite Chad Perrin [ http://ccd.ap... ]
McCloctnick the Lucid: "The first rule of magic is simple. Don't waste your
time waving your hands and hopping when a rock or a club will do."

Chad Perrin

7/5/2007 7:35:00 AM

On Thu, Jul 05, 2007 at 09:18:01AM +0900, John Joyce wrote:
> >
> you should do like blogger (blogspot) and others, allow writing and
> after clicking on 'submit' or 'post' or whatever to submit the form
> info, you then redirect to a page with the captcha and a submit.
> after the captcha page is sent, begin the count. 60 seconds seems a
> bit short for a whole post, but with a separate redirect to the
> captcha page, it's totally reasonable. If it takes longer, redirect
> again to a new captcha. After 3 or 4 failed attempts, save it in a
> log, kill that cookie and require a fresh start or a harder captcha.

Avoid reliance on cookies. For one thing, cookies can be forged. For
another, you'll lose a lot of people with requirements for cookies.
Modern browsers tend to provide a means for selectively refusing cookies,
and a lot of people use those features.

>
> Don't put the count in JavaScript EVER. Client side code is totally
> spoof-able.
> All you need is the session data in the cookie to identify the user
> and check to see if the response came quick enough.

Session data need not be stored in a cookie. There are other ways to do
it as well -- allow for those who won't (or can't) accept cookies.

--
CCD CopyWrite Chad Perrin [ http://ccd.ap... ]
McCloctnick the Lucid: "The first rule of magic is simple. Don't waste your
time waving your hands and hopping when a rock or a club will do."

Alex Young

7/5/2007 9:13:00 AM

ara.t.howard wrote:
> - image has an encoded timebomb in it: attacker has only 60s for post.
> this just rules out brute force attacks.

Only naive ones... The brute force attack that still works is a
birthday attack. Using that they can try attacks as fast as they can
generate *new* captchas - you expect a collision every 1.2*sqrt(n)
attempts, where n is the size of your keyspace. That's probably OK for
"captcha per comment" sites, but it's dangerous for "captcha per
account" sites.

--
Alex

M. Edward (Ed) Borasky

7/5/2007 2:02:00 PM

Chad Perrin wrote:
> On Thu, Jul 05, 2007 at 07:31:43AM +0900, ara.t.howard wrote:
>> On Jul 4, 2007, at 12:47 PM, Brian Candler wrote:
>>> Hmm - this risks making the captcha visible by fewer and fewer
>>> browsers. OK,
>>> so lynx wouldn't be able to view a PNG captcha either; but you risk
>>> locking
>>> out a lot of mobile devices, set-top boxes and other embedded web
>>> browsers
>>> (which could otherwise display a PNG quite happily)
>>>
>>> However, perhaps ASCII-art generation (as a form of unusual and
>>> disjointed
>>> character set) combined with server-side rendering to a PNG would
>>> get around
>>> that issue, save you a lot of work in obfuscating the HTML itself,
>>> and also
>>> be harder to parse.
>> true. i'm not too worried about that though.
>
> I'd be worried about the JavaScript and CSS requirements. In fact, I
> won't use a system for validating humanity that doesn't work in Lynx,
> unless some other necessary functionality of the website absolutely
> cannot work in Lynx (such as Flash animations). Even then, I'd probably
> avoid something that won't work in Lynx, since (for example) a Lynx user
> could navigate to YouTube and do a search to find a particular video,
> then use youtube-dl to download it to the computer and play it using
> MPlayer. No in-browser support for Flash video needed. YouTube can be a
> useful website for a Lynx user -- so mine should be, too, since I don't
> even provide Flash videos as the main content of any of my websites.
>

I was once a proud member of the "This Web Site Best Viewed With Lynx"
club. Ah, the good old days, when a dollar was worth a dime, Netscape
was more popular than Internet Explorer and nobody's cat had a web page.
;). I learned HTML by editing that web page by hand and generating code
with Perl 4 on an HP100 Pocket PC.

Well, maybe *somebody's* cat had a web page ...

ara.t.howard

7/5/2007 2:49:00 PM

On Jul 5, 2007, at 3:12 AM, Alex Young wrote:

> Only naive ones... The brute force attack that still works is a
> birthday attack. Using that they can try attacks as fast as they
> can generate *new* captchas - you expect a collision every 1.2*sqrt
> (n) attempts, where n is the size of your keyspace. That's
> probably OK for "captcha per comment" sites, but it's dangerous for
> "captcha per account" sites.

yes. although flatulent does works without sessions, it's about
three lines to save the data into the session and validate against
that as well so a cautious person would be wise to do so. i'll be
adding automatic session validation for rails - but the api should
make it super easy anywhere.

cheers.

-a
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama

comp.lang.ruby

Re: [ANN] flatulent-0.0.1 ascii captcha for the masses

Brian Candler

Chad Perrin

Sammy Larbi

John Joyce

ara.t.howard

ara.t.howard

Chad Perrin

Chad Perrin

Alex Young

M. Edward (Ed) Borasky

ara.t.howard

x Login to ForumsZone