[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Wiki Spam Report

Jim Weirich

12/13/2004 6:21:00 PM

Wiki Spam Report
----------------

I thought I would take some time and report on the wiki spam situation
on RubyGarden. As I hope you have noticed, the wiki has been
remarkably spam free. This email will tell you what measures we have
taken to get to this point.

But first ...

Some Numbers
------------

Over the past 10 days, we have had:

93 updates to the wiki page, all (AFAICT) spam free.
(although I might have missed spotting some).

46 updates to the wiki tarpit. Of those, we had ...
3 innocent updates
2 questionable updates
1 update by me
40 spams

The Mechanism
-------------

Spammers are automatically routed to a wiki tarpit. The tarpit is an
(almost) exact copy of the real RubyGarden wiki. Making changes to
the tarpit looks as if you are making changes to the real wiki. And
since spammers get their pages from the wiki, it looks like (to them)
that they have successfully spammed our site.

However, everyone else never gets to see the spam.

By tricking the spammers into thinking they are successful, they don't
put any additional effort into bypassing our spam detection criteria.
This is important! When we explicitly denied them access to the wiki,
then went to great lengths to figure out how to get around the
restrictions. I haven't seen any of that kind of probing with the
tarpit.

Detecting Spammers
------------------

The current spammer detection logic is based on two observations:

(1) Spammers almost never use an IP address that has reverse lookup
enabled. This effectively means that it appears (to the wiki
software) that your host name looks like a numeric IP address.

(2) Spammers almost never set user preferences on the wiki.

So if both of these conditions are true, we treat the access as a spammer
and send it to the tarpit.

Now this isn't perfect, but that's OK. We also have a explicit ban
list for spammers who pass one of (1) or (2) above. And we have an
explicit allow list that overrides the automatic spammer detection.

Innocent Users
--------------

Can innocent users get trapped by the Tapit? The short answer is yes.
However, we are monitoring the tarpit and will attempt to rescue such
users.

In the past 10 days, there were at least 3 page updates that were from
innocent users. One guy (bless his heart) even removed some spam from
the tarpit for us.

When I see innocents trapped in the tarpit, I add their IP address to
the allow list and manually update the wiki with their changes (if
they are significant).

Detecting the Tarpit?
---------------------

The tarpit is deliberately designed to look like the original wiki, so
it is sometimes difficult to tell when you are trapped. Here's some
suggestions.

You are probably in the Tarpit when:

* there are a lot of recent updates made with numeric IP addresses
rather than host names.

* a lot of the pages have spam.

Although neither of these suggestions are foolproof. I refresh the tarpit
from the real wiki occasionally (to keep it looking realistic).
Immediately after a refresh it is /very/ difficult to tell the difference.

If you think you are trapped by the tarpit, send me
(jim@weirichhouse.org) an email with your IP address and I will check
the logs. If you are trapped, we can add your IP address to the allow
list.

If you are worried about getting caught in the tarpit, just make sure you
have your user preferences set when accessing the tarpit (click on the
preferences link from any wiki page).

Summary
-------

I am pretty happy with the current wiki situation. In fact, the
tarpit has been so successful, that I am considering lifting the ban
on lower case http. The ban currently isn't buying us any benefits
and is rather annoying (I'll make it so both upper and lower case
work).

Thanks for your time.

--
-- Jim Weirich jim@weirichhouse.org http://onest...
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)



10 Answers

martinus

12/13/2004 9:27:00 PM

0

That is a very cool idea! But I am afraid this posting is the reason
why http://www.rubygarde... currently is under attack from a lot
of spammers.

martinus

David G. Andersen

12/13/2004 9:37:00 PM

0

On Tue, Dec 14, 2004 at 06:27:18AM +0900, martinus scribed:
> That is a very cool idea! But I am afraid this posting is the reason
> why http://www.rubygarde... currently is under attack from a lot
> of spammers.

Are you sure you didn't slip into the tarpit? :) It looks fine
from here.

-Dave wonders if it might not be nice to have multiple
spammer / legitimate user detection heuristics, though.


Edgardo Hames

12/14/2004 3:30:00 AM

0

On Tue, 14 Dec 2004 03:21:02 +0900, Jim Weirich <jim@weirichhouse.org> wrote:
> Wiki Spam Report
> ----------------
>
> I thought I would take some time and report on the wiki spam situation
> on RubyGarden. As I hope you have noticed, the wiki has been
> remarkably spam free. This email will tell you what measures we have
> taken to get to this point.
>

What does spam look like on a wiki site?

Thanks,
Ed
--
Pretty women make us buy beer, ugly women make us drink beer


James Britt

12/14/2004 3:51:00 AM

0

Edgardo Hames wrote:
> On Tue, 14 Dec 2004 03:21:02 +0900, Jim Weirich <jim@weirichhouse.org> wrote:
>
>>Wiki Spam Report
>>----------------
>>
>>I thought I would take some time and report on the wiki spam situation
>>on RubyGarden. As I hope you have noticed, the wiki has been
>>remarkably spam free. This email will tell you what measures we have
>>taken to get to this point.
>>
>
>
> What does spam look like on a wiki site?


The junk I'm accustomed to seeing are pages devoid of any Ruby content,
but filled with links to sites apparently hawking Natural Nigerian Rolex
Enhancements.

James


Jim Weirich

12/14/2004 4:15:00 AM

0

On Monday 13 December 2004 10:30 pm, Edgardo Hames wrote:
> On Tue, 14 Dec 2004 03:21:02 +0900, Jim Weirich <jim@weirichhouse.org>
wrote:
> > Wiki Spam Report
> > ----------------
> >
> > I thought I would take some time and report on the wiki spam situation
> > on RubyGarden. As I hope you have noticed, the wiki has been
> > remarkably spam free. This email will tell you what measures we have
> > taken to get to this point.
>
> What does spam look like on a wiki site?

You asked ... Here's a link to an old page on the RubyGems wiki. Scroll down
to the bottom of the page.

http://rubygems.rubyforge.org/wiki/wiki.pl?action=browse&id=DeveloperGuide&r...

--
-- Jim Weirich jim@weirichhouse.org http://onest...
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)


Curt Sampson

12/14/2004 4:46:00 AM

0

Jim Weirich

12/14/2004 6:20:00 AM

0

On Monday 13 December 2004 11:46 pm, Curt Sampson wrote:
> It looks like the sandbox is not working for 221.197.18.150. If you look
> at RecentChanges, he's changed a lot of pages to add what seems to be
> mostly Chinese links to chinese websites. I've set preferences, and I'm
> still seeing it.

Curt is looking at the RubyGems wiki hosted by RubyForge (which has little to
no spam protection ... hopefully that will change in the near future).

The tarpit is for the RubyGarden wiki.

--
-- Jim Weirich jim@weirichhouse.org http://onest...
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)


martinus

12/14/2004 7:08:00 AM

0

It was fine again after about 5 minutes, it sees that the system really
works
And no, i didn't slip into the tarpit :-)

martinus

Martin DeMello

12/14/2004 11:13:00 AM

0

Jim Weirich <jim@weirichhouse.org> wrote:
> 46 updates to the wiki tarpit. Of those, we had ...
> 3 innocent updates
> 2 questionable updates
> 1 update by me
> 40 spams

Very gratifying :)

martin

slonik AZ

12/14/2004 12:41:00 PM

0

Each act of applying a change to a Wiki can be seen as a "message".
One can use email spam filter (with some modifications, of course) as
a first line of defense. If a proposed wiki change looks as spam to
"modified email spam filter" a user is confronted with a set of
challenges such as read letters from a distorted bitmap, answer silly
questions like Who invented Ruby?... etc.
Alternatively, spam filter decides to silently forward the change to
the Tarpit and also notify Wiki admin of the change if the probability
of this message to be a legitimate one is higher that certain
threshold.

--Leo--


On Tue, 14 Dec 2004 20:17:23 +0900, Martin DeMello
<martindemello@yahoo.com> wrote:
> Jim Weirich <jim@weirichhouse.org> wrote:
> > 46 updates to the wiki tarpit. Of those, we had ...
> > 3 innocent updates
> > 2 questionable updates
> > 1 update by me
> > 40 spams
>
> Very gratifying :)
>
> martin
>
>