[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Ruby port of Nilsimsa?

Martin Pirker

4/13/2005 1:45:00 AM

Hi...

The Nilsimsa algorithmen is a digest technique for determining the
similarity of text messages (e.g. spam detection...)
Calling out to the standalone program is not very efficient, but
CPAN seemingly contains a modified version with only xx lines C
for Perl interfacing.

Any "bored" Ruby guru out there who can take a look how much
effort a port from Perl to Ruby interfaces would take?

I lack the knowledge of Ruby<-->C .... :-/

Thanks!
Martin
4 Answers

Lyndon Samson

4/13/2005 7:28:00 AM

0

On 4/13/05, Martin Pirker <crf@sbox.tu-graz.ac.at> wrote:
> Hi...
>
> The Nilsimsa algorithmen is a digest technique for determining the
> similarity of text messages (e.g. spam detection...)
> Calling out to the standalone program is not very efficient, but
> CPAN seemingly contains a modified version with only xx lines C
> for Perl interfacing.
>
> Any "bored" Ruby guru out there who can take a look how much
> effort a port from Perl to Ruby interfaces would take?
>
> I lack the knowledge of Ruby<-->C .... :-/
>

Is this your homework young Martin :-) Maybe you should post it to rentacoder.


> Thanks!
> Martin
>
>


--
Into RFID? www.rfidnewsupdate.com Simple, fast, news.



Martin Pirker

4/13/2005 10:27:00 AM

0

Lyndon Samson <lyndon.samson@gmail.com> wrote:
> Is this your homework young Martin :-) Maybe you should post it to rentacoder.

What is "young"? What is "homework"?

>> Any "bored" Ruby guru out there who can take a look how much
>> effort a port from Perl to Ruby interfaces would take?

To be more clear, what I want to know is:

Is it enough to write a working interface module/definition/whatever,
or is it known that C modules which run fine with Perl need additional
work because e.g. Rubys runtime GC requires tagging of all memory
allocations or similar reworking of the C side.

In the first case I may be able to guess it myself (guessing from other
Ruby lib module examples), in the later case I won't even try, that
learning step is too steep for first time.


Martin

Martin Pirker

4/14/2005 4:57:00 PM

0

Martin Pirker <crf@sbox.tu-graz.ac.at> wrote:
> The Nilsimsa algorithmen is a digest technique for determining the
> similarity of text messages (e.g. spam detection...)

100% Ruby port (try 1)
http://rubyforge.org/frs/?gr...

Martin

Martin Pirker

4/18/2005 12:39:00 PM

0

Martin Pirker <crf@sbox.tu-graz.ac.at> wrote:
>> The Nilsimsa algorithmen is a digest technique for determining the
>> similarity of text messages (e.g. spam detection...)
>
> 100% Ruby port (try 1)
> http://rubyforge.org/frs/?gr...

In the middle of the night Steve Lewis kindly donated interfaces
to the original C core - Thank you!

You now have the option of running pure Ruby (300 times slower than C),
or if you compile the native extension it runs about C speed.

Both cores produce results matching the standalone C version (here
on Linux x86..), so I guess it's bugfree now :-)

Martin