[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

[ANN] Classifier 1.3.0: New Summary feature and no dependency on external c-libraries

Lucas Carlson

5/5/2005 9:28:00 AM

I know that many of you have been interested in trying out some of the
new features in Classifier (http://classifier...) like LSI
(Latent Semantic Indexing) but don't have time to install the external
GSL library that was required. You will be happy to know that Ernest
Ellingson has graciously provided a native Ruby implementation of the
complicated math library. This slows down Classifier's implementation
of LSI by a factor of at least 10x, but the idea is that you can now
play with LSI all you want. The best part is that as soon as you do
install GSL, Classifier will automatically take advantage of it and
speed things up considerably. You now have no excuse not to play with
classifier since all it takes is:

gem install classifier

-or-

http://rubyforge.org/projects/c...

Brand new to this release of Classifier is a string method called
#summary that takes full advantage of LSI's ability to find the most
important sentences or paragraphs out of a block of text. Here is an
example usage:

require 'classifier'
require 'open-uri'
open('http://rufy.com/pickaxe-intr...).read.gsub(/<[^>]*>/,"").summary

Produces the following summarization of
http://rubycentral.com/book/for...:

"If you don't believe me, read this book and try Ruby [...] But I was
still hoping to design a language that would work for most of the jobs
I did everyday [...] Ruby has never been a well-documented language
[...] While they were writing it, I was modifying the language itself
[...] Shortly after I was introduced to computers, I became interested
in programming languages [...] As an object-oriented fan for more than
fifteen years, it seemed to me that OO programming was very suitable
for scripting too [...] I wanted a language more powerful than Perl,
and more object-oriented than Python [...] I believed that an ideal
programming language must be attainable, and I wanted to be the
designer of it [...] Because I have always preferred writing programs
over writing documents, the Ruby manuals tend to be less thorough than
they should be [...] It is my hope that both Ruby and this book will
serve to make your programming easy and enjoyable"

I hope you enjoy Classifier!

-Lucas Carlson
http://tech...

2 Answers

Lucas Carlson

5/5/2005 9:56:00 AM

0

Oh goodness, I hadn't tested my summary example without GSL. If you
haven't yet installed GSL I wouldn't recommend trying that particular
example. With GSL it takes under a second to summarize the forward.
Without GSL it takes many minutes.

Instead try giving it a dozen or so sentences. You can limit how many
sentences end up in the summary with an optional parameter like this:

x = "This text deals with dogs. Dogs. This text involves dogs too.
Dogs! This text revolves around cats. Cats. This text also involves
cats. Cats! This text involves birds. Birds."

x.summary 2

Outputs:

"This text involves dogs too [...] This text also involves cats"

George Moschovitis

5/5/2005 11:02:00 AM

0

utterly brilliant!

George

--
http://nitro.rub...