[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Re: Text mining in Python

Robert Kern

3/10/2010 7:06:00 PM

On 2010-03-10 12:58 PM, mk wrote:
> Hello everyone,
>
> I need to do the following:
>
> (0. transform words in a document into word roots)
>
> 1. analyze a set of documents to see which words are highly frequent
>
> 2. detect clusters of those highly frequent words
>
> 3. map the clusters to some "special" keywords
>
> 4. rank the documents on clusters and "top n" most frequent words
>
> 5. provide search that would rank documents according to whether search
> words were "special" cluster keywords or frequent words
>
> Is there some good open source engine out there that would be suitable
> to the task at hand? Anybody has experience with them?

You can probably do most of this with Whoosh:

http://...

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco