Tim Pease
8/23/2008 7:37:00 PM
On Aug 23, 2008, at 2:44 AM, Trans wrote:
>
>
> On Aug 23, 12:51 am, Adam Akhtar <adamtempor...@gmail.com> wrote:
>> Hi im learning ruby and making a few scripts here and there. As
>> part of
>> an ongoing script regarding ebay i came up with a challenge though i
>> dont know if its possible.
>>
>> Given say 200 items from ebay from the same category e.g. cameras I
>> would like put them in groups based on similiarity.
>> E.g. find all listings that are for canon ixy 500 and put them in one
>> group
>>
>> Also id like it to understand the difference between "canon ixy
>> 500" and
>> say 7 listings for "canon ixy 500 camera case"
>>
>> Another problem are the unrelated words in a listings title that the
>> script would have to ingnore e.g. "Bargain", "RARE", "Very RARE",
>> "Did I
>> say it was RARE" etc
>>
>> Of course I dont expect it could ever be perfect but if it could be
>> 60%
>> accurate i would be happy!
>>
>> Elegance is not a priority here so if there is a "hack" which
>> achieves
>> similiar results to say some crazy A.I. routine which takes a year to
>> write then thats great. What ever gets the job done!
>>
>> One hack i came up with was to look for model numbers using
>> regexps. It
>> works really well, but only if there is a model number. I suppose i
>> could first group lisitngs with model numbers and then come up with
>> some
>> routine for the remainder. So for items without model numbers what
>> can i
>> do?
>>
>> Also if it takes 2 hours for the script to do the job then fine. At
>> most
>> 1000 listings will be used. Speed isnt an issue.
>>
>> What type of problem is this? is it a.i.??? Whre should i start
>> researching (I checked wiki but there wasnt enough written for me to
>> know which path to take).
>>
>> any help\pointers greatly appreciated.
>
> Off the top of my head. Try indexing by keywords:
>
> keywords.each do |keyword|
> index[keyword] +=1 if (/#{keyword}/ =~ item.description)
> end
>
> Then write "rules" based on keyword combinations.
>
> But this is very general question. From the sound of it I suspect you
> need to sit down and do a good bit of reading on programming.
>
Use Lucene and/or Solr. They were build to do these kinds of queries.
In fact, you can even use the Ruby/Java bridge if you want to create
Lucene search directly in Ruby.
Blessings,
TwP