[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: [QUIZ] Shirt Reader (#140

steve d

9/23/2007 12:43:00 PM

Hi all,my solution: http://rn86.net/~stevedp/tshirt_reader... my solution I used a combination of the Metaphone algorithm, pronunciation matching (via CMU pronunciation dictionary [http://www.speech.cs.cmuedu/cgi-b...]),and the Levenshtein distance algorithm. The input must be words ornumbers which sound out the answer word. It will give back at max 10words that are possible matches. In most of my test words the correctmatch is in the 1st or 2nd place, but a few are in 5th or more.You must first run prepare_dicts.rb which does some preparation work.Youneed the Text gem installed (gem install text) for theMetaphone/Levenshtein algorithm. I used ruby inline to re-implementthe Levenshtein algorithm in C (versus the Text gem's pure ruby impl.)which made it run like 20x faster at least. If you don't have rubyinline installed it will fall back on the Text gem.Here's the final list of phrases I was testing with (taken from test/test_tshirt.rb) %w[e scent shells] => 'essentials', %w[q all if i] => 'qualify', %w[fan task tick] => 'fantastic', %w[b you tea full] => 'beautiful', %w[fun duh mint all] => 'fundamental', %w[s cape] => 'escape', %w[pan z] => 'pansy', %w[n gauge] => 'engage', %w[cap tin] => 'captain', %w[g rate full] => 'grateful', %w[re late shun ship] => 'relationship', %w[con grad yeul 8] => 'congratulate', %w[2 burr q low sis] => 'tuberculosis', %w[my crows cope] => 'microscope', %w[add minus ray shun] => 'administration', %w[accent you ate it] => 'accentuated', %w[add van sing] => 'advancing', %w[car knee for us] => 'carnivorous', %w[soup or seed] => 'supercede', %w[poor 2 bell o] => 'portobello', %w[d pen dance] => 'dependence', %w[s o tear rick] => 'esoteric', %w[4 2 it us] => 'fortuitous', %w[4 2 n 8] => 'fortunate', %w[4 in R] => 'foreigner', %w[naan disk clothes your] => 'nondisclosure', %w[Granmda Atika Lee] => 'grammatically', %w[a brie vie a shun] => 'abbreviation', %w[pheemeeneeneetee] => 'femininity', %w[me c c p] => 'mississippi', %w[art fork] => 'aardvark', %w[liberty giblet] => 'flibbertigibbet', %w[zoo key knee] => 'zucchini', %w[you'll tight] => 'yuletide', %w[Luke I like] => 'lookalike', %w[mah deux mah zeal] => 'mademoiselle', %w[may gel omen yak] => 'megalomaniac', %w[half tell mall eau gist] => 'ophthalmologist', %w[whore tea cull your wrist] => 'horticulturist', %w[pant oh my m] => 'pantomime', %w[tear a ball] => 'terrible', %w[a bowl i shun] => 'abolition', %w[pre chair] => 'preacher', %w[10 s] => 'tennis', %w[e z] => 'easy', %w[1 door full] => 'wonderful', %w[a door] => 'adore', %w[hole e] => 'holy', %w[grand your] => 'grandeur', %w[4 2 5] => 'fortify', %w[age, it ate her] => 'agitator', %w[tear it or eel] => 'territorial', %w[s 1] => 'swan'- steve

2 Answers

Ken Bloom

9/23/2007 5:45:00 PM

0

On Sun, 23 Sep 2007 21:42:30 +0900, steve d wrote:

> Hi all,
>
> my solution: http://rn86.net/~stevedp/tshirt_rea...
>
> For my solution I used a combination of the Metaphone algorithm,
> pronunciation matching (via CMU pronunciation dictionary
> [http://www.speech.cs.cmuedu/cgi-b...]), and the Levenshtein
> distance algorithm. The input must be words or numbers which sound out
> the answer word. It will give back at max 10 words that are possible
> matches. In most of my test words the correct match is in the 1st or
> 2nd place, but a few are in 5th or more.

My answer's along the same lines with Metaphone, but nowhere near as good
as steve's:

require 'rubygems'
require 'text'
include Text::Metaphone


#use this to do the double_metaphone as a drop-in replacement for metaphone
def dmetaphone word
first,second = double_metaphone word
second || first
end


#this solution gets 3 of the test cases correct if single metaphone is used
#it gets 10 of the test cases correct if double-metaphone is used, but also
#provides a much longer list of wrong answers for everything
#
#use this alias to set the particular phonetic conversion algorithm
alias_method :phonetic_convert, :dmetaphone


NUMBERS=Hash.new{|h,k| k}.merge!({"1"=>"one", "2"=>"two", "3"=>"three",
"4"=>"four","5"=>"five","6"=>"six","7"=>"seven","8"=>"eight","9"=>"nine"})

DICT=open('/usr/share/dict/words') do |f|
d=Hash.new{|h,k| h[k]=[]}
f.each_line do |word|
word=word.chomp
d[phonetic_convert(word).gsub(/\s/,'')] << word
end
d
end

def rebus words
words=words.collect{|x| NUMBERS[x]}.join(' ')
DICT[phonetic_convert(words).gsub(/\s/,'')]
end

#tests given by steve d <oksteve@yahoo.com>
expectations = {
%w[e scent shells] => 'essentials',
%w[q all if i] => 'qualify',
%w[fan task tick] => 'fantastic',
%w[b you tea full] => 'beautiful',
%w[fun duh mint all] => 'fundamental',
%w[s cape] => 'escape',
%w[pan z] => 'pansy',
%w[n gauge] => 'engage',
%w[cap tin] => 'captain',
%w[g rate full] => 'grateful',
%w[re late shun ship] => 'relationship',
%w[con grad yeul 8] => 'congratulate',
%w[con grad yule 8 shins] => 'congratulations', #from Phrogz
%w[2 burr q low sis] => 'tuberculosis',
}

expectations.each do |words,target|
result=rebus(words)
if result.include?(target)
printf "%s correctly gave %s.\n", words.inspect, target
else
printf "%s incorrect. Expected %s.\n", words.inspect, target
end
printf "Metaphone of words: %s Metaphone of target: %s\n",
phonetic_convert(words.collect{|x| NUMBERS[x]}.join(' ')),
phonetic_convert(target)
printf "Matching words %s\n", result.inspect
end


--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu...

benjohn

9/25/2007 7:46:00 PM

0

I'm wondering if this quiz would work in the opposite direction
reasonably well? Given a word (or a sentence), create a series of
words that will sound it out. If you stick to using nouns as the
sound words, you can pretty reliably get an image for the word using
Google's picture search.

Cheers,
B