F. Senault
3/13/2009 8:16:00 AM
Le 12 mars à 21:36, Mmcolli00 Mom a écrit :
> Here is my snippet.
>
> require "fileutils"
>
> Dir["C:/Respecs/*.html"].each do |htmlfile|
> readhtml = File.read(htmlfile)
> if readhtml.include?("seconds") == true
> htmlbase = File.basename(htmlfile)
> puts htmlbase
> end
For the easy way, try readlines and grep :
>> h = File.readlines('f1.txt')
=> ["<h1>hhhhhh</h1>\n", "<h2>20 seconds</h2>\n", "<p>Blah.</p>\n",
"\n"]
>> h.grep(/seconds/)
=> ["<h2>20 seconds</h2>\n"]
For a more sophisticated (and time-consuming) approach, try an HTML
parser like Hpricot :
>> require "hpricot"
=> true
>> doc = Hpricot(File.read('f1.txt'))
=> #<Hpricot::Doc {elem <h1> "hhhhhh" </h1>} "\n" {elem <h2> "20
seconds" </h2>} "\n" {elem <p> "Blah." </p>} "\n\n">
>> doc.children.select { |e| e.inner_html =~ /seconds/ }
=> [{elem h2 "20 seconds" h2}]
HTH.
Fred
--
Everyone is bad in their own way. Finding out each person's unique way
of being bad is most of the fun of getting to know them.
(Lee Wilson)