Asp Forum
Home
|
Login
|
Register
|
Search
Forums
>
comp.lang.ruby
Re: Using hpricot to get tables
Dan Diebolt
7/1/2008 9:04:00 PM
[Note: parts of this message were removed to make it a legal post.]
>I would like to access each table individually
doc.search returns an array even if there is only one match. The consturct you are using iterates through this array:
doc.search(strPath) do |div|
end
if you capture the search results into a variable named "divs" you can index it like and array (because it is one)
divs=doc.search(strPath)
If you want to immediately start iterating you can do this:
doc.search(strPath).each_with_index do |div,idiv|
puts idiv if idiv==2
end
I work with hpricot a lot and I find it is more productive to not use all the fancy ruby idioms to shorten your code as you are dealing with pages that are very fragile to parse when someone changes the page content.
See code below
==============
require 'hpricot'
require 'open-uri'
strLink ="
http://www.sportsline.com/mlb/gamecenter/boxscore/MLB_20080331_ARI...
strPath ="//div[@class='SLTables1']/div"
doc = Hpricot(open(strLink))
divs=doc.search(strPath)
puts "#{divs[0].inner_text.slice(0..70)}\n\n"
puts "#{divs[1].inner_text.slice(0..70)}\n\n"
puts "#{divs[2].inner_text.slice(0..70)}\n\n"
puts "#{divs[3].inner_text.slice(0..70)}\n\n"
1 Answer
lrlebron@gmail.com
7/1/2008 9:39:00 PM
0
On Jul 1, 4:03 pm, Dan Diebolt <dandieb...@yahoo.com> wrote:
> [Note: parts of this message were removed to make it a legal post.]
>
> >I would like to access each table individually
>
> doc.search returns an array even if there is only one match. The consturct you are using iterates through this array:
>
> doc.search(strPath) do |div|
>
> end
>
> if you capture the search results into a variable named "divs" you can index it like and array (because it is one)
>
> divs=doc.search(strPath)
>
> If you want to immediately start iterating you can do this:
>
> doc.search(strPath).each_with_index do |div,idiv|
> puts idiv if idiv==2
> end
>
> I work with hpricot a lot and I find it is more productive to not use all the fancy ruby idioms to shorten your code as you are dealing with pages that are very fragile to parse when someone changes the page content.
>
> See code below
> ==============
> require 'hpricot'
> require 'open-uri'
>
> strLink ="
http://www.sportsline.com/mlb/gamecenter/boxscore/MLB_20080331_ARI...
> strPath ="//div[@class='SLTables1']/div"
>
> doc = Hpricot(open(strLink))
> divs=doc.search(strPath)
>
> puts "#{divs[0].inner_text.slice(0..70)}\n\n"
> puts "#{divs[1].inner_text.slice(0..70)}\n\n"
> puts "#{divs[2].inner_text.slice(0..70)}\n\n"
> puts "#{divs[3].inner_text.slice(0..70)}\n\n"
This works. Will be very useful for future projects.
I ended up using the xpath for each table which also worked.
Thanks,
Luis
Servizio di avviso nuovi messaggi
Ricevi direttamente nella tua mail i nuovi messaggi per
Re: Using hpricot to get tables
Inserendo la tua e-mail nella casella sotto, riceverai un avviso tramite posta elettronica ogni volta che il motore di ricerca troverà un nuovo messaggio per te
Il servizio è completamente GRATUITO!
x
Login to ForumsZone
Login with Google
Login with E-Mail & Password