[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Mechanize select list help...?

Andy Pipes

10/6/2008 9:02:00 PM

Hi.

I'm using the excellent WWW::Mechanize to screen scrape a site for UK
frost dates (don't ask ;)

there's a lot of issues with the HTML not being grand, so I thought
that's where I am going wrong in my code, but I'd be really grateful if
somebody could give me a steer on this as I've been trying for hours,
and the documentation only gets me half-way :)

Here's the code. All I want to do is select each of the 100 or so towns
in the select list, follow the link via the submit button and scrape the
first and last frost dates from the resulting page.

Here's the code:

require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new
page = agent.get('http://www.gardenaction.co.uk/main/weather...)


town_results = page.form_with(:action => 'create_cookie.asp') do |e|
e.fields.name('Town').options.each do |s|
s.select
end
end.submit

p town_results.search("/<p align=\"left\">HOME TOWN:(.*)<Form
Method=Post Action=\"create_cookie.asp\">/")

I think I'm actually getting as a result the page itself back not the
results page (which should be
http://www.gardenaction.co.uk/main/weather1-r...)

Can anyone give me some advice here? It should be obvious I'm new to
Ruby and OO so am fully expecting to have gone wrong here with instance
variables or the like :)

thanks in advance.

andy
--
Posted via http://www.ruby-....

2 Answers

Mark Thomas

10/6/2008 9:50:00 PM

0

On Oct 6, 5:02 pm, Andy Pipes <mypipel...@btinternet.com> wrote:
> Hi.
>
> I'm using the excellent WWW::Mechanize to screen scrape a site for UK
> frost dates (don't ask ;)
>
> there's a lot of issues with the HTML not being grand, so I thought
> that's where I am going wrong in my code, but I'd be really grateful if
> somebody could give me a steer on this as I've been trying for hours,
> and the documentation only gets me half-way :)
>
> Here's the code. All I want to do is select each of the 100 or so towns
> in the select list, follow the link via the submit button and scrape the
> first and last frost dates from the resulting page.
>
> Here's the code:
>
> require 'rubygems'
> require 'mechanize'
>
> agent = WWW::Mechanize.new
> page = agent.get('http://www.gardenaction.co.uk/main/weather...)
>
> town_results = page.form_with(:action => 'create_cookie.asp') do |e|
>   e.fields.name('Town').options.each do |s|
>     s.select
>   end
> end.submit
>
> p town_results.search("/<p align=\"left\">HOME TOWN:(.*)<Form
> Method=Post Action=\"create_cookie.asp\">/")
>
> I think I'm actually getting as a result the page itself back not the
> results page (which should behttp://www.gardenaction.co.uk/main/weather1-r...)
>
> Can anyone give me some advice here? It should be obvious I'm new to
> Ruby and OO so am fully expecting to have gone wrong here with instance
> variables or the like :)

I don't think it's the ruby; you need to think it through a bit more.
How many times will you need to submit the form? Once per town,
correct? Therefore, the submit and parse should be inside the loop.

Try this for starters:

agent = WWW::Mechanize.new
page = agent.get('http://www.gardenaction.co.uk/main/weather...)

form = page.form_with(:action => 'create_cookie.asp')
form.fields.name('Town').options.each do |town|
form['Town'] = town
page2 = form.submit
puts page2.body
exit #remove when you're ready to process them all
end

Andy Pipes

10/7/2008 6:17:00 PM

0

Thanks for the help Mark. You're right I needed to think it through a
bit more. Plus, I was unnecessarily using the select method.

Now I've got to find a way to grab the stuff from the proceeding pages
that I need...on to the docs again.

cheers, andy

--
Posted via http://www.ruby-....