[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

mechanize newbie

Colin Summers

6/13/2007 9:54:00 PM

Okay, Ruby in general newbie, but I did the whole shovell project for
RoR, so I felt I was getting somewhere...

I am fooling around trying to make a spider (scraper?) to pull content
off the Forum I read all the time so that I can read it offline.

It seemed like mechanize is exactly what I want. But I try this:

require 'rubygems'; require 'mechanize'

agent = WWW::Mechanize.new
page = agent.get('http://dapo.org/forums/inde...)

pp page

puts "\n\n trying to login... \n\n"

# Fill out the login form
form = page.forms.first
form.vb_login_username = "username"
form.vb_login_md5password = "password"
form.do ="login"
form.s = ""

page = agent.submit(form)

pp page

# pull down a thread
page = agent.get('http://dapo.org/forums/archive/index.php?t-2293...)

pp page

And it doesn't login (blank page for that last get). Clues?

Thanks,
--Colin

2 Answers

jfry

6/13/2007 10:35:00 PM

0

Hi Colin, I can't tell you how to do it in mechanize, but I can say
that what you are trying to do is super easy in Watir: http://openqa...

Watir (Web Application Testing In Ruby) is primarily used for driving
browser-based test automation, but it has a wonderful API that makes
what you describe very easy. Originally the only choice of browser to
drive was IE, but now the FireWatir and SafariWatir projects are
getting strong as well.

Best of luck, whatever solution you go with,
Jeff

On Jun 13, 2:53 pm, "Colin Summers" <blade...@gmail.com> wrote:
> Okay, Ruby in general newbie, but I did the whole shovell project for
> RoR, so I felt I was getting somewhere...
>
> I am fooling around trying to make a spider (scraper?) to pull content
> off the Forum I read all the time so that I can read it offline.
>
> It seemed like mechanize is exactly what I want. But I try this:
>
> require 'rubygems'; require 'mechanize'
>
> agent = WWW::Mechanize.new
> page = agent.get('http://dapo.org/forums/inde...)
>
> pp page
>
> puts "\n\n trying to login... \n\n"
>
> # Fill out the login form
> form = page.forms.first
> form.vb_login_username = "username"
> form.vb_login_md5password = "password"
> form.do ="login"
> form.s = ""
>
> page = agent.submit(form)
>
> pp page
>
> # pull down a thread
> page = agent.get('http://dapo.org/forums/archive/index.php?t-2293...)
>
> pp page
>
> And it doesn't login (blank page for that last get). Clues?
>
> Thanks,
> --Colin


Colin Summers

6/14/2007 7:56:00 PM

0

Nathan,

You are correct. I finally figured that part out (with some help from
someone who wrote the same sort of thing in .NET).

Thanks,
--Colin