[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

help with mechanize

Jeremy Woertink

8/6/2008 4:53:00 AM

I'm using mechanize to log into this form. The redirects aren't going
where I would expect them to though. I don't think i'm being logged in
properly, yet when I try it through the web browser I get logged in
normal.

I'm looking for maybe a better way on how to do this, or if anyone has
an ideas.

@agent = WWW::Mechanize.new { |agent| agent.user_agent_alias = 'Windows
Mozilla' }

@agent.get('http://smallbusiness.yahoo.com/ecomm...) do |page|

@login_page = page.links.text("Small Business").click
temp_page = @login_page.form_with(:name => 'login_form') do |form|
form['login'] = @login
form['passwd'] = @password
end.submit

if temp_page.uri.to_s.include?("login")
puts "Not logged in."
puts "an error occured."
exit
else
puts "Logged in"
end
end

The only thing I could think of is if the login fails, it returns me
back to a login page. This always says "Not Logged in" even though I
know the @login and @password are correct.

On another note, is there any good sites with REALLY good docs on
mechanize, and everything it can do. The main docs page seems to just
show the methods but not really what they do and how to use them.

Thanks,
~Jeremy
--
Posted via http://www.ruby-....

7 Answers

Aaron Patterson

8/6/2008 6:31:00 PM

0

Hi Jeremy,

On Wed, Aug 06, 2008 at 01:52:57PM +0900, Jeremy Woertink wrote:
> I'm using mechanize to log into this form. The redirects aren't going
> where I would expect them to though. I don't think i'm being logged in
> properly, yet when I try it through the web browser I get logged in
> normal.
>
> I'm looking for maybe a better way on how to do this, or if anyone has
> an ideas.

I tried out this script, and it looks like Yahoo sends a meta refresh
after you log in. Mechanize does not follow meta refreshes by default,
so you need to set that option.

Change this line:

> @agent = WWW::Mechanize.new { |agent| agent.user_agent_alias = 'Windows
> Mozilla' }

To this:

@agent = WWW::Mechanize.new { |agent|
agent.user_agent_alias = 'Windows Mozilla'
agent.follow_meta_refresh = true
}

How did I know to do this? I examined the behavior in Firefox and made
Mechanize do the same thing. I typically recommend people install
LiveHTTPHeaders in firefox and examine the requests and responses.

> The only thing I could think of is if the login fails, it returns me
> back to a login page. This always says "Not Logged in" even though I
> know the @login and @password are correct.

Detecting whether or not you are logged in depends on the site you are
interacting with. Looking for a 'Not Logged in' string may be
appropriate in this case.

> On another note, is there any good sites with REALLY good docs on
> mechanize, and everything it can do. The main docs page seems to just
> show the methods but not really what they do and how to use them.

I've tried to document the library as much as possible. "Good docs" is
very subjective. I'm doing my best. :-)

That said, check EXAMPLES.txt, GUIDE.txt, and also the RDoc for each of
the main classes (Mechanize::Page, Mechanize::Form).

Hope that helps.

--
Aaron Patterson
http://tenderlovem...

Jeremy Woertink

8/6/2008 6:58:00 PM

0

DUDE!!! Before I even try this out, i'm going to give you mad kudos! You
rock!

I will check it out and see what I come up with. I appreciate it.

As for another question, I'm trying to parse the HTML on this one page
after I get all logged in. I looked at the docs for Hpricot, but I don't
understand how it works exactly..

i.e.

doc.search("/html/body//p")

why does the "p" need 2 slashes in front?

It works when I do this, but if I only have 1 slash, then it doesn't
work..



Thanks,
~Jeremy
--
Posted via http://www.ruby-....

Phlip

8/7/2008 2:15:00 AM

0

Jeremy Woertink wrote:

> doc.search("/html/body//p")
>
> why does the "p" need 2 slashes in front?
>
> It works when I do this, but if I only have 1 slash, then it doesn't
> work..

'/html/body/p' will only match immediate children of <body>. Hence a body/div/p,
for example, won't match. // searches any descendant.

Here are tutorials on XPath for unit tests. Hpricot will also support many of
their techniques for functional tests:

http://www.oreillynet.com/onlamp/blog/2007/08/xpath_checker_and_assert_...
http://www.oreillynet.com/onlamp/blog/2007/08/assert_hpri...

--
Phlip

Jeremy Woertink

8/11/2008 11:35:00 PM

0

I have another issue. The fix you gave me worked, but now i'm getting
stuck on this form.

I can get logged in just fine, but sometimes I'm asked to put in an
additional password. When this form comes up, I find the form, but when
I call .submit, or .click_button it just sits there. I let it sit there
while I went to lunch and came back an hour later and nothing had
happened. I turned the logging on, and I'm not seeing anything helpful.
I have to just do a ctrl+C to make it stop, but when I do, I get this
error.

c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Interrupt
from c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/timeout.rb:56:in `timeout'
from c:/ruby/lib/ruby/1.8/timeout.rb:76:in `timeout'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from c:/ruby/lib/ruby/1.8/net/http.rb:2029:in `read_status_line'
from c:/ruby/lib/ruby/1.8/net/http.rb:2018:in `read_new'
... 8 levels...
from scrape.rb:87:in `each'
from scrape.rb:87
from
c:/ruby/lib/ruby/gems/1.8/gems/mechanize-0.7.7/lib/www/mechanize.rb
:217:in `get'
from scrape.rb:58
...
87 @store_manager.forms.each do |form|
88 form['passwd'] = @security_key
89 form.click_button
90 end
...

I can go through a normal browser and follow the steps normally and it
works fine.

Any ideas?

Thanks,
~Jeremy
--
Posted via http://www.ruby-....

Jeremy Woertink

8/12/2008 5:26:00 PM

0

I have also tried this code, and it does the same thing

@store_index = @store_manager.form_with(:name => 'a') do |form|
form['passwd'] = @security_key
end.submit

It just stops. Is there a way I can see what it is doing? I'm not sure
how to fix this problem. Could have anything to do with it being an
https://?

Thanks,
~Jeremy

Jeremy Woertink wrote:
> I have another issue. The fix you gave me worked, but now i'm getting
> stuck on this form.
>
> I can get logged in just fine, but sometimes I'm asked to put in an
> additional password. When this form comes up, I find the form, but when
> I call .submit, or .click_button it just sits there. I let it sit there
> while I went to lunch and came back an hour later and nothing had
> happened. I turned the logging on, and I'm not seeing anything helpful.
> I have to just do a ctrl+C to make it stop, but when I do, I get this
> error.
>
> c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Interrupt
> from c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
> from c:/ruby/lib/ruby/1.8/timeout.rb:56:in `timeout'
> from c:/ruby/lib/ruby/1.8/timeout.rb:76:in `timeout'
> from c:/ruby/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
> from c:/ruby/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
> from c:/ruby/lib/ruby/1.8/net/protocol.rb:126:in `readline'
> from c:/ruby/lib/ruby/1.8/net/http.rb:2029:in `read_status_line'
> from c:/ruby/lib/ruby/1.8/net/http.rb:2018:in `read_new'
> ... 8 levels...
> from scrape.rb:87:in `each'
> from scrape.rb:87
> from
> c:/ruby/lib/ruby/gems/1.8/gems/mechanize-0.7.7/lib/www/mechanize.rb
> :217:in `get'
> from scrape.rb:58
> ...
> 87 @store_manager.forms.each do |form|
> 88 form['passwd'] = @security_key
> 89 form.click_button
> 90 end
> ...
>
> I can go through a normal browser and follow the steps normally and it
> works fine.
>
> Any ideas?
>
> Thanks,
> ~Jeremy

--
Posted via http://www.ruby-....

Allen W. McDonnell

1/26/2010 3:23:00 AM

0

>
> Has he ever even been in the top ten?
>
>

This year Bill O'Rielly is #10.


Dano

1/26/2010 4:28:00 AM

0


"Allen W. McDonnell" <tanada@peakoil.com> wrote in message
news:hjln6b$kng$1@news.eternal-september.org...
> >
> > Has he ever even been in the top ten?
> >
> >
>
> This year Bill O'Rielly is #10.
>

Bill who?