Asp Forum - Re: mechanize timeout errors

Berger, Daniel

8/8/2006 7:04:00 PM

> -----Original Message-----
> From: akanksha [mailto:akanksha.baid@gmail.com]
> Sent: Tuesday, August 08, 2006 12:45 PM
> To: ruby-talk ML
> Subject: mechanize timeout errors
>
>
> I am using mechanize for scraping some urls.
> begin
> page = agent.get(url)
> rescue
> puts "oops!!"
> end
>
> catches invalid urls etc. , but how to I handle timeout
> errors? In particular this is the error I get :
>
> request-header: accept => */*
> request-header: user-agent => WWW-Mechanize/0.5.1
> (http://rubyforge.org/projects/...)
> /usr/local/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill':
> execution expired
> (Timeout::Error)
> from /usr/local/lib/ruby/1.8/timeout.rb:56:in `timeout'
> from /usr/local/lib/ruby/1.8/timeout.rb:76:in `timeout'

begin
page = agent.get(url)
rescue Timeout::Error
puts "Timeout!"
raise
rescue
puts "Some other error!"
raise
end

If you want control over the timeout value I think you'll need to
re-wrap the call to agent.get in your own timeout block:

require 'timeout'

begin
Timeout.timeout(5){ agent.get(url) }
...

If you're wondering why your rescue didn't handle the exception, it's
because Timeout::Error is a subclass of Interrupt, not StandardError.

Regards,

Dan

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.

6 Answers

Alex Young

8/8/2006 7:27:00 PM

Berger, Daniel wrote:
>>-----Original Message-----
>>From: akanksha [mailto:akanksha.baid@gmail.com]
>>Sent: Tuesday, August 08, 2006 12:45 PM
>>To: ruby-talk ML
>>Subject: mechanize timeout errors
>>
>>
>>I am using mechanize for scraping some urls.
>>begin
>> page = agent.get(url)
>>rescue
>> puts "oops!!"
>>end
>>
>>catches invalid urls etc. , but how to I handle timeout
>>errors? In particular this is the error I get :
>>
>>request-header: accept => */*
>>request-header: user-agent => WWW-Mechanize/0.5.1
>>(http://rubyforge.org/projects/...)
>>/usr/local/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill':
>>execution expired
>>(Timeout::Error)
>> from /usr/local/lib/ruby/1.8/timeout.rb:56:in `timeout'
>> from /usr/local/lib/ruby/1.8/timeout.rb:76:in `timeout'
>
>
> begin
> page = agent.get(url)
> rescue Timeout::Error
> puts "Timeout!"
> raise
> rescue
> puts "Some other error!"
> raise
> end
>
> If you want control over the timeout value I think you'll need to
> re-wrap the call to agent.get in your own timeout block:
Not so: WWW::Mechanize#read_timeout= is your friend.

--
A;ex

akanksha.baid

8/8/2006 7:49:00 PM

could you plz elaborate a little on that or point me to an example.

Alex Young wrote:
> Berger, Daniel wrote:
> >>-----Original Message-----
> >>From: akanksha [mailto:akanksha.baid@gmail.com]
> >>Sent: Tuesday, August 08, 2006 12:45 PM
> >>To: ruby-talk ML
> >>Subject: mechanize timeout errors
> >>
> >>
> >>I am using mechanize for scraping some urls.
> >>begin
> >> page = agent.get(url)
> >>rescue
> >> puts "oops!!"
> >>end
> >>
> >>catches invalid urls etc. , but how to I handle timeout
> >>errors? In particular this is the error I get :
> >>
> >>request-header: accept => */*
> >>request-header: user-agent => WWW-Mechanize/0.5.1
> >>(http://rubyforge.org/projects/...)
> >>/usr/local/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill':
> >>execution expired
> >>(Timeout::Error)
> >> from /usr/local/lib/ruby/1.8/timeout.rb:56:in `timeout'
> >> from /usr/local/lib/ruby/1.8/timeout.rb:76:in `timeout'
> >
> >
> > begin
> > page = agent.get(url)
> > rescue Timeout::Error
> > puts "Timeout!"
> > raise
> > rescue
> > puts "Some other error!"
> > raise
> > end
> >
> > If you want control over the timeout value I think you'll need to
> > re-wrap the call to agent.get in your own timeout block:
> Not so: WWW::Mechanize#read_timeout= is your friend.
>
> --
> A;ex

Alex Young

8/8/2006 8:22:00 PM

akanksha wrote:
> could you plz elaborate a little on that or point me to an example.
>
>
> Alex Young wrote:
>
>>Berger, Daniel wrote:
>>
>>>>-----Original Message-----
>>>>From: akanksha [mailto:akanksha.baid@gmail.com]
>>>>Sent: Tuesday, August 08, 2006 12:45 PM
>>>>To: ruby-talk ML
>>>>Subject: mechanize timeout errors
>>>>
>>>>
>>>>I am using mechanize for scraping some urls.
>>>>begin
>>>> page = agent.get(url)
>>>>rescue
>>>> puts "oops!!"
>>>>end
>>>>
>>>>catches invalid urls etc. , but how to I handle timeout
>>>>errors? In particular this is the error I get :
>>>>
>>>>request-header: accept => */*
>>>>request-header: user-agent => WWW-Mechanize/0.5.1
>>>>(http://rubyforge.org/projects/...)
>>>>/usr/local/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill':
>>>>execution expired
>>>>(Timeout::Error)
>>>> from /usr/local/lib/ruby/1.8/timeout.rb:56:in `timeout'
>>>> from /usr/local/lib/ruby/1.8/timeout.rb:76:in `timeout'
>>>
>>>
>>>begin
>>> page = agent.get(url)
>>>rescue Timeout::Error
>>> puts "Timeout!"
>>> raise
>>>rescue
>>> puts "Some other error!"
>>> raise
>>>end
>>>
>>>If you want control over the timeout value I think you'll need to
>>>re-wrap the call to agent.get in your own timeout block:
>>
>>Not so: WWW::Mechanize#read_timeout= is your friend.
>>
>>--
>>A;ex
>
>
>
Sure:

irb(main):001:0> require 'mechanize'
=> true
irb(main):002:0> agent = WWW::Mechanize.new;
irb(main):003:0> agent.read_timeout = 0.1 # set a 0.1sec timeout
=> 0.1
irb(main):004:0> begin
irb(main):005:1* agent.get("http://www.ruby-doc...)
irb(main):006:1> rescue Timeout::Error
irb(main):007:1> puts "Timeout!"
irb(main):008:1> end
Timeout!
=> nil
irb(main):009:0>

--
Alex

akanksha.baid

8/8/2006 9:46:00 PM

So I did what you said.....

agent.read_timeout = 10
begin
page = agent.get(url)
rescue Timeout::Error
puts "Timeout error"
rescue
puts "Normal error"
end

but am still getting the following error. Any idea what might be
causing it?

GET: http...
request-header: accept => */*
request-header: user-agent => WWW-Mechanize/0.5.1
(http://rubyforge.org/projects/...)
header: cache-control : private
header: connection : close, close
header: expires : Thu, 30 Sep 1999 01:29:07 GMT
header: content-type : text/html
header: x-powered-by : ASP.NET
header: date : Tue, 08 Aug 2006 21:43:28 GMT
header: x-server : CF0471
header: server : Microsoft-IIS/6.0
header: page-completion-status : Normal, Normal
header: pragma : no-cache
status: 200
GET: http...
request-header: accept => */*
request-header: user-agent => WWW-Mechanize/0.5.1
(http://rubyforge.org/projects/...)
request-header: referer => http...
/usr/local/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill': execution expired
(Timeout::Error)
from /usr/local/lib/ruby/1.8/timeout.rb:56:in `timeout'
from /usr/local/lib/ruby/1.8/timeout.rb:76:in `timeout'
from /usr/local/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from /usr/local/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from /usr/local/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from /usr/local/lib/ruby/1.8/net/http.rb:1988:in
`read_status_line'
from /usr/local/lib/ruby/1.8/net/http.rb:1977:in `read_new'
from /usr/local/lib/ruby/1.8/net/http.rb:1046:in `request'
... 8 levels...

hemant kumar

8/8/2006 10:20:00 PM

Timeout errors can be caught like this:

require 'timeout'

timeout(5) do
do_something()
end
rescue Timeout::Error
puts "Hey timeout error"
end

-----Original Message-----
From: akanksha [mailto:akanksha.baid@gmail.com]
Sent: Wednesday, August 09, 2006 3:20 AM
To: ruby-talk ML
Subject: Re: mechanize timeout errors

So I did what you said.....

agent.read_timeout = 10
begin
page = agent.get(url)
rescue Timeout::Error
puts "Timeout error"
rescue
puts "Normal error"
end

but am still getting the following error. Any idea what might be causing it?

GET: http...
request-header: accept => */*
request-header: user-agent => WWW-Mechanize/0.5.1
(http://rubyforge.org/projects/...)
header: cache-control : private
header: connection : close, close
header: expires : Thu, 30 Sep 1999 01:29:07 GMT
header: content-type : text/html
header: x-powered-by : ASP.NET
header: date : Tue, 08 Aug 2006 21:43:28 GMT
header: x-server : CF0471
header: server : Microsoft-IIS/6.0
header: page-completion-status : Normal, Normal
header: pragma : no-cache
status: 200
GET: http...
request-header: accept => */*
request-header: user-agent => WWW-Mechanize/0.5.1
(http://rubyforge.org/projects/...)
request-header: referer => http...
/usr/local/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill': execution expired
(Timeout::Error)
from /usr/local/lib/ruby/1.8/timeout.rb:56:in `timeout'
from /usr/local/lib/ruby/1.8/timeout.rb:76:in `timeout'
from /usr/local/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from /usr/local/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from /usr/local/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from /usr/local/lib/ruby/1.8/net/http.rb:1988:in
`read_status_line'
from /usr/local/lib/ruby/1.8/net/http.rb:1977:in `read_new'
from /usr/local/lib/ruby/1.8/net/http.rb:1046:in `request'
... 8 levels...

hemant kumar

8/8/2006 10:23:00 PM

Also, you may like to read this:

http://blog.segment7.net/articles/2006/04/11/care-and-feeding-of-t...
eout

by our own, Eric Hodel.

-----Original Message-----
From: akanksha [mailto:akanksha.baid@gmail.com]
Sent: Wednesday, August 09, 2006 3:20 AM
To: ruby-talk ML
Subject: Re: mechanize timeout errors

So I did what you said.....

agent.read_timeout = 10
begin
page = agent.get(url)
rescue Timeout::Error
puts "Timeout error"
rescue
puts "Normal error"
end

but am still getting the following error. Any idea what might be causing it?

GET: http...
request-header: accept => */*
request-header: user-agent => WWW-Mechanize/0.5.1
(http://rubyforge.org/projects/...)
header: cache-control : private
header: connection : close, close
header: expires : Thu, 30 Sep 1999 01:29:07 GMT
header: content-type : text/html
header: x-powered-by : ASP.NET
header: date : Tue, 08 Aug 2006 21:43:28 GMT
header: x-server : CF0471
header: server : Microsoft-IIS/6.0
header: page-completion-status : Normal, Normal
header: pragma : no-cache
status: 200
GET: http...
request-header: accept => */*
request-header: user-agent => WWW-Mechanize/0.5.1
(http://rubyforge.org/projects/...)
request-header: referer => http...
/usr/local/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill': execution expired
(Timeout::Error)
from /usr/local/lib/ruby/1.8/timeout.rb:56:in `timeout'
from /usr/local/lib/ruby/1.8/timeout.rb:76:in `timeout'
from /usr/local/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from /usr/local/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from /usr/local/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from /usr/local/lib/ruby/1.8/net/http.rb:1988:in
`read_status_line'
from /usr/local/lib/ruby/1.8/net/http.rb:1977:in `read_new'
from /usr/local/lib/ruby/1.8/net/http.rb:1046:in `request'
... 8 levels...

comp.lang.ruby

Re: mechanize timeout errors

Berger, Daniel

Alex Young

akanksha.baid

Alex Young

akanksha.baid

hemant kumar

hemant kumar

x Login to ForumsZone