[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

How to check if a webpage exists

Davide Benini

9/10/2008 9:39:00 PM

This probably is trivial, but I have been googling for almost 2hs
without finding a viable solution.
Basically I have this rails app which uses hpricot to parse web pages.
There's this line

page = Hpricot( open(url))

If the url is wrong, or the server is down, obviously I get an
exception.
First I tried my luck with

if page = Hpricot( open(url))
blah blah
end

but this did not work.

Then I started googling like crazy for a method to check if a webpage is
loadable. I only found this thread

http://markmail.org/message/iurqf4...

I tried the suggested code, it does not work.
Now, I am pretty sure there's a straightforward way of checking whether
a webpage is loadable.
Can you help me?
Thanks in advance,
Davide
--
Posted via http://www.ruby-....

21 Answers

Todd Benson

9/10/2008 10:33:00 PM

0

On Wed, Sep 10, 2008 at 4:38 PM, Davide Benini <nutsmuggler@hotmail.com> wrote:
> This probably is trivial, but I have been googling for almost 2hs
> without finding a viable solution.
> Basically I have this rails app which uses hpricot to parse web pages.
> There's this line
>
> page = Hpricot( open(url))
>
> If the url is wrong, or the server is down, obviously I get an
> exception.
> First I tried my luck with
>
> if page = Hpricot( open(url))
> blah blah
> end
>
> but this did not work.
>
> Then I started googling like crazy for a method to check if a webpage is
> loadable. I only found this thread
>
> http://markmail.org/message/iurqf4...
>
> I tried the suggested code, it does not work.
> Now, I am pretty sure there's a straightforward way of checking whether
> a webpage is loadable.
> Can you help me?
> Thanks in advance,
> Davide

Not really an answer, but should point you in the right direction.
Also, I can't test with Hpricot right now due to gem install issues.
But, using open-uri...

require 'open-uri'; begin; open('http://www.ww...) {} rescue '404 error'; end

Todd

Axel Etzold

9/10/2008 10:40:00 PM

0


-------- Original-Nachricht --------
> Datum: Thu, 11 Sep 2008 07:32:40 +0900
> Von: "Todd Benson" <caduceass@gmail.com>
> An: ruby-talk@ruby-lang.org
> Betreff: Re: How to check if a webpage exists

> On Wed, Sep 10, 2008 at 4:38 PM, Davide Benini <nutsmuggler@hotmail.com>
> wrote:
> > This probably is trivial, but I have been googling for almost 2hs
> > without finding a viable solution.
> > Basically I have this rails app which uses hpricot to parse web pages.
> > There's this line
> >
> > page = Hpricot( open(url))
> >
> > If the url is wrong, or the server is down, obviously I get an
> > exception.
> > First I tried my luck with
> >
> > if page = Hpricot( open(url))
> > blah blah
> > end
> >
> > but this did not work.
> >
> > Then I started googling like crazy for a method to check if a webpage is
> > loadable. I only found this thread
> >
> > http://markmail.org/message/iurqf4...
> >
> > I tried the suggested code, it does not work.
> > Now, I am pretty sure there's a straightforward way of checking whether
> > a webpage is loadable.
> > Can you help me?
> > Thanks in advance,
> > Davide
>
> Not really an answer, but should point you in the right direction.
> Also, I can't test with Hpricot right now due to gem install issues.
> But, using open-uri...
>
> require 'open-uri'; begin; open('http://www.ww...) {} rescue '404
> error'; end
>
> Todd

Dear Davide,

I was just about to suggest the same thing. It works on my Ubuntu machine.

Best regards,

Axel
--
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten
Browser-Versionen downloaden: http://www.gmx.net/de/...

Davide Benini

9/11/2008 7:41:00 AM

0


>> require 'open-uri'; begin; open('http://www.ww...) {} rescue '404
>> error'; end

Thanks folks,
your suggestion works, but I am not able to integrate it with my
existing code; I have some problems understanding how rescue interacts
with code blocks.
Basically, I have a chunk of code that must be executed ONLY IF there is
no exception; if there is an exception, I need to execute another chunk
of code.
I tried a couple of syntaxes:

require 'open-uri';
begin
open('http://www.ww...) {
// code to execute if everythinkg's ok
} rescue '404 error'
// code to execute in case of error
end

I also tried

require 'open-uri';
begin
open('http://www.ww...) {}
rescue '404 error'
// code to execute in case of error
else
// code to execute if everythinkg's ok
end

Also

require 'open-uri';
begin
open('http://www.ww...)
puts "ok"
rescue '404 error'
puts "error"
end

None of this works.
Which is the proper syntax?
Davide

--
Posted via http://www.ruby-....

Peña, Botp

9/11/2008 8:01:00 AM

0

RnJvbTogRGF2aWRlIEJlbmluaSBbbWFpbHRvOm51dHNtdWdnbGVyQGhvdG1haWwuY29tXSANCiMg
Li4uDQojIE5vbmUgb2YgdGhpcyB3b3Jrcy4NCiMgV2hpY2ggaXMgdGhlIHByb3BlciBzeW50YXg/
DQoNCmNvbXBhcmUsDQoNCj4gcmVxdWlyZSAnb3Blbi11cmknDQo9PiBmYWxzZQ0KPiBiZWdpbg0K
KiAgIG9wZW4gImh0dHA6Ly93d3cuZ29vZ2xlLmNvbSIsIDpwcm94eT0+dHJ1ZQ0KPiAgIHAgImkn
bSBvayIgICAjPC0tIG9rIGNvZGVzIGhlcmUNCj4gcmVzY3VlDQo+ICAgcCAic29ycnkgY2FuJ3Qg
ZG8iICAjPC0tIG5vdCBvayBjb2RlcyBoZXJlDQo+IGVuZA0KImknbSBvayINCj0+IG5pbA0KDQo+
IGJlZ2luDQoqICAgb3BlbiAiaHR0cDovL3RoaXMuZG9lcy5ub3QuZXhpc3QuY29tIiwgOnByb3h5
PT50cnVlDQo+ICAgcCAiaSdtIG9rIg0KPiByZXNjdWUgPT4gZQ0KPiAgIHAgInNvcnJ5IGNhbid0
IGRvIg0KPiAgIHAgImVycm9yIGlzOiAje2V9Ig0KPiBlbmQNCiJzb3JyeSBjYW4ndCBkbyINCiJl
cnJvciBpczogNTAzIFNlcnZpY2UgVW5hdmFpbGFibGUiDQo9PiBuaWwNCg==

Davide Benini

9/11/2008 8:16:00 AM

0

Thanks for your super-fast answer :)
Yet, none of this works on my system; to be double sure I copied and
pasted.
In a script, I get "I'm ok" even when a page does not exist; in IRB , I
always get "sorry can't do"
Any suggestion?
Davide

> compare,
>
>> require 'open-uri'
> => false
>> begin
> * open "http://www.google..., :proxy=>true
>> p "i'm ok" #<-- ok codes here
>> rescue
>> p "sorry can't do" #<-- not ok codes here
>> end
> "i'm ok"
> => nil
>
>> begin
> * open "http://this.does.not.exist..., :proxy=>true
>> p "i'm ok"
>> rescue => e
>> p "sorry can't do"
>> p "error is: #{e}"
>> end
> "sorry can't do"
> "error is: 503 Service Unavailable"
> => nil

--
Posted via http://www.ruby-....

Axel Etzold

9/11/2008 8:40:00 AM

0

-------- Original-Nachricht --------
> Datum: Thu, 11 Sep 2008 17:15:39 +0900
> Von: Davide Benini <nutsmuggler@hotmail.com>
> An: ruby-talk@ruby-lang.org
> Betreff: Re: How to check if a webpage exists

> Thanks for your super-fast answer :)
> Yet, none of this works on my system; to be double sure I copied and
> pasted.
> In a script, I get "I'm ok" even when a page does not exist; in IRB , I
> always get "sorry can't do"
> Any suggestion?
> Davide

Dear Davide,

hmmm ... the suggested code works on my system (Ubuntu 8.04 /ruby 1.8.7.p-22), both for scripts
and on irb.
How do you enter the code on irb ?
Do you do

begin (enter)
line 1 (enter)
rescue (enter)
line2 (enter)
end (enter) ?

I tried instead

begin ; line1 ; recue ; line 2; end (enter)

This caused irb to work correctly.

Best regards,

Axel


--
GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen!
Jetzt dabei sein: http://www.shortview.de/wasistshortview.php?mc=sv_...

Davide Benini

9/11/2008 9:08:00 AM

0

Hi Axel,
I work on Mac Os X Leopard. Ruby works allright, I have also a number of
rails websites running locally, no problems so far.
Could you help me with the proper "script" sintax; I am sure the
rationale beyond the mechanism is correct, but I ultimately need to
integrate this script in a rails application, so I need to have it
woking in a common .rb file. As I said, tried this

require 'open-uri'
begin
open "www.does.not.exist.sdadasdas.com", :proxy=>true
p "i'm ok" #<-- ok codes here
rescue => e
p "sorry can't do"
p "error is: #{e}"
end

Simple as it seems, it does not work, I always end up with "i'm ok". I'm
sure it's some stupid syntactic glitch...
Any suggestion?
Davide
--
Posted via http://www.ruby-....

Peña, Botp

9/11/2008 9:20:00 AM

0

RnJvbTogRGF2aWRlIEJlbmluaSBbbWFpbHRvOm51dHNtdWdnbGVyQGhvdG1haWwuY29tXSANCiMg
SSB3b3JrIG9uIE1hYyBPcyBYIExlb3BhcmQuIFJ1Ynkgd29ya3MgYWxscmlnaHQsIEkgaGF2ZSBh
bHNvIA0KIyBhIG51bWJlciBvZiANCiMgcmFpbHMgd2Vic2l0ZXMgcnVubmluZyBsb2NhbGx5LCBu
byBwcm9ibGVtcyBzbyBmYXIuDQojIENvdWxkIHlvdSBoZWxwIG1lIHdpdGggdGhlIHByb3BlciAi
c2NyaXB0IiBzaW50YXg7IEkgYW0gc3VyZSB0aGUgDQojIHJhdGlvbmFsZSBiZXlvbmQgdGhlIG1l
Y2hhbmlzbSBpcyBjb3JyZWN0LCBidXQgSSB1bHRpbWF0ZWx5IG5lZWQgdG8gDQojIGludGVncmF0
ZSB0aGlzIHNjcmlwdCBpbiBhIHJhaWxzIGFwcGxpY2F0aW9uLCBzbyBJIG5lZWQgdG8gaGF2ZSBp
dCANCiMgd29raW5nIGluIGEgY29tbW9uIC5yYiBmaWxlLiBBcyBJIHNhaWQsIHRyaWVkIHRoaXMN
Cg0KdHJ5DQoNCiMgcmVxdWlyZSAnb3Blbi11cmknDQojICBiZWdpbg0KIyAgICBvcGVuICJ3d3cu
ZG9lcy5ub3QuZXhpc3Quc2RhZGFzZGFzLmNvbSIsIDpwcm94eT0+dHJ1ZQ0KDQpyZXBsYWNlIHRo
ZSBhYm92ZSBsaW5lIHdpdGgNCg0KICAgIHB1dHMgb3Blbigid3d3LmRvZXMubm90LmV4aXN0LnNk
YWRhc2Rhcy5jb20iLDpwcm94eT0+dHJ1ZSkucmVhZA0KDQpwb3N0IHRoZSBvdXRwdXQgYWdhaW4N
Cg0KIyAgICBwICJpJ20gb2siICAgIzwtLSBvayBjb2RlcyBoZXJlDQojICByZXNjdWUgPT4gZQ0K
IyAgIHAgInNvcnJ5IGNhbid0IGRvIg0KIyAgIHAgImVycm9yIGlzOiAje2V9Ig0KIyBlbmQNCg==

Peña, Botp

9/11/2008 9:26:00 AM

0

RnJvbTogRGF2aWRlIEJlbmluaSBbbWFpbHRvOm51dHNtdWdnbGVyQGhvdG1haWwuY29tXSANCiMu
Li4uDQojICAgIG9wZW4gInd3dy5kb2VzLm5vdC5leGlzdC5zZGFkYXNkYXMuY29tIiwgOnByb3h5
PT50cnVlDQoNCmZ3aXcsIG1pbmUgZG9lcyBub3Qgd29yayBpZiBpIGRvIG5vdCBxdWFsaWZ5IHRo
ZSB1cmwsIGllIHNob3VsZCBiZQ0KDQogICBvcGVuICJodHRwOi8vd3d3LmRvZXMubm90LmV4aXN0
LnNkYWRhc2Rhcy5jb20iLCA6cHJveHk9PnRydWUNCg0Kbm90ZSB0aGUgaHR0cDovLw0KDQpidXQg
eW91ciBjYXNlIGlzIHdlaXJkLCBzaW5jZSBpdCBhbHdheXMgd29ya3MgcmVnYXJkbGVzcyA6KQ0K
DQo=

Davide Benini

9/11/2008 9:37:00 AM

0

> try
>
> # require 'open-uri'
> # begin
> # open "www.does.not.exist.sdadasdas.com", :proxy=>true
>
> replace the above line with
>
> puts open("www.does.not.exist.sdadasdas.com",:proxy=>true).read
>
> post the output again
>
> # p "i'm ok" #<-- ok codes here
> # rescue => e
> # p "sorry can't do"
> # p "error is: #{e}"
> # end

The output is

"sorry can't do"
"error is: can't convert Hash into String"

So now there is type conflict...
Davide
--
Posted via http://www.ruby-....