[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

https html parsing

Arun Kumar

3/26/2009 7:14:00 AM

Hi,

Can anybody please give me the details of how to parse html from a
'https' site using 'net/http'. I came to learn that we can use basic
authentication techniques to do that. But I want to know whether there
is a way other than basic authentication to parse html content from a
https site. I dont want to use any external libraries and i only want to
use 'net/http' and not any parsing libraries like 'Hpricot' etc.. Can
anybody please help me. I will be really greatfull.

N. B.
/usr/lib/ruby/1.8/net/http.rb:560:in `initialize': getaddrinfo: Name or
service not known (SocketError)
from /usr/lib/ruby/1.8/net/http.rb:560:in `open'
from /usr/lib/ruby/1.8/net/http.rb:560:in `connect'
from /usr/lib/ruby/1.8/timeout.rb:53:in `timeout'
from /usr/lib/ruby/1.8/timeout.rb:93:in `timeout'
from /usr/lib/ruby/1.8/net/http.rb:560:in `connect'
from /usr/lib/ruby/1.8/net/http.rb:553:in `do_start'
from /usr/lib/ruby/1.8/net/http.rb:542:in `start'
from example.rb:5
This is the error which i get now when i use 'net/http'.

Regards
Arun Kumar
--
Posted via http://www.ruby-....

5 Answers

Brian Candler

3/27/2009 2:51:00 PM

0

Arun Kumar wrote:
> /usr/lib/ruby/1.8/net/http.rb:560:in `initialize': getaddrinfo: Name or
> service not known (SocketError)
...
> from example.rb:5
> This is the error which i get now when i use 'net/http'.

This error isn't useful unless you also post your example.rb

Note that in order to talk to a https server, you will need net/https
not net/http.

In some distributions (e.g. Ubuntu) you won't have net/https.rb until
you install another package. For Ubuntu, apt-get install libopenssl-ruby
--
Posted via http://www.ruby-....

Arun Kumar

3/28/2009 4:13:00 AM

0

Brian Candler wrote:
> Arun Kumar wrote:
>> /usr/lib/ruby/1.8/net/http.rb:560:in `initialize': getaddrinfo: Name or
>> service not known (SocketError)
> ...
>> from example.rb:5
>> This is the error which i get now when i use 'net/http'.
>
> This error isn't useful unless you also post your example.rb
>
> Note that in order to talk to a https server, you will need net/https
> not net/http.
>
> In some distributions (e.g. Ubuntu) you won't have net/https.rb until
> you install another package. For Ubuntu, apt-get install libopenssl-ruby

Hi,
Thanks for ur reply.

require 'net/https'
require 'uri'

uri = "https://www.cia...
url = URI.parse(uri)
req = Net::HTTP::Get.new('/', "User-Agent"=>"Mozilla/4.0 (compatible
MSIE 5.5; Windows NT 5.0)")
https = Net::HTTP.new(url.host, url.port)
https.use_ssl = true
res = https.start {|http|
http.request(req)
}
puts res.body

This is my entire code of example.rb. Now i used https. I'm now getting
a warning like this when i try to access https sites :

warning: peer certificate won't be verified in this SSL session

and i'm not able to access sites where the url is redirecting.

Please help me

Regards
Arun Kumar
--
Posted via http://www.ruby-....

Eric Hodel

3/28/2009 4:34:00 AM

0

On Mar 27, 2009, at 21:13, Arun Kumar wrote:
> Brian Candler wrote:
>> Arun Kumar wrote:
>>> /usr/lib/ruby/1.8/net/http.rb:560:in `initialize': getaddrinfo:
>>> Name or
>>> service not known (SocketError)
>> ...
>>> from example.rb:5
>>> This is the error which i get now when i use 'net/http'.
>>
>> This error isn't useful unless you also post your example.rb
>>
>> Note that in order to talk to a https server, you will need net/https
>> not net/http.
>>
>> In some distributions (e.g. Ubuntu) you won't have net/https.rb until
>> you install another package. For Ubuntu, apt-get install libopenssl-
>> ruby
>
> Hi,
> Thanks for ur reply.
>
> require 'net/https'
> require 'uri'
>
> uri = "https://www.cia...
> url = URI.parse(uri)
> req = Net::HTTP::Get.new('/', "User-Agent"=>"Mozilla/4.0 (compatible
> MSIE 5.5; Windows NT 5.0)")
> https = Net::HTTP.new(url.host, url.port)
> https.use_ssl = true
> res = https.start {|http|
> http.request(req)
> }
> puts res.body
>
> This is my entire code of example.rb. Now i used https. I'm now
> getting
> a warning like this when i try to access https sites :
>
> warning: peer certificate won't be verified in this SSL session
>
> and i'm not able to access sites where the url is redirecting.

This is explained in `ri Net::HTTP` under "Following Redirection"

Arun Kumar

3/28/2009 8:56:00 AM

0

Eric Hodel wrote:
> On Mar 27, 2009, at 21:13, Arun Kumar wrote:
>>>
>> require 'net/https'
>> }
>> puts res.body
>>
>> This is my entire code of example.rb. Now i used https. I'm now
>> getting
>> a warning like this when i try to access https sites :
>>
>> warning: peer certificate won't be verified in this SSL session
>>
>> and i'm not able to access sites where the url is redirecting.
>
> This is explained in `ri Net::HTTP` under "Following Redirection"

Hi,
I solved the redirecting problem. But still the https problem remains.
Please help me

Regards
Arun Kumar
--
Posted via http://www.ruby-....

Brian Candler

3/28/2009 5:51:00 PM

0

Arun Kumar wrote:
> I solved the redirecting problem. But still the https problem remains.

The error is just telling you that it doesn't have any local CA root
certificates with which to verify the certificate presented by the peer.

If you want to enable certificate verification, and set the path to the
directory containing your CA root certificates, there is example code
here:

http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/branches/ruby_1_8/sample/openssl/wget.rb?v...

There is more documentation at the top of net/https.rb, which will be
installed on your system somewhere, e.g. /usr/lib/ruby/1.8/net/https.rb

You may already have root CA certificates installed. On my Ubuntu Hardy
box they are in /etc/ssl/certs/
--
Posted via http://www.ruby-....