[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: Bug in URI.parse?

Felix Windt

8/29/2007 9:20:00 PM

> -----Original Message-----
> From: Andrew Beers [mailto:beers@tableausoftware.com]
> Sent: Wednesday, August 29, 2007 1:49 PM
> To: ruby-talk ML
> Subject: Re: Bug in URI.parse?
>
> Ok lots of good responses, thanks! A few comments:
>
> Felix: while URI.parse() is behaving according to the two
> cited RFCs, I
> think it is missing an important use case. In "http://3beers-wrk",
> "3beers-wrk" isn't a domain name, is it? It is an
> unqualified host name


That's fair - it does mention that single unqualified hostnames should work.
I don't have enough time right now at work to look at the RFC for those -
I'm not even sure there is one for them - and what that defines as naming
standards, that might be worth investigating.

> (I assume we'd pick the host name up from context. Now, the RFC also
> suggests that host name must follow these rules (starting
> with a letter,
> etc.), and furthermore, all components of a domain name just
> follow this
> convention, which suggests that the regexp is common.rb is also
> incorrect. :)

I think it does act correctly for qualified domain names, which is
important.

>
> Also, the solution of "rename the host" is a non-solution when dealing
> with customers, who are using an otherwise perfectly
> acceptable hostname
> (I haven't found a tool yet that will balk at a hostname
> beginning with
> a number)

That's true :o)

>
> Now, I'm not sure if the RFCs have been replaced by newer versions -
> that would take some digging.

I'm relatively certain it has not.

>
> So, John, I'd say that this is a bug in URI.parse, since it follows
> neither the published RFCs nor the practical implementation of them
> today (as Coey points out). And if it follows neither, it's
> really not
> a very good general purpose function in the Ruby library and so should
> be fixed.
>
> Andrew

Together with:

> -----Original Message-----
> From: RubyTalk@gmail.com [mailto:rubytalk@gmail.com]
> Sent: Wednesday, August 29, 2007 1:17 PM
> To: ruby-talk ML
> Subject: Re: Bug in URI.parse?
>
> If it is a bug change toplabel in common.rb to this
>
> TOPLABEL = "(?:[#{ALNUM}](?:[-#{ALNUM}]*[#{ALNUM}])?)"
>
> Thanks to my friendly MySQL admin .
>
> Stephen Becker IV


If it is a bug - maybe you should file on the core mailing list and enquire?
-, here's a better fix:

$ ruby -v
ruby 1.8.5 (2006-08-25) [i486-linux]
diff for uri/common.rb:
56c56
< HOSTNAME = "(?:(?:#{DOMLABEL}\\.)+#{TOPLABEL}\\.?)|(?:#{DOMLABEL}?)"
---
> HOSTNAME = "(?:#{DOMLABEL}\\.)*#{TOPLABEL}\\.?"


If it's a qualified domain name, enforce things as they were. If there are
no sub-domains or domains to a top level domain, accept sub-domain naming
stands (can start with a number) as a single, unqualified hostname.

With that change:

irb(main):001:0> require 'uri'
=> true
irb(main):002:0> URI.parse('http://www.exampl...)
=> #<URI::HTTP:0xfdbdf1726 URL:http://www.examp...
irb(main):003:0> URI.parse('http://2.exampl...)
=> #<URI::HTTP:0xfdbdf03ee URL:http://2.examp...
irb(main):004:0> URI.parse('http://2test')
=> #<URI::HTTP:0xfdbdef250 URL:http://2test>
irb(main):005:0> URI.parse('http://2test...)
URI::InvalidURIError: the scheme http does not accept registry part:
2test.4bad (or bad hostname?)
from /usr/lib/ruby/1.8/uri/generic.rb:194:in `initialize'
from /usr/lib/ruby/1.8/uri/http.rb:46:in `initialize'
from /usr/lib/ruby/1.8/uri/common.rb:484:in `new'
from /usr/lib/ruby/1.8/uri/common.rb:484:in `parse'
from (irb):5
from :0
irb(main):006:0>

Which should make everyone happy.


Unfortunately, you will have to edit your uri/common.rb file for that
directly - since these are declared as constants, you _can_ override them by
reclaring all modules involved (you'll have to redeclare several patterns
and regular expressions), but you will trigger warnings that way.

Hope that helps,

Felix