[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

How do I parse a string to find a URL?

Jayson Williams

9/17/2007 5:52:00 PM

Is there a command in Ruby that will accept a string, and spit out a
URL that is contained in the string? I think I remember reading about
something that would do this, but I cant recall.

7 Answers

Jano Svitok

9/17/2007 6:42:00 PM

0

On 9/17/07, Jayson Williams <williams.jayson@gmail.com> wrote:
> Is there a command in Ruby that will accept a string, and spit out a
> URL that is contained in the string? I think I remember reading about
> something that would do this, but I cant recall.

URI::extract

http://ruby-doc.org/core/classes/URI.ht...

Jayson Williams

9/17/2007 7:21:00 PM

0

Outstanding!
Thanks

On 9/17/07, Jano Svitok <jan.svitok@gmail.com> wrote:
> On 9/17/07, Jayson Williams <williams.jayson@gmail.com> wrote:
> > Is there a command in Ruby that will accept a string, and spit out a
> > URL that is contained in the string? I think I remember reading about
> > something that would do this, but I cant recall.
>
> URI::extract
>
> http://ruby-doc.org/core/classes/URI.ht...
>
>

Daniel DeLorme

9/17/2007 11:46:00 PM

0

Jano Svitok wrote:
> On 9/17/07, Jayson Williams <williams.jayson@gmail.com> wrote:
>> Is there a command in Ruby that will accept a string, and spit out a
>> URL that is contained in the string? I think I remember reading about
>> something that would do this, but I cant recall.
>
> URI::extract
>
> http://ruby-doc.org/core/classes/URI.ht...

Wow, I didn't know about that, very nice. But it has a few weaknesses:
>> URI.extract("behold: www.abc.com and http://www.xyz....)
=> ["behold:", "http://www.xyz....]
(notice the period at the end of xyz.com)

Daniel

flazzarino

9/18/2007 1:32:00 AM

0

On Sep 17, 7:46 pm, Daniel DeLorme <dan...@dan42.com> wrote:
> Jano Svitok wrote:
> > On 9/17/07, Jayson Williams <williams.jay...@gmail.com> wrote:
> >> Is there a command in Ruby that will accept a string, and spit out a
> >> URL that is contained in the string? I think I remember reading about
> >> something that would do this, but I cant recall.
>
> > URI::extract
>
> >http://ruby-doc.org/core/classes/URI.ht...
>
> Wow, I didn't know about that, very nice. But it has a few weaknesses:
> >> URI.extract("behold:www.abc.comandhttp://www.xyz....)
> => ["behold:", "http://www.xyz....]
> (notice the period at the end of xyz.com)
>
> Daniel

not a weakness, in that string 'behold:' is a valid uri, it has a
scheme with a scheme delimeter (":"). "www.abc.com" is not an
unambiguous uri, no scheme present.

http://en.wikipedia.org/wiki/Uniform_Resource_...

flazzarino

9/18/2007 1:32:00 AM

0

On Sep 17, 7:46 pm, Daniel DeLorme <dan...@dan42.com> wrote:
> Jano Svitok wrote:
> > On 9/17/07, Jayson Williams <williams.jay...@gmail.com> wrote:
> >> Is there a command in Ruby that will accept a string, and spit out a
> >> URL that is contained in the string? I think I remember reading about
> >> something that would do this, but I cant recall.
>
> > URI::extract
>
> >http://ruby-doc.org/core/classes/URI.ht...
>
> Wow, I didn't know about that, very nice. But it has a few weaknesses:
> >> URI.extract("behold:www.abc.comandhttp://www.xyz....)
> => ["behold:", "http://www.xyz....]
> (notice the period at the end of xyz.com)
>
> Daniel

also the period is legal,

Daniel DeLorme

9/18/2007 2:06:00 AM

0

franco wrote:
> not a weakness, in that string 'behold:' is a valid uri, it has a
> scheme with a scheme delimeter (":"). "www.abc.com" is not an
> unambiguous uri, no scheme present.

Is it a valid uri if nothing is present after the scheme? Anyway, I know
that the results are technically valid but they are less than useful if
you want, say, to extract and "linkify" urls that users might have
written inside a message. (which is what I assumed the OP wanted but I
might have been mistaken)

Daniel


flazzarino

9/18/2007 1:29:00 PM

0

On Sep 17, 10:06 pm, Daniel DeLorme <dan...@dan42.com> wrote:
> franco wrote:
> > not a weakness, in that string 'behold:' is a valid uri, it has a
> > scheme with a scheme delimeter (":"). "www.abc.com" is not an
> > unambiguous uri, no scheme present.
>
> Is it a valid uri if nothing is present after the scheme? Anyway, I know
> that the results are technically valid but they are less than useful if
> you want, say, to extract and "linkify" urls that users might have
> written inside a message. (which is what I assumed the OP wanted but I
> might have been mistaken)
>
> Daniel

you could just select the ones with a scheme scpecific part? or screen
scrape for //a/@href to get all hyperreferenced anchors (links).