[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: Is there a Ruby library that does HTML entity parsing?

John Lam

4/28/2005 3:06:00 PM

Steve, I've been using the URI lib quite a bit, but it doesn't parse entities. One feature that would be nice to add is one that calculates the URL for the directory that contains a document, given its complete URL.

For example, consider:

http://www.fo...som...

This document clearly lives in

http://www.fo...

This is required if you want to resolve document-relative URL links as in:

<a href="otherpage.htm">...

which would send you to:

http://www.fo...otherpage.htm

I've hacked this thing in for my own code, which was my first foray into ranges etc:

def getDocumentRelativeUri(documentUri, uri_str)
doc = documentUri.to_s
return CGI.unescapeHTML(doc[0..doc.rindex('/')] + uri_str)
end

-John
http://www.iu...


________________________________

From: Steve Kellock [mailto:skellock@gmail.com]
Sent: Thu 4/28/2005 10:30 AM
To: ruby-talk ML
Subject: Re: Is there a Ruby library that does HTML entity parsing?



John,

If you haven't yet, check out the URI lib that ships with ruby. It's
pretty hot $%@*. Not sure if handles QSV's though. Maybe you could
tell me? :)

Steve

On 4/28/05, John Lam <jlam@iunknown.com> wrote:
> Simple stuff like
>
> mypage.htm?foo=abc&amp;bar=1 => mypage.htm?foo=abc&bar=1
>
> and its reverse?
>
> Thanks
> -John
> http://www.iu...
>
>






1 Answer

Ilmari Heikkinen

4/28/2005 3:41:00 PM

0


On 28.4.2005, at 18:05, John Lam wrote:

> Steve, I've been using the URI lib quite a bit, but it doesn't parse
> entities. One feature that would be nice to add is one that calculates
> the URL for the directory that contains a document, given its complete
> URL.
>
> For example, consider:
>
> http://www.fo...som...
>
> This document clearly lives in
>
> http://www.fo...
>

This can also be done using File.split / dirname / basename:

File.dirname "http://foo.com/bar/stuff....
#=> "http://foo.com...

File.basename "http://foo.com/bar/stuff....
#=> "stuff.html"

File.split "http://foo.com/bar/stuff....
#=> ["http://foo.com..., "stuff.html"]

File.join( File.dirname("http://foo.com/bar/doc....),
"relative_link.html" )
# => "http://foo.com/bar/relative_link....

Though that probably breaks on Windows since it has backslashes for
directory separators.

Cheers,
Ilmari Heikkinen