[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Gathering Links

joey__

1/19/2006 11:10:00 PM

Hello,
I am looking for some help on a regex expression. I would like a regexp
that matches against Html Links. I have tried, but I can't seem to get
anything working. I would appreciate help.

Thanks
Joey

--
Posted via http://www.ruby-....


1 Answer

Eero Saynatkari

1/19/2006 11:56:00 PM

0

joey__ wrote:
> Hello,
> I am looking for some help on a regex expression. I would like a regexp
> that matches against Html Links. I have tried, but I can't seem to get
> anything working. I would appreciate help.

You might just want to run the HTML through htmltidy
to generate an XML document and parse that or then use
the htree library for the same purpose, it would probably
be the more robust solution.

On the other hand, if you want to use regexps,
something like this would work (though not tested).

First you have to match the beginning tag
(there might be some whitespace:

/<\s*a

Next, gather any attributes in the opening tag:

([^>]*)>

The link text comes next:

(.*?)

The text section is ended by the closing anchor
tag (no other tags are appropriate):

<\s*\/\s*a[^>]*>

Finally, we want to match case-insensitively
(A vs. a) and over multiple lines:

/im

So, $1 will be the attributes and $2 the link text.

> Thanks
> Joey


E

--
Posted via http://www.ruby-....