[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

regexp strip html

pere.noel

3/26/2006 2:33:00 PM

i have a regexp able to strip html :

/<[^>]*>/

however, between <script and </script> all the "text is preserved, tjen
i've tried :

def stripHTML
# self.gsub(/<\S[^><]*>/, '')
self.gsub(/\A.*<body [^>]*>(.*)<\/body>\s*\Z/, '\1').gsub(/<[^>]*>/,
'')
end

without success : the various javascript functions are kept ?

what's my error here ?

--
une bévue
2 Answers

Paul Battley

3/27/2006 12:43:00 PM

0

On 26/03/06, Une bévue <pere.noel@laponie.com.invalid> wrote:> i have a regexp able to strip html :>> /<[^>]*>/>> however, between <script and </script> all the "text is preserved, tjen...> what's my error here ?Look at it this way: you have '<script>Javascript</script>'. Youremove everything between angle brackets. You still have 'Javascript',because that's not actually inside <...>.The simplest solution is probably to do something like this beforestripping out the remaining tags:gsub(/<script.*?</script>/im, '')Paul.

pere.noel

3/27/2006 1:24:00 PM

0

Paul Battley <pbattley@gmail.com> wrote:

> The simplest solution is probably to do something like this
> before
stripping out the remaining tags:

gsub(/<script.*?</script>/im,
> '')

yes, sounds clever ))

--
une bévue