Todd Benson
3/1/2008 7:00:00 AM
On Fri, Feb 29, 2008 at 10:55 PM, William James <w_a_x_man@yahoo.com> wrote:
> On Feb 29, 9:34 pm, Gregory Seidman <gsslist+r...@anthropohedron.net>
> wrote:
>
> > On Sat, Mar 01, 2008 at 12:22:12PM +0900, Tom Arra wrote:
> > > So I am new to Ruby scripting so I am not sure if this is possible or
> > > not. I want to make a script that will load a webpage and then search
> > > through the HTML of that page until it hits a certain tag. Once it hits
> > > that tag it need to grab all of the text between the tag and the
> > > appropriate end tag. Is something like this possible?
> >
> > > Example
> > > <html>
> > > <body>
> > > <h3>test</h3>
> > > </body>
> > > </html>
> >
> > > I want the script to return "test"
> >
> > You want the Hpricot gem.
>
> No, he doesn't.
Same question, different people, same strict requirements. It sounds
a little like homework. In that case, I suppose some of the regexp
solutions provided will work (for this small use case).
I still think Florian said it best, though. Unless you can "stack",
you won't be able to correctly reveal the components inside a nested
language structure. I haven't looked into the theory, but I can
attest to the pain in the arse I've had trying to scrape with regular
expressions.
Todd