[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Return first line of parsing

Haze Noc

8/17/2007 1:53:00 PM

mysite.each {|line|
if line =~ /<p><a href="(.+)"><b>(.+)<\/b>/
puts "#{$2} found at: #{$1}"
end
}

Ok guys, Lets say the website has 50+ lines.. and i only want to return
the first one, any ideas?
--
Posted via http://www.ruby-....

4 Answers

Tim Pease

8/17/2007 1:59:00 PM

0

On 8/17/07, Haze Noc <h4z3@the-c0re.org> wrote:
> mysite.each {|line|
> if line =~ /<p><a href="(.+)"><b>(.+)<\/b>/
> puts "#{$2} found at: #{$1}"
> end
> }
>
> Ok guys, Lets say the website has 50+ lines.. and i only want to return
> the first one, any ideas?

%r/^(.*)$/.match(mysite)[1]

John Joyce

8/17/2007 10:00:00 PM

0


On Aug 17, 2007, at 8:59 AM, Tim Pease wrote:

> On 8/17/07, Haze Noc <h4z3@the-c0re.org> wrote:
>> mysite.each {|line|
>> if line =~ /<p><a href="(.+)"><b>(.+)<\/b>/
>> puts "#{$2} found at: #{$1}"
>> end
>> }
>>
>> Ok guys, Lets say the website has 50+ lines.. and i only want to
>> return
>> the first one, any ideas?
>
> %r/^(.*)$/.match(mysite)[1]
>
Careful,
What if the site's white space has been stripped? (no CR or LF at all)
or if the html/xhtml is screwy? (old html without closed elements,
or just poorly formed or badly nested)

Konrad Meyer

8/17/2007 11:08:00 PM

0

!DSPAM:46c62a6715821228095555!

yermej

8/17/2007 11:22:00 PM

0

On Aug 17, 8:52 am, Haze Noc <h...@the-c0re.org> wrote:
> mysite.each {|line|
> if line =~ /<p><a href="(.+)"><b>(.+)<\/b>/
> puts "#{$2} found at: #{$1}"
> end
>
> }
>
> Ok guys, Lets say the website has 50+ lines.. and i only want to return
> the first one, any ideas?
> --
> Posted viahttp://www.ruby-....

If you want to use essentially the same block as above, but just take
the first matching line:

mysite.each {|line|
if line =~ /<p><a href="(.+)"><b>(.+)<\/b>/
puts "#{$2} found at: #{$1}"
break
end
}

Tim's solution would give you the first line of the actual html file
and, as John mentions, that could be the entire web page if there are
no CR/LF characters in the file.

Jeremy