[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: [Q] specify start postion of Regexp matching

Yukihiro Matsumoto

11/25/2007 11:24:00 PM

Hi,

In message "Re: [Q] specify start postion of Regexp matching"
on Mon, 26 Nov 2007 00:20:25 +0900, makoto kuwata <kwa@kuwata-lab.com> writes:

|Is it possible to specify start position of Regexp matching?
|
| str = "foo bar baz"
| m = /ba/.match(str)
| p m.begin(0) #=> 4
| m = /ba/.match(str, 5) # is it possible?
| p m.begin(0) #=> 8 (if possible)

str.index(/ba/, 5) ?

matz.

8 Answers

Makoto Kuwata

11/25/2007 11:46:00 PM

0

Yukihiro Matsumoto <m...@ruby-lang.org> wrote:
>
> str.index(/ba/, 5) ?
>

No, String#index returns Fixnum (position), but I want MatchData.

Regexp#match(string, start=0) in Ruby1.9 is the best solution I want.
Is there any plan to implement it into Ruby1.8?

--
makoto kuwata

Makoto Kuwata

11/25/2007 11:49:00 PM

0

makoto kuwata <k...@kuwata-lab.com> wrote:
> > str.index(/ba/, 5) ?
>
> No, String#index returns Fixnum (position), but I want MatchData.
>

I found that it is able to get MatchData by Regexp.last_match()
after String#index().
Well, I think Regexp#match(string, start=0) is the natural way,
but String#index(regexp, start) can be the good solution.

Thank you, Matz.

--
makoto kuwata

MonkeeSage

11/26/2007 5:18:00 AM

0

What's the difference between 1.9 Regexp#match(string, start=n) and
1.8 Regexp#match(string[n..-1])?? You have to create a sub-string with
the 1.8 version, but according to Robert Klemme (above) it's just
creating a pointer into the original string if you're not changing the
substring or original string. Besides, even if you did get a copy,
it's anonymous and should be garbage collected soon. If I understand
everything correctly, the 1.9 version would just basically be a
convenience feature over the 1.8 way?

$ irb19
irb(main):001:0> RUBY_VERSION
=> "1.9.0"
irb(main):002:0> m = /oo/.match("foo", start=1)
=> #<MatchData "oo">
irb(main):003:0> m[0]
=> "oo"

$ irb
irb(main):001:0> RUBY_VERSION
=> "1.8.6"
irb(main):002:0> m = /oo/.match("foo"[1..-1])
=> #<MatchData:0xb78777a8>
irb(main):003:0> m[0]
=> "oo"

Regards,
Jordan

7stud --

11/26/2007 6:08:00 AM

0

Jordan Callicoat wrote:
> You have to create a sub-string with
> the 1.8 version, but according to Robert Klemme (above) it's just
> creating a pointer into the original string if you're not changing the
> substring or original string.

I'm having a hard time confirming that:

str = "hello"
sub_str = str[1, 2]

puts str.object_id
-->76750

puts sub_str.object_id
-->76740

puts sub_str.class
-->String
--
Posted via http://www.ruby-....

MonkeeSage

11/26/2007 6:45:00 AM

0

On Nov 26, 12:07 am, 7stud -- <bbxx789_0...@yahoo.com> wrote:

> Jordan Callicoat wrote:
> > You have to create a sub-string with
> > the 1.8 version, but according to Robert Klemme (above) it's just
> > creating a pointer into the original string if you're not changing the
> > substring or original string.
>
> I'm having a hard time confirming that:

I'm not sure how to confirm it, other than just looking at the source,
and since I'm very poor at C programming, it probably wouldn't help
for me to try that. I'm sure Robert can demonstrate. But I will say
that I'm not suprised that they have different object_id, because they
are different objects. The copy on write is just a back-end
optimization where you pretend that two objects that point to the same
data are unique copies in the front-end, but you don't actually move
any data in the back-end until you have to (i,e., when one of the
objects is changed).

Regards,
Jordan

MonkeeSage

11/26/2007 7:07:00 AM

0

On Nov 26, 12:45 am, MonkeeSage <MonkeeS...@gmail.com> wrote:
> I'm not sure how to confirm it, other than just looking at the source,
> and since I'm very poor at C programming, it probably wouldn't help
> for me to try that.

Well, I did anyhow...

http://svn.ruby-lang.org/repos/ruby/branches/ruby_...
http://svn.ruby-lang.org/repos/ruby/branches/ruby_1_...

And I think the functions of interest are str_new3 and str_new4
(called from rb_str_substr). Specifically, the assignment of
RSTRING(str2)->aux.shared. But like I said, I'm not great with C, so I
could be mistaken.

Regards,
Jordan

MonkeeSage

11/26/2007 9:01:00 AM

0

Here's a test to show that my reading of the source, and Robert's
assertion, is correct (there is probably a better way to do this...):

#!/usr/bin/env ruby

# disable GC to get fair reading of actual allocation cost
GC.disable

def free_megs
(`free -o`.split("\n")[1].split(' ')[3].to_i/1024).to_s
end

puts "Free megabytes " + free_megs
# make a one megabyte string
s1 = "a" * 1048576
s100 = "" # placeholder to be filled in below
# make 100 substrings of it
0.upto(101) { |i| eval("s#{i}=s1[0..-1]") }

puts s100.length.to_s
puts "Free megabytes " + free_megs

Output:

Free megabytes 588
1048576
Free megabytes 587

Only one meg is used, which is the length of the original string. So,
by inductive inference, the substrings are only pointers back to the
original string rather than copies of the data.

Regards,
Jordan

George

11/27/2007 5:08:00 AM

0

On Nov 26, 2007 5:07 PM, 7stud -- <bbxx789_05ss@yahoo.com> wrote:
> Jordan Callicoat wrote:
> > You have to create a sub-string with
> > the 1.8 version, but according to Robert Klemme (above) it's just
> > creating a pointer into the original string if you're not changing the
> > substring or original string.
>
> I'm having a hard time confirming that:
>
> str = "hello"
> sub_str = str[1, 2]
>
> puts str.object_id
> -->76750
>
> puts sub_str.object_id
> -->76740
>
> puts sub_str.class
> -->String

A new ruby object is created, but the string buffer that it points to
is only copied on write.