Asp Forum - subsequences - comp.lang.ruby

Eli Bendersky

4/17/2006 6:47:00 AM

Hello all,

How would you code, in the most Rubyish way possible: (1) given an
array, decide whether it is strictly increasing, and (2) given two
arrays, decide whether the first is a subsequence of the second.

Here are my trivial implementations:

def increasing_sequence?(seq)
seq.each_with_index do |x, i|
return false if i > 0 and seq[i - 1] >= x
end

return true
end

def is_subsequence_of?(seq, other_seq)
i = 0
seq.each do |x|
while other_seq[i] != x
i += 1
return false if (i >= other_seq.length)
end
end

return true
end

I'm wondering if (1) is there a more Rubyish way to implement this ?
and (2) is there a more efficient way to implement this in pure Ruby ?

Thanks

16 Answers

David Sletten

4/17/2006 9:34:00 AM

Eli Bendersky wrote:

> Hello all,
>
> How would you code, in the most Rubyish way possible: (1) given an
> array, decide whether it is strictly increasing, and (2) given two
> arrays, decide whether the first is a subsequence of the second.
>
> Here are my trivial implementations:
>
> def increasing_sequence?(seq)
> seq.each_with_index do |x, i|
> return false if i > 0 and seq[i - 1] >= x
> end
>
> return true
> end
>
>
> def is_subsequence_of?(seq, other_seq)
> i = 0
> seq.each do |x|
> while other_seq[i] != x
> i += 1
> return false if (i >= other_seq.length)
> end
> end
>
> return true
> end
>

This doesn't quite work:
irb(main):625:0> is_subsequence_of?([4, 4, 6], [1, 2, 3, 4, 5, 4, 5, 6,
7, 8])
=> true
Basically it optimistically just keeps checking whether any remaining
elements of the first sequence are interspersed among the other
sequence. I don't think this is what you intended.

> I'm wondering if (1) is there a more Rubyish way to implement this ?
> and (2) is there a more efficient way to implement this in pure Ruby ?

As far as is_subsequence_of? goes (I call mine search, which is its name
in Lisp) you need 2 loops, similar to what you have, but with a twist.
The outer loop allows you to backtrack in case of a failure after a
partial match. In Java or C you would numerically index both arrays, but
we don't have to do that in Ruby. However, it's more efficient to remove
that termination test that you have in your inner loop. Instead we can
specify the termination condition at the beginning. But this requires us
to keep track of our index manually in the outer loop. Also, it's
helpful to return the index of the match as a true value. So we get:
def search(a1, a2)
diff = a2.length - a1.length
if diff < 0 then return false
else
0.upto(diff) do |i|
catch :outer do
a1.each_with_index do |elt, j|
throw :outer unless elt == a2[i+j]
end
return i
end
end
return false
end
end

I came up with a more Rubyesque increasing_sequence? (called monotone?
since it works in either direction):
def monotone?(a, test=lambda {|x, y| x < y})
a.inject {|x, y| if test.call(x, y) then y else break end}
end

The default behavior checks for increasing sequences:
irb(main):675:0> monotone?([1, 2, 6])
=> 6
irb(main):676:0> monotone?([1, 2, 6, 4])
=> false

But we can check for decreasing sequences too:
irb(main):677:0> monotone?([1, 2, 6], lambda {|x, y| x > y})
=> false
irb(main):678:0> monotone?([3, 2, 1], lambda {|x, y| x > y})
=> 1

There's probably a simpler way to pass in Comparable.> above, but I
don't know what it is yet. :)

Aloha,
David Sletten

Eli Bendersky

4/17/2006 9:37:00 AM

> This doesn't quite work:
> irb(main):625:0> is_subsequence_of?([4, 4, 6], [1, 2, 3, 4, 5, 4, 5, 6,
> 7, 8])
> => true
> Basically it optimistically just keeps checking whether any remaining
> elements of the first sequence are interspersed among the other
> sequence. I don't think this is what you intended.

Perhaps I didn't explain myself clearly enough. Subsequence is not
necessarily consecutive, so [4, 4, 6] *is* indeed a subsequence of [1,
2, 3, >4, 5, >4, 5, >6, 7, 8])

I referred to non-consecutive subsequences

Eli

Robert Klemme

4/17/2006 9:39:00 AM

Robert Klemme

4/17/2006 10:22:00 AM

David Sletten wrote:
> I came up with a more Rubyesque increasing_sequence? (called monotone?
> since it works in either direction):
> def monotone?(a, test=lambda {|x, y| x < y})
> a.inject {|x, y| if test.call(x, y) then y else break end}
> end
>
> The default behavior checks for increasing sequences:
> irb(main):675:0> monotone?([1, 2, 6])
> => 6
> irb(main):676:0> monotone?([1, 2, 6, 4])
> => false
>
> But we can check for decreasing sequences too:
> irb(main):677:0> monotone?([1, 2, 6], lambda {|x, y| x > y})
> => false
> irb(main):678:0> monotone?([3, 2, 1], lambda {|x, y| x > y})
> => 1
>
> There's probably a simpler way to pass in Comparable.> above, but I
> don't know what it is yet. :)

If doing a general solution I'd work with a block instead of a lambda
because it's syntactically easier for the caller and feels more natural
in Ruby land. Also you can check more constraints than just monotony
with this so there should be a more general name (but which?).

module Enumerable
def pairs_comply?
inject {|a,b| yield a,b or return false; b}
true
end
end

# test whether the sequence alternates between
# increasing and decreasing
arr=[1,2,0,10,4,5,-20,-3]
up=false
puts arr.pairs_comply? {|a,b|
up = !up
up ? b>a : b<a
}

:-)

Kind regards

robert

Robert Klemme

4/17/2006 10:33:00 AM

Eli Bendersky wrote:
> def is_subsequence_of?(seq, other_seq)
> i = 0
> seq.each do |x|
> while other_seq[i] != x
> i += 1
> return false if (i >= other_seq.length)
> end
> end
>
> return true
> end
>
> I'm wondering if (1) is there a more Rubyish way to implement this ?
> and (2) is there a more efficient way to implement this in pure Ruby ?

I'd get rid of one of the loops:

module Enumerable
def includes_nested?(arr)
i = 0

each do |x|
if x == arr[i]
i += 1
return true if i == arr.size
end
end

false
end
end

Main advantage: only the small sequence must be indexable which allows
for efficient tests against large sequences (e.g. files).

Kind regards

robert

Eli Bendersky

4/17/2006 10:38:00 AM

Robert Klemme wrote:
> Eli Bendersky wrote:
> > Hello all,
> >
> > How would you code, in the most Rubyish way possible: (1) given an
> > array, decide whether it is strictly increasing, and (2) given two
> > arrays, decide whether the first is a subsequence of the second.
> >
> > Here are my trivial implementations:
> >
> > def increasing_sequence?(seq)
> > seq.each_with_index do |x, i|
> > return false if i > 0 and seq[i - 1] >= x
> > end
> >
> > return true
> > end
>
> This cries out for inject:
>
> module Enumerable
> def increasing?
> inject {|a,b| return false if a >= b; b}
> true
> end
> end

Nice ! The inject version is faster by almost 30% as well, but a simple
'loop' version was even faster.

require 'pp'
require 'benchmark'

def increasing_sequence?(seq)
seq.each_with_index do |x, i|
return false if i > 0 and seq[i - 1] >= x
end

return true
end

def increasing_sequence_inj?(seq)
seq.inject {|mem, x| return false if mem >= x; x}
true
end

def increasing_sequence_loop?(seq)
for i in 1 ... seq.length
return false if seq[i - 1] >= seq[i]
end
true
end

seq1 = [1, 2, 4, 8, 16, 32, 64]
seq2 = [1, 2, 4, 8, 16, 15, 21]

ntimes = 100_000
Benchmark.bmbm(10) do |x|
x.report("my") do
ntimes.times do
increasing_sequence?(seq1)
increasing_sequence?(seq2)
end
end

x.report("inj") do
ntimes.times do
increasing_sequence_inj?(seq1)
increasing_sequence_inj?(seq2)
end
end

x.report("loop") do
ntimes.times do
increasing_sequence_loop?(seq1)
increasing_sequence_loop?(seq2)
end
end
end

Rehearsal ---------------------------------------------
my 5.640000 0.078000 5.718000 ( 5.766000)
inj 4.063000 0.032000 4.095000 ( 4.406000)
loop 3.828000 0.047000 3.875000 ( 3.937000)
----------------------------------- total: 13.688000sec

user system total real
my 5.610000 0.046000 5.656000 ( 5.765000)
inj 3.985000 0.016000 4.001000 ( 4.062000)
loop 3.860000 0.047000 3.907000 ( 3.953000)

Is there anything faster than the loop ?

Eli

Robert Klemme

4/17/2006 11:30:00 AM

Eli Bendersky wrote:
> Robert Klemme wrote:
>> Eli Bendersky wrote:
>>> Hello all,
>>>
>>> How would you code, in the most Rubyish way possible: (1) given an
>>> array, decide whether it is strictly increasing, and (2) given two
>>> arrays, decide whether the first is a subsequence of the second.
>>>
>>> Here are my trivial implementations:
>>>
>>> def increasing_sequence?(seq)
>>> seq.each_with_index do |x, i|
>>> return false if i > 0 and seq[i - 1] >= x
>>> end
>>>
>>> return true
>>> end
>> This cries out for inject:
>>
>> module Enumerable
>> def increasing?
>> inject {|a,b| return false if a >= b; b}
>> true
>> end
>> end
>
> Nice ! The inject version is faster by almost 30% as well, but a simple
> 'loop' version was even faster.
>
> require 'pp'
> require 'benchmark'
>
> def increasing_sequence?(seq)
> seq.each_with_index do |x, i|
> return false if i > 0 and seq[i - 1] >= x
> end
>
> return true
> end
>
> def increasing_sequence_inj?(seq)
> seq.inject {|mem, x| return false if mem >= x; x}
> true
> end
>
> def increasing_sequence_loop?(seq)
> for i in 1 ... seq.length
> return false if seq[i - 1] >= seq[i]
> end
> true
> end
>
> seq1 = [1, 2, 4, 8, 16, 32, 64]
> seq2 = [1, 2, 4, 8, 16, 15, 21]
>
> ntimes = 100_000
> Benchmark.bmbm(10) do |x|
> x.report("my") do
> ntimes.times do
> increasing_sequence?(seq1)
> increasing_sequence?(seq2)
> end
> end
>
> x.report("inj") do
> ntimes.times do
> increasing_sequence_inj?(seq1)
> increasing_sequence_inj?(seq2)
> end
> end
>
> x.report("loop") do
> ntimes.times do
> increasing_sequence_loop?(seq1)
> increasing_sequence_loop?(seq2)
> end
> end
> end
>
>
> Rehearsal ---------------------------------------------
> my 5.640000 0.078000 5.718000 ( 5.766000)
> inj 4.063000 0.032000 4.095000 ( 4.406000)
> loop 3.828000 0.047000 3.875000 ( 3.937000)
> ----------------------------------- total: 13.688000sec
>
> user system total real
> my 5.610000 0.046000 5.656000 ( 5.765000)
> inj 3.985000 0.016000 4.001000 ( 4.062000)
> loop 3.860000 0.047000 3.907000 ( 3.953000)
>
>
> Is there anything faster than the loop ?

You can write a C extension. Seriously, I'd stick with the inject
variant because it's a) most elegant, b) is not much slower than the
loop and c) works for *arbitrary* Enumerables while both of your
versions work only for arrays.

Btw, how do benchmark results change if you do this?

seq1 = (1..1_000_000).to_a

IOW, is any of these algorithms better for short sequences vs. long
sequences?

Kind regards

robert

Daniel Schierbeck

4/17/2006 12:21:00 PM

Eli Bendersky wrote:
> def increasing_sequence?(seq)
> seq.each_with_index do |x, i|
> return false if i > 0 and seq[i - 1] >= x
> end
> return true
> end

module Enumerable
def increasing?
inject{|a, b| if a > b then return false else b end}
return true
end
end

[1, 2, 3].increasing? => true
[3, 2, 1].increasing? => false

Cheers,
Daniel

Eli Bendersky

4/17/2006 3:37:00 PM

> > Is there anything faster than the loop ?
>
> You can write a C extension. Seriously, I'd stick with the inject
> variant because it's a) most elegant, b) is not much slower than the
> loop and c) works for *arbitrary* Enumerables while both of your
> versions work only for arrays.

True, it is more elegant. But I'm not sure that (c) is a good reason,
since I don't think sequences will be represented by anything other
than arrays.

>
> Btw, how do benchmark results change if you do this?
>
> seq1 = (1..1_000_000).to_a
>

The results surprised me. Running with ntimes = 10 on the 1 mln long
string, I get:

Rehearsal ---------------------------------------------
my 45.297000 0.313000 45.610000 ( 47.015000)
inj 35.672000 0.406000 36.078000 ( 37.000000)
loop 16.625000 0.031000 16.656000 ( 16.922000)
----------------------------------- total: 98.344000sec

user system total real
my 45.172000 0.438000 45.610000 ( 46.485000)
inj 35.875000 0.266000 36.141000 ( 40.234000)
loop 16.672000 0.093000 16.765000 ( 17.406000)

A significant advantage to the loop ! What could make inject slow down
so much for longer arrays ?

Eli

Robert Klemme

4/17/2006 11:01:00 PM

Eli Bendersky wrote:
>>> Is there anything faster than the loop ?
>> You can write a C extension. Seriously, I'd stick with the inject
>> variant because it's a) most elegant, b) is not much slower than the
>> loop and c) works for *arbitrary* Enumerables while both of your
>> versions work only for arrays.
>
> True, it is more elegant. But I'm not sure that (c) is a good reason,
> since I don't think sequences will be represented by anything other
> than arrays.

You can still have the general implementation in module Enumerable and
have a specialized implementation in Array. *You* might use only
Arrays but the problem is not restricted to arrays.

>> Btw, how do benchmark results change if you do this?
>>
>> seq1 = (1..1_000_000).to_a
>>
>
> The results surprised me. Running with ntimes = 10 on the 1 mln long
> string, I get:
>
> Rehearsal ---------------------------------------------
> my 45.297000 0.313000 45.610000 ( 47.015000)
> inj 35.672000 0.406000 36.078000 ( 37.000000)
> loop 16.625000 0.031000 16.656000 ( 16.922000)
> ----------------------------------- total: 98.344000sec
>
> user system total real
> my 45.172000 0.438000 45.610000 ( 46.485000)
> inj 35.875000 0.266000 36.141000 ( 40.234000)
> loop 16.672000 0.093000 16.765000 ( 17.406000)
>
>
> A significant advantage to the loop ! What could make inject slow down
> so much for longer arrays ?

Block calls maybe. A block call is like a method invocation which does
not happen for the loop if I remember the code correctly.

Cheers

robert

comp.lang.ruby

subsequences

Eli Bendersky

David Sletten

Eli Bendersky

Robert Klemme

Robert Klemme

Robert Klemme

Eli Bendersky

Robert Klemme

Daniel Schierbeck

Eli Bendersky

Robert Klemme

x Login to ForumsZone