Asp Forum - Ordered contrast for String or Array

Trans

9/2/2006 5:07:00 PM

I have two strings: "aabc" and "aacd". I want to get an "ordered
contrast" to see the difference in their spellings. Eg.

"aabc".ordered_contrast( "aacc" ) => " cc"

In the example a space represents a matching string, although I suppose
there may be a better alternative. In any case, one could also imagine
a non-ordered contrast:

"aabc".contrast( "aacc" ) => " c "

And conversly one could ask for the intersect.

"aabc".inersect( "aacc" ) => "aa c"
"aabc".ordered_inersect( "aacc" ) => "aa "

These could be extended to Array as well, and String could just use
split(//) with those.

So the question is: Is there an efficient way to calculate these?

Thanks,
T.

11 Answers

James Gray

9/2/2006 6:08:00 PM

On Sep 2, 2006, at 12:10 PM, Trans wrote:

> I have two strings: "aabc" and "aacd". I want to get an "ordered
> contrast" to see the difference in their spellings. Eg.
>
> "aabc".ordered_contrast( "aacc" ) => " cc"

I assume that's an error. The last to letters match.

Also, how are you determining which letter to show for the contrast?

James Edward Gray II

William Crawford

9/2/2006 6:30:00 PM

Trans wrote:
> I have two strings: "aabc" and "aacd". I want to get an "ordered
> contrast" to see the difference in their spellings. Eg.
>
> "aabc".ordered_contrast( "aacc" ) => " cc"
>
> In the example a space represents a matching string, although I suppose
> there may be a better alternative. In any case, one could also imagine
> a non-ordered contrast:
>
> "aabc".contrast( "aacc" ) => " c "
>

From this, the 'ordered contrast' means that the first different letter,
and all following letters, are shown? And unordered means only the
different letters are shown?

Google refused to provide a definition or examples.

--
Posted via http://www.ruby-....

Trans

9/2/2006 7:12:00 PM

William Crawford wrote:
> Trans wrote:
> > I have two strings: "aabc" and "aacd". I want to get an "ordered
> > contrast" to see the difference in their spellings. Eg.
> >
> > "aabc".ordered_contrast( "aacc" ) => " cc"
> >
> > In the example a space represents a matching string, although I suppose
> > there may be a better alternative. In any case, one could also imagine
> > a non-ordered contrast:
> >
> > "aabc".contrast( "aacc" ) => " c "
> >
>
> From this, the 'ordered contrast' means that the first different letter,
> and all following letters, are shown? And unordered means only the
> different letters are shown?

That's right.

> Google refused to provide a definition or examples.

Yes, they aren't technical terms, just me trying my best to describe
them.

T.

Trans

9/2/2006 7:22:00 PM

James Edward Gray II wrote:
> On Sep 2, 2006, at 12:10 PM, Trans wrote:
>
> > I have two strings: "aabc" and "aacd". I want to get an "ordered
> > contrast" to see the difference in their spellings. Eg.
> >
> > "aabc".ordered_contrast( "aacc" ) => " cc"
>
> I assume that's an error. The last to letters match.

It's correct. By "ordered" I mean by sort order (ie. alphabetic) so
"aabc" and "aacc" diverge at the thrid letter. Regular, unordered
"contrast" doesn't care about that and would blank the last letter too.

> Also, how are you determining which letter to show for the contrast?

If they are != or == and in the same position. Hmmm.. maybe the Array
form would be a better example:

[ "a", "a", "b", "c" ].contrast => [ nil, nil, "b", nil ]
[ "a", "a", "b", "c" ].ordered_contrast => [ nil, nil, "b","c" ]

[ "a", "a", "b", "c" ].intersect => [ "a", "a", nil, "c" ]
[ "a", "a", "b", "c" ].ordered_intersect => [ "a", "a", nil, nil ]

Also, maybe the terms 'negative' and 'positive' would have been better
than 'contrast' and 'intersect'.

T.

Trans

9/2/2006 7:34:00 PM

Trans wrote:

> [ "a", "a", "b", "c" ].contrast => [ nil, nil, "b", nil ]
> [ "a", "a", "b", "c" ].ordered_contrast => [ nil, nil, "b","c" ]
>
> [ "a", "a", "b", "c" ].intersect => [ "a", "a", nil, "c" ]
> [ "a", "a", "b", "c" ].ordered_intersect => [ "a", "a", nil, nil ]

Opps... I forgot the comparision array and screwed it up. Let me try
that again:

[ "a", "a", "b", "c" ].contrast([ "a", "a", "c", "c" ]) => [ nil,
nil, "c", nil ]
[ "a", "a", "b", "c" ].ordered_contrast([ "a", "a", "c", "c" ]) => [
nil, nil, "c","c" ]

In these the difference being show is that of the parameter's. If we
swap the receiver and the parameter:

[ "a", "a", "c", "c" ].contrast([ "a", "a", "b", "c" ]) => [ nil,
nil, "b", nil ]
[ "a", "a", "c", "c" ].ordered_contrast([ "a", "a", "b", "c" ]) => [
nil, nil, "b","c" ]

And of course the inverse:

[ "a", "a", "b", "c" ].intersect([ "a", "a", "c", "c" ]) => [ "a",
"a", nil, "c" ]
[ "a", "a", "b", "c" ].ordered_intersect([ "a", "a", "c", "c" ]) =>
[ "a", "a", nil, nil ]

Also interesting:

[ "a", "a", "b", "c" ].contrast([ "a", "a", "c", "c" ]) => [ nil,
nil, "c", nil ]
[ nil, nil, "c", nil ].contrast([ "a", "a", "c", "c" ]) => [ "a",
"a", nil, "c" ]

which is the intersection.

T.

Rick DeNatale

9/3/2006 5:37:00 PM

Here's one way to do it. There are probably better ways.

rick@frodo:~/rubyscripts$ cat enum_contrast.rb
module Enumerable

def contrast(enum, eql_val=nil)
result = []
self.zip(enum) { |a, b| result << (a.eql?(b) ? eql_val : b)}
result
end

def ordered_contrast(enum, eql_val=nil)
result = []
diff = false
self.zip(enum) do |a, b|
diff = diff || !a.eql?(b)
result << (diff ? b: eql_val)
end
result
end

def intersect(enum, diff_val=nil)
result = []
self.zip(enum) { |a, b| result << (a.eql?(b) ? b : diff_val)}
result
end

def ordered_intersect(enum, diff_val=nil)
result = []
diff = false
self.zip(enum) do |a, b|
diff = diff || !a.eql?(b)
result << (diff ? diff_val : b)
end
result
end
end

class String

def to_chars_array
unpack('a'*length)
end

def contrast(str)
to_chars_array.contrast(str.to_chars_array,' ').join
end

def ordered_contrast(str)

to_chars_array.ordered_contrast(str.to_chars_array,' ').join
end

def intersect(str)
to_chars_array.intersect(str.to_chars_array,' ').join
end

def ordered_intersect(str)

to_chars_array.ordered_intersect(str.to_chars_array,' ').join
end
end

rick@frodo:~/rubyscripts$ cat test_enum_contrast.rb
require 'enum_contrast.rb'
require 'test/unit'

class TestSubranges < Test::Unit::TestCase

def test_array_contrast
assert_equal([ nil, nil, "c", nil ],
[ "a", "a", "b", "c" ].contrast([ "a",
"a", "c", "c" ]))

assert_equal([ nil, nil, "b", nil ],
[ "a", "a", "c", "c" ].contrast([ "a",
"a", "b", "c" ]))

assert_equal([ nil, nil, "c", nil ],
[ "a", "a", "b", "c" ].contrast([ "a",
"a", "c", "c" ]))

assert_equal([ "a", "a", nil, "c" ],
[ nil, nil, "c", nil ].contrast([ "a",
"a", "c", "c" ]))
end

def test_array_ordered_contrast
assert_equal([ nil, nil, "c","c" ],
[ "a", "a", "b", "c" ].ordered_contrast([
"a", "a", "c", "c" ]))

assert_equal([ nil, nil, "b","c" ],
[ "a", "a", "c", "c" ].ordered_contrast([
"a", "a", "b", "c" ]))
end

def test_array_intersect
assert_equal([ "a", "a", nil, "c" ],
[ "a", "a", "b", "c" ].intersect([ "a",
"a", "c", "c" ]))
end

def test_array_ordered_intersect
assert_equal([ "a", "a", nil, nil ],
[ "a", "a", "b", "c"
].ordered_intersect([ "a", "a", "c", "c" ]))

end

def test_string_ordered_contrast
assert_equal(" cc",
"aabc".ordered_contrast( "aacc" ))
end

def test_string_contrast
assert_equal(" c ",
"aabc".contrast("aacc"))
end

def test_string_intersect
assert_equal("aa c",
"aabc".intersect("aacc"))
end

def test_string_ordered_intersect
assert_equal("aa ",
"aabc".ordered_intersect("aacc"))
end
end
rick@frodo:~/rubyscripts$ ruby test_enum_contrast.rb
Loaded suite test_enum_contrast
Started
........
Finished in 0.010052 seconds.

8 tests, 12 assertions, 0 failures, 0 errors
rick@frodo:~/rubyscripts$
--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

Trans

9/3/2006 9:04:00 PM

Rick DeNatale wrote:
> Here's one way to do it. There are probably better ways.

Nice! Good use of zip, and using #unpack for the String rendition makes
a lot of sense and is probably the fastest way. Thanks for these good
solutions. Beats the hek out of what I had.

I'm going to add these to Facets, albiet I'm going to give some thought
to possibly better names. You'll get the credit of course and added to
the list of Authors/Contributors if that's okay with you.

Thanks!
T.

P.S. Your blog seems to be down. Something about:
SQLite3::CantOpenException in ArticlesController#index

Rick DeNatale

9/4/2006 2:29:00 PM

On 9/4/06, Trans <transfire@gmail.com> wrote:
>
> Rick DeNatale wrote:
> > Here's one way to do it. There are probably better ways.
>
> Nice! Good use of zip, and using #unpack for the String rendition makes
> a lot of sense and is probably the fastest way. Thanks for these good
> solutions. Beats the hek out of what I had.

Well, I was surprised when I couldn't seem to find a standard String
method for splitting a String into an array of single character
strings.

Thinking about it again, another way is

string.scan /./

I haven't benchmarked the two though so I don't know which is faster.
I guess that's a task for the to-do list.

>
> I'm going to add these to Facets, albiet I'm going to give some thought
> to possibly better names. You'll get the credit of course and added to
> the list of Authors/Contributors if that's okay with you.

That's fine. I don't have any advice on the names, except that
intersect is definitely a bad name, since it sounds too much like a
set operation which would have slightly different semantics.

> Thanks!
> T.
>
> P.S. Your blog seems to be down. Something about:
> SQLite3::CantOpenException in ArticlesController#index

Thanks for pointing that out. This is the second time in a week that
Typo has gotten me. Although this time it was just a matter of
restarting it. Maybe it's time to consider migrating to Mephisto!?!

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

Rick DeNatale

9/4/2006 2:43:00 PM

On 9/4/06, Rick DeNatale <rick.denatale@gmail.com> wrote:

> Well, I was surprised when I couldn't seem to find a standard String
> method for splitting a String into an array of single character
> strings.
>
> Thinking about it again, another way is
>
> string.scan /./
>
> I haven't benchmarked the two though so I don't know which is faster.
> I guess that's a task for the to-do list.

It looks like unpack is a clear winner:
rick@frodo:~/rubyscripts$ cat benchstringsplit.rb
require 'benchmark'
include Benchmark

class String
def to_chars_array_with_unpack
unpack('a'*length)
end

def to_chars_array_with_scan
scan /./
end
end

iterations = 100
str = "abcdefghijklmnopqrstuvwxyz" * 5
bmbm do | x |
5.times do
x.report("unpack #{str.length} character string") do
iterations.times do
str.to_chars_array_with_unpack
end
end

x.report("scan #{str.length} character string") do
iterations.times do
str.to_chars_array_with_scan
end
end
str += str
end
end

rick@frodo:~/rubyscripts$ ruby benchstringsplit.rb
Rehearsal ----------------------------------------------------------------
unpack 130 character string 1.130000 0.010000 1.140000 ( 1.139545)
scan 130 character string 2.590000 0.000000 2.590000 ( 3.678513)
unpack 260 character string 1.100000 0.000000 1.100000 ( 1.672301)
scan 260 character string 2.170000 0.000000 2.170000 ( 2.850112)
unpack 520 character string 1.000000 0.000000 1.000000 ( 1.808740)
scan 520 character string 2.230000 0.010000 2.240000 ( 3.562964)
unpack 1040 character string 1.080000 0.000000 1.080000 ( 2.122081)
scan 1040 character string 2.260000 0.000000 2.260000 ( 3.962422)
unpack 2080 character string 1.020000 0.000000 1.020000 ( 1.518132)
scan 2080 character string 2.120000 0.000000 2.120000 ( 3.091133)
------------------------------------------------------ total: 16.720000sec

user system total real
unpack 130 character string 0.920000 0.000000 0.920000 ( 1.348387)
scan 130 character string 2.200000 0.000000 2.200000 ( 5.322029)
unpack 260 character string 0.990000 0.000000 0.990000 ( 1.363863)
scan 260 character string 2.210000 0.020000 2.230000 ( 3.426876)
unpack 520 character string 1.010000 0.000000 1.010000 ( 1.761541)
scan 520 character string 2.140000 0.000000 2.140000 ( 2.251753)
unpack 1040 character string 1.010000 0.000000 1.010000 ( 1.136075)
scan 1040 character string 2.220000 0.000000 2.220000 ( 2.373706)
unpack 2080 character string 1.000000 0.000000 1.000000 ( 1.083459)
scan 2080 character string 2.130000 0.000000 2.130000 ( 2.199584)

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

Rick DeNatale

9/4/2006 2:53:00 PM

On 9/4/06, Rick DeNatale <rick.denatale@gmail.com> wrote:

> It looks like unpack is a clear winner:

Then on second thought, it looks like the problem with scan was the
construction of the regex, although using scan without pre-compiling
the regex is about twice as slow as unpack, using scan with a
pre-compiled regexp looks like it's about 100 times faster!

rick@frodo:~/rubyscripts$ cat benchstringsplit.rb
require 'benchmark'
include Benchmark

class String
To_chars_regex = Regexp.new('/./')

def to_chars_array_with_unpack
unpack('a'*length)
end

def to_chars_array_with_scan
scan /./
end

def to_chars_array_with_scan_precomp
scan To_chars_regex
end
end

iterations = 100
str = "abcdefghijklmnopqrstuvwxyz" * 5
bmbm do | x |
5.times do
x.report("unpack #{str.length} character string") do
iterations.times do
str.to_chars_array_with_unpack
end
end

x.report("scan #{str.length} character string") do
iterations.times do
str.to_chars_array_with_scan
end
end

x.report("scan-precomp #{str.length} character string") do
iterations.times do
str.to_chars_array_with_scan_precomp
end
end
str += str
end
end

rick@frodo:~/rubyscripts$ ruby benchstringsplit.rb
Rehearsal ----------------------------------------------------------------------
unpack 130 character string 0.960000 0.010000 0.970000 ( 0.984373)
scan 130 character string 2.150000 0.000000 2.150000 ( 2.178162)
scan-precomp 130 character string 0.010000 0.000000 0.010000 ( 0.012862)
unpack 260 character string 0.910000 0.000000 0.910000 ( 0.910658)
scan 260 character string 2.040000 0.000000 2.040000 ( 2.100734)
scan-precomp 260 character string 0.010000 0.000000 0.010000 ( 0.010890)
unpack 520 character string 0.940000 0.000000 0.940000 ( 0.942446)
scan 520 character string 1.990000 0.000000 1.990000 ( 2.020499)
scan-precomp 520 character string 0.010000 0.000000 0.010000 ( 0.010869)
unpack 1040 character string 0.980000 0.010000 0.990000 ( 0.995709)
scan 1040 character string 2.140000 0.000000 2.140000 ( 2.160120)
scan-precomp 1040 character string 0.010000 0.000000 0.010000 ( 0.013315)
unpack 2080 character string 1.130000 0.000000 1.130000 ( 1.214512)
scan 2080 character string 2.110000 0.000000 2.110000 ( 2.132072)
scan-precomp 2080 character string 0.010000 0.000000 0.010000 ( 0.011119)
------------------------------------------------------------ total: 15.420000sec

user system total real
unpack 130 character string 1.270000 0.000000 1.270000 ( 1.338689)
scan 130 character string 2.530000 0.000000 2.530000 ( 2.710398)
scan-precomp 130 character string 0.010000 0.000000 0.010000 ( 0.011328)
unpack 260 character string 1.350000 0.000000 1.350000 ( 1.445329)
scan 260 character string 2.420000 0.000000 2.420000 ( 2.532545)
scan-precomp 260 character string 0.000000 0.000000 0.000000 ( 0.010712)
unpack 520 character string 1.080000 0.010000 1.090000 ( 1.086219)
scan 520 character string 2.120000 0.000000 2.120000 ( 2.128990)
scan-precomp 520 character string 0.010000 0.000000 0.010000 ( 0.010815)
unpack 1040 character string 1.080000 0.000000 1.080000 ( 1.078558)
scan 1040 character string 2.120000 0.000000 2.120000 ( 2.129707)
scan-precomp 1040 character string 0.010000 0.000000 0.010000 ( 0.010945)
unpack 2080 character string 1.210000 0.000000 1.210000 ( 1.267488)
scan 2080 character string 2.460000 0.000000 2.460000 ( 2.627165)
scan-precomp 2080 character string 0.010000 0.000000 0.010000 ( 0.012961)
--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

comp.lang.ruby

Ordered contrast for String or Array

Trans

James Gray

William Crawford

Trans

Trans

Trans

Rick DeNatale

Trans

Rick DeNatale

Rick DeNatale

Rick DeNatale

x Login to ForumsZone