[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Hash#collate

Phrogz

12/19/2007 12:02:00 AM

I wanted a method like Hash#update, but that preserved the values from
both the original and argument Hash. A little searching failed to find
it. (I did find that someone somewhere wrote a Hash#collate that's in
my ri docs, but who knows where it came from. Its description appears
not to do at all what I wanted, anyhow.)

So, I wrote my own. Comments welcome. Efficiency patches particularly
welcome. Under a different name, perhaps Trans might consider it for
inclusion in Facets.

class Hash
# Merge the values of this hash with those from another, setting all
values
# to be arrays representing the values from both hashes.
# { :a=>1, :b=>2 }.collate :a=>3, :b=>4, :c=>5
# #=> { :a=>[1,3], :b=>[2,4], :c=>[5] }
#
# The 'uniq' option allows you to ensure all values are unique:
# { :a=>1, :b=>2 }.collate( { :a=>1, :b=>3 }, :uniq=>true )
# #=> { :a=>[1], :b=>[2,3] }
#
# By default, array values in either side are merged:
# foo = { :a=>[1,2], :b=>[3] }
# bar = { :a=>[4,5], :c=>[6,7] }
# foo.collate( bar )
# #=> { :a=>[1,2,4,5], :b=>[3], :c=>[6,7] }
#
# Use the 'preserve_arrays' option to prevent them from being
merged:
# foo = { :a=>[1,2], :b=>[3] }
# bar = { :a=>[4,5], :c=>[6,7] }
# foo.collate( bar, :preserve_arrays=>true )
# #=> { :a=>[[1,2],[4,5]], :b=>[[3]], :c=>[[6,7]] }
#
# Note that, as shown above, preserving arrays will cause array
values
# to be wrapped up in another array.
def collate( other_hash, options={} )
dup.collate!( other_hash, options )
end

# The same as #collate, but modifies the receiver in place.
def collate!( other_hash, options={} )
# Prepare, ensuring every existing key is already an Array
each{ |key, value|
if value.is_a?( Array ) && !options[ :preserve_arrays ]
self[key] = value
else
self[key] = [ value ]
end
}

# Collate with values from other_hash
other_hash.each{ |key, value|
if self[ key ]
if value.is_a?( Array ) && !options[ :preserve_arrays ]
self[ key ].concat( value )
else
self[ key ] << value
end
elsif value.is_a?( Array ) && !options[ :preserve_arrays ]
self[ key ] = value
else
self[ key ] = [ value ]
end
}

each{ |key, value| value.uniq! } if options[ :uniq ]

self
end
end

if __FILE__ == $0
require 'test/unit'
class TestHashCollation < Test::Unit::TestCase
def setup
$a = { :a=>1, :b=>2, :z=>26, :all=>%w|a b z|, :stuff1=>%w|foo
bar|, :whee=>%w|a b| }
$b = { :a=>1, :b=>4, :c=>9, :all=>%w|a b c|, :stuff2=>%w|jim
jam|, :whee=>%w|a b| }
$c = { :a=>1, :b=>8, :c=>27 }
end
def test1_defaults
collated = $a.collate( $b )
assert_equal( 8, collated.keys.length, "There are 7 unique
keys" )
assert_equal( [1,1], collated[ :a ] )
assert_equal( [2,4], collated[ :b ] )
assert_equal( [9], collated[ :c ] )
assert_equal( [26], collated[ :z ] )
assert_equal( %w|a b z a b c|, collated[ :all ], "Arrays are
merged by default." )
assert_equal( %w|foo bar|, collated[ :stuff1 ] )
assert_equal( %w|jim jam|, collated[ :stuff2 ] )
assert_equal( %w|a b a b|, collated[ :whee ] )
end
def test2_uniq
collated = $a.collate( $b, :uniq=>true )
assert_equal( 8, collated.keys.length, "There are 7 unique
keys" )
assert_equal( [1], collated[ :a ] )
assert_equal( [2,4], collated[ :b ] )
assert_equal( [9], collated[ :c ] )
assert_equal( [26], collated[ :z ] )
assert_equal( %w|a b z c|, collated[ :all ], "Arrays are merged
by default." )
assert_equal( %w|foo bar|, collated[ :stuff1 ] )
assert_equal( %w|jim jam|, collated[ :stuff2 ] )
assert_equal( %w|a b|, collated[ :whee ] )
end
def test3_preserve_arrays
collated = $a.collate( $b, :preserve_arrays=>true )
assert_equal( 8, collated.keys.length, "There are 7 unique
keys" )
assert_equal( [1,1], collated[ :a ] )
assert_equal( [2,4], collated[ :b ] )
assert_equal( [9], collated[ :c ] )
assert_equal( [26], collated[ :z ] )
assert_equal( [ %w|a b z|, %w|a b c|], collated[ :all ], "Two
arrays are not merged." )
assert_equal( [%w|foo bar|], collated[ :stuff1 ],
"Arrays unique to one side are wrapped" )
assert_equal( [%w|jim jam|], collated[ :stuff2 ],
"Arrays unique to one side are wrapped" )
assert_equal( [%w|a b|, %w|a b|], collated[ :whee ] )
end
def test4_preserve_and_uniq
collated = $a.collate( $b, :preserve_arrays=>true, :uniq=>true )
assert_equal( 8, collated.keys.length, "There are 7 unique
keys" )
assert_equal( [1], collated[ :a ] )
assert_equal( [2,4], collated[ :b ] )
assert_equal( [9], collated[ :c ] )
assert_equal( [26], collated[ :z ] )
assert_equal( [ %w|a b z|, %w|a b c|], collated[ :all ], "Two
arrays are not merged." )
assert_equal( [%w|foo bar|], collated[ :stuff1 ],
"Arrays unique to one side are wrapped" )
assert_equal( [%w|jim jam|], collated[ :stuff2 ],
"Arrays unique to one side are wrapped" )
assert_equal( [%w|a b|], collated[ :whee ], "Preserve arrays +
uniq == duplicate arrays are removed" )
end
def test5_multi_collate
collated = $a.collate( $b ).collate( $c )
assert_equal( [1,1,1], collated[ :a ] )
assert_equal( [2,4,8], collated[ :b ] )
assert_equal( [9,27], collated[ :c ] )
end
def test6_multi_collate_with_preserve
collated = $a.collate( $b, :preserve_arrays=>1 ).collate( $c )
assert_equal( [1,1,1], collated[ :a ] )
assert_equal( [2,4,8], collated[ :b ] )
assert_equal( [9,27], collated[ :c ] )

collated = $a.collate( $b ).collate( $c, :preserve_arrays=>1 )
assert_equal( [[1,1],1], collated[ :a ] )
assert_equal( [[2,4],8], collated[ :b ] )
assert_equal( [[9],27], collated[ :c ] )

collated =
$a.collate( $b, :preserve_arrays=>1 ).collate( $c, :preserve_arrays=>1 )
assert_equal( [[1,1],1], collated[ :a ] )
assert_equal( [[2,4],8], collated[ :b ] )
assert_equal( [[9],27], collated[ :c ] )
end
end
end
8 Answers

Phrogz

12/19/2007 12:05:00 AM

0

On Dec 18, 5:01 pm, Phrogz <phr...@mac.com> wrote:
> I wanted a method like Hash#update, but that preserved the values from
> both the original and argument Hash. A little searching failed to find
> it. (I did find that someone somewhere wrote a Hash#collate that's in
> my ri docs, but who knows where it came from. Its description appears
> not to do at all what I wanted, anyhow.)
>
> So, I wrote my own. Comments welcome. Efficiency patches particularly
> welcome. Under a different name, perhaps Trans might consider it for
> inclusion in Facets.

<snip stupidly-wrapped code>

Please find properly-formatted code @ http://pastie.caboo...
Sorry for the extra noise.

Joel VanderWerf

12/19/2007 12:17:00 AM

0

Phrogz wrote:
...
> # { :a=>1, :b=>2 }.collate :a=>3, :b=>4, :c=>5
> # #=> { :a=>[1,3], :b=>[2,4], :c=>[5] }

Do these two give the same result? Does it matter?

{ :a=>1, :b=>2 }.collate :a=>3, :b=>4, :c=>5
{ :a=>1, :b=>2, :c=>5 }.collate :a=>3, :b=>4

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Phrogz

12/19/2007 1:03:00 AM

0

On Dec 18, 5:16 pm, Joel VanderWerf <vj...@path.berkeley.edu> wrote:
> Phrogz wrote:
> > # { :a=>1, :b=>2 }.collate :a=>3, :b=>4, :c=>5
> > # #=> { :a=>[1,3], :b=>[2,4], :c=>[5] }
>
> Do these two give the same result? Does it matter?
>
> { :a=>1, :b=>2 }.collate :a=>3, :b=>4, :c=>5
> { :a=>1, :b=>2, :c=>5 }.collate :a=>3, :b=>4

They don't. In my particular use case today, I only used the result as
a set, so a proper Set might have been more appropriate. But I don't
know; I think that preserving the order is probably useful, at least
when not using the #uniq option. (I'm thinking perhaps of a case where
you're specifying a series of fallback results for a variety of
options.)

Totally up for grabs, though, if there's a faster, more elegant
solution that doesn't use that.

Trans

12/19/2007 1:29:00 AM

0



On Dec 18, 7:05 pm, Phrogz <phr...@mac.com> wrote:
> I wanted a method like Hash#update, but that preserved the values from
> both the original and argument Hash. A little searching failed to find
> it. (I did find that someone somewhere wrote a Hash#collate that's in
> my ri docs, but who knows where it came from. Its description appears
> not to do at all what I wanted, anyhow.)

That's from Facets, probably. But the latest version of Facets renamed
it to #mash, for "map hash", which is more descriptive of what it
does. (#collate remains an alias for the time being).

I like your definition --actually I'm surprised I haven't worked this
functionality into Facets yet. I guess I thought #weave took care of
it, but that's slightly different b/c it only combines arrays if the
value is already an array. So I'm going to add this to Facets. A
couple thoughts though...

The options don't feel quite right. Maybe it would more versatile to
define #uniq on Hash? So then

{ :a=>1, :b=>2 }.collate( { :a=>1, :b=>3 } ).uniq
#=> { :a=>[1], :b=>[2,3] }

As for preserving the arrays, I'm not sure. Is that really all that
useful? Well, if it is it seems like a better definition for Hash#zip.

T.

Phrogz

12/19/2007 2:49:00 AM

0

On Dec 18, 6:29 pm, Trans <transf...@gmail.com> wrote:
> On Dec 18, 7:05 pm, Phrogz <phr...@mac.com> wrote:
>
> > I wanted a method like Hash#update, but that preserved the values from
> > both the original and argument Hash. A little searching failed to find
> > it. (I did find that someone somewhere wrote a Hash#collate that's in
> > my ri docs, but who knows where it came from. Its description appears
> > not to do at all what I wanted, anyhow.)
>
> That's from Facets, probably. But the latest version of Facets renamed
> it to #mash, for "map hash", which is more descriptive of what it
> does. (#collate remains an alias for the time being).
>
> I like your definition --actually I'm surprised I haven't worked this
> functionality into Facets yet. I guess I thought #weave took care of
> it, but that's slightly different b/c it only combines arrays if the
> value is already an array. So I'm going to add this to Facets. A
> couple thoughts though...
>
> The options don't feel quite right. Maybe it would more versatile to
> define #uniq on Hash? So then
>
> { :a=>1, :b=>2 }.collate( { :a=>1, :b=>3 } ).uniq
> #=> { :a=>[1], :b=>[2,3] }

That's an excellent point. I needed this functionality today and so I
included it in the script; however, since it's a simple one-line (as
seen in the implementation) post-process step, perhaps it's
appropriate to keep it out of this method.


> As for preserving the arrays, I'm not sure. Is that really all that
> useful? Well, if it is it seems like a better definition for Hash#zip.

The reason I made the arrays not be preserved by default is to enable
chained collation of 3 or more hashes. (test5_multicollate in the unit
tests.) I was actually collating hundreds today. However, I put in the
'preserve arrays' because it seemed almost arbitrary to treat them
differently from every other type of value. I don't personally have a
use case that needs it now, but I know from experience (like #flatten
versus #flatten_once) how sometimes arrays of arrays can suddenly crop
up and need to be preserved.

I would dearly love to get rid of the options hash altogether,
though. :)

Phrogz

12/19/2007 3:50:00 AM

0

On Dec 18, 7:49 pm, Phrogz <phr...@mac.com> wrote:
> On Dec 18, 6:29 pm, Trans <transf...@gmail.com> wrote:
>
>
>
> > On Dec 18, 7:05 pm, Phrogz <phr...@mac.com> wrote:
>
> > > I wanted a method like Hash#update, but that preserved the values from
> > > both the original and argument Hash. A little searching failed to find
> > > it. (I did find that someone somewhere wrote a Hash#collate that's in
> > > my ri docs, but who knows where it came from. Its description appears
> > > not to do at all what I wanted, anyhow.)
>
> > That's from Facets, probably. But the latest version of Facets renamed
> > it to #mash, for "map hash", which is more descriptive of what it
> > does. (#collate remains an alias for the time being).
>
> > I like your definition --actually I'm surprised I haven't worked this
> > functionality into Facets yet. I guess I thought #weave took care of
> > it, but that's slightly different b/c it only combines arrays if the
> > value is already an array. So I'm going to add this to Facets. A
> > couple thoughts though...
>
> > The options don't feel quite right. Maybe it would more versatile to
> > define #uniq on Hash? So then
>
> > { :a=>1, :b=>2 }.collate( { :a=>1, :b=>3 } ).uniq
> > #=> { :a=>[1], :b=>[2,3] }
>
> That's an excellent point. I needed this functionality today and so I
> included it in the script; however, since it's a simple one-line (as
> seen in the implementation) post-process step, perhaps it's
> appropriate to keep it out of this method.
>
> > As for preserving the arrays, I'm not sure. Is that really all that
> > useful? Well, if it is it seems like a better definition for Hash#zip.
>
> The reason I made the arrays not be preserved by default is to enable
> chained collation of 3 or more hashes. (test5_multicollate in the unit
> tests.) I was actually collating hundreds today. However, I put in the
> 'preserve arrays' because it seemed almost arbitrary to treat them
> differently from every other type of value. I don't personally have a
> use case that needs it now, but I know from experience (like #flatten
> versus #flatten_once) how sometimes arrays of arrays can suddenly crop
> up and need to be preserved.
>
> I would dearly love to get rid of the options hash altogether,
> though. :)

One alternative would be to drop the idea of preserving collation
order altogether, and instead accumulate the results as a Set.
Although the method would still need to branch on value type (since
set1 << set2 isn't the same as set1.merge set2), it seems far less
likely that someone would have a Hash whose values were Sets and
wanted to maintain each set as a distinct 'value' during collation.

Nobuyoshi Nakada

12/19/2007 2:33:00 PM

0

Hi,

At Wed, 19 Dec 2007 09:05:11 +0900,
Phrogz wrote in [ruby-talk:284104]:
> I wanted a method like Hash#update, but that preserved the values from
> both the original and argument Hash. A little searching failed to find
> it. (I did find that someone somewhere wrote a Hash#collate that's in
> my ri docs, but who knows where it came from. Its description appears
> not to do at all what I wanted, anyhow.)

{:a=>1, :b=>2 }.update(:a=>3, :b=>4, :c=>5) {|key, *values| values}

--
Nobu Nakada

Trans

12/19/2007 3:42:00 PM

0



On Dec 19, 9:32 am, Nobuyoshi Nakada <n...@ruby-lang.org> wrote:
> Hi,
>
> At Wed, 19 Dec 2007 09:05:11 +0900,
> Phrogz wrote in [ruby-talk:284104]:
>
> > I wanted a method like Hash#update, but that preserved the values from
> > both the original and argument Hash. A little searching failed to find
> > it. (I did find that someone somewhere wrote a Hash#collate that's in
> > my ri docs, but who knows where it came from. Its description appears
> > not to do at all what I wanted, anyhow.)
>
> {:a=>1, :b=>2 }.update(:a=>3, :b=>4, :c=>5) {|key, *values| values}

Woh! Little known is this karate!

You can even do:

{:a=>1, :b=>2 }.update(:a=>[1,3], :b=>4, :c=>5) {|key, *values|
values.flatten.uniq}
=> {:a=>[1, 3], :b=>[2, 4], :c=>5}

T.