Asp Forum - Array::uniq { block } ?

Belorion

1/26/2005 3:34:00 PM

I have an array of arrays. I want to be able to do a uniq operation
on my array, and have it accept a block so I can specify what to
examine for uniqueness. For example, in IRB

a = [ [1,2], [3,4], [1,3] ]
a.sort { |x,y| x[0] <=> y[0] }

returns => [ [1,2], [1,3], [3,4] ]

However, it does not appear uniq accepts blocks:

a = [ [1,2], [3,4], [1,3] ]
a.uniq { |x,y| x[0] == y[0] }

returns => [ [1,2], [3,4], [1,3] ]
and a.uniq! with the above block returns nil.

Am I stuck writing my own deep_uniq? Granted, it would be hard, but
certainly not as clean as the above.

25 Answers

djberg96

1/26/2005 4:14:00 PM

Belorion wrote:
> I have an array of arrays. I want to be able to do a uniq operation
> on my array, and have it accept a block so I can specify what to
> examine for uniqueness. For example, in IRB
>
> a = [ [1,2], [3,4], [1,3] ]
> a.sort { |x,y| x[0] <=> y[0] }
>
> returns => [ [1,2], [1,3], [3,4] ]
>
> However, it does not appear uniq accepts blocks:
>
> a = [ [1,2], [3,4], [1,3] ]
> a.uniq { |x,y| x[0] == y[0] }
>
> returns => [ [1,2], [3,4], [1,3] ]
> and a.uniq! with the above block returns nil.
>
> Am I stuck writing my own deep_uniq? Granted, it would be hard, but
> certainly not as clean as the above.

Not that this is a bad idea, but why do I get this feeling that we're
eventually going to find an excuse to allow blocks to virtually every
method in the core library?

Where does it all end?!

Dan

Robert Klemme

1/26/2005 5:56:00 PM

"Belorion" <belorion@gmail.com> schrieb im Newsbeitrag
news:a48d774d05012607334cd93e1e@mail.gmail.com...
>I have an array of arrays. I want to be able to do a uniq operation
> on my array, and have it accept a block so I can specify what to
> examine for uniqueness. For example, in IRB
>
> a = [ [1,2], [3,4], [1,3] ]
> a.sort { |x,y| x[0] <=> y[0] }
>
> returns => [ [1,2], [1,3], [3,4] ]
>
> However, it does not appear uniq accepts blocks:
>
> a = [ [1,2], [3,4], [1,3] ]
> a.uniq { |x,y| x[0] == y[0] }
>
> returns => [ [1,2], [3,4], [1,3] ]
> and a.uniq! with the above block returns nil.
>
> Am I stuck writing my own deep_uniq? Granted, it would be hard, but
> certainly not as clean as the above.

The block is accepted but apparently not used:

>> %w{a b c d a d}.uniq {|*x| p x}
=> ["a", "b", "c", "d"]

Could be a bug in the std lib. (Btw, I'm on 1.8.1 here)

How about using a Hash in the meantime?

>> h = a.inject({}){|h,(k,v)| h[k] ||= [k,v];h}
=> {1=>[1, 2], 3=>[3, 4]}
>> h.values
=> [[1, 2], [3, 4]]

Or

>> h = a.inject({}){|h,pair| h[pair[0]] ||= pair;h}
=> {1=>[1, 2], 3=>[3, 4]}
>> h.values
=> [[1, 2], [3, 4]]

Kind regards

robert

Robert Klemme

1/26/2005 6:02:00 PM

"Robert Klemme" <bob.news@gmx.net> schrieb im Newsbeitrag
news:35q3t9F4jlog7U1@individual.net...
>
> "Belorion" <belorion@gmail.com> schrieb im Newsbeitrag
> news:a48d774d05012607334cd93e1e@mail.gmail.com...
>>I have an array of arrays. I want to be able to do a uniq operation
>> on my array, and have it accept a block so I can specify what to
>> examine for uniqueness. For example, in IRB
>>
>> a = [ [1,2], [3,4], [1,3] ]
>> a.sort { |x,y| x[0] <=> y[0] }
>>
>> returns => [ [1,2], [1,3], [3,4] ]
>>
>> However, it does not appear uniq accepts blocks:
>>
>> a = [ [1,2], [3,4], [1,3] ]
>> a.uniq { |x,y| x[0] == y[0] }
>>
>> returns => [ [1,2], [3,4], [1,3] ]
>> and a.uniq! with the above block returns nil.
>>
>> Am I stuck writing my own deep_uniq? Granted, it would be hard, but
>> certainly not as clean as the above.
>
> The block is accepted but apparently not used:
>
>>> %w{a b c d a d}.uniq {|*x| p x}
> => ["a", "b", "c", "d"]
>
> Could be a bug in the std lib. (Btw, I'm on 1.8.1 here)

Delete that: you can provide a block for *every* method - it's simply
ignored. So no bug, but unique obviously is not built to use it. Stupid
me...

robert

>
> How about using a Hash in the meantime?
>
>>> h = a.inject({}){|h,(k,v)| h[k] ||= [k,v];h}
> => {1=>[1, 2], 3=>[3, 4]}
>>> h.values
> => [[1, 2], [3, 4]]
>
> Or
>
>>> h = a.inject({}){|h,pair| h[pair[0]] ||= pair;h}
> => {1=>[1, 2], 3=>[3, 4]}
>>> h.values
> => [[1, 2], [3, 4]]
>
> Kind regards
>
> robert
>

Its Me

1/26/2005 6:55:00 PM

I agree, #uniq is certainly not the only candidate. I think it would be very
useful to compile a list of all such methods that could use a block to
override some default internal method (#== in the case of uniq), and
consider these for 2.0.

"Daniel Berger" <djberg96@hotmail.com> wrote in message
news:1106756045.223279.94600@f14g2000cwb.googlegroups.com...
> Belorion wrote:
> > I have an array of arrays. I want to be able to do a uniq operation
> > on my array, and have it accept a block so I can specify what to
> > examine for uniqueness. For example, in IRB
> >
> > a = [ [1,2], [3,4], [1,3] ]
> > a.sort { |x,y| x[0] <=> y[0] }
> >
> > returns => [ [1,2], [1,3], [3,4] ]
> >
> > However, it does not appear uniq accepts blocks:
> >
> > a = [ [1,2], [3,4], [1,3] ]
> > a.uniq { |x,y| x[0] == y[0] }
> >
> > returns => [ [1,2], [3,4], [1,3] ]
> > and a.uniq! with the above block returns nil.
> >
> > Am I stuck writing my own deep_uniq? Granted, it would be hard, but
> > certainly not as clean as the above.
>
> Not that this is a bad idea, but why do I get this feeling that we're
> eventually going to find an excuse to allow blocks to virtually every
> method in the core library?
>
> Where does it all end?!
>
> Dan
>

ts

1/27/2005 10:34:00 AM

>>>>> "i" == itsme213 <itsme213@hotmail.com> writes:

i> I agree, #uniq is certainly not the only candidate. I think it would be very
i> useful to compile a list of all such methods that could use a block to
i> override some default internal method (#== in the case of uniq), and
i> consider these for 2.0.

#uniq don't use #==, it use #hash and #eql?

Guy Decoux

Robert Klemme

1/27/2005 11:03:00 AM

"ts" <decoux@moulon.inra.fr> schrieb im Newsbeitrag
news:200501271033.j0RAXMu05608@moulon.inra.fr...
> >>>>> "i" == itsme213 <itsme213@hotmail.com> writes:
>
> i> I agree, #uniq is certainly not the only candidate. I think it would
be very
> i> useful to compile a list of all such methods that could use a block
to
> i> override some default internal method (#== in the case of uniq), and
> i> consider these for 2.0.
>
> #uniq don't use #==, it use #hash and #eql?

Hm...

class Dummy
def eql?(x) puts "eql?"; super end
def equal?(x) puts "equal?"; super end
def ==(x) puts "=="; super end
def hash() puts "hash"; super end
def id() puts "id"; super end
end

>> h={Dummy.new => 1, Dummy.new =>2}
hash
hash
=> {#<Dummy:0x10187428>=>2, #<Dummy:0x10187440>=>1}
>> h.keys.concat(h.keys).uniq
hash
hash
hash
hash
hash
hash
hash
hash
=> [#<Dummy:0x10187428>, #<Dummy:0x10187440>]

I don't see eql? called here - maybe it's an internal optimization that
short circuits identity. Or did I miss something?

Kind regards

robert

Martin DeMello

1/27/2005 11:09:00 AM

ts <decoux@moulon.inra.fr> wrote:
> >>>>> "i" == itsme213 <itsme213@hotmail.com> writes:
>
> i> I agree, #uniq is certainly not the only candidate. I think it would be very
> i> useful to compile a list of all such methods that could use a block to
> i> override some default internal method (#== in the case of uniq), and
> i> consider these for 2.0.
>
> #uniq don't use #==, it use #hash and #eql?

Which raises the problem that it works globally, unlike the unix uniq
which only uniqs consecutive elements. Passing in a block would lead to
a performance hit since you'd need to compare every pair of elements
without the benefit of a hash function. A uniq_by would probably cover
most of the use cases anyway.

martin

ts

1/27/2005 11:12:00 AM

>>>>> "R" == Robert Klemme <bob.news@gmx.net> writes:

R> I don't see eql? called here - maybe it's an internal optimization that
R> short circuits identity. Or did I miss something?

If 2 object have different hash value, it don't need to call eql?

Now if they have the same hash value :
* if eql? return true, this is the "same" object (i.e. it will use the
same key)
* otherwise there is collision

Object#hash return the id of object

Guy Decoux

Robert Klemme

1/27/2005 12:24:00 PM

"ts" <decoux@moulon.inra.fr> schrieb im Newsbeitrag
news:200501271111.j0RBBJu07211@moulon.inra.fr...
> >>>>> "R" == Robert Klemme <bob.news@gmx.net> writes:
>
> R> I don't see eql? called here - maybe it's an internal optimization
that
> R> short circuits identity. Or did I miss something?
>
> If 2 object have different hash value, it don't need to call eql?

Ah, yes of course. That's it! Thanks for reminding me.

> Now if they have the same hash value :
> * if eql? return true, this is the "same" object (i.e. it will use the
> same key)
> * otherwise there is collision

Of all what you said I don't understand your last point: Why is there a
collision?

>> class Foo; def hash() 0 end end
=> nil
>> h = {Foo.new => 1, Foo.new => 2}
=> {#<Foo:0x101a9808>=>2, #<Foo:0x101a9898>=>1}
>> h.keys.uniq
=> [#<Foo:0x101a9808>, #<Foo:0x101a9898>]

And indeed a redefined Dummy clarifies this:

class Dummy
def eql?(x) puts "eql?"; super end
def equal?(x) puts "equal?"; super end
def ==(x) puts "=="; super end
def hash() puts "hash"; 0 end
def id() puts "id"; super end
end

>> h={Dummy.new => 1, Dummy.new => 2}
hash
hash
eql?
=> {#<Dummy:0x1018da28>=>2, #<Dummy:0x1018da88>=>1}
>> h.keys.uniq
hash
hash
eql?
=> [#<Dummy:0x1018da28>, #<Dummy:0x1018da88>]

Again a reminder that it's always a good idea to override #hash, #== and
#eql? in one go for classes that should be used as hash keys or in sets.

> Object#hash return the id of object

Yeah, but that might be overridden in an arbitrary way. Normally one
can't rely on that #hash tells anything about the identity of an instance
other than probably that instances with different hash are not identical
and not equivalent.

Kind regards

robert

ts

1/27/2005 12:40:00 PM

>>>>> "R" == Robert Klemme <bob.news@gmx.net> writes:

R> Of all what you said I don't understand your last point: Why is there a
R> collision?

Array#uniq just store the element of the array as the key of a hash.

For an hash, 2 values are stored in differents keys if they have a
different hash value, or if eql? return false.

In hash "terminology", there is collision when 2 elements have the same hash
value but are different (eql? return false) (see st.c)

Guy Decoux

comp.lang.ruby

Array::uniq { block } ?

Belorion

djberg96

Robert Klemme

Robert Klemme

Its Me

ts

Robert Klemme

Martin DeMello

ts

Robert Klemme

ts

x Login to ForumsZone