[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: Set intersection with a specific criteria: a better idiom?

Joel VanderWerf

5/31/2007 5:21:00 PM

Eric Torreborre wrote:
> Hi,
>
> I have been trying to intersect 2 sets of objects that where I want the "intersected" objects to share a common property.
>
> This is what I came up with:
>
> Here, I want to get all the Errors having the same file name:
>
> def intersection_set
> Set.new(event_errors).&(Set.new(socket_channel_errors)) {|x| file_name(x)}
> end
>
> To allow that I extended the Set class with:
>
> def &(enum, &property)
> n = self.class.new
> if block_given?
> mapped = self.map{|e| property.call(e)}
> enum.each { |o| n.add(o) if mapped.include?(property.call(o)) }
> else
> enum.each { |o| n.add(o) if include?(o) }
> end
> n
> end
>
> I have met this requirement quite some times while scripting and I may miss something here.
> Do you know a more idiomatic, Ruby-way of doing that?

Hm, it's like Schwartzian transform: intersect_by instead of sort_by.

Your implementation looks good, but might be more efficient with:

mapped = self.class.new(map{|e| property.call(e)})

which would make mapped a set, for faster lookups.

Also, using yield will probably be more efficient than instantiating a
proc and calling it:

def &(enum)
n = self.class.new
if block_given?
mapped = self.class.new(map{|e| yield e})
enum.each { |o| n.add(o) if mapped.include?(yield o) }
else
enum.each { |o| n.add(o) if include?(o) }
end
n
end

It would be more elegant to use select than to iterate through enum
using #each and #add to the set, but Set#select returns an Array. At the
cost of using two iterations instead of once, you could do this:


require 'set'

class Set
def intersect_by(other)
props = self.class.new(other.map {|t| yield t})
self.class.new(select {|t| props.include? t.prop})
end

# Actually the following is closer to the origial:
# def intersect_by(other)
# props = self.class.new(map {|t| yield t})
# self.class.new(other.select {|t| props.include? t.prop})
# end

alias old_intersect &

def &(other)
if block_given?
intersect_by(other) {|t| yield t}
else
old_intersect(other) {|t| yield t}
end
end
end

class Test
attr_accessor :prop
def initialize prop
@prop = prop
end
end

s1 = Set.new([
Test.new(1),
Test.new(2),
Test.new(3)
])

s2 = Set.new([
Test.new(2),
Test.new(3),
Test.new(4)
])

p(s1.intersect_by(s2) {|t| t.prop})
p(Set[1,2,3]&Set[2,3,4])

__END__

#<Set: {#<Test:0xb79f73e4 @prop=2>, #<Test:0xb79f73d0 @prop=3>}>
#<Set: {2, 3}>

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

1 Answer

Robert Klemme

6/1/2007 11:22:00 AM

0

On 31.05.2007 19:21, Joel VanderWerf wrote:
> Eric Torreborre wrote:
>> Hi,
>>
>> I have been trying to intersect 2 sets of objects that where I want
>> the "intersected" objects to share a common property.
>>
>> This is what I came up with:
>>
>> Here, I want to get all the Errors having the same file name:
>>
>> def intersection_set
>> Set.new(event_errors).&(Set.new(socket_channel_errors)) {|x|
>> file_name(x)}
>> end
>>
>> To allow that I extended the Set class with:
>>
>> def &(enum, &property)
>> n = self.class.new
>> if block_given?
>> mapped = self.map{|e| property.call(e)}
>> enum.each { |o| n.add(o) if mapped.include?(property.call(o)) }
>> else
>> enum.each { |o| n.add(o) if include?(o) }
>> end
>> n
>> end
>>
>> I have met this requirement quite some times while scripting and I may
>> miss something here.
>> Do you know a more idiomatic, Ruby-way of doing that?

Basically you want to do the intersection based on a specific criterion.
My first choice in such a scenario would be a Hash based solution,
i.e. you create an index based on what you consider the key field. See
attachment for a sample implementation.

Regarding your question whether to include your code in std lib or
facets lib: I vote "no". Here's why: a set intersection will leave you
with a single uniform set of items, namely all items present in all sets
that take part in the intersection. Your solution is not directly an
intersection but it creates a copy of one set with all elements from the
copy removed that do not fit a certain criterion. More specifically,
you kind of reversed the logic because the resulting set has only
elements from the /parameter/ (i.e. the right hand side of the operator)
and not from the left hand side. So while for intersection order of
arguments does not matter, it does for substraction:

irb(main):001:0> require 'set'
=> true
irb(main):002:0> s1 = (1..10).to_set
=> #<Set: {5, 6, 1, 7, 2, 8, 3, 9, 4, 10}>
irb(main):003:0> s2 = (5..15).to_set
=> #<Set: {5, 11, 6, 12, 7, 13, 8, 14, 9, 15, 10}>
irb(main):004:0> s1 - s2
=> #<Set: {1, 2, 3, 4}>
irb(main):005:0> s2 - s1
=> #<Set: {11, 12, 13, 14, 15}>

I would rewrite your original code like this:

require 'set'
....
# use a set for efficiency reasons
keys = event_errors.map {|x| file_name x}.to_set
# or, more efficient:
keys = event_errors.inject(Set.new) {|s,x| s << file_name(x)}
# create the result
result = socket_channel_errors.select {|x| keys.include? file_name(x)}

Since this is basically a two liner I don't see any necessity to change
std libs.

Kind regards

robert