[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Mode method for Array

Glenn

9/30/2008 9:57:00 PM

[Note: parts of this message were removed to make it a legal post.]

Hi,

I'd like to write a get_mode method for the Array class. The method would return an array of the most frequently occurring element or elements.

So [3, 1, 1, 55, 55].get_mode would return [1, 55].

I have a way to do this but I don't know if it's the best way. I was wondering if anyone had any suggestions?

Thanks!

20 Answers

Eustaquio 'TaQ' Rangel

9/30/2008 10:15:00 PM

0

> I'd like to write a get_mode method for the Array class. The method would return an array of the most frequently occurring element or elements.
> So [3, 1, 1, 55, 55].get_mode would return [1, 55].
> I have a way to do this but I don't know if it's the best way. I was wondering if anyone had any suggestions?

What is your way? Maybe we can have some idea of what parameters you are using
to the the most frequently elements. Using something like

irb(main):001:0> [3,1,1,55,55].inject(Hash.new(0)){|memo,item| memo[item] += 1;
memo}.sort_by {|e| e[1]}.reverse
=> [[55, 2], [1, 2], [3, 1]]

can return you some elements ordered by frequency.

Trans

9/30/2008 11:34:00 PM

0



On Sep 30, 5:56=A0pm, Glenn <glenn_r...@yahoo.com> wrote:
> Hi,
>
> I'd like to write a get_mode method for the Array class. =A0The method wo=
uld return an array of the most frequently occurring element or elements.
>
> So [3, 1, 1, 55, 55].get_mode would return [1, 55].
>
> I have a way to do this but I don't know if it's the best way. =A0I was w=
ondering if anyone had any suggestions?

Facets has:

module Enumerable

# In Statistics mode is the value that occurs most
# frequently in a given set of data.

def mode
count =3D Hash.new(0)
each {|x| count[x] +=3D 1 }
count.sort_by{|k,v| v}.last[0]
end

end

Hmm.. but that thwarts ties. I'll have to consider how to fix.

T.

Erik Veenstra

10/1/2008 1:12:00 AM

0

Using Enumerable#cluster_by (already defined in Facets):

module Enumerable
def mode
cluster_by do |element|
element
end.cluster_by do |cluster|
cluster.length
end.last.ergo do |clusters|
clusters.transpose.first
end # || []
end
end

gegroet,
Erik V.

Erik Veenstra

10/1/2008 2:12:00 AM

0


There's one more problem with your code: [].mode doesn't work.

gegroet,
Erik V.

Brian Candler

10/1/2008 7:35:00 AM

0

Shame that the standard Hash#invert doesn't handle duplicate values
well. My suggestion:

class Hash
def ninvert
inject({}) { |h,(k,v)| (h[v] ||= []) << k; h }
end
end

class Array
def get_mode
(inject(Hash.new(0)) { |h,e| h[e] += 1; h }.ninvert.max ||
[[]]).last
end
end

p [3, 1, 1, 55, 55].get_mode
p [3, 1, 1, 55].get_mode
p [:foo, 3, "bar", :foo, 4, "bar"].get_mode
p [].get_mode

(with ruby 1.8 if there are multiple mode values you get them in an
arbitary order; I think with 1.9 you'd get them in the order first seen
in the original array)
--
Posted via http://www.ruby-....

Erik Veenstra

10/1/2008 11:30:00 AM

0

And since we all love speed, we tend to avoid inject. (For
those who don't know: inject and inject! are really, really
slow. I mean, _really_ slow...)

Speed is also the reason for "pre-defining" variables used in
iterations. (6% faster!!)

For low level methods like these, speed is much more important
than readability. And the inject versions of the methods below
aren't even more readable than the faster implementations.

So, we'll go for the fast ones:

module Enumerable
def mode
empty? ? [] : frequencies.group_by_value.max.last
end

def frequencies
x = nil
res = Hash.new(0)
each{|x| res[x] += 1}
res
end
end

class Hash
def group_by_value
k = v = nil
res = {}
each{|k, v| (res[v] ||= []) << k}
res
end
end

gegroet,
Erik V. - http://www.erikve...

Robert Klemme

10/1/2008 11:43:00 AM

0

2008/10/1 Erik Veenstra <erikveen@gmail.com>:
> Speed is also the reason for "pre-defining" variables used in
> iterations. (6% faster!!)

Premature optimization IMHO.

Here's another nice and short one:

irb(main):001:0> module Enumerable
irb(main):002:1> def mode
irb(main):003:2> max = 0
irb(main):004:2> c = Hash.new 0
irb(main):005:2> each {|x| cc = c[x] += 1; max = cc if cc > max}
irb(main):006:2> c.select {|k,v| v == max}.map {|k,v| k}
irb(main):007:2> end
irb(main):008:1> end
=> nil
irb(main):009:0> [3, 1, 1, 55, 55].mode
=> [55, 1]
irb(main):010:0> [].mode
=> []
irb(main):011:0>

Cheers

robert

--
remember.guy do |as, often| as.you_can - without end

David A. Black

10/1/2008 11:47:00 AM

0

Hi --

On Wed, 1 Oct 2008, Erik Veenstra wrote:

> And since we all love speed, we tend to avoid inject. (For
> those who don't know: inject and inject! are really, really
> slow. I mean, _really_ slow...)

What's inject! ?


David

--
Rails training from David A. Black and Ruby Power and Light:
Intro to Ruby on Rails January 12-15 Fort Lauderdale, FL
Advancing with Rails January 19-22 Fort Lauderdale, FL *
* Co-taught with Patrick Ewing!
See http://www.r... for details and updates!

David A. Black

10/1/2008 11:52:00 AM

0

Hi --

On Wed, 1 Oct 2008, Robert Klemme wrote:

> 2008/10/1 Erik Veenstra <erikveen@gmail.com>:
>> Speed is also the reason for "pre-defining" variables used in
>> iterations. (6% faster!!)
>
> Premature optimization IMHO.

As much as I like inject, I have to say I've always felt that the ones
that look like this:

inject({}) {|h,item| do_something; h }

are kind of unidiomatic. Evan Phoenix was saying recently on IRC (I
hope I'm remembering/quoting correctly) that his rule of thumb was
that inject was for cases where the accumulator was not the same
object every time, and that where a single object is having elements
added to it, an each iteration from the source collection was better.
I tend to agree, though I'm not able to come up with a very technical
rationale.

What say you, oh inject king?


David

--
Rails training from David A. Black and Ruby Power and Light:
Intro to Ruby on Rails January 12-15 Fort Lauderdale, FL
Advancing with Rails January 19-22 Fort Lauderdale, FL *
* Co-taught with Patrick Ewing!
See http://www.r... for details and updates!

Brian Candler

10/1/2008 12:57:00 PM

0

David A. Black wrote:
> What say you, oh inject king?

I don't know who that is, but I'll add my 2c anyway:

> As much as I like inject, I have to say I've always felt that the ones
> that look like this:
>
> inject({}) {|h,item| do_something; h }
>
> are kind of unidiomatic.

I agree; 'inject' is ideally for when you're creating a new data
structure each iteration rather than modifying an existing one. You
could do

inject({}) {|h,item| h.merge(something => otherthing)}

but that creates lots of waste.

I only used it as a convenient holder for the target object. Maybe
there's a more ruby-ish pattern where the target is the same each time
round, although I don't know what you'd call it:

module Enumerable
def into(obj)
each { |e| yield obj, e }
obj
end
end

src = {:foo=>1, :bar=>1, :baz=>2}
p src.into({}) { |tgt,(k,v)| (tgt[v] ||= []) << k }

There was also a previous suggestion of generalising map so that it
would build into an arbitary object, not just an array.

module Enumerable
def map2(target = [])
each { |e| target << (yield e) }
target
end
end

p [1,2,3].map2 { |e| e * 2 }

class Hash
def <<(x)
self[x[0]] = x[1]
end
end

p [1,2,3].map2({}) { |e| [e, e * 2] }

That would allow any target which implements :<<, so map to $stdout
would be fine.

It's not so useful here, since we'd need a :<< method suitable for hash
inversion. And I suppose for completeness, you'd need a wrapper class
analagous to Enumerator to map :<< to an arbitary method name...
--
Posted via http://www.ruby-....