Asp Forum - histogram of histograms

Charles L. Snyder

2/7/2007 8:54:00 PM

Hi

I have several text files that look like this:

Brazil, 10
Brazil, 13
Brazil, 9
Bulgaria, 1
Canada, 48
Canada, 52
Canada, 38
Canada, 55
Canada, 59
Chile, 1
Chile, 1
Chile, 2
China, 7
China, 18
China, 19
China, 22
China, 25

I need to iterate through the above file(s) and get the data
summarized in the form:

Canada, 252
China, 91
Chile, 4
Brazil, 32
Bulgaria, 1

I know how to go from a single column list with multiple repeated
values to a 'histogram' type list, ie:

my_hash = countries.inject(Hash.new { 0 }) { |counts, key| counts[key]
+= 1; counts}
my_hash = my_hash.sort { |a,b| a[1] <=> b[1] }

but I'm unable to figure out how to get the 2-column csv values into a
total by country as shown above.
(I do have another file "countries.txt" which is a unique list of
countries.)

Thanks in advance!

CLS

7 Answers

Martin DeMello

2/7/2007 9:13:00 PM

On 2/8/07, Charles L. Snyder <clsnyder@gmail.com> wrote:
>
> I need to iterate through the above file(s) and get the data
> summarized in the form:
>
> Canada, 252
> China, 91
> Chile, 4
> Brazil, 32
> Bulgaria, 1

#------------------------------------------------------------------
countries = <<HERE
Brazil, 10
Brazil, 13
Brazil, 9
Bulgaria, 1
Canada, 48
Canada, 52
Canada, 38
Canada, 55
Canada, 59
Chile, 1
Chile, 1
Chile, 2
China, 7
China, 18
China, 19
China, 22
China, 25
HERE

totals = Hash.new {|h, k| h[k] = 0}

countries.each_line {|line|
country, n = line.split(/,\s*/)
totals[country] += n.to_i
}

totals.keys.sort_by {|i| -totals[i]}.each {|c|
puts "#{c}, #{totals[c]}"
}

#------------------------------------------------------------------

martin

Robert Klemme

2/7/2007 9:18:00 PM

On 07.02.2007 21:53, Charles L. Snyder wrote:
> I have several text files that look like this:
>
> Brazil, 10
> Brazil, 13
> Brazil, 9
> Bulgaria, 1
> Canada, 48
> Canada, 52
> Canada, 38
> Canada, 55
> Canada, 59
> Chile, 1
> Chile, 1
> Chile, 2
> China, 7
> China, 18
> China, 19
> China, 22
> China, 25
>
> I need to iterate through the above file(s) and get the data
> summarized in the form:
>
> Canada, 252
> China, 91
> Chile, 4
> Brazil, 32
> Bulgaria, 1

I would do that in stream mode, i.e. not first read all and then
summarize but directly summarize (see attached). Reason is, that this
is more efficient especially since these files look like they could be
large.

> I know how to go from a single column list with multiple repeated
> values to a 'histogram' type list, ie:
>
> my_hash = countries.inject(Hash.new { 0 }) { |counts, key| counts[key]
> += 1; counts}

I don't know why you do this. Do you also need the number of occurrences?

> my_hash = my_hash.sort { |a,b| a[1] <=> b[1] }
>
> but I'm unable to figure out how to get the 2-column csv values into a
> total by country as shown above.
> (I do have another file "countries.txt" which is a unique list of
> countries.)

You don't need the second file unless you want to report zero counts for
countries not present.

Kind regards

robert

counts = Hash.new 0
DATA.each do |line|
line.chomp!
country, val = line.split /,\s*/
counts[country] += val.to_i if country && val
end
counts.sort_by {|cn,co| -co}.each do |country, count|
print country, " ", count, "\n"
end
__END__
Brazil, 10
Brazil, 13
Brazil, 9
Bulgaria, 1
Canada, 48
Canada, 52
Canada, 38
Canada, 55
Canada, 59
Chile, 1
Chile, 1
Chile, 2
China, 7
China, 18
China, 19
China, 22
China, 25

dblack

2/7/2007 9:31:00 PM

William James

2/7/2007 10:06:00 PM

On Feb 7, 2:53 pm, "Charles L. Snyder" <clsny...@gmail.com> wrote:
> Hi
>
> I have several text files that look like this:
>
> Brazil, 10
> Brazil, 13
> Brazil, 9
> Bulgaria, 1
> Canada, 48
> Canada, 52
> Canada, 38
> Canada, 55
> Canada, 59
> Chile, 1
> Chile, 1
> Chile, 2
> China, 7
> China, 18
> China, 19
> China, 22
> China, 25
>
> I need to iterate through the above file(s) and get the data
> summarized in the form:
>
> Canada, 252
> China, 91
> Chile, 4
> Brazil, 32
> Bulgaria, 1
>
> I know how to go from a single column list with multiple repeated
> values to a 'histogram' type list, ie:
>
> my_hash = countries.inject(Hash.new { 0 }) { |counts, key| counts[key]
> += 1; counts}
> my_hash = my_hash.sort { |a,b| a[1] <=> b[1] }
>
> but I'm unable to figure out how to get the 2-column csv values into a
> total by country as shown above.
> (I do have another file "countries.txt" which is a unique list of
> countries.)
>
> Thanks in advance!
>
> CLS

hash = Hash.new(0)
"Brazil, 10
Brazil, 13
Brazil, 9
Bulgaria, 1
Canada, 48
Canada, 52
Canada, 38
Canada, 55
Canada, 59
Chile, 1
Chile, 1
Chile, 2
China, 7
China, 18
China, 19
China, 22
China, 25".each{|s| s.split(',').inject{|k,v| hash[k] += v.to_i }}
p hash

-hi-

5/13/2012 6:06:00 PM

On May 13, 1:50 pm, Will Dockery <will.dock...@gmail.com> wrote:
> and those among me.

LOL Reminds me of an old rockabilly song, "There was a fungus among
us."

Will Dockery

5/14/2012 4:38:00 AM

Hieronymous 707 <hieronymous...@gmail.com> wrote:
>Will Dockery wrote:
>
> > Also the old George Clinton band's riff "Take off your shoes & let the
> > fungus be among us..."
>
> No, you don't remind me anything of George Clinton at all.

I don't have to, you know-nothing, pompous ass.

The quote you used is from George Clinton, and no surprise your
shallow intellect had no idea of that:

http://henpantha.wordpress.com/2009/01/08/let-the-fungus-be-among-us-just-a-touc...

Just one more detail about me you don't have a clue about is my
intimate knowledge of 1970s soul music, and one thing I wouldn't
expect you to know anything about, with your obviously narrow world-
view.

--
Music & poetry from Will Dockery & Friends:
http://www.reverbnation.com/w...

Will Dockery

5/14/2012 9:37:00 AM

Hieronymous 707 wrote:
>
> You used the phrase "and those among me" which [...]
> reminded me of a song that also reminded me of you. I didn't quote
> George Clinton.

The quote you used was from a George Clinton band, whether you know it
or not, simple as that:

> > LOL Reminds me of an old rockabilly song, "There was a fungus among
> > us."
>
> Also the old George Clinton band's riff "Take off your shoes & let the
> fungus be among us..."

Let it rest... give me another of your foolish looking "no comment"
posts.

--
Music & poetry from Will Dockery & Friends:
http://www.reverbnation.com/w...

comp.lang.ruby

histogram of histograms

Charles L. Snyder

Martin DeMello

Robert Klemme

dblack

William James

-hi-

Will Dockery

Will Dockery

x Login to ForumsZone