[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Output unique values in CSV columns to a text file

Drew Olson

12/18/2006 10:05:00 PM

What I want to do is read in a CSV file and produce an output which
lists the unique values from each column in the following format:

Column: ColHeader1
UniqueVal1
UniqueVal2
Column: ColHeader2
UniqueVal1
UniqueVal2
...

What I'm currently getting is output that looks as follows:

Column: ColHeader1

ColHeader1UniqueVal1
ColHeader1UniqueVal2
Column: ColHeader2

ColHeader2UniqueVal1
ColHeader2UniqueVal2
...

For some reason, it is appending the column header to each value and
also printing a blank row to start each column. My code is below. Any
help is much appreciated. Essentially I read the CSV into a hash where
the key is the column header and the element is an array of values from
that column. I then run .uniq! on each array in the hash and print the
results to a file.

require 'rubygems'
require 'faster_csv'

infile = "xyz.csv"

uniques = {}

FCSV.open(infile, :headers => true).each do |row|
row.each_with_index do |element,j|
uniques[row.headers[j]] ||= []
uniques[row.header[j]] << element
end
end

uniques.each do |key,element|
element.uniq!
end

File.open("unique_output.txt","w+") do |out|
uniques.each_key do |key|
out.write "Column: #{key}\n"
uniques[key].each do |element|
out.write " #{element}\n"
end
end
end

--
Posted via http://www.ruby-....

2 Answers

James Gray

12/18/2006 10:21:00 PM

0

On Dec 18, 2006, at 4:04 PM, Drew Olson wrote:

> What I want to do is read in a CSV file and produce an output which
> lists the unique values from each column in the following format:
>
> Column: ColHeader1
> UniqueVal1
> UniqueVal2
> Column: ColHeader2
> UniqueVal1
> UniqueVal2
> ...

Well, if it all fits in memory it's super easy using FCSV's Tables:

#!/usr/bin/env ruby -w

require "rubygems"
require "faster_csv"

table = FCSV.parse(DATA.read, :headers => true)
table.by_col!.each do |header, col|
puts "#{header}:"
puts " #{col.uniq.join(', ')}"
end

__END__
nums,letters
1,a
2,b
2,b
3,c
3,c
3,c

James Edward Gray II

William James

12/19/2006 12:45:00 AM

0

Drew Olson wrote:
> What I want to do is read in a CSV file and produce an output which
> lists the unique values from each column in the following format:
>
> Column: ColHeader1
> UniqueVal1
> UniqueVal2
> Column: ColHeader2
> UniqueVal1
> UniqueVal2
> ..
>
> What I'm currently getting is output that looks as follows:
>
> Column: ColHeader1
>
> ColHeader1UniqueVal1
> ColHeader1UniqueVal2
> Column: ColHeader2
>
> ColHeader2UniqueVal1
> ColHeader2UniqueVal2
> ..
>
> For some reason, it is appending the column header to each value and
> also printing a blank row to start each column. My code is below. Any
> help is much appreciated. Essentially I read the CSV into a hash where
> the key is the column header and the element is an array of values from
> that column. I then run .uniq! on each array in the hash and print the
> results to a file.
>
> require 'rubygems'
> require 'faster_csv'
>
> infile = "xyz.csv"
>
> uniques = {}
>
> FCSV.open(infile, :headers => true).each do |row|
> row.each_with_index do |element,j|
> uniques[row.headers[j]] ||= []
> uniques[row.header[j]] << element
> end
> end
>
> uniques.each do |key,element|
> element.uniq!
> end
>
> File.open("unique_output.txt","w+") do |out|
> uniques.each_key do |key|
> out.write "Column: #{key}\n"
> uniques[key].each do |element|
> out.write " #{element}\n"
> end
> end
> end
>
> --
> Posted via http://www.ruby-....

data = DATA.readlines.map{|s| s.chomp.split(",")}
header = data.shift.map{|s| "Column: " + s}

data = data.transpose.map{|ary| ary.uniq.map{|s| " " + s} }

puts header.zip(data)


__END__
It's,so,simple!
a,b,c
a,b,c
d,e,f