[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Seeking the Ruby way

Todd Breiholz

2/2/2006 9:58:00 PM

I'm just getting my feet wet with Ruby and would like some advice on how you
"old-timers" would write the following script using Ruby idioms.

The intent of the script is to parse a CSV file that contains 2 fields per
row, sorted on the second field. There may be multiple rows for field 2. I
want to get a list of all of the unique values of field2 that has more than
1 value for the 1st 6 characters of field 1.

Here's what I did:

require 'csv'

last_account_id = ''
last_adv_id = ''
parent_co_ids = []
cntr = 0
first = true
CSV::Reader.parse(File.open('e:\\tmp\\20060201\\bsa.csv', 'r')) do |row|
if row[1] == last_account_id
parent_co_ids << last_adv_id[0, 6] unless
parent_co_ids.include?(last_adv_id[0, 6])
else
if !first
parent_co_ids << last_adv_id[0, 6] unless
parent_co_ids.include?(last_adv_id[0, 6])
if parent_co_ids.size > 1
puts "#{last_account_id} - (#{parent_co_ids.join(',')})"
cntr = cntr + 1
end
parent_co_ids.clear
else
first = false
end
end
last_account_id = row[1]
last_adv_id = row[0]
end
puts "Found #{cntr} accounts with multiple parent companies"

Thanks in advance!

Todd Breiholz
12 Answers

Ara.T.Howard

2/2/2006 10:22:00 PM

0

Jacob Fugal

2/2/2006 11:10:00 PM

0

On 2/2/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:
> require "csv"
> require "yaml"
>
> path = ARGV.shift
> sum = Hash::new{|h,k| h[k] = 0}
> count = lambda{|row| sum[row.last.to_s[0,6]] += 1}
> CSV::open(path,"r"){|row| count[row]}
> y sum.delete_if{|k,v| v == 1}

I'm curious why you decided to make `count` its own lambda when:

1) It's only ever used once
2) The block that uses it has only one statement, namely the call to `count`
3) count and the block to CSV::open have the same signature

I think at a minimum, given 2) and 3), I'd just replace the block to
CSV::open with count itself:

count = lambda{|row| sum[row.last.to_s[0,6]] += 1}
CSV::open(path,"r", &count)

Then, since count isn't used anywhere else, I'd join those together:

CSV::open(path,"r"){|row| sum[row.last.to_s[0,6]] += 1}

After those transformations:

galadriel:~ lukfugl$ cat a.rb
require "csv"
require "yaml"

path = ARGV.shift
sum = Hash::new{|h,k| h[k] = 0}
CSV::open(path,"r"){|row| sum[row.last.to_s[0,6]] += 1}
y sum.delete_if{|k,v| v == 1}

galadriel:~ lukfugl$ cat in.csv
0,aaaaaa___
1,aaaaaa___
2,aaabbb___
3,aaabbb___
4,aaabbb___
5,aaaccc___

galadriel:~ lukfugl$ ruby a.rb in.csv
---
aaaaaa: 2
aaabbb: 3

Just seems a little clearer to me over having an extra one-time use lambda.

Jacob Fugal


Joel VanderWerf

2/2/2006 11:34:00 PM

0

Jacob Fugal wrote:

> sum = Hash::new{|h,k| h[k] = 0}

And for some reason, I tend to write

sum = Hash.new(0)

when dealing with an immediate value. (But maybe it's a better practice
to use Ara's form, so that if you ever replace 0 with, say, a matrix,
you don't reuse the same object for each key in the hash.)

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407


Ara.T.Howard

2/2/2006 11:56:00 PM

0

Jacob Fugal

2/3/2006 12:18:00 AM

0

On 2/2/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:
> On Fri, 3 Feb 2006, Jacob Fugal wrote:
> > I'm curious why you decided to make `count` its own lambda when:
> >
> > 1) It's only ever used once
> > 2) The block that uses it has only one statement, namely the call to `count`
> > 3) count and the block to CSV::open have the same signature
>
> it's for abstraction only.

<snip>

> basically i find
>
> {{{{}}}}
>
> tough to read sometimes and factor out things using lambda. it's rare that it
> acutally ends up being the the only thing left as in this case - but here you
> are quite right that it can be compacted.

Yeah, I agree. I often use similar abstraction techniques for
readability. My brain just has the tendency to refactor code inwards
as well as outwards when an abstraction seems extraneous.

> > CSV::open(path,"r"){|row| sum[row.last.to_s[0,6]] += 1}
>
> but i disagree here. people, esp nubies will look at that and say - what?
> whereas reading
>
> count = lambda{|row| sum[row.last.to_s[0,6]] += 1}
>
> ... count[row] ...
>
> is pretty clear. i often us variable as comments to others and myself.

Again, agreed. In this case though I don't think the abstraction of
naming sum[...] += 1 as count is a necessary one. If I were to
refactor part of the complex expression

sum[row.last.to_s[0,6]] += 1

to improve readability, it would be the index:

identifier_prefix = lambda{ |row| row.last.to_s[0,6] }
... sum[identifier_prefix[row]] += 1 ...

> what does this do:
>
> password = "#{ sifname }_#{ eval( ((0...256).to_a.map{|c| c.chr}.sort_by{rand}.select{|c| c =~ %r/[[:print:]]/})[0,4].join.inspect ) }"
>
> hard to say huh?

Ick, yes, I'd definitely split that into chunks. :)

> how about this?
>
> four_random_printable_chars = eval( ((0...256).to_a.map{|c| c.chr}.sort_by{rand}.select{|c| c =~ %r/[[:print:]]/})[0,4].join.inspect )
> password = "#{ sifname }_#{ four_random_printable_chars }"
>
> ugly (yes i'm hacking like crazy today) but at least anyone reading it (most
> importantly me) knows what i'm trying to do if not how!

If you say so... ;)

Jacob Fugal


William James

2/3/2006 2:35:00 AM

0

Todd Breiholz wrote:
> I'm just getting my feet wet with Ruby and would like some advice on how you
> "old-timers" would write the following script using Ruby idioms.
>
> The intent of the script is to parse a CSV file that contains 2 fields per
> row, sorted on the second field. There may be multiple rows for field 2. I
> want to get a list of all of the unique values of field2 that has more than
> 1 value for the 1st 6 characters of field 1.

--- input data -----
123456ab,900
123456cd,900
123456ef,909
012345gh,909
--- end of input -----

--- Using a hash of arrays:

require 'csv'

h = Hash.new{ [] }
CSV::Reader.parse(File.open( ARGV.first )) { |row|
h[row.last] |= [ row.first[0,6] ] }
p h.delete_if{|k,v| v.size == 1 }

--- output -----
{"909"=>["123456", "012345"]}
--- end of output -----


--- Using a hash of hashes:

require 'csv'

h = Hash.new{|h,k| h[k] = {} }
CSV::Reader.parse(File.open( ARGV.first )) { |row|
h[row.last][ row.first[0,6] ] = 8 }
p h.delete_if{|k,v| v.size == 1 }

--- output -----
{"909"=>{"012345"=>8, "123456"=>8}}
--- end of output -----

Robert Klemme

2/3/2006 10:16:00 AM

0

William James wrote:
> Todd Breiholz wrote:
>> I'm just getting my feet wet with Ruby and would like some advice on
>> how you "old-timers" would write the following script using Ruby
>> idioms.
>>
>> The intent of the script is to parse a CSV file that contains 2
>> fields per row, sorted on the second field. There may be multiple
>> rows for field 2. I want to get a list of all of the unique values
>> of field2 that has more than 1 value for the 1st 6 characters of
>> field 1.
>
> --- input data -----
> 123456ab,900
> 123456cd,900
> 123456ef,909
> 012345gh,909
> --- end of input -----
>
> --- Using a hash of arrays:
>
> require 'csv'
>
> h = Hash.new{ [] }

I wonder how this works since the Hash never stores these arrays.

> CSV::Reader.parse(File.open( ARGV.first )) { |row|
> h[row.last] |= [ row.first[0,6] ] }
> p h.delete_if{|k,v| v.size == 1 }
>
> --- output -----
> {"909"=>["123456", "012345"]}
> --- end of output -----

Is this really the output of the script above?

robert

Robert Klemme

2/3/2006 10:22:00 AM

0

Todd Breiholz wrote:
> I'm just getting my feet wet with Ruby and would like some advice on
> how you "old-timers" would write the following script using Ruby
> idioms.
>
> The intent of the script is to parse a CSV file that contains 2
> fields per row, sorted on the second field. There may be multiple
> rows for field 2. I want to get a list of all of the unique values of
> field2 that has more than 1 value for the 1st 6 characters of field 1.

There are two possible interpretations of what you state here:

1. You want all values for row2 that occur more than once.

2. You want all values for row2 that have more than one distinct row1
value.

Implementations:

ad 1.

require 'csv'

h = Hash.new(0)
CSV::Reader.parse(ARGF) {|row| h[row[1]] += 1}
h.each {|k,v| puts k if v > 1}


ad 2.

require 'csv'
require 'set'

h = Hash.new {|h,k| h[k] = Set.new}
CSV::Reader.parse(ARGF) {|row| h[row[1]] << row[0]}
h.each {|k,v| puts k if v.size > 1}

Note: CSV::Reader can use ARGF which makes it easy to read from stdin as
well as multiple files.

Kind regards

robert

Robert Klemme

2/3/2006 10:35:00 AM

0

Robert Klemme wrote:
> Todd Breiholz wrote:
>> I'm just getting my feet wet with Ruby and would like some advice on
>> how you "old-timers" would write the following script using Ruby
>> idioms.
>>
>> The intent of the script is to parse a CSV file that contains 2
>> fields per row, sorted on the second field. There may be multiple
>> rows for field 2. I want to get a list of all of the unique values of
>> field2 that has more than 1 value for the 1st 6 characters of field
>> 1.
>
> There are two possible interpretations of what you state here:
>
> 1. You want all values for row2 that occur more than once.

Just remembered that the file is sorted. Then this implementation of case
1 is even more efficient as it does not store values in mem and works on
arbitrary large files:

require 'csv'

last = nil
CSV::Reader.parse(ARGF) do |row|
last, k = row[1], last
puts k if last == k
end

Kind regards

robert

Delta

2/13/2010 7:22:00 PM

0

On Sat, 13 Feb 2010 18:56:57 +0000, DVH wrote:

> "Delta" <none@invalid.non> wrote in message
> news:8oCdn.49$DL1.24@newsfe25.ams2...
>> On Sat, 13 Feb 2010 18:36:27 +0000, DVH wrote:
>>
>>> "Delta" <none@invalid.non> wrote in message
>>> news:%aCdn.48$DL1.14@newsfe25.ams2...
>>>> On Sat, 13 Feb 2010 17:52:38 +0000, DVH wrote:
>>>>
>>>>> "smurf" <smurf@smurf.com> wrote in message
>>>>> news:7to0pvFmd9U1@mid.individual.net...
>>>>>> Chris X wrote:
>>>>>>> http://news.bbc.co.uk/1/hi/uk/8...
>>>>>>
>>>>>>
>>>>>>> It claimed that members of the Israel Defence Forces (IDF), sent
>>>>>>> to help with the humanitarian effort after Haiti's devastating
>>>>>>> quake, were selling human organs.
>>>>>>>
>>>>>>>
>>>>>> So who would they be selling them to?
>>>>>
>>>>> If in doubt, the answer to most questions on usenet is usually "the
>>>>> joos".
>>>>
>>>> sure you never miss an opportunity to raise that spectre eh... good
>>>> for you is it? to deflect all possibly accurate criticism/suggestions
>>>> with that old ghoulie?
>>>>
>>>> surely its obvious its anyone who needs an organ who has the money
>>>> and contacts to arrange one... so could be surgeons and hospital
>>>> administrators through registered third parties who may be the
>>>> interface to unorthodox procurement routes, that may in fact appear a
>>>> more orthodox 'matchmaker'...
>>>
>>> And the evidence this is happening in Haiti is?
>>
>> surely a question worth researching?
>
> Are you saying there isn't any?
>

no (as I am sure you well know), I'm saying its worth investigating to
discover if there is any...

but contrary to Baroness Tonge... I'd say the IDF are the last people I'd
recommend to conduct it.... ;)

>> (In the light of 10 'baptists'
>> being arrested for essentially looting children for 'adoption')
>>
>> but the media thrust is dismiss it and Jenny Tonge as anti semitic....