Phil Rhoades
1/12/2008 4:54:00 AM
Brian,
On Sat, 2008-01-12 at 13:20 +0900, Brian Adkins wrote:
> On Jan 10, 5:21 pm, Phil Rhoades <p...@pricom.com.au> wrote:
> > On Fri, 2008-01-11 at 06:47 +0900, Phil Rhoades wrote:
> > > On Fri, 2008-01-11 at 04:19 +0900, Brian Adkins wrote:
> > > > def block_extractor file, fields,
> > > > splitter = lambda {|line| line.split }
> > > > while !file.eof
> > > > result = []
> > > > fields.each do |field|
> > > > line = file.readline
> > > > result << splitter.call(line.chomp)[field-1] if field
> > > > end
> > > > yield result
> > > > end
> > > > end
> >
> > > > File.open("data.txt", "r") do |file|
> > > > block_extractor(file, [1,2,5,10,12]) do |fields|
> > > > puts fields.join(' ')
> > > > end
> > > > end
> >
> > > Thanks! - now I just need to work out how that actually works and then
> > > work out how I can modify it to use command line parameters.
> >
> > I apologise for replying to my own post but I have had a look at this
> > and read up about Procs and Lambdas and I can sorta see what you are
> > doing but would you be so kind as to elaborate on the code a bit? - I
> > think other people would find it useful as well . .
>
> I'd be glad to. What question do you have?
I'll come back to that after having another look at your code but see
below . .
> > Also, to generalise the code further, how would you select two of more
> > fields from each line?
>
> Well, this is very much a toy/example program, so I wouldn't build on
> it too much. There is a cost and a benefit to generalization, so it
> might be worthwhile to spend some time thinking about how general you
> need the function to be.
>
> A simple way to select two or more fields from each line would be to
> change from:
>
> [a, b, ...]
>
> to:
>
> [ [a1, a2, ...] [b1, b2, ...] ... ]
>
> or possibly use a hash with the key being a block-relative line
> number, and the value being a list of field numbers. Or you may want
> an external specification of the fields to extract - kind of like HTML
> templating in reverse.
While I was waiting I thought I would go ahead and produce something
that would do exactly what I wanted and then get some feedback on it. I
wanted to be able to run a program with parameters eg
multi_line_cvs.rb filename.txt #lines_in_block #fields_in_line arraycell1 arraycell2 arraycell3 . .
like:
/t070.rb infile.txt 5 12 0,0 1,1 2,4 3,9 4,11
So I have produced this:
#!/usr/bin/ruby
filename = ARGV.shift
lib = ARGV.shift.to_i # No. Lines In Block
fil = ARGV.shift.to_i # Max. No. of Fields to read In Line
infile = File::open( filename, 'r' )
count = 0
array = Array.new( lib ) { Array.new( fil ) }
infile.each { |line|
for field in 0..( fil-1 )
array[ count ][ field ] = line.split( "\t" )[ field ].chomp
end
count += 1
if count == lib
output = ''
ARGV.each { |cell|
output << array[cell.split( ',' )[0].to_i][cell.split( ',' )[1].to_i]
output << "\t"
}
output.chop
puts output
count = 0
array = Array.new( lib ) { Array.new( fil ) }
end
}
infile.close
and this actually does just what I want and the output is correct on the
example above ie: "1 12 25 j o"
It obviously needs error handling and there are probably other
suggestions people can make to improve/replace it . .
The original question was whether something that would do this already
existed as a gem or library but it appears not . .
Regards,
Phil.
--
Philip Rhoades
Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
Fax: +61:(0)2-8221-9599
E-mail: phil@pricom.com.au