[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

changing the format of a text file

Bary Buz

2/25/2009 11:30:00 AM

Hello everyone,

i am new to ruby and im having some problems trying to reformat a text
file.

Basically, i have a large log file which is around 200mb in the
following format:
----------------------------------------------------------
1000000 name
Status :A
Basetype :2
Version :1.0
|
|
(more
fields)
|
Name :/file/name/etc
1000001 name
Status :B
Basetype :2
Version :a20
|
|
Name :/file/name/etc
1000002 name
Status :C
|

... and so on


so for each 200mb file there are lot of entries.

What i want to do is to open the file, read the data into an array,
reformat the text and save it into another file with the following
output:

id, Status, Basetype, .... , Name
1000000, A, 2, ..... , /file/name/etc
1000001, B, 2, ..... , /file/name/etc

i tried to write a script in ruby to do that task but i dont get any
output so far.

def getfile(file_name)
entry = []
IO.foreach(file_name) do |fl|
if fl.include? 'name'
entry.push fl.scan(/\d+/)[0]
elsif fl.strip =~ /\A\d/
end
end
entry
end

def writefile(file, *linedata)
linedata.each do |line|
file << line.join(", ") +\n"
end
end

def readfile(file, outputfile)
out = File.new(outputfile, "w+")
info = []

wline = ['id', 'Status', 'Basetype', .... 'Name']

IO.foreach(file) { |line|

if line =~ //
wline[0]= line.scan(/\d+/)
elsif line =~ /Status/
wline[1]= line.split(":")[1].scan(/[a-zA-Z]+/).join("")
elsif line =~ /Basetype/
wline[2]= line.split(":")[1].scan(/\d+/).join("")
|
|
|
wline all fields
|
writefile(out, wline)
end
out.close
end

readfile('filename', 'outputfile')


this is what ive done so far, can someone tell me whats wrong and i dont
get any output at all..

Thanks in advance
--
Posted via http://www.ruby-....

2 Answers

James Coglan

2/25/2009 11:42:00 AM

0

[Note: parts of this message were removed to make it a legal post.]

2009/2/25 Bary Buz <sxetikos@hotmail.co.uk>

> Hello everyone,
>
> i am new to ruby and im having some problems trying to reformat a text
> file.
>
> Basically, i have a large log file which is around 200mb in the
> following format:
> ----------------------------------------------------------
> 1000000 name
> Status :A
> Basetype :2
> Version :1.0
> |
> |
> (more
> fields)
> |
> Name :/file/name/etc
> 1000001 name
> Status :B
> Basetype :2
> Version :a20
> |
> |
> Name :/file/name/etc
> 1000002 name
> Status :C
> |
>
> ... and so on
>
>
> so for each 200mb file there are lot of entries.
>
> What i want to do is to open the file, read the data into an array,
> reformat the text and save it into another file with the following
> output:
>
> id, Status, Basetype, .... , Name
> 1000000, A, 2, ..... , /file/name/etc
> 1000001, B, 2, ..... , /file/name/etc



I would strongly recommend looking at Treetop (http://treetop.ruby...).
It's a parser generator that produces tree structures from text files using
a grammar that you specify. If you know regular expressions, it shouldn't be
too big a leap to use Treetop's grammar language.

For this particular task it may be overkill, but certainly worth looking at.

James Gray

2/25/2009 2:16:00 PM

0

On Feb 25, 2009, at 5:30 AM, Bary Buz wrote:

> Hello everyone,

Hello and welcome.

> i am new to ruby and im having some problems trying to reformat a text
> file.
>
> Basically, i have a large log file which is around 200mb in the
> following format:
> ----------------------------------------------------------
> 1000000 name
> Status :A
> Basetype :2
> Version :1.0

> id, Status, Basetype, .... , Name
> 1000000, A, 2, ..... , /file/name/etc
> 1000001, B, 2, ..... , /file/name/etc

Do you just read the log file replacing variables holding Status, =20
Basetype, Version, and Name then spit out a new entry each time you =20
run across a number?

> i tried to write a script in ruby to do that task but i dont get any
> output so far.

I'll try to give some feedback=85

> def getfile(file_name)
> entry =3D []
> IO.foreach(file_name) do |fl|
> if fl.include? 'name'
> entry.push fl.scan(/\d+/)[0]
> elsif fl.strip =3D~ /\A\d/
> end
> end
> entry
> end

I don't see this method used anywhere in the code.

> def writefile(file, *linedata)
> linedata.each do |line|
> file << line.join(", ") +\n"

You are missing a quote there. It should be:

=85 + "\n"

>
> end
> end
>
> def readfile(file, outputfile)
> out =3D File.new(outputfile, "w+")
> info =3D []
>
> wline =3D ['id', 'Status', 'Basetype', .... 'Name']
>
> IO.foreach(file) { |line|
>
> if line =3D~ //

Don't do that. It doesn't do what you think it does. :)

What are you looking for here? A line that starts with a digit? If =20
so, use this:

if line =3D~ /\A\s*(\d+)/
# the digit is in the $1 variable here...

> wline[0]=3D line.scan(/\d+/)
> elsif line =3D~ /Status/
> wline[1]=3D line.split(":")[1].scan(/[a-zA-Z]+/).join("")

The above two lines can be simplified to:

elsif line =3D~ /\A\s*Status\s*:\s*([a-zA-Z]+)/
wline[1] =3D $1

The other assignments could be handled in a similar way.

> elsif line =3D~ /Basetype/
> wline[2]=3D line.split(":")[1].scan(/\d+/).join("")
> |
> |
> |
> wline all fields
> |
> writefile(out, wline)
> end
> out.close
> end
>
> readfile('filename', 'outputfile')
>
>
> this is what ive done so far, can someone tell me whats wrong and i =20=

> dont
> get any output at all..

It's not real easy for me to tell why you don't see output. It looks =20=

like outputs might only happen in that last elsif. If that's the =20
case, you won't se output unless the code makes it there. I'm =20
guessing it's not. Maybe because of the line =3D~ // condition, which =20=

is problematic.

I believe the code below does something like what you want. I hope it =20=

can be adapted to your needs.

James Edward Gray II

#!/usr/bin/env ruby -wKU

fields =3D ["id"]
fields_written =3D false
entry =3D { }

DATA.each do |line|
case line
when /\A\s*(\d+)/
unless entry.empty?
unless fields_written
puts fields.join(", ")
fields_written =3D true
end
puts fields.map { |f| entry[f] }.join(", ")
entry.clear
end
entry["id"] =3D $1
when /\A\s*([a-zA-Z]+)\s*:\s*(\S+)/
fields << $1 unless fields.include? $1
entry[$1] =3D $2
end
end

__END__
1000000 name
Status :A
Basetype :2
Version :1.0
Name :/file/name/etc
1000001 name
Status :B
Basetype :2
Version :a20
Name :/file/name/etc
1000002 name
Status :C