Steven Hansen
9/15/2006 2:47:00 PM
I suck at regex too, I tried this as an exercise and came up with the
below. It's less concise than previous solutions, but it works as far
as I can tell:
Row = Struct.new(:col1, :col2, :col3, :col4)
rows = Array.new()
regex = /([A-Z])\s(\[[0-9]+\])\s([A-Z1-9]+)\s(.+)/
File.open("file.txt") do |file|
while (line = file.gets)
m = line.match(regex)
rows << Row.new(m[1], m[2], m[3], m[4])
end
end
puts rows.flatten
#output =>
#<struct Row col1="R", col2="[01]", col3="R1", col4="The system shall
support \"emergency call processing\"">
#<struct Row col1="R", col2="[02]", col3="R1", col4="The system shall
support \"local call processing\"">
#<struct Row col1="R", col2="[08]", col3="F", col4="The system shall
provide a command-line user interface">
#<struct Row col1="R", col2="[723]", col3="F", col4="The system shall
provide 6 10/100/1000 Ethernet interfaces">
#<struct Row col1="R", col2="[11]", col3="F", col4="The system shall
support VoIP networks">
#<struct Row col1="R", col2="[398]", col3="R1", col4="The system shall
contain 2 control boards">
#<struct Row col1="O", col2="[327]", col3="I", col4="The system should
support hotswapping of all internal boards">
#<struct Row col1="R", col2="[19]", col3="I", col4="The system shall be
able to detect transmission errors">
#<struct Row col1="R", col2="[631]", col3="F", col4="The system shall
continue processing data as long as a call is active.">
-Steven
James Calivar wrote:
> Hello,
>
> I'm trying to split a formatted text file into four separate columns.
> The data is comprised of lines of text that are bundled into four
> distinct columns, corresponding to a "Required versus Optional"
> variable, a requirement number, a requirement classification (R1=Rev 1,
> F=Future, I=Internal), and a textual description of the requirement.
>
> My raw data looks like this in the input text file:
>
> R [01] R1 The system shall support "emergency call processing"
> R [02] R1 The system shall support "local call processing"
> R [08] F The system shall provide a command-line user interface
> R [723] F The system shall provide 6 10/100/1000 Ethernet interfaces
> R [11] F The system shall support VoIP networks
> R [398] R1 The system shall contain 2 control boards
> O [327] I The system should support hotswapping of all internal boards
> R [19] I The system shall be able to detect transmission errors
> R [631] F The system shall continue processing data as long as a call is
> active.
>
> I've set up a loop to process each line in the input file, and what I'd
> like to get is four separate variables containing on a line-by-line
> basis the data corresponding to the four distinct columns. The problem
> is my regexp experience is next to nothing, and I can't figure out how
> to extract the data I want since my fourth column contains whitespace
> (I'd have used that as my column separator otherwise).
>
> Here's my loop:
>
> File.open(textfile, "r") do |input_file|
> while line = input_file.gets
> output_file << line
> end
> end
>
> What can I replace the simple copy statement (output_file << line) with
> in order to get what I want?
>
> Thanks in advance, I hope this question makes some sense.
>
> James
>
>