Asp Forum - Re: awk regexp search

Gregory Seidman

7/1/2007 1:21:00 PM

On Sun, Jul 01, 2007 at 10:13:16PM +0900, baptiste Augui? wrote:
> Hi,
>
> The last bit of a bash program is still resisting me. Here is the
> code I used before:
>
> awk '/scattering efficiency/{print $4}' ../OUTPUTFILES/Output.dat
>
> How would you do that in Ruby? I just need to locate this regexp in
> the file, and get the following value in the same line. I've tried
> something like,
>
> output_file=IO.readlines('../OUTPUTFILES/Output.dat').to_s
> myarray = output_file.each_line(/scattering efficiency/){|elt| print
> elt.to_s, '\t'}
>
> but it clearly doesn't work.

First off, don't be too quick to abandon awk. I took a dozen lines of Ruby
someone wrote that did almost what I wanted and ported/replaced it with
three lines of sh/awk. For what awk can do, it is excellent.

To set up the equivalent of the awk code above, you want something like
this (note: untested):

ARGF.each { |line|
case line
when /scattering efficiency/
puts line.split(/\s+/)[3] #note 3 instead of 4
end
}

Note that this basic framework does not support awk's /regex1/,/regex2/
notation that captures lines between (and including) the lines matching
those regular expressions.

> Best regards,
> baptiste
--Greg

3 Answers

Przemyslaw Daniel

7/12/2007 10:41:00 AM

> Note that this basic framework does not support awk's /regex1/,/regex2/
> notation that captures lines between (and including) the lines matching
> those regular expressions.
>
>> Best regards,
>> baptiste
> --Greg

how about something like this simple implementation

# AWK in Ruby

module AWK
class ClassAwk
def initialize(filename = "")
@NR = 0; # record number
@NF = 0; # field number
@FS = /\s+/; # field separator
@line = ""; # line matched
@fields = []; # fields of macthed line
@trace = []; # regexp trace

if (filename == "")
@file = ARGF;
else
@file = File.open(filename, "r").close;;
end
end

#NOTE: every rule has to be in separate line
#to get unique rule id
def rule(regexp1, regexp2 = nil)
msg = "regexp parameter must be Regexp";
raise ArgumentError, msg unless regexp1.kind_of?(Regexp);

if regexp2 == nil
if @line =~ regexp1
@fields = @line.split(@FS)
yield
end
else
raise ArgumentError, msg unless regexp2.kind_of?(Regexp);
rule_id = /.+:([0-9]+)/.match(caller.first.to_s)[1].to_i;

@trace[rule_id] = true if @line =~ regexp1;
if @trace[rule_id]
@fields = @line.split(@FS)
yield
end
@trace[rule_id] = false if @line =~ regexp2;
end
end

def analyze()
@NR = 0;
ARGF.each { |@line|
@line = @line.chop;
@NR += 1;
yield
}
end

#get paricular field
def getField(index)
output = "";

if (index == 0)
output = @line;
else
if index - 1 < @fields.length
return @fields[index - 1];
end
end
end

#get NR (record number)
def getNR
return @NR;
end

#get number of fileds
def getNF
return @fields.length;
end
end
end

and an example how to use it:

require "awk.rb"

awk = AWK::ClassAwk.new();

awk.analyze() {
awk.rule(/start1/, /stop1/) {
print "1, NR:", awk.getNR(),", ";
print "NF: ", awk.getNF(),", ", awk.getField(0), "\n";
};
awk.rule(/start2/, /stop2/) {
print "2, NR:", awk.getNR(),", ";
print "NF: ", awk.getNF(),", ", awk.getField(0), "\n";
};
awk.rule(/start1/) {
print awk.getField(0);
};
}

--
Posted via http://www.ruby-....

Przemyslaw Daniel

7/12/2007 11:25:00 AM

Few mistakes

instead of
> @file = File.open(filename, "r").close;;
should be
@file = File.open(filename, "r");
and
> ARGF.each { |@line|
replace with
> @file.each { |@line|

:-)

--
Posted via http://www.ruby-....

Przemyslaw Daniel

7/16/2007 3:21:00 PM

small update to make it easier

# AWK implementation

module AWK
class ClassAwk
def initialize(filename = "")
@NR = 0; # record number
@NF = 0; # field number
@FS = /\s+/; # field separator
@f = []; # fields of macthed line, f[0] - line
@trace = []; # regexp trace

#input file
@file = filename == "" ? ARGF: File.open(filename, "r");
end

#NOTE: every rule has to be in separate line
#to get unique rule id
def rule(regexp1, regexp2 = nil)
msg = "regexp parameter must be Regexp";
raise ArgumentError, msg unless regexp1.kind_of?(Regexp);

if regexp2 == nil
yield if @f[0] =~ regexp1;
else
raise ArgumentError, msg unless regexp2.kind_of?(Regexp);
rule_id = /.+:([0-9]+)/.match(caller.first.to_s)[1].to_i;

@trace[rule_id] = true if @f[0] =~ regexp1;
yield if @trace[rule_id]
@trace[rule_id] = false if @f[0] =~ regexp2;
end
end

def analyze()
@NR = 0;
@file.each { |line|
@NR += 1;
@f = line.split(@FS)
@NF = @f.length
@f.unshift(line.chop);
yield
}
end

attr_reader :NR, :NF, :f;
end
end

example:

require "awk.rb"

awk = AWK::ClassAwk.new();

awk.analyze() {
awk.rule(/start1/, /stop1/) {
print "1, NR:#{awk.NR}, NF:#{awk.NF}, #{awk.f[0]}\n";
};
awk.rule(/start2/, /stop2/) {
print "2, NR:#{awk.NR}, NF:#{awk.NF}, #{awk.f[0]}\n";
};
awk.rule(/start1/) {
print "3, NR:#{awk.NR}, NF:#{awk.NF}, #{awk.f[0]}\n";
};
}

--
Posted via http://www.ruby-....

comp.lang.ruby

Re: awk regexp search

Gregory Seidman

Przemyslaw Daniel

Przemyslaw Daniel

Przemyslaw Daniel

x Login to ForumsZone