[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Help: Efficient regular expression

Divya Badrinath

7/10/2007 8:21:00 PM

string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"

i need to fetch 14051 and /bin/bash from the string

can someone help me to write an efficient regular expression for that.

i am a beginner, i wrote
string =~ /(\d+)\s+(\d+)\s+\d+\s+\d+:\d+\s+.*\s+\d+:\d+:\d+\s+(.*)\s/

i know this is not the efficient way of doing it.

Please help.

--
Posted via http://www.ruby-....

24 Answers

Divya Badrinath

7/10/2007 8:26:00 PM

0

Divya Badrinath wrote:
> string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
>
> i need to fetch 14051 and /bin/bash from the string

i mean i need the 2nd column and the last column.
>
> can someone help me to write an efficient regular expression for that.
>
> i am a beginner, i wrote
> string =~ /(\d+)\s+(\d+)\s+\d+\s+\d+:\d+\s+.*\s+\d+:\d+:\d+\s+(.*)\s/
>
> i know this is not the efficient way of doing it.
>
> Please help.


--
Posted via http://www.ruby-....

James Gray

7/10/2007 8:32:00 PM

0

On Jul 10, 2007, at 3:25 PM, Divya Badrinath wrote:

> Divya Badrinath wrote:
>> string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
>>
>> i need to fetch 14051 and /bin/bash from the string
>
> i mean i need the 2nd column and the last column.

cols = string.split
sec, last = cols.values_at(1, -1)

Hope that helps.

James Edward Gray II

Florian Aßmann

7/10/2007 8:34:00 PM

0

Hi Divya, use

string[/\s(\d+)/, 1]

see String.[]

Regards
Florian

Florian Aßmann

7/10/2007 8:38:00 PM

0

Florian Aßmann schrieb:
> Hi Divya, use
>
> string[/\s(\d+)/, 1]
>
> see String.[]
>
> Regards
> Florian
>
>
pid = string[/\s(\d+)/, 1]
cmd = string[/\s(\S+)$/, 1] # is missing


Divya Badrinath

7/10/2007 8:39:00 PM

0

Florian AÃ?mann wrote:
> Hi Divya, use
>
> string[/\s(\d+)/, 1]
>
> see String.[]
>
> Regards
> Florian

string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
with this,
string =~ /(\d+)\s+(\d+)\s+\d+\s+\d+:\d+\s+.*\s+\d+:\d+:\d+\s+(.*)\s/

$1 gives me 14051
and
$3 gives me /bin/bash

what i am trying to do is to get $1 and $3 into a hash.

--
Posted via http://www.ruby-....

Kyle Schmitt

7/10/2007 8:45:00 PM

0

I love regex, so it hurts me to say it, there are other ways of solving this ;)

for instance:

string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
number = string.split[1]
program=string.split.last


now regexes!

string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
number=string[/[0-9]+/]
program=string[/[a-z\/]+$/]

You know you can get values out of an array with the [] operator.
Well you can get strings out of strings that same way, and it works
with regexes!

string[/[0-9]+/] will return the first match of 1 or more numbers

Here's the magic use [ ] inside of a regular expression to create your
own groups. Individual characters in there are included in the group,
and ranges may be included using the -. so a-b is
abcdefghijklmnopqrstuvwxyz.
The + afterwards means 1 or more times.
What if you want _exactly 5 consecutive numbers? use the {}
string[/[0-9]{5}/]
ranges also work here
string[/[0-9]{3-5}/] would match 3, 4 or 5 digit numbers

and
string[/[a-z\/]+$/] will match a text string containing the forward
slash at the end. The $ is a special char to represent the end of a
line, and since / is a special char itself, it needed to be escaped
with a \.

BUT it could even be easier.
the [] groups, can be negative!
/[^a]*/ would match any string that did not have an a in it
/[^ ]*/ would match any string that did not have a space in it...soo
string[/[^ ]+$/] would be a good way to get the last bit.

Florian Aßmann

7/10/2007 8:48:00 PM

0

Divya Badrinath schrieb:
> string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
>
> i need to fetch 14051 and /bin/bash from the string
>
> can someone help me to write an efficient regular expression for that.
>
> i am a beginner, i wrote
> string =~ /(\d+)\s+(\d+)\s+\d+\s+\d+:\d+\s+.*\s+\d+:\d+:\d+\s+(.*)\s/
>
> i know this is not the efficient way of doing it.
>
> Please help.
>
talking about efficient, I was just curious...

#!/usr/bin/env ruby -w
#
# Created by Florian Aßmann on 2007-07-10.
# Copyright (c) 2007. All rights reserved.

string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
require 'profiler'

puts <<-EOS

pid = string[/\s(\d+)/, 1]
cmd = string[/\s(\S+)$/, 1]

EOS
Profiler__::start_profile

10000.times do
pid = string[/\s(\d+)/, 1]
cmd = string[/\s(\S+)$/, 1]
end

Profiler__::stop_profile
Profiler__::print_profile STDOUT

puts <<-EOS

cols = string.split
sec, last = cols.values_at(1, -1)

EOS
Profiler__::start_profile

10000.times do
cols = string.split
sec, last = cols.values_at(1, -1)
end

Profiler__::stop_profile
Profiler__::print_profile STDOUT

puts <<-EOS

number = string.split[1]
program = string.split.last

EOS
Profiler__::start_profile

10000.times do
number = string.split[1]
program = string.split.last
end

Profiler__::stop_profile
Profiler__::print_profile STDOUT

*grin*

Florian


Kyle Schmitt

7/10/2007 9:11:00 PM

0

Ooooh fun.
so are you going to announce the winner ;)

Robert Dober

7/10/2007 9:21:00 PM

0

On 7/10/07, James Edward Gray II <james@grayproductions.net> wrote:
> On Jul 10, 2007, at 3:25 PM, Divya Badrinath wrote:
>
> > Divya Badrinath wrote:
> >> string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
> >>
> >> i need to fetch 14051 and /bin/bash from the string
> >
> > i mean i need the 2nd column and the last column.
>
> cols = string.split
> sec, last = cols.values_at(1, -1)
Very interesting James, I seem to be rather extreme and

sec, last = string.split.values_at(1, -1)
might be a tad to long for one line in your style, however Ruby syntax
just supports this marvelous syntax :)

sec, last = string.split.
values_at(1, -1)

Robert
>
> Hope that helps.
>
> James Edward Gray II
>
>


--
I always knew that one day Smalltalk would replace Java.
I just didn't know it would be called Ruby
-- Kent Beck

Florian Aßmann

7/10/2007 9:21:00 PM

0

Ok, it was hard to beat Edward, but at least building the simplest
regular expression to do somthing like a String.split seems to faster:

#!/usr/bin/env ruby -w
#
# Created by Florian Aßmann on 2007-07-10.
# Copyright (c) 2007. All rights reserved.

string = "root 14051 14033 3 08:39 pts/2 00:00:00 /bin/bash"
require 'profiler'

puts <<-EOS

pid_rx = /\s(\d+)/
cmd_rx = /\s(\S+)$/
pid, cmd = string[pid_rx, 1], string[cmd_rx, 1]

EOS
Profiler__::start_profile

pid_rx = /\s(\d+)/
cmd_rx = /\s(\S+)$/
100000.times do
pid, cmd = string[pid_rx, 1], string[cmd_rx, 1]
end

Profiler__::stop_profile
Profiler__::print_profile STDOUT

puts <<-EOS

pid, cmd = string.split.values_at(1, -1)

EOS
Profiler__::start_profile

100000.times do
pid, cmd = string.split.values_at(1, -1)
end

Profiler__::stop_profile
Profiler__::print_profile STDOUT

puts <<-EOS

rx = Regexp.new('\S+\s(\d+).*\s(\S+$)')
pid, cmd = rx.match(string).values_at( 1, -1 )

EOS
Profiler__::start_profile

rx = Regexp.new('\S+\s(\d+).*\s(\S+$)')
100000.times do
pid, cmd = rx.match(string).values_at( 1, -1 )
end

Profiler__::stop_profile
Profiler__::print_profile STDOUT

puts <<-EOS

rx = Regexp.new('(\S+)')
pid, cmd = rx.match(string).values_at( 1, -1 )

EOS
Profiler__::start_profile

rx = Regexp.new('(\S+)')
100000.times do
pid, cmd = rx.match(string).values_at( 1, -1 )
end

Profiler__::stop_profile
Profiler__::print_profile STDOUT


Sincerely
Florian