[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Small regexp question

francisrammeloo@hotmail.com

3/10/2006 10:58:00 AM

Hi all,

I am writing some refactoring code for a C++ project.

I need to change:

class MyClass
{
...
}

to:

class IMP_EXP MyClass
{
...
}

The pattern I used to find a class definition line is:

line =~ /^\s*class\s+(\w+)/

But I want to exclude forward class declarations ( class MyClass; )

So I changed my pattern to:

line =~ /^\s*class\s+(\w+)\s*[^;]/ --> don't match if line ends
with ";"

But it doesn't work... Why?

I ended up using: if line !~ /;/ and line =~ /^\s*class\s+(\w+)/

Hints?

Any help will be appreciated,
Best regards,

Francis

9 Answers

William James

3/10/2006 11:09:00 AM

0


francisrammeloo@hotmail.com wrote:
> Hi all,
>
> I am writing some refactoring code for a C++ project.
>
> I need to change:
>
> class MyClass
> {
> ...
> }
>
> to:
>
> class IMP_EXP MyClass
> {
> ...
> }
>
> The pattern I used to find a class definition line is:
>
> line =~ /^\s*class\s+(\w+)/
>
> But I want to exclude forward class declarations ( class MyClass; )
>
> So I changed my pattern to:
>
> line =~ /^\s*class\s+(\w+)\s*[^;]/ --> don't match if line ends
> with ";"
>
> But it doesn't work... Why?
>
> I ended up using: if line !~ /;/ and line =~ /^\s*class\s+(\w+)/
>
> Hints?
>
> Any help will be appreciated,
> Best regards,
>
> Francis

/^\s*class\s+(\w+)\s*$/

Antonin AMAND

3/10/2006 11:15:00 AM

0

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

francisrammeloo@hotmail.com a écrit :
> Hi all,
>
> I am writing some refactoring code for a C++ project.
>
> I need to change:
>
> class MyClass
> {
> ...
> }
>
> to:
>
> class IMP_EXP MyClass
> {
> ...
> }
>
> The pattern I used to find a class definition line is:
>
> line =~ /^\s*class\s+(\w+)/
>
> But I want to exclude forward class declarations ( class MyClass; )
>
> So I changed my pattern to:
>
> line =~ /^\s*class\s+(\w+)\s*[^;]/ --> don't match if line ends
> with ";"
>
> But it doesn't work... Why?
>
> I ended up using: if line !~ /;/ and line =~ /^\s*class\s+(\w+)/
>
> Hints?
>
> Any help will be appreciated,
> Best regards,
>
> Francis
>
>

puts "ok" if "class MyClass;".match(/^\s*class\s+(\w+)\s*[^;]/)

=> "ok"

it works for me.

The problem may be somewhere else.

Antonin.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail....

iD8DBQFEEV+IrvKyD2MLOwsRAoY0AJ4nPyzvD8ZrfGviBWBmWewOu6GuQgCfcFSG
LwkwT0uAahhQbOg+7/eKwwI=
=OzYv
-----END PGP SIGNATURE-----



Xavier Noria

3/10/2006 11:35:00 AM

0

On Mar 10, 2006, at 11:58, francisrammeloo@hotmail.com wrote:

> The pattern I used to find a class definition line is:
>
> line =~ /^\s*class\s+(\w+)/
>
> But I want to exclude forward class declarations ( class MyClass; )
>
> So I changed my pattern to:
>
> line =~ /^\s*class\s+(\w+)\s*[^;]/ --> don't match if line ends
> with ";"
>
> But it doesn't work... Why?

I don't know exactly in what sense it does not work, but negations in
regexps are tricky.

A regexp engine *always* tries to match. If in a first attempt \w+
matches the whole class name and then the rest does not match, then
the regexp engine backtracks and happens to find a "shorter class
name" whose remaining characters are not semicolons, so it still
matches.

class Foo; (\w+ -> "Foo", fails, backtrack)
^
class Foo; (\w+ -> "Fo", no whitespace, "o" is not a semicolon,
matched)
^

A solution is to add an anchor for end of string. Another one is to
prevent \w+ from backtracking, that is known as "atomic grouping":

(?>\w+) # grab word characters and do not backtrack

In addition, the idiomatic way to say "and at this point I don't what
this to happen" is to use a negative look-ahead assertion. All in all
we get this:

/^\s*class\s+(?>\w+)(?!\s*;)/

-- fxn



Robert Klemme

3/10/2006 11:38:00 AM

0

francisrammeloo@hotmail.com wrote:
> Hi all,
>
> I am writing some refactoring code for a C++ project.
>
> I need to change:
>
> class MyClass
> {
> ...
> }
>
> to:
>
> class IMP_EXP MyClass
> {
> ...
> }
>
> The pattern I used to find a class definition line is:
>
> line =~ /^\s*class\s+(\w+)/
>
> But I want to exclude forward class declarations ( class MyClass; )
>
> So I changed my pattern to:
>
> line =~ /^\s*class\s+(\w+)\s*[^;]/ --> don't match if line ends
> with ";"
>
> But it doesn't work... Why?

Because the match simply stops before the ";".

>> line = 'class Foo;'
=> "class Foo;"
>> line[/^\s*class\s+(\w+)\s*[^;]/]
=> "class Foo"

If you want to make sure there is no ";" between the class name and the
end of the line you need to anchor the RX at the end:

>> line = 'class Foo;'
=> "class Foo;"
>> line[/^\s*class\s+(\w+)[^;]*$/]
=> nil
>> line = 'class Foo'
=> "class Foo"
>> line[/^\s*class\s+(\w+)[^;]*$/]
=> "class Foo"

Kind regards

robert

benjohn

3/10/2006 11:42:00 AM

0

>> I need to change:
>>
>> class MyClass
>> {
>> ...
>> }
>>
>> to:
>>
>> class IMP_EXP MyClass
>> {
>> ...
>> }

Don't you want to be looking _for_

class xxxx {

Where you can have at least one white space between class and xxxx, and
any ammount of white space between xxxx and { (I think none is allowable
too)? Any white space includes new lines too, as the following are all
valid class declarations:

class
AClass
{

class AClass {

class
AClass {

and I think even...
class
AClass{

? :) Or do you want to get the job done, rather than getting a perfect
solution? :)

I was playing with regexp yesterday, and wanted to have a pattern match
over multiple lines, but couldn't see how that is done (A friend wanted
a simple way of stripping out c comments, and they can over multiple
lines, of course). Could someone give me a hint on that?

Cheers,
Benjohn



Ross Bamford

3/10/2006 11:50:00 AM

0

On Fri, 2006-03-10 at 19:58 +0900, francisrammeloo@hotmail.com wrote:
> Hi all,
>
> I am writing some refactoring code for a C++ project.
>
> I need to change:
>
> class MyClass
> {
> ...
> }
>
> to:
>
> class IMP_EXP MyClass
> {
> ...
> }
>
> The pattern I used to find a class definition line is:
>
> line =~ /^\s*class\s+(\w+)/
>
> But I want to exclude forward class declarations ( class MyClass; )
>
> So I changed my pattern to:
>
> line =~ /^\s*class\s+(\w+)\s*[^;]/ --> don't match if line ends
> with ";"
>
> But it doesn't work... Why?

Your regexp is trying to match:

+ zero or more spaces
+ the word 'class'
+ one or more spaces
+ one or more word characters (captured)
+ zero or more spaces
+ any single character except ';'

By the time you get to that ';' there likely won't be any input left, so
no character to be something except ';'. You could do it with lookahead,
but it's probably easier to do:

"class MyClass;" =~ /class\s+(\w+)[^;]*$/
# => nil

"class MyClass" =~ /class\s+(\w+)[^;]*$/
# => 0

"class MyClass {" =~ /class\s+(\w+)[^;]*$/
# => 0

"class MyClass { /* etc */ }" =~ /class\s+(\w+)[^;]*$/
# => 0

There are probably still things this will miss though. For example,
strange class names could well result in a failure to match...

--
Ross Bamford - rosco@roscopeco.REMOVE.co.uk



Pistos Christou

3/10/2006 3:09:00 PM

0

francisrammeloo@hotmail.com wrote:
> But I want to exclude forward class declarations ( class MyClass; )
>
> So I changed my pattern to:
>
> line =~ /^\s*class\s+(\w+)\s*[^;]/ --> don't match if line ends
> with ";"
>
> But it doesn't work... Why?
>
> I ended up using: if line !~ /;/ and line =~ /^\s*class\s+(\w+)/

How does it not match? Can you show several lines of an irb session
demonstrating matches and non-matches?

Pistos

--
Posted via http://www.ruby-....


francisrammeloo@hotmail.com

3/10/2006 4:01:00 PM

0

irb(main):001:0> line = "class MyClass;"
=> "class MyClass;"
irb(main):002:0> line =~ /^\s*class\s+(\w+)\s*[^;]/
=> 0
irb(main):003:0> puts $1
MyClas
=> nil

Logan Capaldo

3/10/2006 5:34:00 PM

0


On Mar 10, 2006, at 10:08 AM, Pistos Christou wrote:

>> /^\s*class\s+(\w+)\s*[^;]/

This would match:
"class A ;" for instance

\s* can match the empty string which is followed by a space which is
not a semi-colon, so hey it matches!