Asp Forum - Problem with regular expression

Joan Miller

3/7/2010 10:32:00 AM

I would to convert the first string to upper case. But this regular
expression is not matching the first string between quotes.

re.sub("'(?P<id>\w+)': [^{]", "\g<id>FOO", str)

# string to non-matching
'foo': {

# strings to matching
'bar': 'bar2'
'bar': None
'bar': 0
'bar': True

So, i.e., from the first string I would to get:
'BAR': 'bar2'

Any help? please
Thanks in advance

4 Answers

News123

3/7/2010 12:52:00 PM

Hi Joan,

Joan Miller wrote:
> I would to convert the first string to upper case. But this regular
> expression is not matching the first string between quotes.
>
> re.sub("'(?P<id>\w+)': [^{]", "\g<id>FOO", str)
>
> # string to non-matching
> 'foo': {
>
> # strings to matching
> 'bar': 'bar2'
> 'bar': None
> 'bar': 0
> 'bar': True
>
> So, i.e., from the first string I would to get:
> 'BAR': 'bar2'
>
I'm a little slow today and don't exactly understand your question.

Could you perhaps add some examples of input lines and what you would
like to extract?

example:
input = "first word to Upper"
output = "FIRST word to Upper"

bye

N

>
> Any help? please
> Thanks in advance

Steve Holden

3/7/2010 1:03:00 PM

Joan Miller wrote:
> I would to convert the first string to upper case. But this regular
> expression is not matching the first string between quotes.
>
> re.sub("'(?P<id>\w+)': [^{]", "\g<id>FOO", str)
>
> # string to non-matching
> 'foo': {
>
> # strings to matching
> 'bar': 'bar2'
> 'bar': None
> 'bar': 0
> 'bar': True
>
> So, i.e., from the first string I would to get:
> 'BAR': 'bar2'
>
>
> Any help? please
> Thanks in advance

Well your pattern is identifying the right bits, but re.sub() replaces
everything it matches:

>>> import re
>>> strings = """\
.... 'bar': 'bar2'
.... 'bar': None
.... 'bar': 0
.... 'bar': True""".split("\n")
>>> for s in strings:
.... print re.sub("'(?P<id>\w+)': [^{]", "\g<id>FOO", s)
....
barFOObar2'
barFOOone
barFOO
barFOOrue
>>>

What you need to fix is the replacement. Take a look at the
documentation for re.sub: you will need to provide a function to apply
the upper-case transformation, and the example there should show you how.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010 http://us....
Holden Web LLC http://www.hold...
UPCOMING EVENTS: http://holdenweb.event...

Tim Chase

3/7/2010 1:36:00 PM

Joan Miller wrote:
> I would to convert the first string to upper case. But this regular
> expression is not matching the first string between quotes.
>
> re.sub("'(?P<id>\w+)': [^{]", "\g<id>FOO", str)

Well, my first thought is that you're not using raw strings, so
you're not using the regexps and replacements you think you are.

r"'(?P<id>\w+)': [^{]"

will match the lines of interest. The replacement will eat the
opening & closing single-quote, colon, and first character.

> # string to non-matching
> 'foo': {
>
> # strings to matching
> 'bar': 'bar2'
> 'bar': None
> 'bar': 0
> 'bar': True
>
> So, i.e., from the first string I would to get:
> 'BAR': 'bar2'

I think you'd have to use a function/lambda to do the
case-conversion:

re.sub(
r"'(?P<id>\w+)(?=': [^{])",
lambda m: "'" + m.group('id').upper(),
string_of_interest
)

Or you could just forgo regexps and use regular string functions
like split(), startswith(), and upper()

-tkc

Paul McGuire

3/7/2010 2:11:00 PM

On Mar 7, 4:32 am, Joan Miller <pelok...@gmail.com> wrote:
> I would to convert the first string to upper case. But this regular
> expression is not matching the first string between quotes.
>
Is using pyparsing overkill? Probably. But your time is valuable,
and pyparsing let's you knock this out in less time than it probably
took to write your original post.

Use pyparsing's pre-defined expression sglQuotedString to match your
entry key in quotes:

key = sglQuotedString

Add a parse action to convert to uppercase:

key.setParseAction(lambda tokens:tokens[0].upper())

Now define the rest of your entry value (be sure to add the negative
lookahead so we *don't* match your foo entry):

entry = key + ":" + ~Literal("{")

If I put your original test cases into a single string named 'data', I
can now use transformString to convert all of your keys that don't
point to '{'ed values:

print entry.transformString(data)

Giving me:

# string to non-matching
'foo': {

# strings to matching
'BAR': 'bar2'
'BAR': None
'BAR': 0
'BAR': True

Here's the whole script:

from pyparsing import sglQuotedString, Literal

key = sglQuotedString
key.setParseAction(lambda tokens:tokens[0].upper())
entry = key + ":" + ~Literal("{")

print entry.transformString(data)

And I'll bet that if you come back to this code in 3 months, you'll
still be able to figure out what it does!

-- Paul

comp.lang.python

Problem with regular expression

Joan Miller

News123

Steve Holden

Tim Chase

Paul McGuire

x Login to ForumsZone