[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

split parameter line with quotes

teddyber

1/11/2008 6:51:00 PM

Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks
11 Answers

Nanjundi

1/11/2008 7:28:00 PM

0

On Jan 11, 1:50 pm, teddyber <teddy...@gmail.com> wrote:
> Hello,
>
> first i'm a newbie to python (but i searched the Internet i swear).
> i'm looking for some way to split up a string into a list of pairs
> 'key=value'. This code should be able to handle this particular
> example string :
>
> qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> i know i can do that with some regexp (i'm currently trying to learn
> that) but if there's some other way...
>
> thanks

This is unconventional and using eval is not SAFE too.
>>> s = 'qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",maxbuf=1024,charset="utf-8",algorithm="md5-sess"'
>>> d = eval(' dict(%s)' % s)
>>> d.items()
[('algorithm', 'md5-sess'), ('maxbuf', 1024), ('charset', 'utf-8'),
('cipher', 'rc4-40,rc4-56,rc4,des,3des'), ('qop', 'auth,auth-int,auth-
conf')]
>>> for k,v in d.iteritems(): print k, '=', v
....
algorithm = md5-sess
maxbuf = 1024
charset = utf-8
cipher = rc4-40,rc4-56,rc4,des,3des
qop = auth,auth-int,auth-conf

For safe eval, take a look at http://aspn.activestate.com/ASPN/Cookbook/Python/Rec...

-N

Joshua J. Kugler

1/11/2008 7:37:00 PM

0

teddyber wrote:
> first i'm a newbie to python (but i searched the Internet i swear).
> i'm looking for some way to split up a string into a list of pairs
> 'key=value'. This code should be able to handle this particular
> example string :
>
> qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> i know i can do that with some regexp (i'm currently trying to learn
> that) but if there's some other way...

Take a look at the shlex module. You might be able to fiddle with the shlex
object and convince it to split on the commas. But, to be honest, that
above would be a lot easier to parse if the dividing commas were spaces
instead.

j


teddyber

1/11/2008 7:47:00 PM

0

On 11 jan, 20:28, Nanjundi <nanju...@gmail.com> wrote:
> On Jan 11, 1:50 pm, teddyber <teddy...@gmail.com> wrote:
>
> > Hello,
>
> > first i'm a newbie to python (but i searched the Internet i swear).
> > i'm looking for some way to split up a string into a list of pairs
> > 'key=value'. This code should be able to handle this particular
> > example string :
>
> > qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> > 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> > i know i can do that with some regexp (i'm currently trying to learn
> > that) but if there's some other way...
>
> > thanks
>
> This is unconventional and using eval is not SAFE too.>>> s = 'qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",maxbuf=1024,charset="utf-8",algorithm="md5-sess"'
> >>> d = eval(' dict(%s)' % s)
> >>> d.items()
thanks for that. The problem is i don't have charset="utf-8" but
charset=utf-8. Sometimes " sometimes not!
>
> [('algorithm', 'md5-sess'), ('maxbuf', 1024), ('charset', 'utf-8'),
> ('cipher', 'rc4-40,rc4-56,rc4,des,3des'), ('qop', 'auth,auth-int,auth-
> conf')]>>> for k,v in d.iteritems(): print k, '=', v
>
> ...
> algorithm = md5-sess
> maxbuf = 1024
> charset = utf-8
> cipher = rc4-40,rc4-56,rc4,des,3des
> qop = auth,auth-int,auth-conf
>
> For safe eval, take a look athttp://aspn.activestate.com/ASPN/Cookbook/Python/Rec...
>
> -N

Paul McGuire

1/11/2008 8:20:00 PM

0

On Jan 11, 12:50 pm, teddyber <teddy...@gmail.com> wrote:
> Hello,
>
> first i'm a newbie to python (but i searched the Internet i swear).
> i'm looking for some way to split up a string into a list of pairs
> 'key=value'. This code should be able to handle this particular
> example string :
>
> qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> i know i can do that with some regexp (i'm currently trying to learn
> that) but if there's some other way...
>
> thanks

Those quoted strings sure are pesky when you try to split along
commas. Here is a solution using pyparsing - note the argument field
access methods at the bottom. Also, the parse action attached to
integer will do conversion of the string to an int at parse time.

More info on pyparsing at http://pyparsing.wiki....

-- Paul

from pyparsing import Word, nums, alphas, quotedString, delimitedList, Literal, CharsNotIn, Dict, Group, removeQuotes

arg = '''qop="auth,auth-int,auth-conf",
cipher="rc4-40,rc4-56,rc4,des,3des",
maxbuf=1024,charset=utf-8,algorithm=md5-sess'''

# format is: delimited list of key=value groups, where value is
# a quoted string, an integer, or a non-quoted string up to the
next
# ',' character
key = Word(alphas)
EQ = Literal("=").suppress()
integer = Word(nums).setParseAction(lambda t:int(t[0]))
quotedString.setParseAction(removeQuotes)
other = CharsNotIn(",")
val = quotedString | integer | other

# parse each key=val arg into its own group
argList = delimitedList( Group(key + EQ + val) )
args = argList.parseString(arg)

# print the parsed results
print args.asList()
print

# add dict-like retrieval capabilities, by wrapping in a Dict
expression
argList = Dict(delimitedList( Group(key + EQ + val) ))
args = argList.parseString(arg)

# print the modified results, using dump() (shows dict entries too)
print args.dump()

# access the values by key name
print "Keys =", args.keys()
print "cipher =", args["cipher"]

# or can access them like attributes of an object
print "maxbuf =", args.maxbuf


Prints:

[['qop', 'auth,auth-int,auth-conf'], ['cipher', 'rc4-40,rc4-56,rc4,des,
3des'], ['maxbuf', 1024], ['charset', 'utf-8'], ['algorithm', 'md5-
sess']]

[['qop', 'auth,auth-int,auth-conf'], ['cipher', 'rc4-40,rc4-56,rc4,des,
3des'], ['maxbuf', 1024], ['charset', 'utf-8'], ['algorithm', 'md5-
sess']]
- algorithm: md5-sess
- charset: utf-8
- cipher: rc4-40,rc4-56,rc4,des,3des
- maxbuf: 1024
- qop: auth,auth-int,auth-conf
Keys = ['maxbuf', 'cipher', 'charset', 'algorithm', 'qop']
maxbuf = 1024
cipher = rc4-40,rc4-56,rc4,des,3des

Russ P.

1/11/2008 8:53:00 PM

0

On Jan 11, 10:50 am, teddyber <teddy...@gmail.com> wrote:
> Hello,
>
> first i'm a newbie to python (but i searched the Internet i swear).
> i'm looking for some way to split up a string into a list of pairs
> 'key=value'. This code should be able to handle this particular
> example string :
>
> qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> i know i can do that with some regexp (i'm currently trying to learn
> that) but if there's some other way...
>
> thanks

The problem is that you are using commas for delimiters at two
different levels.

I would start by replacing the commas between quotation marks with
some other delimiter, such as spaces of semicolons. To do that, step
through each character and keep a count of quotation marks. While the
count is odd, replace each comma with the selected alternative
delimiter. While the count is even, leave the comma. [An alternative
would be to replace the commas outside the quotation marks.]

Once that is done, the problem is straightforward. Split the string on
commas (using string.split(",")). Then split each item in the list by
"=". Use the [0] element for the key, and use the [1] element for the
value (first stripping off the quotation marks if necessary). If you
need to further split each of the values, just split on whatever
delimiter you chose to replace the commas.

Russ P.

1/11/2008 9:03:00 PM

0

On Jan 11, 12:53 pm, "Russ P." <Russ.Paie...@gmail.com> wrote:
> On Jan 11, 10:50 am, teddyber <teddy...@gmail.com> wrote:
>
> > Hello,
>
> > first i'm a newbie to python (but i searched the Internet i swear).
> > i'm looking for some way to split up a string into a list of pairs
> > 'key=value'. This code should be able to handle this particular
> > example string :
>
> > qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> > 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> > i know i can do that with some regexp (i'm currently trying to learn
> > that) but if there's some other way...
>
> > thanks
>
> The problem is that you are using commas for delimiters at two
> different levels.
>
> I would start by replacing the commas between quotation marks with
> some other delimiter, such as spaces of semicolons. To do that, step
> through each character and keep a count of quotation marks. While the
> count is odd, replace each comma with the selected alternative
> delimiter. While the count is even, leave the comma. [An alternative
> would be to replace the commas outside the quotation marks.]
>
> Once that is done, the problem is straightforward. Split the string on
> commas (using string.split(",")). Then split each item in the list by
> "=". Use the [0] element for the key, and use the [1] element for the
> value (first stripping off the quotation marks if necessary). If you
> need to further split each of the values, just split on whatever
> delimiter you chose to replace the commas.


One more point. Whoever chose the structure of the string you are
parsing didn't do a very good job. If you know that person, you should
tell him or her to use different delimiters at the different levels.
Use commas for one level, and spaces or semicolons for the other
level. Then you won't have to "correct" the string before you parse
it.

Reedick, Andrew

1/11/2008 9:21:00 PM

0



> -----Original Message-----
> From: python-list-bounces+jr9445=att.com@python.org [mailto:python-
> list-bounces+jr9445=att.com@python.org] On Behalf Of teddyber
> Sent: Friday, January 11, 2008 1:51 PM
> To: python-list@python.org
> Subject: split parameter line with quotes
>
> Hello,
>
> first i'm a newbie to python (but i searched the Internet i swear).
> i'm looking for some way to split up a string into a list of pairs
> 'key=value'. This code should be able to handle this particular
> example string :
>
> qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> i know i can do that with some regexp (i'm currently trying to learn
> that) but if there's some other way...
>

import re
s='''qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",m
axbuf=1024,charset=utf-8,algorithm=md5-sess'''
print s

all = re.findall(r'(.*?)=(".*?"|[^"]*?)(,|$)', s)
print all

for i in all:
print i[0], "=", i[1].strip('"')


Output:
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",maxbuf
=1024,charset=utf-8,algorithm=md5-sess

[
('qop', '"auth,auth-int,auth-conf"', ','),
('cipher', '"rc4-40,rc4-56,rc4,des,3des"', ','),
('maxbuf', '1024', ','),
('charset', 'utf-8', ','),
('algorithm', 'md5-sess', '')
]

qop = auth,auth-int,auth-conf
cipher = rc4-40,rc4-56,rc4,des,3des
maxbuf = 1024
charset = utf-8
algorithm = md5-sess




*****

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA621


teddyber

1/11/2008 10:31:00 PM

0

wow! that's perfect this shlex module! thanks for pointing this!

On 11 jan, 20:36, Joshua Kugler <jkug...@bigfoot.com> wrote:
> teddyber wrote:
> > first i'm a newbie to python (but i searched the Internet i swear).
> > i'm looking for some way to split up a string into a list of pairs
> > 'key=value'. This code should be able to handle this particular
> > example string :
>
> > qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> > 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> > i know i can do that with some regexp (i'm currently trying to learn
> > that) but if there's some other way...
>
> Take a look at the shlex module. You might be able to fiddle with the shlex
> object and convince it to split on the commas. But, to be honest, that
> above would be a lot easier to parse if the dividing commas were spaces
> instead.
>
> j

teddyber

1/11/2008 10:32:00 PM

0

i know this is some kind of bad design but the problem is that i
receive this string from a jabber server and i cannot do anything to
change this. i should try to verify if that's correct implementation
of jabber protocol still...

On 11 jan, 22:02, "Russ P." <Russ.Paie...@gmail.com> wrote:
> On Jan 11, 12:53 pm, "Russ P." <Russ.Paie...@gmail.com> wrote:
>
>
>
> > On Jan 11, 10:50 am, teddyber <teddy...@gmail.com> wrote:
>
> > > Hello,
>
> > > first i'm a newbie to python (but i searched the Internet i swear).
> > > i'm looking for some way to split up a string into a list of pairs
> > > 'key=value'. This code should be able to handle this particular
> > > example string :
>
> > > qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> > > 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> > > i know i can do that with some regexp (i'm currently trying to learn
> > > that) but if there's some other way...
>
> > > thanks
>
> > The problem is that you are using commas for delimiters at two
> > different levels.
>
> > I would start by replacing the commas between quotation marks with
> > some other delimiter, such as spaces of semicolons. To do that, step
> > through each character and keep a count of quotation marks. While the
> > count is odd, replace each comma with the selected alternative
> > delimiter. While the count is even, leave the comma. [An alternative
> > would be to replace the commas outside the quotation marks.]
>
> > Once that is done, the problem is straightforward. Split the string on
> > commas (using string.split(",")). Then split each item in the list by
> > "=". Use the [0] element for the key, and use the [1] element for the
> > value (first stripping off the quotation marks if necessary). If you
> > need to further split each of the values, just split on whatever
> > delimiter you chose to replace the commas.
>
> One more point. Whoever chose the structure of the string you are
> parsing didn't do a very good job. If you know that person, you should
> tell him or her to use different delimiters at the different levels.
> Use commas for one level, and spaces or semicolons for the other
> level. Then you won't have to "correct" the string before you parse
> it.

teddyber

1/11/2008 10:36:00 PM

0

here's the solution i have for the moment :

t = shlex.shlex(data)
t.wordchars = t.wordchars + "/+.-"
r=''
while 1:
token = t.get_token()
if not token:
break
if not token==',': r = r+token
else: r = r + ' '
self.DEBUG(r,'ok')
for pair in r.split(' '):
key,value=pair.split('=', 1)
print(key+':'+value)

i know this is not perfect still but i'm coming a long way from very
bad php habits! :o)
and thanks for your help!

On 11 jan, 23:30, teddyber <teddy...@gmail.com> wrote:
> wow! that's perfect this shlex module! thanks for pointing this!
>
> On 11 jan, 20:36, Joshua Kugler <jkug...@bigfoot.com> wrote:
>
> > teddyber wrote:
> > > first i'm a newbie to python (but i searched the Internet i swear).
> > > i'm looking for some way to split up a string into a list of pairs
> > > 'key=value'. This code should be able to handle this particular
> > > example string :
>
> > > qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
> > > 3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
>
> > > i know i can do that with some regexp (i'm currently trying to learn
> > > that) but if there's some other way...
>
> > Take a look at the shlex module. You might be able to fiddle with the shlex
> > object and convince it to split on the commas. But, to be honest, that
> > above would be a lot easier to parse if the dividing commas were spaces
> > instead.
>
> > j