Asp Forum
Home
|
Login
|
Register
|
Search
Forums
>
comp.lang.python
python regex: misbehaviour with "\r" (0x0D) as Newline character in Unicode Mode
Arian Sanusi
1/27/2008 11:31:00 AM
Hi,
concerning to unicode, "\n", "\r "and "\r\n" (0x000A, 0x000D and
0x000D+0x000A) should be threatened as newline character
at least this is how i understand it:
(
http://en.wikipedia.org/wiki/Newli...
)
obviously, the re module does not care, and on unix, only threatens \n
as newline char:
>>> a=re.compile(u"^a",re.U|re.M)
>>> a.search(u"bc\ra")
>>> a.search(u"bc\na")
<_sre.SRE_Match object at 0xb5908fa8>
same thing for $:
>>> b = re.compile(u"c$",re.U|re.M)
>>> b.search(u"bc\r\n")
>>> b.search(u"abc")
<_sre.SRE_Match object at 0xb5908f70>
>>> b.search(u"bc\nde")
<_sre.SRE_Match object at 0xb5908fa8>
is this a known bug in the re module? i couldn't find any issues in the
bug tracker.
Or is this just a user fault and you guys can help me?
arian
p.s.: appears in both python2.4 and 2.5
Servizio di avviso nuovi messaggi
Ricevi direttamente nella tua mail i nuovi messaggi per
python regex: misbehaviour with "\r" (0x0D) as Newline character in Unicode Mode
Inserendo la tua e-mail nella casella sotto, riceverai un avviso tramite posta elettronica ogni volta che il motore di ricerca troverà un nuovo messaggio per te
Il servizio è completamente GRATUITO!
x
Login to ForumsZone
Login with Google
Login with E-Mail & Password