[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Re: Time out a regular expression in Python 2.6.4?

Steve Holden

2/15/2010 3:59:00 PM

python@bdurham.com wrote:
> Is there any way to time out a regular expression in Python 2.6.4?
>
> Motiviation: Our application allows users to enter regular expressions
> as validation criteria. If a user enters a pathological regular
> expression, we would like to timeout the evaluation of this expression
> after a short period of time.
>
Python itself does not contain any mechanism to terminate an operation
if it takes too much time.

One approach would be to run the regex in a subprocess, and apply
process limits to terminate that subprocess if it ran too long.

This group being what it is you are likely to receive other, better
suggestions too.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010 http://us....
Holden Web LLC http://www.hold...
UPCOMING EVENTS: http://holdenweb.event...
3 Answers

Jonathan Gardner

2/15/2010 11:55:00 PM

0

On Feb 15, 7:59 am, Steve Holden <st...@holdenweb.com> wrote:
> pyt...@bdurham.com wrote:
> > Is there any way to time out a regular expression in Python 2.6.4?
>
> > Motiviation: Our application allows users to enter regular expressions
> > as validation criteria. If a user enters a pathological regular
> > expression, we would like to timeout the evaluation of this expression
> > after a short period of time.
>
> Python itself does not contain any mechanism to terminate an operation
> if it takes too much time.
>
> One approach would be to run the regex in a subprocess, and apply
> process limits to terminate that subprocess if it ran too long.
>
> This group being what it is you are likely to receive other, better
> suggestions too.
>

I'm not sure how exactly the re module is implemented, but since I
assume a great chunk is in C code, you may get away with a single
process and multiple threads. One thread will watch the process, or
have a timer event set to go off at a certain point. The other will
actually run the regex and get killed by the timer process if it
doesn't finish in time.

Steve Holden

2/16/2010 1:06:00 AM

0

Jonathan Gardner wrote:
> On Feb 15, 7:59 am, Steve Holden <st...@holdenweb.com> wrote:
>> pyt...@bdurham.com wrote:
>>> Is there any way to time out a regular expression in Python 2.6.4?
>>> Motiviation: Our application allows users to enter regular expressions
>>> as validation criteria. If a user enters a pathological regular
>>> expression, we would like to timeout the evaluation of this expression
>>> after a short period of time.
>> Python itself does not contain any mechanism to terminate an operation
>> if it takes too much time.
>>
>> One approach would be to run the regex in a subprocess, and apply
>> process limits to terminate that subprocess if it ran too long.
>>
>> This group being what it is you are likely to receive other, better
>> suggestions too.
>>
>
> I'm not sure how exactly the re module is implemented, but since I
> assume a great chunk is in C code, you may get away with a single
> process and multiple threads. One thread will watch the process, or
> have a timer event set to go off at a certain point. The other will
> actually run the regex and get killed by the timer process if it
> doesn't finish in time.

That would be a great idea if it were possible to kill a thread form
outside. Unfortunately it's not, so the best you can do is set a flag
and have it queried periodically. This is not practical during re matching.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010 http://us....
Holden Web LLC http://www.hold...
UPCOMING EVENTS: http://holdenweb.event...

MRAB

2/16/2010 1:38:00 AM

0

Steve Holden wrote:
> Jonathan Gardner wrote:
>> On Feb 15, 7:59 am, Steve Holden <st...@holdenweb.com> wrote:
>>> pyt...@bdurham.com wrote:
>>>> Is there any way to time out a regular expression in Python
>>>> 2.6.4? Motiviation: Our application allows users to enter
>>>> regular expressions as validation criteria. If a user enters a
>>>> pathological regular expression, we would like to timeout the
>>>> evaluation of this expression after a short period of time.
>>> Python itself does not contain any mechanism to terminate an
>>> operation if it takes too much time.
>>>
>>> One approach would be to run the regex in a subprocess, and apply
>>> process limits to terminate that subprocess if it ran too long.
>>>
>>> This group being what it is you are likely to receive other,
>>> better suggestions too.
>>>
>> I'm not sure how exactly the re module is implemented, but since I
>> assume a great chunk is in C code, you may get away with a single
>> process and multiple threads. One thread will watch the process, or
>> have a timer event set to go off at a certain point. The other
>> will actually run the regex and get killed by the timer process if
>> it doesn't finish in time.
>
> That would be a great idea if it were possible to kill a thread form
> outside. Unfortunately it's not, so the best you can do is set a flag
> and have it queried periodically. This is not practical during re
> matching.
>
The code for matching in the re module is written in C, and it doesn't
release the GIL because it calls the Python API, and you need to have
the GIL when doing that (unless you can guarantee that the specific call
is safe, that is!).

This means that other threads can't run during matching.

In order to be able to cancel the matching, the re module would have to
release the GIL when possible and have some kind of cancel() method
(belonging to which class?).

A simpler option would be to add a timeout argument. It already
periodically checks for ctrl-C, so perhaps the time check could be done
then.