Asp Forum
Home
|
Login
|
Register
|
Search
Forums
>
comp.lang.ruby
[ann] regexp-engine 0.4
Simon Strandgaard
11/25/2003 4:46:00 PM
download:
http://rubyforge.org/download.php/219/regexp-engine-...
homepage:
http://raa.ruby-lang.org/list.rhtml?n...
Try it out; tell me your opinion.
--
Simon Strandgaard
Changes
=======
non-greedy matching has been implemented. You can now do
/a(.*?)a/.match("0a1a2a3").to_a #=> ["a1a", "1"]
Now using iterators internally; the way has been paved
for i18n, so that the engine operate on unicode, jis..etc.
Status
======
Data structure has stabilized and the fundemental operations
are working quite good (was difficult to implement).
Uses iterators, this should make it easy to operate on many
different kinds of input-streams (unicode, UTF-8), but right
now the iterator only works on ASCII.
Performance is not impressive.
Left is all the easy stuff (character-classes, unicode, optimizaition).
* features of the scanner so far:
a|b|c alternation
* + ? {n,m} repeat(min..max) greedy/lazy
( ... ) grouping -> register.. nested repeat also works
. match anything except newline
\1 .. \9 backreferences
* features of the parser so far:
a|b|c alternation
* *? repeat(0..infinity) greedy/lazy
+ +? repeat(1..infinity) greedy/lazy
{n,} {n,}? repeat(n..infinity) greedy/lazy
? ?? repeat(0..1) greedy/lazy
{n,m} {n,m}? repeat(n..m) greedy/lazy
{n} {n}? repeat(n..n) greedy/lazy (does lazy make sense here?)
( ... ) group -> register
. match anything except newline
\1 .. \9 backreferences
\ escape
specialcase: illegal ranges is treated as they are just
ordinary literals.
License
=======
Ruby's license.
About
=====
AEditor needs a regexp engine. You probably think, why not
rely on an existing engine (for instance Ruby's regexp engine) ?
Existing engines are not flexible enough. The iterator pattern
provides that needed flexibility. Thus it should not matter
wheter the engine operate on: UCS-4 or UTF-8 or ASCII.
Goal is to build an engine which is fully compatible with Ruby's
regexp syntax, which can work with iterators.
Eventualy extend the regexp syntax, with some editor-stuff.
For instance: point where cursor should be placed,
match text which is legal ruby code, execute regexp within
retangular selection... etc. I am open to other suggestions.
Eventualy re-implement in C++ to gain performance.
Servizio di avviso nuovi messaggi
Ricevi direttamente nella tua mail i nuovi messaggi per
[ann] regexp-engine 0.4
Inserendo la tua e-mail nella casella sotto, riceverai un avviso tramite posta elettronica ogni volta che il motore di ricerca troverà un nuovo messaggio per te
Il servizio è completamente GRATUITO!
x
Login to ForumsZone
Login with Google
Login with E-Mail & Password