Ben Crowell
6/11/2007 5:03:00 PM
I have a ruby script that uses some oniguruma features. It works on
this setup:
FreeBSD
32-bit intel
ruby 1.8.5 (2006-08-25) [i386-freebsd4]
ruby compiled from the FreeBSD port, with oniguruma support
It does not work on this setup:
Linux
x64
ruby 1.9.0 (2007-05-07 patchlevel 0) [x86_64-linux]
I suspect that the bug may be a problem with code in oniguruma that's
not 64-bit clean. Unfortunately, it's been difficult for me to trim
this down to a minimal example that demonstrates the bug. What seems
to happen is that oniguruma evaluates a certain regex repeatedly, but
at some point (possibly after hundreds of evaluations), it overwrites
the first 8 bytes of the regex expression with nulls. Here's what the
source code looks like:
tex.split(/\\(?:begin|end){#{x}}/).each { |m|
Here's the error:
../translate_to_html.rb:479:in `block in handle_tables': unmatched
close parenthesis: /\000\000\000\000\000\000\000\000in|end){tabular}/
(RegexpError)
Notice how the source code quoted in the error message is not the same
as the actual source code. I'm imagining C code something like this.
(Pardon me if my C syntax is incorrect -- I'm rusty.)
typedef struct {
char *a;
char *b;
char s[];
} regex_t;
regex_t *p;
p = malloc(...);
strcpy(((char *) p)+8,string); // incorrectly assuming 4-byte pointers
p->b = NULL; // overwrites the first 8 characters of s[]
Of course the real C code inside the oniguruma implementation would have
to be a lot more complex than this, or else the error would be easier
to reproduce, and would not occur seemingly randomly, after hundreds of
evaluations. I'm guessing that the error occurs because my regex
includes the interpolated string #{x}, which would cause the regex
object to get recreated every time the value of x changes.
I would be willing to put more effort into trying to make a short,
reproducible example of the bug, if people on this group thought it
would be helpful. However, I've already put ~8 hours into trying to
make a short test case, and just haven't had any luck. I thought that
maybe if I posted here, the folks who work on the oniguruma code might
look at my post and say, "Oh, I can imagine how such a bug would occur.
I'll review the relevant part of the code."