Logan Capaldo
9/15/2006 2:44:00 AM
On Fri, Sep 15, 2006 at 11:32:33AM +0900, Francis Cianfrocca wrote:
> On 9/14/06, Tom Copeland <tom@infoether.com> wrote:
> >
> > I'm not sure if it's impossible to parse out C-style comments using a
> >regular expression, but the various JavaCC grammars I've seen all use
> >lexical states to do it instead. Another complication is trigraphs (*),
> >although I think those are unrecognized by default in most C
> >preprocessors.
>
>
> One more point. Someone upthread gave an example similar to this:
>
> /* printf ("*/"); */
Pretty sure this would end up being a syntax error
> Considered strictly as a lexical construction, I think this is regular.
> However, I have a funny feeling that this:
>
> /* printf ("/*......*/"); */
This too.
gcc agrees with me at least:
% cat comments.c
#include <stdio.h>
int main(int argc, char **argv) {
/* printf("*/"); */
/* printf("/*.......*/"); */
return 0;
}
% gcc -c comments.c
comments.c: In function 'main':
comments.c:4: error: missing terminating " character
comments.c:5: error: missing terminating " character
>
> is actually context-free. Does anyone know for sure?
As for whether or not its context free, I don't know, but I think you
overestimated how hard C tries. /* */ are not nestable for instance.