Markus
10/8/2004 4:53:00 AM
On Thu, 2004-10-07 at 21:19, Yukihiro Matsumoto wrote:
> Hi,
>
> In message "Re: quality of error messages"
> on Fri, 8 Oct 2004 12:58:26 +0900, Jamis Buck <jgb3@email.byu.edu> writes:
>
> |> We can. But how we check for missing/broken def/end pairs, more than
> |> just syntax error?
>
> |I believe what is being asked for is more than just a "syntax error"
> |message. If the error could be more specific, like "missing 'end' on
> |line x", it would greatly increase the usefulness of the -c option.
>
> I know what he wants. I am not refusing his idea. The point is I'm
> not yet sure how to detect missing pairs.
>
Having spent 12 of the last 48 hours or so hacking away on ruby's
parse.y, I think I've got a pretty clear idea what the problem is.
Unless (as some have suggested) you add a second source of information
(such as indentation or an explicit statement of intent such as 'enddef'
or 'method_delimiter') it simply isn't possible in general to tell which
end is missing. Consider:
((1+2)+3+4/5
There is clearly a ')' missing, but should it be:
((1)+2)+3+4/5 which equals 6.8
or
((1+2))+3+4/5 which likewise equals 6.8
or
((1+2)+3)+4/5 which is also 6.8
or
((1+2)+3+4)/5 which is 2
or
((1+2)+3+4/5) which is 6.8 again
Without an external source of information, it is impossible to decide
this. In a simple ruby program, there might be a reasonably small
number of possibilities, but those are the times it's easy to spot "by
hand." In a more complex (say, over 50 lines or so) program it would be
more work to weed through the warnings than to find it by other means.
-- Markus
P.S. There may be heuristics to get a reasonable "hint" by making some
assumptions; e.g., warn if there is a line less indented than the first
line of an outstanding (open) construct, excluding here-docs, %_{
constructs, etc., if (and only if) there is a missing end at eof. This
could (I think) be implemented fairly easily by
* caching the location and indentation of a each class, def, etc.
on a stack
* popping from the stack on end
* noting when the first token is lexed from a line if it was less
indented than the most recent outstanding def/class, etc., and
if so noting the fact in a global
* including the information in the global (if any) when generating
the missing end message
But this is only a heuristic, based on the observation that even people
who don't like salient structure tend to use it to some extent. It
would not solve the problem in general, and perhaps not even in a
typical case, for anyone but me and the python expatriates.