Gregory Seidman
1/12/2006 3:55:00 PM
On Thu, Jan 12, 2006 at 04:04:58PM +0900, Matthew Moss wrote:
[...]
} Most solutions fell into one of a few types. I'll cover each a bit and point
} out anything in particular that attracted my attention.
}
} The first class of solutions massaged the input expression a bit, then used
} Ruby's eval method to do the work.
[...]
} The second class of solutions involved using a parsing library or parser
} generator tool.
[...]
} And now we're at the third class of solutions: the handcrafted parsers.
} About nine people handcrafted parsers, mostly recursive descent which is
} rather easy to code.
[...]
I want to go through my handcrafted parser solution, which may have gotten
missed in the shuffle. This is especially likely since I also submitted one
using the eval method. In fact, my handcrafted parser used eval, but only
for the +, -, /, and * operators.
The important bit is that all of the nodes in my syntax tree respond to
to_i with the computed result of its subexpression. This was largely to
make numbers behave the same way that my syntactic elements did, and vice
versa. My parsing method looks like this (with a fail check to make sure
nothing went wrong after each line, mainly for debugging purposes) to
create the syntax tree:
1) token_array = tokenize(expression)
uses String#scan and checks for garbage
2) token_array = validate_and_cook(token_array)
replaces some tokens with symbols, deals with dX => 1dX, d00 =>
d100, d% => d100, makes sure open parens have close parens and
the operators and operands are in grammatically appropriate places
3) parse_tokens(token_array)
a) token_array = parse_parens(token_array)
finds parenthesized expressions and runs the subarray through
parse_tokens; nested parens are handled recursively. Note that
there is no need for error checking, since the expression has
already been validated.
b) token_array = parse_ops(token_array, /^d$/) if token_array.length > 1
finds all d operators and creates a DieOperator object with its
left and right operands. Since all the operators in this grammar
are left-associative, parse_ops is left-associative. The regex is
used to determine if the token is the operator(s) being handled.
c) token_array = parse_ops(token_array, /^[*\/]$/) if token_array.length > 1
same as b, but for * and / operators. In this case, ArithOperator
objects are created. The parse_ops method uses an OpsClass hash
that matches the operator (d, *, /, +, or -) to the appropriate
class to call new on.
d) token_array = parse_ops(token_array, /^[-+]$/) if token_array.length > 1
same as c, but for + and - operators. Still using ArithOperator
objects.
The ArithOperator class is what uses eval. It is given the operator as a
string, and its to_i method calls eval "#{@left.to_i}#{@op}#{@right.to_i}"
to produce its value.
The points I'm trying to make are:
1) the grammar was so simple that there was no need for a proper
recursive-descent parser
2) duck typing is way cool and let me treat numbers, DieOperators, and
ArithOperators identically
3) a little eval let me use one class instead of four for ArithOperator
4) doing all that validation ahead of time let me replace my syntax tree
with using eval for the whole expression without giving up my nice,
clear failure messages
} Thanks to everyone for the great submissions and lively discussion on the
} mailing list (and the correction to the quiz that I missed!). I just hope
} when we all sit down to play that I don't see any of you with all characters
} stats of 18.
This one was loads of fun, if not as challenging as the Numeric Maze (which
was my first). It's a pity I'll be out of town this weekend and won't be
able to do the next one.
--Greg