Asp Forum - "yield" and "old-way iteration"

Bjarke Walling

11/28/2006 10:19:00 PM

Hi,

I am new to Ruby, but I find the language easy to learn and use.
However I have some code that I think could be written smarter.

I am writing a lexer and parser for a small language I have created.
The first part splitting some input into tokens was easy to write using
"yield" (20 lines or so). I was actually a little overwhelmed how easy
it was. The next part is to examine these tokens and parse them into
language structures. I want to create a class with "current" and "next"
methods to get the current token and fetch the next (advance the
pointer). I have solved it by using the first lexer yielding tokens and
collecting them in an array. Afterwards I can fetch tokens from the
array. But could it be done in a smarter way?

My code is like this:

class FirstParse
def initialize
...
end
def each
... yield tokens ...
end
end

class SecondParse
def initialize
@tokens = []
@index = 0
first_parse = FirstParse.new
for token in first_parse
@tokens.add token
end
end
def current
@tokens[@index]
end
def next
@index++
self.current
end
def read_structure1
... read structures ...
end
def read_structure2
... read structures ...
end
end

Am I being to "Java'ish" or what do you think. It is not a big problem
since the code works, but do I really need to load the tokens into an
array first?

- Bjarke Walling

4 Answers

dblack

11/29/2006 12:41:00 AM

Vidar Hokstad

11/29/2006 12:50:00 AM

Bjarke Walling wrote:
> I am writing a lexer and parser for a small language I have created.
> The first part splitting some input into tokens was easy to write using
> "yield" (20 lines or so). I was actually a little overwhelmed how easy
> it was. The next part is to examine these tokens and parse them into
> language structures. I want to create a class with "current" and "next"
> methods to get the current token and fetch the next (advance the
> pointer). I have solved it by using the first lexer yielding tokens and
> collecting them in an array. Afterwards I can fetch tokens from the
> array. But could it be done in a smarter way?

You're not providing much context. I am assuming that you want the
current/next approach because your parser will pull tokens, presumably
because you're using recursive descent or another top-down parsing
method.

If that's what you are doing, and you want to stick with that (as
opposed to switching to a bottom-up parser), then you're dealing with a
classic "inversion of control" problem.

I don't really think making the first lexer yielding tokens buys you
much over just making the parser call methods in the lexer to tokenize
and return the tokens as a normal method call. In other words, if
you're using a top-down parsing method, you really want to consider
making your parser pull tokens from the lexer, instead of having the
lexer push tokens to the parser, which is what you are doing when you
use yield.

However, if you want to stick to using yield, you can use "Generator"
(see http://ruby-doc.org/core/classes/Gene...) to invert the
control and let you "pull" tokens from your yield'ing lexer without
having to go via an array.

Vidar

Eric Hodel

11/29/2006 1:17:00 AM

On Nov 28, 2006, at 1420 , Bjarke Walling wrote:
> I am writing a lexer and parser for a small language I have created.
> The first part splitting some input into tokens was easy to write
> using
> "yield" (20 lines or so). I was actually a little overwhelmed how easy
> it was. The next part is to examine these tokens and parse them into
> language structures. I want to create a class with "current" and
> "next"
> methods to get the current token and fetch the next (advance the
> pointer). I have solved it by using the first lexer yielding tokens
> and
> collecting them in an array. Afterwards I can fetch tokens from the
> array. But could it be done in a smarter way?
>
> My code is like this:
>
> class FirstParse

include Enumerable

> def initialize
> ...
> end
> def each
> ... yield tokens ...
> end
> end
>
> class SecondParse
> def initialize
def initialize(tokens)
> @index = 0
@tokens = tokens
> end
> def current
> @tokens[@index]
> end
> def next
> @index++
> self.current
> end
> def read_structure1
> ... read structures ...
> end
> def read_structure2
> ... read structures ...
> end
> end

parser = SecondParse.new FirstParse.new.to_a

Passing in an Array of tokens makes it easier to test, too.

--
Eric Hodel - drbrain@segment7.net - http://blog.se...

I LIT YOUR GEM ON FIRE!

Bjarke Walling

11/29/2006 1:12:00 PM

Thank you for all your replies!

The real problem is that I am not that skilled in writing a parser, I
think.

It might be the "inversion of control" problem I experience. The
problem is that my FirstParse class push tokens using "yield" and in my
SecondParse I want to pull tokens and decide upon them, and pull the
next when I'm ready for it. I don't know how to write the SecondParse
another way without the code becomming too complex, but I have a book
on grammers and languages. I read the part on regular expressions and
on Turing machines, but missed out the grammers part. I have to read at
least some of it :-)

But for now I solve it by loading the tokens directly into an array in
the first parse.

- Bjarke Walling

comp.lang.ruby

"yield" and "old-way iteration"

Bjarke Walling

dblack

Vidar Hokstad

Eric Hodel

Bjarke Walling

x Login to ForumsZone