Asp Forum - announcing RubyLexer 0.6.0

vikkous

4/23/2005 3:53:00 PM

At this time, I am pleased to announce the release of RubyLexer 0.6.0,
a standalone lexer of ruby in ruby. RubyLexer attempts to completely
and correctly tokenize all valid ruby 1.8 source code, and it mostly
succeeds. In time, RubyLexer will be able to lex all ruby code. For
now, some newer features are unsupported and there are some extremely
obscure bugs involving strings, but all real world ruby code should be
supported. It is my hope to provide a high-quality lexer for all those
language tools which require one.

RubyLexer is hosted on RubyForge
(http://rubyforge.org/projects/...).
Here's where to get the tarball:
http://rubyforge.org/frs/download.php/4191/rubylexer-0.6...

100 Answers

Trans

4/23/2005 4:50:00 PM

Hi,

could you describe Ruby lexer a bit more. I know very little about
lexers, so excuse if I ask dumb questions, but... What's the output
look like? How does it compare to other projects like ParseTree? Do you
have any plans for its use?

Thanks,
T.

Florian Groß

4/23/2005 5:04:00 PM

vikkous wrote:

> At this time, I am pleased to announce the release of RubyLexer 0.6.0,
> a standalone lexer of ruby in ruby. RubyLexer attempts to completely
> and correctly tokenize all valid ruby 1.8 source code, and it mostly
> succeeds.

How extendable is this? Would you be able to add new rules to it add
run-time? If it is like that then it could be used for writing Ruby
source code filters which is something that is useful for exploring new
syntax.

I can also contribute a few pieces of code that I think that are hard to
lex properly if you are interested.

Peter Suk

4/23/2005 5:15:00 PM

On Apr 23, 2005, at 10:54 AM, vikkous wrote:

> At this time, I am pleased to announce the release of RubyLexer 0.6.0,

YeeHaaa!! ThankYouThankYou!

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

vikkous

4/23/2005 9:36:00 PM

A lexer, or tokenizer (they mean the same thing) divides an input
source language into words. It also removes comments and finds the
boundaries of strings. Once this is done, it's much easier to correctly
process the language in a pre-processor or parser. Here's an example.
Given this ruby code:

8+(9 *5)

a correct lexing is something like:

["8","+","(","*","5",")"]

(For lexing purposes, punctuation and operators count as strings as
well.)

The ouput of RubyLexer is actually more complicated than that... for
one thing, there are tokens for whitespace as well. for another, the
individual tokens are not Strings, but Tokens (or subclasses of it, to
be precise), a class defined in RubyLexer. Tokens to respond to to_s in
the expected way, however. (Initially, I did want to have RubyLexer
just return Strings, but it turned out I needed to distinguish
different token types, and the best way to do that is with the type
system.)

ParseTree is a parser, not a lexer. Parsing is the next step in a
compiler pipeline; it determines what order to evaluate to operations
in an expression and solves the difficult problems of precedence and
associativity. (Another way to think of parsers is as the bit that
figures out where the implicit parentheses are inserted into the source
code.) I think that the tool corresponding to RubyLexer is Ripper, but
I don't really know, so don't blame me if I'm wrong.

I have lots of plans, of course, but being only one little programmer
with lots of big ideas, who knows if I'll ever get to them...

vikkous

4/23/2005 9:47:00 PM

> How extendable is this? Would you be able to add new rules to it
> add run-time?

Ummm... if you're really lucky, maybe. I didn't really have
extensibility in mind. It might be possible to add it, without a lot
of trouble, depending on what you want to extend. So, what do you want
to extend?

> If it is like that then it could be used for writing Ruby
> source code filters which is something that is useful for exploring
> new syntax.

One of the applications I had in mind was to create a lexer family for
ruby-like languages, but that has sort of fallen by the wayside right
now. I still like the idea, but other priorities press at the moment.

> I can also contribute a few pieces of code that I think that are hard

> to lex properly if you are interested.

Oh! That would be lovely. Weird syntax, obscure syntax, new syntax,
twisted, devious, mutant syntax, I want it all for my menagerie.

vikkous

4/23/2005 9:49:00 PM

Peter Suk wrote:
> YeeHaaa!! ThankYouThankYou!

You're welcome. It's nice to be appredciated.

Hal E. Fulton

4/23/2005 10:48:00 PM

vikkous wrote:
>>I can also contribute a few pieces of code that I think that are hard
>>to lex properly if you are interested.
>
> Oh! That would be lovely. Weird syntax, obscure syntax, new syntax,
> twisted, devious, mutant syntax, I want it all for my menagerie.

Ha... I'll see if I can dig up anything.

In the meantime, one of my favorites is an expression containing a
string that contains an interpolated expression that contains a
string containing another interpolated expression:

x = "Hi, my name is #{"Slim #{rand(4)>2?"Whitman":"Shady"}"}."

Hal

gabriele renzi

4/24/2005 12:53:00 AM

vikkous ha scritto:
> At this time, I am pleased to announce the release of RubyLexer 0.6.0,
> a standalone lexer of ruby in ruby. RubyLexer attempts to completely
> and correctly tokenize all valid ruby 1.8 source code, and it mostly
> succeeds. In time, RubyLexer will be able to lex all ruby code. For
> now, some newer features are unsupported and there are some extremely
> obscure bugs involving strings, but all real world ruby code should be
> supported. It is my hope to provide a high-quality lexer for all those
> language tools which require one.
>
> RubyLexer is hosted on RubyForge
> (http://rubyforge.org/projects/...).
> Here's where to get the tarball:
> http://rubyforge.org/frs/download.php/4191/rubylexer-0.6...
>

first let me say I think this is cool :)
Anyway, I wonder: isn't something like this included with ruby (irb's
lexer) ?
Care to explain the differences a little?

Florian Groß

4/24/2005 2:03:00 AM

vikkous wrote:

>>How extendable is this? Would you be able to add new rules to it
>>add run-time?
>
> Ummm... if you're really lucky, maybe. I didn't really have
> extensibility in mind. It might be possible to add it, without a lot
> of trouble, depending on what you want to extend. So, what do you want
> to extend?

One simple example would be adding a ".=" assign-result-of-method-call
operator as in "foo = 'bar'; foo .= reverse"

> Oh! That would be lovely. Weird syntax, obscure syntax, new syntax,
> twisted, devious, mutant syntax, I want it all for my menagerie.

See attachment.

vikkous

4/24/2005 2:40:00 AM

> first let me say I think this is cool :)
> Anyway, I wonder: isn't something like this included with ruby (irb's

> lexer) ?
> Care to explain the differences a little?

Irb's lexer is not as complete. I can't think of any examples, but when
developing this, I played around with irb quite a bit, trying different
syntaces. Irb would do pretty good most of the time, but every so
often, I'd come up with something that had to be wrapped in eval %() in
order to work in irb...

comp.lang.ruby

announcing RubyLexer 0.6.0

vikkous

Trans

Florian Groß

Peter Suk

vikkous

vikkous

vikkous

Hal E. Fulton

gabriele renzi

Florian Groß

vikkous

x Login to ForumsZone