Asp Forum - Project suggestion: Ruby code indenter

Gavin Sinclair

10/10/2003 4:13:00 PM

From the thread "Extension Language for a Text Editor":

>> So basically, if you like a modal editor or not, go for editor of form
>> of your choice. I've seriously considered switching to vim simply
>> because of the ruby support. (A few things have kept me from doing
>> this, but they were merely technical issues.)

> You're welcome to come aboard ;-). I'm the maintainer of the Vim indent
> script, together with Gavin Sinclair. It is, in my opinion, better than
> the one that comes with Emacs, even though I guess Matz wrote that one
> ;-).

I was meaning to mention this anyway, but now I can't resist. I think
a great project for someone to work on - someone who really really
wants to work on a project but isn't sure what :) - is a Ruby code
indenter.

Input:
Ruby code

Output:
Properly indented Ruby code, perhaps accounting for user preferences

Motivation:
Ruby is a hard language to programatically indent, for reasons that
will become obvious if this thread goes anywhere. Attempts to
provide support for this in Vim and Emacs are progressing, but are
hampered by languages which are not really suited to the task
(please prove me wrong).

If a general-purpose program were provided, it would offer a
solution to any editor and for standalone use, as well as inspiring
greater agility in the existing editor plugins. It would not render
such plugins obsolete, rather provide a backup for the tasks they do
not easily do (indent entire file, accounting for prefs, comments,
here-docs, etc.).

Comments:
A Ruby implementation could take advantage of irb code, just like
RDoc does. Understanding Ruby code, as opposed to reading a text
stream, makes indentation much easier.

There's no way I have time to work on this; just throwing it out
there in case it catches someone's fancy.

BTW...

>> Because [Emacs is] general, people have written lots of stuff, some
>> of which is quite silly (tetris, web browser, etc.),

...on the rare occasions I play Tetris, it's as a Vim plugin :)
Search www.vim.org if you're interested.

Gavin

5 Answers

Nikolai Weibull

10/10/2003 5:25:00 PM

* Gavin Sinclair <gsinclair@soyabean.com.au> [Oct, 10 2003 18:20]:
> I was meaning to mention this anyway, but now I can't resist. I think
> a great project for someone to work on - someone who really really
> wants to work on a project but isn't sure what :) - is a Ruby code
> indenter.
>
> Input:
> Ruby code
>
> Output:
> Properly indented Ruby code, perhaps accounting for user preferences
>
> Motivation:
> Ruby is a hard language to programatically indent, for reasons that
> will become obvious if this thread goes anywhere. Attempts to
> provide support for this in Vim and Emacs are progressing, but are
> hampered by languages which are not really suited to the task
> (please prove me wrong).
I'd love to prove you wrong. I have, however, as you, discovered that
it is a bitch to indent Ruby programatically. It's syntax is simply too
general. There is such overloading of so many tokens that it's hard to
get every case right, while maintaining compatibility with other cases.
For every case you fix, you'll have to check that it doesn't affect any
of the other ones.
Anyway, it would be an interesting project. If I'm any judge, Perl 6
would make this very much easier to do. However, it should be generally
possible in any language. I'd assume Ruby would fit the task quite well
actually. The hard part is, of course, keeping track of all the cases.
However, it is quite well specified what may exists where, and in many
ways it is also easier to manage than a language such as C. Also, the
coding standards of Ruby are quite well defined as well, and almost
everyone seems to stick to them rather passionately, so this makes
things easier. I can't promise that I'll take a look this personally,
since I'll be rather busy with other things in a near future. I will,
however, try to improve the Vim indenter to the best of my ability.
By the way, if you read this and you use Vim, please check out the
Vim/Ruby project at
http://rubyforge.org/projects...
and try out all the latest features. Much work has been done since the
6.2 release, and it needs a good test-run.
>
> If a general-purpose program were provided, it would offer a
> solution to any editor and for standalone use, as well as inspiring
> greater agility in the existing editor plugins. It would not render
> such plugins obsolete, rather provide a backup for the tasks they do
> not easily do (indent entire file, accounting for prefs, comments,
> here-docs, etc.).
like indent(1) you mean? I rarely run indent, but if I was ever to
alter other people's code, I'd probably run it through indent(1) before
running it through Vim's.
>
> Comments:
> A Ruby implementation could take advantage of irb code, just like
> RDoc does. Understanding Ruby code, as opposed to reading a text
> stream, makes indentation much easier.
>
> There's no way I have time to work on this; just throwing it out
> there in case it catches someone's fancy.
>
> >> Because [Emacs is] general, people have written lots of stuff, some
> >> of which is quite silly (tetris, web browser, etc.),
>
> ...on the rare occasions I play Tetris, it's as a Vim plugin :)
> Search www.vim.org if you're interested.
I like the one that comes with Zsh better :-D,
niklai

--
::: name: Nikolai Weibull :: aliases: pcp / lone-star / aka :::
::: born: Chicago, IL USA :: loc atm: Gothenburg, Sweden :::
::: page: www.pcppopper.org :: fun atm: gf,lps,ruby,lisp,war3 :::
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}

Gavin Sinclair

10/10/2003 11:41:00 PM

On Saturday, October 11, 2003, 3:25:17 AM, Nikolai wrote:

> By the way, if you read this and you use Vim, please check out the
> Vim/Ruby project at
> http://rubyforge.org/projects...
> and try out all the latest features. Much work has been done since the
> 6.2 release, and it needs a good test-run.

I'll just note for those interested that "the latest features" are
only available via CVS at the moment. A "devel" release will be out
shortly.

>> If a general-purpose program were provided, it would offer a
>> solution to any editor and for standalone use, as well as inspiring
>> greater agility in the existing editor plugins. It would not render
>> such plugins obsolete, rather provide a backup for the tasks they do
>> not easily do (indent entire file, accounting for prefs, comments,
>> here-docs, etc.).

> like indent(1) you mean? I rarely run indent, but if I was ever to
> alter other people's code, I'd probably run it through indent(1) before
> running it through Vim's.

Precisely like indent. Say it were called 'rindent', then from within
Vim (or any editor; that's the point) you can run

:%!rindent

and have it done nicely. Obviously you're still going to use your
editor's indenting features as you type and want to correct small
blocks.

Also, Nikolai, I thought this would be perfect for you, as you have
already done it in VimL :-* and are gearing up to do it in pcpEdit in
Ruby ;)

Gavin

Nikolai Weibull

10/11/2003 12:06:00 AM

* Gavin Sinclair <gsinclair@soyabean.com.au> [Oct, 11 2003 01:50]:
>
[me asking if it would be like indent(1)]
>
> Precisely like indent. Say it were called 'rindent', then from within
> Vim (or any editor; that's the point) you can run
>
> :%!rindent
>
> and have it done nicely. Obviously you're still going to use your
> editor's indenting features as you type and want to correct small
> blocks.
>
OK. The good thing with Ruby, over C, for this kind of thing is that
most people seem to keep to a rather similar way of 'type-setting' their
programs. We could perhaps use this to our advantage somehow.
>
> Also, Nikolai, I thought this would be perfect for you, as you have
> already done it in VimL :-* and are gearing up to do it in pcpEdit in
> Ruby ;)
>
Haha, OK. I'll see what I can do. I've always wondered if it would be
possible to do this kind of thing with a yacc/racc or such similar.
pcpEdit heh. That will not be the official name ;-). I'm thinking of
'ned', for Nikolai EDitor, or simply the name Ned (as in Flanders) in
tribute of editors such as Sam, Wily, and family. Other, more
silly/stupid names were scamacs (emacs spelled backwards prepended to
emacs, with e's removed) and scam-e (emacs spelled backwards). And
also, I haven't decided on Ruby yet, but yes, it will probably be Ruby
actually. I think it can work rather well.
nikolai

--
::: name: Nikolai Weibull :: aliases: pcp / lone-star / aka :::
::: born: Chicago, IL USA :: loc atm: Gothenburg, Sweden :::
::: page: www.pcppopper.org :: fun atm: gf,lps,ruby,lisp,war3 :::
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}

Aaron Son

10/11/2003 1:20:00 AM

On 2003-10-11, Nikolai Weibull <ruby-talk@pcppopper.org> wrote:
> * Gavin Sinclair <gsinclair@soyabean.com.au> [Oct, 11 2003 01:50]:
>>
> [me asking if it would be like indent(1)]
>>
>> Precisely like indent. Say it were called 'rindent', then from
>> within Vim (or any editor; that's the point) you can run
>>
>> :%!rindent
>>
>> and have it done nicely. Obviously you're still going to use your
>> editor's indenting features as you type and want to correct small
>> blocks.
>>
> OK. The good thing with Ruby, over C, for this kind of thing is that
> most people seem to keep to a rather similar way of 'type-setting'
> their programs. We could perhaps use this to our advantage somehow.
>>
>> Also, Nikolai, I thought this would be perfect for you, as you have
>> already done it in VimL :-* and are gearing up to do it in pcpEdit in
>> Ruby ;)
>>
> Haha, OK. I'll see what I can do. I've always wondered if it would
> be possible to do this kind of thing with a yacc/racc or such similar.
> pcpEdit heh. That will not be the official name ;-). I'm thinking of
> 'ned', for Nikolai EDitor, or simply the name Ned (as in Flanders) in
> tribute of editors such as Sam, Wily, and family. Other, more
> silly/stupid names were scamacs (emacs spelled backwards prepended to
> emacs, with e's removed) and scam-e (emacs spelled backwards). And
> also, I haven't decided on Ruby yet, but yes, it will probably be Ruby
> actually. I think it can work rather well.

You mention yacc/racc, and I was curious as to your opinions on the
subject of an indent like program and the best way to approach it.

Due to the limitations of editors and the like, real-time indentation
calculation is inherently error prone because we're working with a
subset of the file and the more accuracy we want in the heuristics of the
indentation, the more complex our scripts which are responsible for said
indentation become.

One way to approach the problem when writing an external program which
is responsible for re-indenting a file would be to parse the file into a
kind of verbose abstract syntax tree and then write the tree back out
using straight forward rules regarding indentation and white space.
This has straight-forward advantages and disadvantages, as well as
consequences which I'm probably overlooking. The major advantage that I
see is that the resulting file could be almost prefect given that we had
a parser for the complete grammar. One of the disadvantages would be
that things like same-line comments would probably get converted to
full-line comments or vice-versa more often than desirable. I'm also
not sure about the relative performances of the two methods...on one
hand parsing the entire file into a syntax tree is processor intensive
and requires memory space for the tree (although the file could be
parsed incrementally I suppose, writing out the nodes that we're
currently at as long as they're "closed", meaning they would no longer
affect the indentation of elements to come), whereas parsing regarding a
large set of regular expressions requires running the buffer of text
through multiple regexes, etc.

Personally, I think grammars and parsers are pretty fun/neat, so writing
an indent-like program using them would probably be more interesting
than writing one using a sequence of regular expressions similar to
writing a syntax file. What's the normal way of doing this (i.e. how
are indent and astyle implemented) and what do you think would be the
best? Any advantages or disadvantages of the methods that I'm not
seeing?

--Aaron

Nikolai Weibull

10/11/2003 10:53:00 AM

* Aaron Son <aaronson@uiuc.edu> [Oct, 11 2003 03:30]:
>
> You mention yacc/racc, and I was curious as to your opinions on the
> subject of an indent like program and the best way to approach it.
>
> Due to the limitations of editors and the like, real-time indentation
> calculation is inherently error prone because we're working with a
> subset of the file and the more accuracy we want in the heuristics of the
> indentation, the more complex our scripts which are responsible for said
> indentation become.
>
Yes, this is the main problem we face. With limited context we can also
only get limited usefulness.
>
> One way to approach the problem when writing an external program which
> is responsible for re-indenting a file would be to parse the file into a
> kind of verbose abstract syntax tree and then write the tree back out
> using straight forward rules regarding indentation and white space.
>
Yes, this is precisely the idea I had for it. I don't know if it's
possible to get right though.
>
> This has straight-forward advantages and disadvantages, as well as
> consequences which I'm probably overlooking. The major advantage that I
> see is that the resulting file could be almost prefect given that we had
> a parser for the complete grammar. One of the disadvantages would be
> that things like same-line comments would probably get converted to
> full-line comments or vice-versa more often than desirable.
>
Yes, this may be a problem. The more information about the file you
store in the 'verbose abstract syntax tree' though, the more you can
keep the old structure as well.
>
> I'm also not sure about the relative performances of the two
> methods...on one hand parsing the entire file into a syntax tree is
> processor intensive and requires memory space for the tree (although
> the file could be parsed incrementally I suppose, writing out the
> nodes that we're currently at as long as they're "closed", meaning
> they would no longer affect the indentation of elements to come),
> whereas parsing regarding a large set of regular expressions requires
> running the buffer of text through multiple regexes, etc.
>
This is probably not a problem. Source files are generally not very
large.
>
> Personally, I think grammars and parsers are pretty fun/neat, so writing
> an indent-like program using them would probably be more interesting
> than writing one using a sequence of regular expressions similar to
> writing a syntax file. What's the normal way of doing this (i.e. how
> are indent and astyle implemented) and what do you think would be the
> best? Any advantages or disadvantages of the methods that I'm not
> seeing?
>
indent(1) works by lexing the C file and basically applying heuristic
rules to it. astyle I don't know. The main advantage is that it works
rather well ;-). The main disadvantage is that it is only heuristic.
It's not necessarily correct, (or, as the indent(1) manual states "it is
not guaranteed that running indent on the same file will generate the
same output every time"),
nikolai

--
::: name: Nikolai Weibull :: aliases: pcp / lone-star / aka :::
::: born: Chicago, IL USA :: loc atm: Gothenburg, Sweden :::
::: page: www.pcppopper.org :: fun atm: gf,lps,ruby,lisp,war3 :::
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}

comp.lang.ruby

Project suggestion: Ruby code indenter

Gavin Sinclair

Nikolai Weibull

Gavin Sinclair

Nikolai Weibull

Aaron Son

Nikolai Weibull

x Login to ForumsZone