Asp Forum - Ruby 1.9 compatibility and performance

DJ Cole

5/15/2008 6:17:00 PM

I have pretty much committed to Ruby for several projects - and two
things make me nervous:
1) Ruby seems to be changing fundamental things - like Strings - in 1.9.
The changes are incompatible and seem fundamental (e.g. enumeration),
which seems peculiar given the late date in the evolution of Ruby. Why
make these changes now?
(At least, trying to mix a handful of needed new modules out of top of
tree with 1.8 obviously doesn't work - the core classes are diverged in
a fundamental way)
2) Performance - at least, a few modules, like the default REXML parser.
This thing took about 10 minutes to parse a simple 2 MB file just once.
It's unusable. I had to switch to libxml. Is this a Ruby artifact (e.g.
fundamentals like regular expressions just aren't up to snuff) or just a
bad module?

Is there a complete roadmap (I have had trouble finding one) that can
help me decide if it's too early to use Ruby?
--
Posted via http://www.ruby-....

7 Answers

Phillip Gawlowski

5/15/2008 8:13:00 PM

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

DJ Cole wrote:
| I have pretty much committed to Ruby for several projects - and two
| things make me nervous:
| 1) Ruby seems to be changing fundamental things - like Strings - in 1.9.
| The changes are incompatible and seem fundamental (e.g. enumeration),
| which seems peculiar given the late date in the evolution of Ruby. Why
| make these changes now?

Ruby 1.9 is a testbed of sorts for the eventual Ruby 2.0, and not
considered production ready in any way. For that, stick to the Ruby
1.8.6 branch (1.8.7 backported some possibly breaking changes, search
the archives of the list for the 1.8.7 preview announcements).

| 2) Performance - at least, a few modules, like the default REXML parser.
| This thing took about 10 minutes to parse a simple 2 MB file just once.
| It's unusable. I had to switch to libxml. Is this a Ruby artifact (e.g.
| fundamentals like regular expressions just aren't up to snuff) or just a
| bad module?

Probably a Ruby artifact. There is a libxml interface for Ruby, and I
guess there is something similar for other heavy lifting type of stuff.

However, I've heard that narray is rather fast, while still written in
pure Ruby, so it is possible to write computationally heavy code in Ruby
that performs well. At least, so I've heard, as I haven't used narray.

In general, Ruby isn't a speed demon, unless you talk about time spent
developing an app or script. ;)

- --
Phillip Gawlowski
Twitter: twitter.com/cynicalryan
Blog: http://justarubyist.bl...

10 years old is a good age to get stuck at.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail....

iEYEARECAAYFAkgsmUwACgkQbtAgaoJTgL/JVQCfUi1mH2CQ0do+viHuiO0g+Jke
D6kAoJF5pd9mjKW/P4dkODfVPB9KJ0eh
=kEqV
-----END PGP SIGNATURE-----

Rick DeNatale

5/15/2008 8:32:00 PM

On Thu, May 15, 2008 at 4:13 PM, Phillip Gawlowski
<cmdjackryan@googlemail.com> wrote:

> Ruby 1.9 is a testbed of sorts for the eventual Ruby 2.0, and not
> considered production ready in any way.
Actually, not quite true.

It used to be the case that a x.y version number for Ruby where y was
odd was considered experimental, and an even minor version number
indicated a production version.

But late last year Matz announced that he was afraid of running out of
digits, and that 1.9 WOULD eventually be considered production ready,
and that a teeny version number >= 1 would indicate this.

The version released last Christmas was, I believe, initially intended
to be Ruby 1.9.1, but late in the game it was decided that it wasn't
quite ready, so it became 1.9.0.

As far as I know it's still the plan to come out with a stable 1.9.1
at some point, at which time the new 2.0 stream would start in
parallel.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

John

5/16/2008 6:34:00 AM

I think you should see the Google talk with Matz that touches on this
subject

http://www.youtube.com/watch?v=o...

Ryan Davis

5/18/2008 4:06:00 AM

On May 15, 2008, at 11:16 , DJ Cole wrote:

> 2) Performance - at least, a few modules, like the default REXML
> parser.
> This thing took about 10 minutes to parse a simple 2 MB file just
> once.
> It's unusable. I had to switch to libxml. Is this a Ruby artifact
> (e.g.
> fundamentals like regular expressions just aren't up to snuff) or
> just a
> bad module?

It is certainly slow and bloated if you do a full parse. If you switch
to streaming it can be quite fast. Some of my students insisted on
using it to parse my itunes xml db (7.9Mb) and actually got fairly
good/usable times (not good enough to beat my regex version--but good
enough to use).

James Gray

5/19/2008 12:03:00 PM

On May 17, 2008, at 11:05 PM, Ryan Davis wrote:

> On May 15, 2008, at 11:16 , DJ Cole wrote:
>
>> 2) Performance - at least, a few modules, like the default REXML
>> parser.
>> This thing took about 10 minutes to parse a simple 2 MB file just
>> once.
>> It's unusable. I had to switch to libxml. Is this a Ruby artifact
>> (e.g.
>> fundamentals like regular expressions just aren't up to snuff) or
>> just a
>> bad module?
>
> It is certainly slow and bloated if you do a full parse. If you
> switch to streaming it can be quite fast. Some of my students
> insisted on using it to parse my itunes xml db (7.9Mb) and actually
> got fairly good/usable times (not good enough to beat my regex
> version--but good enough to use).

Interesting. Can you talk a little about why you prefer using regexen
to a parser in this instance?

I'm not trying to question your decision. I'm just curious about your
reasoning here.

James Edward Gray II

Ryan Davis

5/19/2008 7:21:00 PM

On May 19, 2008, at 05:02 , James Gray wrote:

> Interesting. Can you talk a little about why you prefer using
> regexen to a parser in this instance?
>
> I'm not trying to question your decision. I'm just curious about
> your reasoning here.

1) damn fast.
2) the xml from apple's plist format is so regular that a generic xml
parser is way overkill.
3) more readable
4) damn fast.

James Gray

5/19/2008 7:34:00 PM

On May 19, 2008, at 2:21 PM, Ryan Davis wrote:

> On May 19, 2008, at 05:02 , James Gray wrote:
>
>> Interesting. Can you talk a little about why you prefer using
>> regexen to a parser in this instance?
>>
>> I'm not trying to question your decision. I'm just curious about
>> your reasoning here.
>
> 1) damn fast.
> 2) the xml from apple's plist format is so regular that a generic
> xml parser is way overkill.
> 3) more readable
> 4) damn fast.

Yeah, I think those are great points.

I would add that XPath is practically useless on a plist file.

I'm not sure many people would allow the readable argument for a regex
approach, but I dig it. At a recent local conference, I called out a
two character regular expression from my place in the audience, to
help a presenter verify what he wanted. 30 programmers then proceeded
to debate my expression for the next few minutes. It was sad.

Obviously, if you can't count on the content, the parser is probably
the right choice. But if you can, why not cheat. I like it.

James Edward Gray II

comp.lang.ruby

Ruby 1.9 compatibility and performance

DJ Cole

Phillip Gawlowski

Rick DeNatale

John

Ryan Davis

James Gray

Ryan Davis

James Gray

x Login to ForumsZone