Asp Forum - Two Advanced Ruby Performance Questions

Sunny Hirai

11/26/2006 9:11:00 PM

First, I am a Ruby newbie but am an experienced developer of highly
scalable applications.

I like the Ruby community because it is very friendly and helpful;
however, I wanted to give my background because the answer to this
question (despite being a newbie to Ruby) won't be your typical 80/20
optimize when you need to, watch your database first, or bandwidth is
the limiting factor type question.

Our application servers on our current application run in highly
optimized Coldfusion (sub 100ms page response times) has about a dozen
application servers attached to it with probably about a dozen more
supporting servers. We have dual load balancers, dual Firewalls, RAIDed
dbs on multiple servers, and a 100 Mbps connection (likely to be
upgraded). We anticipate this new app to run on up to 100 application
servers eventually but this is obviously dependant on app server and
code performance.

I know to optimize when it's important but I am concerned about the
overhead of a framework I want to place on top of Ruby. The RAILS
framework, unfortunately, is too limiting for the application we are
planning (at least the VC parts of MVC) and prefer the control and
performance understanding of having built most of the framework
ourselves anyways.

Thank you in advance for listening. If I can get the right answers to
these questions, we would like to launch possibly one of the more highly
scaled our Ruby web applications. I find Ruby a highly desirable
language to use but I find very little documentation or discussion on
how things work underneath.

QUESTION 1

Is there a way (or does it do this already) for the Classes of the
application to be cached such that it doesn't add performance overhead?
Because it is such a dynamic language, my understanding is that the
classes themselves are created at run-time and DO add overhead before
any object instantiation occurs. My guess is that this would still
happen under YARV too?

In other words, can I create a scope in the web application (say a scope
that lasts the lifetime of the web server) where I can store class
definitions and/or object instances themselves and then use them to
create instances in the page request scope?

Why I want to do this is so I can define many classes without having to
worry about the overhead of having them defined at runtime for every
page request. In this way, if I don't use the classes in a page request,
they won't add any extra to the execution time.

When I write Javascript, I know that there is overhead so I have to keep
my libraries short and sweet. In ColdFusion, I have created a framework
that stores shared classes and object instances in what ColdFusion calls
an "application" scope so that classes and objects are setup only once
at application startup (my framework is a little more complex than this
but you probably get the point). Because of this, I have many libraries
and they are wide and deep. This is very helpful because I can create
many helper libraries without worrying about performance overhead.

I want to know if this is possible in Ruby.

Overall, I'm not really sure what kind of persistence and
non-persistence there is between page requests when Ruby is attached a
web server.

Any thoughts or pointers to resources would be helpful.

QUESTION 2

I've found a lot of documentation on ERB but a lot less on eRuby. All
the documentation I have found on eRuby has it executing from the
command line or through a web server plugin, usually through Apache. Can
eRuby be called from inside Ruby to do parsing? I ask this because eRuby
seems like it would execute faster seeing it is built using C.

Using ERB is straightforward but I'd love to get the performance
benefits of using eRuby if I could; however, my framework would likely
requiring making calls from inside Ruby and not ONLY .rhtml files
directly.

I'm guessing we can use ERB to generate the Ruby code and saving the
generated code to a file and then executing the generated file. This
would improve performance since the parsing step only happens once;
however, I'd still like to know if eRuby can be used this way.

FINAL COMMENTS

Sorry for the monster large post. This is incredibly important for us
and will help us decide if we want to switch to Ruby for our new
application. We have a large amount of good code in ColdFusion but as an
agile company, I can see the benefits of Ruby down the line, especially
after a couple of years. Mostly, I love the clean syntax and the overall
design of the language.

Thanks for your input and I hope (beg) that somebody can help answer
these questions.

--
Posted via http://www.ruby-....

23 Answers

M. Edward (Ed) Borasky

11/26/2006 9:48:00 PM

Sunny Hirai wrote:
> First, I am a Ruby newbie but am an experienced developer of highly
> scalable applications.
>
> I like the Ruby community because it is very friendly and helpful;
> however, I wanted to give my background because the answer to this
> question (despite being a newbie to Ruby) won't be your typical 80/20
> optimize when you need to, watch your database first, or bandwidth is
> the limiting factor type question.
>
> Our application servers on our current application run in highly
> optimized Coldfusion (sub 100ms page response times) has about a dozen
> application servers attached to it with probably about a dozen more
> supporting servers. We have dual load balancers, dual Firewalls, RAIDed
> dbs on multiple servers, and a 100 Mbps connection (likely to be
> upgraded). We anticipate this new app to run on up to 100 application
> servers eventually but this is obviously dependant on app server and
> code performance.
>
> I know to optimize when it's important but I am concerned about the
> overhead of a framework I want to place on top of Ruby. The RAILS
> framework, unfortunately, is too limiting for the application we are
> planning (at least the VC parts of MVC) and prefer the control and
> performance understanding of having built most of the framework
> ourselves anyways.
>
Have you looked at Nitro or IOWA? There are some extensive descriptions
of them in Hal Fulton's second edition of "The Ruby Way". Nitro in
particular seems to be quite flexible and may do what you want where
Rails can't.

[snip]

> FINAL COMMENTS
>
> Sorry for the monster large post. This is incredibly important for us
> and will help us decide if we want to switch to Ruby for our new
> application. We have a large amount of good code in ColdFusion but as an
> agile company, I can see the benefits of Ruby down the line, especially
> after a couple of years. Mostly, I love the clean syntax and the overall
> design of the language.
>
I don't know much about ColdFusion, but it hardly seems to me like it's
going away any time soon. I would think that, aside from the "agility"
of Ruby, the main motivator for switching from ColdFusion to Ruby would
be to reduce software license costs. :) In any event, despite your
concerns about Ruby's performance, your decision needs to be made on
economic grounds and not necessarily on technical ones. After all,
you've demonstrated a willingness to throw hardware at your existing
scalable ColdFusion applications. I would be more concerned about what
would happen when a team highly experienced with ColdFusion and making
scalable applications with it suddenly are "asked" to jump head first
into Ruby.

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blo...

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

Edwin Fine

11/26/2006 11:24:00 PM

This post may be stating the obvious, but here goes anyway... I hope I
am not preaching to the choir.

First of all, the most important part of getting high performance is a
performance-oriented software and hardware architecture. Second of all,
at the code level, the selection of appropriate algorithms is crucial.
Finally comes the low-level code tuning.

Given an algorithm in Ruby, and the same algorithm in C, the algorithm
will perform better in C if the Ruby code consists mostly of primitives
(e.g. a += 1, loops with many iterations, conditionals, object creation
and destruction, and so on). If the Ruby code is really just calling
high-level underlying C library code, then there will not be that much
difference.

That being said, my experience with getting serious performance out of
dynamic languages such as Ruby has been to create extensions in C that
do the heavy lifting, and to write code in the dynamic language that
consists mainly of calls to the extensions. The wrong thing to do is to
write algorithms that make heavy use of fine-grained methods in Ruby. As
with any interpreter, the overhead of interpreting the code and calling
the method can be a significant proportion of the entire operation.

I believe that a very common performance mistake is to design using the
wrong level of granularity. This happened a lot with distributed
computing (DCOM, CORBA, Web Services) when people would create a
fine-grained remote method without considering the overhead of executing
it. I formulated a heuristic stating that if the method being called did
not take at least 10-100 times longer than the time needed to actually
call it (e.g. marshaling, network overhead, etc), it was not
coarse-grained enough and should not be a remote method.

I think this rule of thumb (with a reasonable choice of constant
multiplier, maybe 1000x) could apply to dynamic (or interpreted)
languages too. To design a high-performance interpreted application,
partition the application appropriately between native code and
interpreted code. (The trick is deciding what should be native and what
should be Ruby).

In doing so, of course, you lose some (maybe a lot) of the portability
of the application, and probably maintainability, but that often happens
when you are aiming for extreme performance anyway.

This point of view may not sit well with Rubyists who want 100% pure
Ruby solutions, but as with anything in life, there are always
trade-offs. I take a pragmatic point of view, and do what is needed
based on priorities. If I can write something in the pure language, I
do. If that doesn't make the grade, I either use anther technology or
write an extension.

--
Posted via http://www.ruby-....

Vidar Hokstad

11/26/2006 11:44:00 PM

Edwin Fine wrote:
> Given an algorithm in Ruby, and the same algorithm in C, the algorithm
> will perform better in C if the Ruby code consists mostly of primitives
> (e.g. a += 1, loops with many iterations, conditionals, object creation
> and destruction, and so on). If the Ruby code is really just calling
> high-level underlying C library code, then there will not be that much
> difference.

True, but the original poster indicated he's looking at a database
backed web application. In that kind of environment, in all likelihood
most of the execution time will be dictated by database speed and
network performance.

Of course you can easily kill performance for those kind of apps by
doing the wrong things too, but the odds that he'll need to resort to C
to get the performance he needs for something like that are minimal,
and in any case with the ease of writing C extensions for Ruby it would
be a severe case of premature optimization.

The original poster is looking in the right direction by looking for
options that minimize the per request overhead (i.e. avoiding parsing
the source and creating the object hierarchy per request). FastCGI for
instance would be a good starting point.

Vidar

M. Edward (Ed) Borasky

11/27/2006 2:05:00 AM

Edwin Fine wrote:
> This post may be stating the obvious, but here goes anyway... I hope I
> am not preaching to the choir.
>
> First of all, the most important part of getting high performance is a
> performance-oriented software and hardware architecture. Second of all,
> at the code level, the selection of appropriate algorithms is crucial.
> Finally comes the low-level code tuning.
>
I'm not sure this is at all "stating the obvious". Still, the OP hasn't
formally decided to switch from ColdFusion to Ruby, and is digging for
low-level details on Ruby in general and web applications in particular.
What this tells me is that 1 and 2 are already taken care of. As I noted
in my post, I think the economic considerations are more important than
the low-level details. How is a team with presumably many person-years
of accumulated experience building scalable ColdFusion applications
going to react when being asked to learn a whole new language and
framework? Is the reduced cost of an open source platform over a
commercial one enough of a motivation to put the team through that? I
don't think you're preaching to the choir. I think what I'm saying is,
"If it ain't broke, don't fix it!" :)

> In doing so, of course, you lose some (maybe a lot) of the portability
> of the application, and probably maintainability, but that often happens
> when you are aiming for extreme performance anyway.
>
In many cases portability is not a requirement. More fundamental
requirements are total cost of ownership and usability of the
application. Performance figures into both total cost of ownership and
application usability.

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blo...

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

Edwin Fine

11/27/2006 2:32:00 AM

I guess I gave the right answer to the wrong question :)

I was really looking at the OP's words below when I posted:

>I know to optimize when it's important but I am concerned about the
>overhead of a framework I want to place on top of Ruby.
...
>eRuby seems like it would execute faster seeing it is built using C.

Vidar Hokstad wrote:
>True, but the original poster indicated he's looking at a database
>backed web application. In that kind of environment, in all likelihood
>most of the execution time will be dictated by database speed and
>network performance.

Very likely, but you never know. It all depends on the application. I am
working on a "database-backed web application" that happens to be an
OLAP/data mart application, and in this, response time is dictated by
*everything*, including CPU.

I do agree with what you guys posted, especially that TCO and usability
should be the ultimate decision factors, especially the thought that
reskilling experienced ColdFusion developers in Ruby could be costly.

Still, I would also be interested in knowing what facilities exist in
Ruby app servers that are similar to the Java app server world (like app
scope, session scope, page scope etc.) I don't have any experience in
Ruby app servers. An earlier poster mentioned Nitro - maybe I should
read up on that.

I also remember reading a bit about "continuation" servers, which if I
recall correctly should address some of the OP's performance concerns.

--
Posted via http://www.ruby-....

Sunny Hirai

11/27/2006 5:59:00 AM

Hi All and thanks for the responses.

Before I go on, I thought I should note that Vidar Hokstad is
understanding my particular problem the best.

Thank you for all the great responses but I would like to mention that I
have a good grasp of how we will probably need to scale our application
and the potential pitfalls. As I mentioned, we have already scaled one
application to a good scale. I am looking for specific answers to the
fill in the holes of my knowledge with Ruby. I am a Ruby newbie, but I
have read three books on Ruby and a couple of them multiple times. I'm
pretty much aware of most of what I want to know about the other
parameters that you have asked me to consider.

However, in no particular order, I'd like to address these issues to
show you that I'm not looking for the "typical" answers as I mentioned
in the OP.

1. This is a NEW application so many won't be switching. I will write
the majority of it to start. It IS expensive for me to switch, but I'm
thinking what each language will cost/save us 2-3 years down the road.
Furthermore, I wrote all of the original code in the current application
and will write all the core code for the new one. In other words, I will
bear the brunt of the complexity which I'm currently determining
whethere it is worht it. None of our developers except one actually knew
ColdFusion from the start and were all taught ColdFusion on the job. All
except two developers (not counting me) were hired less than a year ago
in a 6 person team which means they also had to learn both a language
and a framework. This is because it is difficult to find great
ColdFusion developers. Rather, we hired smart people and taught them a
new language. We will keep many developers with the old app and some
will come over to the new app team. We are hiring two new people who
will likely not know ColdFusion or Ruby anyways. I think ColdFusion is
easier to learn but Ruby has a better syntax and I personally like its
design philosophy better (though I like that of ColdFusion too). I am
actually counting on, in the long term, a stable version of a VM, but am
willing to wait a year to get it and just pack power against it in the
shorter term.

2. As a database backed app, dropping to C is not needed for most of our
application and having it split into two languages is costly from a
development point of view so I'd like to keep the main app in one
language only; however, there are performance oriented portions like a
Photoshop like image processor that I've written in C# and .Net as a web
service. It is Mono compatible but Mono seems to choke randomly. We used
C# because of its strong underlying graphic framework, its Java like
memory protection and still the ability to drop to UNSAFE mode where the
custom filters, merging and effects code need to execute very quickly.
We also have portions of interoperability in Java. Most likely we will
also have a Lucene search engine in Java. All of these operate (or will
operate) as web services.

3. I address the overhead of web services operations constantly. Even
when working within the SAME language, I benchmark overhead and test the
cost of each operation. For example, ColdFusion has a native WDDX (XML
like) conversion format that we used to use to fold multiple fields into
a single field for caching in a database. After testing, I wrote a
custom encoding/decoding format that executes about 1-2 orders of
magnitude faster if I remember correctly. I also weigh the cost of
calling methods. In fact, one of the major reasons for wanting a switch
to Ruby is that the object instantiation cost in ColdFusion is very
high. It takes about 1ms per instantiation. This means that if I need to
instantiate an object for say each row in a 50 row query for output, I
have added a 50ms overhead to our project (in which we are aiming at
under 100ms for total page execution times). And to cover the next
response, YES, I know I shouldn't be doing this in an environment where
object instantiation is expensive so I don't do this. Instead, I have
three ways to instantiate an object which I've created in the framework,
two ways which fake it in a manner that is 2 orders of magnitude faster;
however, the syntax is ugly. It also adds to development time greatly
because I need to decide every time which model I need to use and
sometimes, the best model changes over time. I could use the fastest
method (with the ugliest code) but it just introduces a layer of
complexity and potential for bugs to the code which goes against my
instincts. I constantly weigh app performance against developer
performance. Most likely, however, is that I don't use OOP at all and
inline the code but this results in bad reusability.

Okay, now, that I've (hopefully) convinced you I'm attacking the problem
at an atypical level, back to the regular program. ;)

I found some of the information I wanted in not the eRuby or ERB pages
but in mod_ruby.

It suggests that ONE instance of Ruby executes to handle all the
threads; however, it doesn't go into too much detail about how this is
handled. So, for example, if I call

require 'somelibrary'

It is actually only included into the code once.

However, it doesn't say anything about where scopes begin and end. The
general feeling I'm getting is that there is very little documentation
on the guts of Ruby and I'd like to learn about them without having to
read the source code which I probably wouldn't understand anyways.

For example, I'd like to know how multiple threads are handled. It
appears that objects in the "global" scope are shared but objects in
normal scopes are not. I found this in the FAQ for mod_ruby here:

http://wiki.modruby.net/en/?FAQ#Why+are+changes+to+my+library+not+reflected+in+the...

http://wiki.modruby.net/en/?FAQ#How+do+I+keep+an+object+instance+between+invocations+of+a+page%2C+for+example%2C+a+persistent+database...

But it is REALLY unclear how this all works. So for example, if I extend
the original Array object, this does NOT persist between requests. But
if I add the extension in a "require"d file, it then does persist but
does NOT reload. I find this contradictory. It probably has something to
do with this statement in the FAQ:

---
You can't override classes in your mod_ruby scripts directly. (Instead,
a new class will be defined.) Because mod_ruby scripts are loaded by
Kernel#load(filename, true).

If you have to override existing classes, please do it in a library,
then require it from your mod_ruby scripts.
---

But I'm not really sure how Kernal#load works underneath either.

I'd also like to know how to deal with locking global variables for
transactional use in a multi-threaded environment (e.g. web server) or
if this is even possible.

I still don't know whethere eRuby can be called from within Ruby or if
it has to be called from the command line or through some sort of
adapter.

I feel like Ruby needs a "High Performance Ruby" book. There is one for
MySQL and that is the only reason I had the confidence to make the
decision to switch out of MS SQL Server. Knowing what I'm up against
would help tremendously.

Thanks for your feedback. If anybody knows anything more about the guts
of mod_ruby and/or Ruby, please let me know.

All the best,

Sunny Hirai
CEO, MeZine Inc.

--
Posted via http://www.ruby-....

M. Edward (Ed) Borasky

11/27/2006 6:57:00 AM

Sunny Hirai wrote:
> Hi All and thanks for the responses.
>
[snip]
> I am
> actually counting on, in the long term, a stable version of a VM, but am
> willing to wait a year to get it and just pack power against it in the
> shorter term.
>
There is one stable implementation at the moment, Ruby 1.8.5 "stable
snapshot". It's not a VM -- it's pure C code, plus some Ruby once enough
Ruby is built during the installation. A year out, there will be jRuby,
built on the Java Virtual Machine and running 1.8.x syntax/semantics,
probably YARV with different syntax and semantics ("Ruby 1.9.1"), and
possibly something running 1.8.x on the CLR. If you get started building
now, my guess is you'll be most likely going down the jRuby path if you
want a VM.

[snip]
> Okay, now, that I've (hopefully) convinced you I'm attacking the problem
> at an atypical level, back to the regular program. ;)
>
I wouldn't call it atypical ... but you are definitely "challenging" the
language. :)
> I found some of the information I wanted in not the eRuby or ERB pages
> but in mod_ruby.
>
> It suggests that ONE instance of Ruby executes to handle all the
> threads; however, it doesn't go into too much detail about how this is
> handled. So, for example, if I call
>
> require 'somelibrary'
>
> It is actually only included into the code once.
>
> However, it doesn't say anything about where scopes begin and end. The
> general feeling I'm getting is that there is very little documentation
> on the guts of Ruby and I'd like to learn about them without having to
> read the source code which I probably wouldn't understand anyways.
>
The best description of scoping in the Ruby language I've seen is in
David A. Black's "Ruby for Rails". I've never used "mod_ruby", so I
can't be of much help with it. For that matter, I stay as far away from
Apache as I possibly can -- life is too short to know *everything* and
so I've chosen to learn about Markov processes instead of Apache config
files. :)

As far as the "guts of Ruby" and reading the source code, it's actually
quite readable once you've been through the Pickaxe book, understand the
layout of objects, classes and pointers and know the material on
interfacing C with Ruby. Ruby has a lot fewer ugly hacks than, say,
Perl, or a modern finite element structural analysis code.
> I feel like Ruby needs a "High Performance Ruby" book. There is one for
> MySQL and that is the only reason I had the confidence to make the
> decision to switch out of MS SQL Server. Knowing what I'm up against
> would help tremendously.
>
I think the jRuby team is writing a high-performance *Ruby* -- perhaps
that would be better than a book.

One last comment -- from what the folks on this list tell me,
programming in Ruby is supposed to be more fun than programming in other
languages. Try not to get so bogged down in understanding how it all
works that it stops being fun.

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blo...

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

Jano Svitok

11/27/2006 8:19:00 AM

On 11/27/06, Sunny Hirai <sunny@citymax.com> wrote:
> Hi All and thanks for the responses.

> 3. I address the overhead of web services operations constantly. Even
> when working within the SAME language, I benchmark overhead and test the
> cost of each operation. For example, ColdFusion has a native WDDX (XML
> like) conversion format that we used to use to fold multiple fields into
> a single field for caching in a database. After testing, I wrote a
> custom encoding/decoding format that executes about 1-2 orders of
> magnitude faster if I remember correctly. I also weigh the cost of
> calling methods. In fact, one of the major reasons for wanting a switch

It is said that method call overhead in ruby is pretty high, partially
due to the possibility of being able override an existing method
later.

> I found some of the information I wanted in not the eRuby or ERB pages
> but in mod_ruby.

Both ERB and eRuby will compile your template into a string that will
get 'eval'ed. The difference is that ERB uses print statements and is
therefore hard to capture.
Have a look at erubis -- ERB implementation in C (I haven't tried it myself).

> It suggests that ONE instance of Ruby executes to handle all the
> threads; however, it doesn't go into too much detail about how this is
> handled. So, for example, if I call

The usual way to deploy ruby application is either using
mod_fgci(d)/fast_cgi or using mongrel or webrick as a container and
proxying to them. (Proxy to Mongrel seems to be the easiest one).

Mongrel is able to run in multiple threads, rails not. The problem is
somewhere in the metaprogramming magic (though I don't know precisely
where). This limitation is worked around by running multiple instances
of mongrel with multiple ruby interpreters. These interpreteter share
nothing by default among them (except db). You can have them share
sessions, etc. by using db or shared memory for it.

Within one interpreter, classes are persistent. I.e. the life cycle
is: the interpreter starts, initializes, and serves requests in a
loop. The classes you define stay there until interpreter is stopped.
Instances/objects are created on the run as needed. They are not
recycled. However it may be possible to create a factory method that
will cache created instances and reuse them. That will possibly make
your code more complicated and error prone...

> I still don't know whethere eRuby can be called from within Ruby or if
> it has to be called from the command line or through some sort of
> adapter.

You can call it from ruby as well. There's not a lot of documentation,
but it's certainly possible. Ask me when you'll need it and I'll send
an example to you.
(OT: I sent a doc patch to Shugo Maeda, but got no response.)
However eRuby will not work as it is with rails due to the print statement.

> I feel like Ruby needs a "High Performance Ruby" book. There is one for
> MySQL and that is the only reason I had the confidence to make the
> decision to switch out of MS SQL Server. Knowing what I'm up against
> would help tremendously.
>
> Thanks for your feedback. If anybody knows anything more about the guts
> of mod_ruby and/or Ruby, please let me know.
>
> All the best,
>
> Sunny Hirai
> CEO, MeZine Inc.

Sunny Hirai

11/27/2006 10:16:00 AM

To M. Edward

Thanks for the info. In terms of VM, basically I'm looking for something
that is significantly faster than Ruby is right now. Ultimately it would
have been nice to start clean on Ruby 2.0 semantics and its upcoming VM
but I don't have much confidence on timelines here, especially to a
stable build. I'd prefer jRuby because it would allow us to hook into
Java easier which a lot of reference implementations for integration are
done in; however, I'm worried about difference between it and the
reference implementation.

Thanks also for the referral to "Ruby for Rails." I have read the book
once but I wasn't thinking of scoping when I did. I will read it again.
The information on scoping will probably be very helpful.

I have read the Pickaxe book (a couple of times now) but am not much of
a C programmer though I have learned it in the past and I know much of
its semantics are similar to Java/C# style without garbage collection,
native OOP, etc.

I do have to disagree a little with your "one last comment" however. I
find it necessary to learn everything I need to know about a language to
scale. I find it uncomfortable when I don't know what is happening under
the hood because things can take me by surprise.

I agree that not knowing what's going on under the hood can be a "good
thing" if your application doesn't need to scale largely and, quite
frankly, for about 99% of apps, you really don't need to worry that much
about performance. But it is absolutely essential in our applications. A
wrong choice early on or a lack of knowledge could mean we run into
possibly unsurmountable problems later.

As an example, I know that MS SQL server keeps statistics on all of its
tables and makes optimization decisions on which indexes to use based on
those statistics. One night, our application slowed to a complete crawl.
It stopped serving pages and yet we couldn't recall any change we made
to the code that would cause it. We traced it to the DB and what
happened was that one of our partners added a huge number of products to
our db. This in itself wasn't a problem as our indexes are designed to
scale to a large number of products; however, SQL Server incorrectly
started choosing the wrong index and performance went down by something
like 100x - 1000x. Obviously, it was using bad logic to decide which
index to use; however, if I didn't know that the optimizer used table
stats to make decisions on which index to use, we would have likely been
stuck looking in the wrong area. The change in table size changed the
stats and the index used change. As it were, we rewrote the query such
that we provided more hinting to the database and then MS SQL Server
started using the correct index again.

I like knowing this type of stuff so I know what happened when things go
wrong and to prevent it from happening in the first place.

Jan Svitok,

Thanks for the incredible information. I feel like you've got an
understanding of how Ruby works underneath and have some interesting
approaches to boot. Just the names of the useful projects has helped
immensely.

> It is said that method call overhead in ruby is pretty high, partially
> due to the possibility of being able override an existing method
> later.

Thanks for the warning. Actually, I think I mispoke a little. I should
have said the overhead of specific methods. Although I have timed method
calls in ColdFusion and the call time differs depending on where the
methods are called from (e.g. methods in objects take a longer to call
than local methods), I haven't found the overhead to be a problem. Like
Ruby, method calls in ColdFusion are done through a lookup and they can
be modified at runtime so I expect similar call times. Object
instantiation, however, was crippling and certain specific methods took
too long to execute and were rewritten (like the WDDX call).

> Both ERB and eRuby will compile your template into a string that will
> get 'eval'ed. The difference is that ERB uses print statements and is
> therefore hard to capture.
> Have a look at erubis -- ERB implementation in C (I haven't tried it
> myself).

Thank you. I will take a look at erubis.

Thanks also for the information on Mongrel. This project sounds
interesting. I am disappointed to learn that Rails is not thread safe. I
am still hoping that ActiveRecord will work well in a multi-threaded
environment however. The approach of multiple instances of mongrel and
multiple ruby interpreters is a good workaround. That said, I think I'd
have to rewrite ActiveRecord anyways as it relies on config files to set
datasources and such. Our application will probably need to set
datasources at run time so that we can split a table across multiple db
servers and let it know, at run-time, which server the data resides on.

Also, your information on persistence is useful; however, I'm still
unclear about a few things.

I can't wrap my head around when something becomes bound to the "global"
scope and when it is bound to a "request" scope. I'm defining "global"
to mean from the application start to its end and "request" scope to
mean the life of one request.

For example, if I "require" a file with a class, that class will now
become part of the global scope. But what if I define a method in the
"require"d file as well? Does that method become part of the global
scope?

If a "require"d file becomes part of the global scope always, is there
any way to create a class that IS NOT part of the global scope.

Also, if a single request say modifies the "class" at runtime by adding
methods to it, does this change persist into all the other request or
does the change only persist for the one request? What if the class is
modified in non-"require"d code?

I understand if this is too many questions for you. Just wanted to say
thanks either way for the information provided. It's nice to have lots
of useful experts online.

Sunny Hirai
CEO, MeZine Inc.

--
Posted via http://www.ruby-....

Frederick Cheung

11/27/2006 12:50:00 PM

As far as threadedness goes, I believe activerecord can be used thread
safely (backgroundrb used to do that before they switched to
process-based workers instead of thread-based workers).

If you are using something like fastcgi or mongrel, then each instance
of those has its own ruby interpreter: changes made in one of those
don't change what's happening in another process.

So if I add a method to a class, change a class variable etc... that
will only affect the mongrel/fastcgi process that that statement
executed in. Subsequent requests to the same mongrel will see that
change, but those that get handled by a different mongrel won't see the
change.

Fred

--
Posted via http://www.ruby-....

comp.lang.ruby

Two Advanced Ruby Performance Questions

Sunny Hirai

M. Edward (Ed) Borasky

Edwin Fine

Vidar Hokstad

M. Edward (Ed) Borasky

Edwin Fine

Sunny Hirai

M. Edward (Ed) Borasky

Jano Svitok

Sunny Hirai

Frederick Cheung

x Login to ForumsZone