Robert Klemme
1/23/2006 12:21:00 PM
Tuwewe wrote:
> Hi,
>
> complete newbie here. I love Ruby and would like to use Ruby in my new
> project, which deals heavily with large sized file manipulation and
> string processing.
>
> Can you give me advice on how to write decent (with an emphasis on
> performance) string processing scripts.
>
> Things I am interested in are best practices for concating strings,
> searching for sub-string, regex operations, reading and writing files,
> etc.
>
> Also links to Ruby performance related sites are greatly appreciated.
>
> Thanks in advance.
Some general remark:
- keep the number of created objects as small as possible
- cling to objects only as long as they are needed
- if possible freeze strings that you use as hash keys (this avoids the
overhead of a new string instance being created as key)
- use a_string << other_string rather than a_string += other_string - or
use StringIO.
- where possible use in place replacements (sub! and gsub! instead of sub
and gsub)
- when processing large files process them streaming mode if possible
instead of slurping them into mem as a whole
- Start with default IO methods and rely on their buffering and line end
parsing before switching to more complex scenarios (sysread, syswrite).
To give more precise info we would need to know more about your
application case. And when it comes to optimization you'll have to
measure your app anyway. Tools that can help there are "ruby -r profile"
and module Benchmark.
Kind regards
robert