Paul Battley
8/18/2006 2:43:00 PM
On 18/08/06, L7 <jesse.r.brown@gmail.com> wrote:
> In the first when, (/not found/, /^[gBbTi]/) is there any benefit in
> setting up the statement in either of the following ways?
Ruby doesn't compile the separate expressions: it sends ===(subject)
to each argument in turn. They don't have to be regular expressions,
after all. Putting the more frequent match first is indeed quicker, as
expected - but that's an optimisation that can only be made with
foreknowledge of the data set.
Separating each operation into its own argument is significantly
slower due to the Ruby method call overhead. Knowing this, however, we
can derive a further optimisation by combining the two regular
expressions together: /^[gBbTi]|not found/
Paul.
PS: I did a bit of quick unscientific profiling to check. Here's the code:
data = [
"goat",
"Badger",
"bear",
"Tiger",
"ibis",
"not found",
"Start",
"Data goes here"
] * 10000
t0 = Time.now
data.each do |line|
case line
when /not found/, /^[gBbTi]/
next
end
end
p Time.now - t0 # 0.236658
t0 = Time.now
data.each do |line|
case line
when /^[gBbTi]/, /not found/
next
end
end
p Time.now - t0 # 0.176375
t0 = Time.now
data.each do |line|
case line
when /^[gBbTi]|not found/
next
end
end
p Time.now - t0 # 0.145403
t0 = Time.now
data.each do |line|
case line
when /^g/, /^B/, /^b/, /^T/, /^i/, /not found/
next
end
end
p Time.now - t0 # 0.299182