Asp Forum - Struct is slow

Wayne Magor

10/16/2007 4:45:00 PM

I have a script in which I was using a 2-element array where a struct
would be used in another language, so I decided to give the Ruby class
Struct a try.

My script went from taking 2 seconds to taking 27 seconds from this
simple change! The use of this 2-element array is only one small part
of the script so it was very surprising to me that it could even cause
the script to take twice as long no matter how inefficient it might be.
To take more than 10 times longer for the entire script to execute was a
shocker.

I will probably shy away from the use of the Struct class after this
brief experience and use arrays with constants for the indexes. I'm
wondering if this is an erroneous conclusion and perhaps I am missing
something.

What are the advantages/disadvantages of using the Struct class in Ruby?
--
Posted via http://www.ruby-....

25 Answers

Robert Dober

10/16/2007 4:56:00 PM

On 10/16/07, Wayne Magor <wemagor2@gmail.com> wrote:
> I have a script in which I was using a 2-element array where a struct
> would be used in another language, so I decided to give the Ruby class
> Struct a try.
>
> My script went from taking 2 seconds to taking 27 seconds from this
> simple change! The use of this 2-element array is only one small part
> of the script so it was very surprising to me that it could even cause
> the script to take twice as long no matter how inefficient it might be.
> To take more than 10 times longer for the entire script to execute was a
> shocker.
Hmm some code to look at would be nice, this seems difficult to believe indeed.

Robert

--
what do I think about Ruby?
http://ruby-smalltalk.blo...

Alex Fenton

10/16/2007 5:05:00 PM

Wayne Magor wrote:
> I have a script in which I was using a 2-element array where a struct
> would be used in another language, so I decided to give the Ruby class
> Struct a try.
>
> My script went from taking 2 seconds to taking 27 seconds from this
> simple change!

This doesn't sound likely, even if your script did nothing else but use
the Struct class. I'd look elsewhere for the source of slowness (try
running your script with -rprofile).

A quick benchmark suggests that Struct is somewhat slower to instantiate
(+100%), marginally slower to set values in (+33%), and the same speed
to fetch values from as an Array:

BENCHMARK:

require 'benchmark'

GC.disable
Foo = Struct.new(:foo, :bar)

repetitions = 500_000

puts 'struct-init', Benchmark::measure {
repetitions.times { f = Foo.new('x', 666) }
}

puts 'array-init', Benchmark::measure {
repetitions.times { f = ['x', 666] }
}

puts 'struct-get', Benchmark::measure {
f = Foo.new('x', 666)
repetitions.times { g = f.foo }
}

puts 'array-get', Benchmark::measure {
f = ['x', 666]
repetitions.times { g = f[0] }
}

puts 'struct-set', Benchmark::measure {
f = Foo.new('x', 666)
repetitions.times { f.foo = 'y' }
}

puts 'array-set', Benchmark::measure {
f = ['x', 666]
repetitions.times { f[0] = 'y' }
}

__END__

RESULTS (ruby 1.8.4 (2005-12-24) [i386-mswin32])

struct-init
1.359000 0.031000 1.390000 ( 1.390000)
array-init
0.563000 0.016000 0.579000 ( 0.579000)
struct-get
0.281000 0.000000 0.281000 ( 0.281000)
array-get
0.297000 0.000000 0.297000 ( 0.297000)
struct-set
0.422000 0.000000 0.422000 ( 0.422000)
array-set
0.281000 0.000000 0.281000 ( 0.281000)

alex

Wayne Magor

10/16/2007 6:48:00 PM

I understand your skepticism, it seems impossible, and I totally agree
it DOES seem impossible, yet that is what I am seeing.

I thought maybe I had done something wrong, so I recreated my experiment
(I had already deleted the file). I was running it from Eclipse, so I
opened a DOS window and ran it directly and got the same result.

The script is too large to post, but what it does is read C header files
and generates an array of these "structs" which have two strings: the
type and the name of a typedef. It then goes through the list making
them to be somewhat "primary" types for another script to process. Here
are some examples of the output:

float UBT_Feet_Type
uint_8 UBT_Turn_Directions_Type
struct NCBYTERD_PositionConvRecType
uint_32 NCTPLYGN_PolygonIterationsType
struct[4] NCTPLYGN_QuadralateralPolygonVerticesType
struct NCTPLYGN_QuadralateralClassType
struct[8] NCTPLYGN_QuadralateralPolygonSetType
uint_8 NCPRMYTP_DataStatusType
double NCPRMYTP_DegreesType
enum NCPRMYTP_PeriodsTypeEnum

One thing I just noticed is that the script with the Struct class
doesn't work correctly. It doesn't make the types "primary" and it
doesn't generate as many lines. Obviously, there's something I don't
understand about how to use Struct.

Here is the difference between the 2 scripts. The array

FILE COMPARISON
Produced: 10/16/2007 1:27:12 PM

Mode: Just Differences

Left file: C:\workspace\CreateSfct\h_file_types_diff.txt Right file:
C:\workspace\CreateSfct\h_file_types (before struct mod).rb
L19 $typedef_rec = Struct.new("Typedef", :typedef_type,
:typedef_name)
R19 TYPEDEF_TYPE = 0
TYPEDEF_NAME = 1
------------------------------------------------------------------------
------------------------------------------------------------------------
L74 typedef_type = type_arr[j][:typedef_type]
typedef_name = type_arr[j][:typedef_name]
R75 typedef_type = type_arr[j][TYPEDEF_TYPE]
typedef_name = type_arr[j][TYPEDEF_NAME]
------------------------------------------------------------------------
------------------------------------------------------------------------
L88 type_arr[j][:typedef_type].sub!(/\w+/, primary_t)
R89 type_arr[j][TYPEDEF_TYPE].sub!(/\w+/, primary_t)
------------------------------------------------------------------------
------------------------------------------------------------------------
L91 if type_arr[j][:typedef_type] =~ /\[([A-Z]\w*)\]/ then
type_arr[j][:typedef_type].sub!($1,define_lookup($1,define_hash))
R92 if type_arr[j][TYPEDEF_TYPE] =~ /\[([A-Z]\w*)\]/ then
type_arr[j][TYPEDEF_TYPE].sub!($1,define_lookup($1,define_hash))
------------------------------------------------------------------------
------------------------------------------------------------------------
L186 return $typedef_rec.new("enum", enum_name)
R187 return ["enum", enum_name]
------------------------------------------------------------------------
------------------------------------------------------------------------
L202 return $typedef_rec.new(typedef_type, typedef_name)
R203 return [typedef_type, typedef_name]
------------------------------------------------------------------------
------------------------------------------------------------------------
L546 type_arr.each do |x|
print(x[:typedef_type], "\t\t", x[:typedef_name], "\n")
R547 type_arr.each do |typedef_type, typedef_name|
print(typedef_type, "\t\t", typedef_name, "\n")
------------------------------------------------------------------------
--
Posted via http://www.ruby-....

MenTaLguY

10/16/2007 7:29:00 PM

If type_arr[j] is a struct, then try type_arr[j].typedef_type
rather than type_arr[j][:typedef_type], and see if that makes
a difference.

-mental

Wayne Magor

10/16/2007 8:00:00 PM

Mental Guy wrote:
> If type_arr[j] is a struct, then try type_arr[j].typedef_type
> rather than type_arr[j][:typedef_type], and see if that makes
> a difference.

Well, I tried it and, as expected, the syntax change made no difference.
The one with Struct's still takes 27 seconds and doesn't work correctly.
It's not clear to me why it should be different than just using a
2-element array (shrug).

--
Posted via http://www.ruby-....

Robert Klemme

10/16/2007 9:33:00 PM

On 16.10.2007 22:00, Wayne Magor wrote:
> Mental Guy wrote:
>> If type_arr[j] is a struct, then try type_arr[j].typedef_type
>> rather than type_arr[j][:typedef_type], and see if that makes
>> a difference.
>
> Well, I tried it and, as expected, the syntax change made no difference.
> The one with Struct's still takes 27 seconds and doesn't work correctly.
> It's not clear to me why it should be different than just using a
> 2-element array (shrug).

Did you actually also use the same block call semantics?

L546 type_arr.each do |x|
print(x[:typedef_type], "\t\t", x[:typedef_name], "\n")
R547 type_arr.each do |typedef_type, typedef_name|
print(typedef_type, "\t\t", typedef_name, "\n")

Make the latter

R547 type_arr.each do |x|
print(x.typedef_type, "\t\t", x.typedef_name, "\n")

Where x is your struct type.

robert

Robert Klemme

10/17/2007 5:59:00 AM

On 16.10.2007 19:05, Alex Fenton wrote:
> Wayne Magor wrote:
>> I have a script in which I was using a 2-element array where a struct
>> would be used in another language, so I decided to give the Ruby class
>> Struct a try.
>>
>> My script went from taking 2 seconds to taking 27 seconds from this
>> simple change!
>
> This doesn't sound likely, even if your script did nothing else but use
> the Struct class. I'd look elsewhere for the source of slowness (try
> running your script with -rprofile).
>
> A quick benchmark suggests that Struct is somewhat slower to instantiate
> (+100%), marginally slower to set values in (+33%), and the same speed
> to fetch values from as an Array:
>
> BENCHMARK:
>
> require 'benchmark'
>
> GC.disable
> Foo = Struct.new(:foo, :bar)
>
> repetitions = 500_000
>
> puts 'struct-init', Benchmark::measure {
> repetitions.times { f = Foo.new('x', 666) }
> }
>
> puts 'array-init', Benchmark::measure {
> repetitions.times { f = ['x', 666] }
> }
>
> puts 'struct-get', Benchmark::measure {
> f = Foo.new('x', 666)
> repetitions.times { g = f.foo }
> }
>
> puts 'array-get', Benchmark::measure {
> f = ['x', 666]
> repetitions.times { g = f[0] }
> }
>
> puts 'struct-set', Benchmark::measure {
> f = Foo.new('x', 666)
> repetitions.times { f.foo = 'y' }
> }
>
> puts 'array-set', Benchmark::measure {
> f = ['x', 666]
> repetitions.times { f[0] = 'y' }
> }
>
>
> __END__
>
> RESULTS (ruby 1.8.4 (2005-12-24) [i386-mswin32])
>
> struct-init
> 1.359000 0.031000 1.390000 ( 1.390000)
> array-init
> 0.563000 0.016000 0.579000 ( 0.579000)
> struct-get
> 0.281000 0.000000 0.281000 ( 0.281000)
> array-get
> 0.297000 0.000000 0.297000 ( 0.297000)
> struct-set
> 0.422000 0.000000 0.422000 ( 0.422000)
> array-set
> 0.281000 0.000000 0.281000 ( 0.281000)

I'm not sure whether you are familiar with Benchmark#bmbm which does a
rehearsal - personally I rather not switch off GC since in realistic
situations GC time belongs into the mix. But results are rather similar:

Robert@Babelfish2 /cygdrive/c/TEMP
$ ruby array-struct.rb
Rehearsal --------------------------------------------------
struct init 3.812000 0.000000 3.812000 ( 3.923000)
array init 1.672000 0.000000 1.672000 ( 1.709000)
struct get 0.437000 0.000000 0.437000 ( 0.440000)
array get 0.485000 0.000000 0.485000 ( 0.485000)
struct set 0.718000 0.000000 0.718000 ( 0.716000)
array set 0.500000 0.000000 0.500000 ( 0.510000)
----------------------------------------- total: 7.624000sec

user system total real
struct init 3.891000 0.000000 3.891000 ( 3.984000)
array init 1.640000 0.000000 1.640000 ( 1.690000)
struct get 0.438000 0.000000 0.438000 ( 0.450000)
array get 0.469000 0.000000 0.469000 ( 0.469000)
struct set 0.703000 0.000000 0.703000 ( 0.715000)
array set 0.484000 0.000000 0.484000 ( 0.504000)

Robert@Babelfish2 /cygdrive/c/TEMP
$ cat array-struct.rb
require 'benchmark'

REP = 500_000

Foo = Struct.new :foo, :bar

data = "foo"
c1 = Foo.new data, data
c2 = [data, data]

Benchmark.bmbm 15 do |x|
x.report 'struct init' do
REP.times { Foo.new data, data }
end

x.report 'array init' do
REP.times { [data, data] }
end

x.report 'struct get' do
REP.times { c1.bar }
end

x.report 'array get' do
REP.times { c2[1] }
end

x.report 'struct set' do
REP.times { c1.bar = data }
end

x.report 'array set' do
REP.times { c2[1] = data }
end
end

Kind regards

robert

Sylvain Joyeux

10/17/2007 6:13:00 AM

> I'm not sure whether you are familiar with Benchmark#bmbm which does a
> rehearsal - personally I rather not switch off GC since in realistic
> situations GC time belongs into the mix. But results are rather
> similar:
Sure. GC does belong to the mix. Now, if GC is enabled you are not able to
compare anything. Let's assume that GC runs during 'array init', you will
say 'hey, struct init is faster'. Now, if GC runs during 'struct init' the
result may change ...

Keeping GC is meaningful when benchmarking a whole application. In
microbenchmarks like these, it is simply noise.
--
Sylvain Joyeux

Joel VanderWerf

10/17/2007 6:29:00 AM

Sylvain Joyeux wrote:
>> I'm not sure whether you are familiar with Benchmark#bmbm which does a
>> rehearsal - personally I rather not switch off GC since in realistic
>> situations GC time belongs into the mix. But results are rather
>> similar:
> Sure. GC does belong to the mix. Now, if GC is enabled you are not able to
> compare anything. Let's assume that GC runs during 'array init', you will
> say 'hey, struct init is faster'. Now, if GC runs during 'struct init' the
> result may change ...
>
> Keeping GC is meaningful when benchmarking a whole application. In
> microbenchmarks like these, it is simply noise.

For fairness, you could do it this way (assuming REP is large enough):

...
Benchmark.bmbm 15 do |x|
GC.start
x.report 'struct init' do
REP.times { Foo.new data, data }
GC.start
end

GC.start
x.report 'array init' do
REP.times { [data, data] }
GC.start
end
...

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Robert Klemme

10/17/2007 9:48:00 AM

2007/10/17, Joel VanderWerf <vjoel@path.berkeley.edu>:
> Sylvain Joyeux wrote:
> >> I'm not sure whether you are familiar with Benchmark#bmbm which does a
> >> rehearsal - personally I rather not switch off GC since in realistic
> >> situations GC time belongs into the mix. But results are rather
> >> similar:
> > Sure. GC does belong to the mix. Now, if GC is enabled you are not able to
> > compare anything. Let's assume that GC runs during 'array init', you will
> > say 'hey, struct init is faster'. Now, if GC runs during 'struct init' the
> > result may change ...

But you have a *lot* invocations and it's highly unlikely that GC runs
during every init. On the other hand, if you allocate a lot of memory
from OS that may slow things down. And this will happen without GC.

Also, IMHO cost of memory GC overhead is also part of the runtime
performance. If you compare two approaches and one allocates a lot
more memory than the other, then GC times belong in the measurement
because in a real application that approach will also lead to
increased GC overhead.

> > Keeping GC is meaningful when benchmarking a whole application. In
> > microbenchmarks like these, it is simply noise.
>
> For fairness, you could do it this way (assuming REP is large enough):
>
> ...
> Benchmark.bmbm 15 do |x|
> GC.start
> x.report 'struct init' do
> REP.times { Foo.new data, data }
> GC.start
> end
>
> GC.start
> x.report 'array init' do
> REP.times { [data, data] }
> GC.start
> end
> ...

GC.start is not guaranteed to actually run GC AFAIK. I'd probably rather do

Benchmark.bmbm do
GC.stop
# tests
GC.start
end

There is a chance that GC.start will clean up all the memory allocated
during the first run.

Kind regards

robert

comp.lang.ruby

Struct is slow

Wayne Magor

Robert Dober

Alex Fenton

Wayne Magor

MenTaLguY

Wayne Magor

Robert Klemme

Robert Klemme

Sylvain Joyeux

Joel VanderWerf

Robert Klemme

x Login to ForumsZone