dga
10/26/2006 6:58:00 AM
Tweaking a bit and benchmarking with 100 rows of 10,000 elements each
user system total real
Create Matrix 3.940000 0.090000 4.030000 ( 4.037665)
Add Up Matrix 0.710000 0.000000 0.710000 ( 0.711897)
Create GSL::Matrix 3.400000 0.050000 3.450000 ( 3.496827)
Add GSL::Matrix 0.080000 0.030000 0.110000 ( 0.111982)
Matrix Add GSL::Matrix 0.020000 0.000000 0.020000 ( 0.017661)
With the "Matrix Add" done as:
(@m * colones.transpose).to_v.sum
(to avoid @m.transpose, which creates a copy of all of @m).
Have fun. Note that there are faster ways to initialize a GSL array
from an external source, if you end up concerned about that source of
overhead. (Or an NArray).
-Dave
dga wrote:
> Someone already mentioned the NArray way; if you treat the entire
> thing as a set of matrix operations, you can make it insanely fast in
> either NArray or GSL. But don't treat it as "N" arrays with "M"
> entries -- treat it as a matrix, and then all of your operations can be
> done in the external (very fast) code:
>
> m = GSL::Matrix.new([1, 0, 0, 0, 0], [1, 1, 0, 0, 0], [0, 0, 0, 1, 0])
>
> You can then either cheat, and "flatten" the matrix into a vector and
> find the sum:
>
> puts m.to_v.sum
>
> (which works since they're ones)
>
> Or pretend you're in Matlab and do it as a set of matrix operations:
>
> colones = GSL::Matrix.new(1, m.size[1])
> colones[0].set_all(1)
>
> rowones = GSL::Matrix.new(1, m.size[0])
> rowones[0].set_all(1)
>
> puts ((colones * m.transpose) * rowones.transpose)[0,0]
>
> -Dave
>
>
> poopdeville@gmail.com wrote:
> > Hi everybody,
> >
> > I'm writing a fairly open-ended question. I'm hoping for suggestions,
> > opinions, advice. Suppose I have n arrays, each of which has m
> > entries. m is a fairly large integer, on the order of 10,000. Each
> > entry is either 1 or 0.
> >
> > The first task I need to accomplish is figuring out how many times a 1
> > occurs in the ith entry in an array. So for concreteness, if I had
> > arrays:
> >
> > first = [1,0,0,0,0]
> > second = [1,1,0,0,0]
> > third = [0,0,0,1,0]
> >
> > I would end up with
> > count = [2,1,0,1,0]
> >
> > I'm just trying to give the general flavor of what I'm working on. I
> > know I can use some simple each_with_index loops to increment
> > count[index] (something along the lines of:)
> >
> > count = Array.new(m,0)
> > [first, second, third].each do |array|
> > array.each_with_index do |item, index|
> > count[index] += item
> > end
> > end
> >
> > There are going to be m * 3n * (two constant multipliers for the
> > looping) object allocations and method calls. m is fairly large, and I
> > have other, similar, tasks to accomplish with this data. The faster I
> > can process this data, the more data I can process in a given amount of
> > time, and the more accurate the analysis will be.
> >
> > Is there a slick way to do this with unpacking and packing? Or some
> > other way to do this with strings? Any modules or libraries I should
> > look into? I'm fairly new to Ruby, though not to scientific
> > computation. I realize I won't be able to do this any faster than with
> > m * n * (a constant multiplier), but I need that constant multiplier to
> > be as small as possible. Any advice would be appreciated.
> >
> > Thanks!
> > 'cid 'ooh