Lionel Bouton
4/28/2008 5:17:00 PM
Pascal Bourguignon wrote:
> [...] to transpose a matrix of dimension (c r) do
> if (c=1) and (r=1)
> then return the matrix itself
> else
> return the matrix made of:
> | transpose of submatrix [0..(c/2)[,[0..(r/2)[ transpose of submatrix [0..(c/2)[,[(r/2)..r[ |
> | transpose of submatrix [(c/2)..c[,[0..(r/2)[ transpose of submatrix [(c/2)..c[,[(r/2)..r[ |
>
> that is, you cut the matrix in four, transpose each part
> independently, and combine the four parts, exchanging the parts on the
> anti-diagonal.
>
> With this algorithm disk accesses should be more localized, so you may
> get better speed on very big files with very big lines.
>
Here's an algorithm that should minimize the amount of seeks.
Given that the file doesn't fit in memory, you'll have to fight against
the disk cache as it won't be effective (caching data that won't ever be
reused) so you want to load as much useful data in memory at once and
only useful data.
If your program can allocate an amount <M> of memory, compute the number
<n> of element that fit in <M> / matrix_line_count. You probably want to
test that with actual Ruby code as it's difficult to guess correctly.
Be creative on how you store data in memory to gain space, you shoud try
to store it pre-transposed in memory and already compacted, but it might
be difficult/slow to write (although disk seeks are slow enough that you
might be able to benefit from relatively heavy computations).
On each line, read the <n> first elements (effectively reading the <n>
first columns), then write the <n> first lines in the transposed matrix.
Continue <n> by <n> columns.
With this method you'll always read/write sequential data on disk when
it's possible with as large data chunks as possible without hitting the
swap.
You can test the opposite method (reading <n> lines, writing <n>
columns, it may be faster on your hardware but my gutt feeling is that
you should avoid seeking on writes).
Lionel