Asp Forum - multi-dimensional arrays, idioms...

Seebs

7/13/2011 6:50:00 PM

So, for a really long time, I've done this:

int **allocate_multidim_array(int n, int m) {
int **a;
int i;

a = malloc(n * sizeof(int *));
for (i = 0; i < n; ++i) {
a[i] = malloc(m * sizeof(int));
}
}

void free_multidim_array(int **a, int n) {
int i;
for (i = 0; i < n; ++i)
free(a[i]);
free(a);
}

(Not really as functions, normally, just presenting them this way so
you can see the logic clearly.)

Recently, I had cause to do this, and it occurred to me I
could do something else:

int **allocate_multidim_array(int n, int m) {
int **a;
int *data;
int i;

a = malloc(n * sizeof(int *));
data = malloc(n * m * sizeof(int));
for (i = 0; i < n; ++i) {
a[i] = data + (i * m);
}
}

void free_multidim_array(int **a, int n) {
free(a[0]);
free(a);
}

So far as I can tell, this is just plain cleaner, simpler, and so on.

Am I missing something? Is the former idiom actually widespread, or did
I just invent it wrong? Is there a benefit to it I haven't considered,
other than sparse arrays or things larger than you can allocate as a single
object?

-s
--
Copyright 2011, all wrongs reversed. Peter Seebach / usenet-nospam@seebs.net
http://www.seeb... <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/...(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

6 Answers

rouben

7/13/2011 10:02:00 PM

In article <slrnj1rqdf.16d0.usenet-nospam@guild.seebs.net>,
Seebs <usenet-nospam@seebs.net> wrote:
>So, for a really long time, I've done this:
>
> int **allocate_multidim_array(int n, int m) {
> int **a;
> int i;
>
> a = malloc(n * sizeof(int *));
> for (i = 0; i < n; ++i) {
> a[i] = malloc(m * sizeof(int));
> }
> }
>
> void free_multidim_array(int **a, int n) {
> int i;
> for (i = 0; i < n; ++i)
> free(a[i]);
> free(a);
> }
>
>(Not really as functions, normally, just presenting them this way so
>you can see the logic clearly.)
>
>Recently, I had cause to do this, and it occurred to me I
>could do something else:
>
> int **allocate_multidim_array(int n, int m) {
> int **a;
> int *data;
> int i;
>
> a = malloc(n * sizeof(int *));
> data = malloc(n * m * sizeof(int));
> for (i = 0; i < n; ++i) {
> a[i] = data + (i * m);
> }
> }
>
> void free_multidim_array(int **a, int n) {
> free(a[0]);
> free(a);
> }
>
>So far as I can tell, this is just plain cleaner, simpler, and so on.
>
>Am I missing something? Is the former idiom actually widespread, or did
>I just invent it wrong? Is there a benefit to it I haven't considered,
>other than sparse arrays or things larger than you can allocate as a single
>object?

I have seen both methods used, and I have met people with
strongly held biases for one or the other method.

Here are my comments on your code fragments.

1. As you have noted, the second method may suffer from
the problem of memory being too large to allocate as a
single object.

2. If you can't spot an advantage of one method over the other
in your specific application, then probably the choice of the
method is immaterial for you.

3. One situation where the first method has a great advantage
over the second is in matrix algebra where row interchanges
are quite common. To swap rows p and q in your first method,
you will do:

int *tmp;
tmp = a[p];
a[p] = a[q];
a[q] = tmp;

which amounts to a total of 3 assignments (an O(1) operation).
In the second method, you will have to swap the individual
elements of the rows, which amounts to 3*m swaps, where m is
the length of the matrix's rows (an O(m) operation). This can
make a big difference if m is large.

4. Aside: Your choice of m and n for the array dimensions goes
contrary to the common convention. I suggest that you use
m for number of rows and n for number of columns, unless you
want to obfuscate your code.

5. The functions free_multidim_array() can be made more user-friendly.
Specifying n as an argument with free_multidim_array() is confusing
because the user needs to remember whether n is the number of rows or
the number of columns. A better way is to set a sentinel in the array
of pointers. For instance, in you method #1, change:

a = malloc(n * sizeof(int *));

to

a = malloc((n+1) * sizeof(int *));

and set a[n] = NULL. Then you won't need to pass n to
free_multidim_array(). The implementation is left to
the reader as an exercise.

6. Aside: The line a = malloc(...) above is better written as

a = malloc((n+1) * sizeof *a);

The fact that the type of `a' does not enter in this form of
the statement, is the key to the construction of the generic
macros described below.

7. The type of array elements (`int' in your case) is
hard-coded in your construction. What if you want to make an
mxn array with elements of type `double' or `struct mystruct'?
Do you have to duplicate your code for each instance? No!
With a preprocessor slight-of-hand you may generalize the
construction to multidimensional arrays of arbitrary type,
as in this illustration:

unsigned char **a;
int m = 10, n = 12;

MAKE_2ARRAY(a, m, n);
/* use array as a[i][j]; e.g., a[i][j] = 3 */
...
FREE_2ARRAY(a);

The macros MAKE_2ARRAY() and FREE_2ARRAY() construct and free
arrays of any type. E.g.,

struct point {
double x;
double y;
} **b;
int m = 10, n = 12;

MAKE_2ARRAY(b, m, n);
/* use array as b[i][j]; e.g., b[i][j].x = 1.2 */
...
FREE_2ARRAY(b);

I wrote about these macros in this newsgroup back in 2002. See:

http://tinyurl.c...

That message gives a link to the file `array.h' that defines the
macros. Here is that link again:

http://home.comcast.net/~rouben.rostami...

--
Rouben Rostamian

Ben Bacarisse

7/14/2011 12:30:00 AM

Seebs <usenet-nospam@seebs.net> writes:

> So, for a really long time, I've done this:
>
> int **allocate_multidim_array(int n, int m) {
> int **a;
> int i;
>
> a = malloc(n * sizeof(int *));
> for (i = 0; i < n; ++i) {
> a[i] = malloc(m * sizeof(int));
> }
> }
>
> void free_multidim_array(int **a, int n) {
> int i;
> for (i = 0; i < n; ++i)
> free(a[i]);
> free(a);
> }
>
> (Not really as functions, normally, just presenting them this way so
> you can see the logic clearly.)
>
> Recently, I had cause to do this, and it occurred to me I
> could do something else:
>
> int **allocate_multidim_array(int n, int m) {
> int **a;
> int *data;
> int i;
>
> a = malloc(n * sizeof(int *));
> data = malloc(n * m * sizeof(int));
> for (i = 0; i < n; ++i) {
> a[i] = data + (i * m);
> }
> }
>
> void free_multidim_array(int **a, int n) {
> free(a[0]);
> free(a);
> }
>
> So far as I can tell, this is just plain cleaner, simpler, and so on.
>
> Am I missing something? Is the former idiom actually widespread, or did
> I just invent it wrong?

I've certainly used it and I probably saw it in someone else's code.
It's certainly in the C FAQ.

> Is there a benefit to it I haven't considered,
> other than sparse arrays or things larger than you can allocate as a single
> object?

I like it because it has simpler cleanup when checking for malloc
failures.

--
Ben.

Seebs

7/14/2011 12:34:00 AM

On 2011-07-13, rouben@shady.(none) (Rouben Rostamian) <rouben@shady> wrote:
> 3. One situation where the first method has a great advantage
> over the second is in matrix algebra where row interchanges
> are quite common. To swap rows p and q in your first method,
> you will do:
>
> int *tmp;
> tmp = a[p];
> a[p] = a[q];
> a[q] = tmp;
>
> which amounts to a total of 3 assignments (an O(1) operation).
> In the second method, you will have to swap the individual
> elements of the rows, which amounts to 3*m swaps, where m is
> the length of the matrix's rows (an O(m) operation). This can
> make a big difference if m is large.

Hmm. Will you? You'd have to stash the data pointer separately,
but it seems to me that once you've stashed it, you can swap the pointers
all you want. The only special case was row 0, and you can solve that
with O(1) storage.

> 4. Aside: Your choice of m and n for the array dimensions goes
> contrary to the common convention. I suggest that you use
> m for number of rows and n for number of columns, unless you
> want to obfuscate your code.

Whoops.

> and set a[n] = NULL. Then you won't need to pass n to
> free_multidim_array(). The implementation is left to
> the reader as an exercise.

Neat thought!

> 6. Aside: The line a = malloc(...) above is better written as
>
> a = malloc((n+1) * sizeof *a);
>
> The fact that the type of `a' does not enter in this form of
> the statement, is the key to the construction of the generic
> macros described below.

Yeah, that's what I usually actually write, but I was trying to
use a "simpler idiom.

> 7. The type of array elements (`int' in your case) is
> hard-coded in your construction. What if you want to make an
> mxn array with elements of type `double' or `struct mystruct'?

Well, again, these aren't really functions. Good point for a general
interface, but I've never felt the need for an actual interface for
this.

.... Of course, now that I know someone's written one maybe I'll start
using it. :)

-s
--
Copyright 2011, all wrongs reversed. Peter Seebach / usenet-nospam@seebs.net
http://www.seeb... <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/...(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Hallvard B Furuseth

7/14/2011 10:42:00 AM

Seebs writes:
> a = malloc(n * sizeof(int *));
> data = malloc(n * m * sizeof(int));

You can lose one more malloc() and free(), if you care:

size_t a_size = n * sizeof(int *), pad;
if (sizeof(int * ) % sizeof(int)) {
pad = (pad = a_size % sizeof(int)) ? sizeof(int) - pad : 0;
a_size += pad;
}
a = malloc(a_size + n * m * sizeof(int));
data = (int *) ((char *) a + a_size);
<set up a[0..n-1] as before>;

<use a>;

free(a);

pad can use _Alignof if available instead of sizeof.

--
Hallvard

luserXtrog

7/15/2011 4:48:00 AM

On Jul 14, 5:42 am, Hallvard B Furuseth <h.b.furus...@usit.uio.no>
wrote:
> Seebs writes:
> > a = malloc(n * sizeof(int *));
> > data = malloc(n * m * sizeof(int));
>
> You can lose one more malloc() and free(), if you care:
>
> size_t a_size = n * sizeof(int *), pad;
> if (sizeof(int * ) % sizeof(int)) {
> pad = (pad = a_size % sizeof(int)) ? sizeof(int) - pad : 0;
> a_size += pad;
> }
> a = malloc(a_size + n * m * sizeof(int));
> data = (int *) ((char *) a + a_size);
> <set up a[0..n-1] as before>;
>
> <use a>;
>
> free(a);
>
> pad can use _Alignof if available instead of sizeof.

Yay! That removes the special case of row 0.

BruceS

7/28/2011 3:02:00 PM

On Wed, 13 Jul 2011 18:50:01 +0000, Seebs wrote:

> So, for a really long time, I've done this:
>
> int **allocate_multidim_array(int n, int m) {
> int **a;
> int i;
>
> a = malloc(n * sizeof(int *));
> for (i = 0; i < n; ++i) {
> a[i] = malloc(m * sizeof(int));
> }
> }
>
> void free_multidim_array(int **a, int n) {
> int i;
> for (i = 0; i < n; ++i)
> free(a[i]);
> free(a);
> }
>
> (Not really as functions, normally, just presenting them this way so you
> can see the logic clearly.)
>
> Recently, I had cause to do this, and it occurred to me I could do
> something else:
>
> int **allocate_multidim_array(int n, int m) {
> int **a;
> int *data;
> int i;
>
> a = malloc(n * sizeof(int *));
> data = malloc(n * m * sizeof(int));
> for (i = 0; i < n; ++i) {
> a[i] = data + (i * m);
> }
> }
>
> void free_multidim_array(int **a, int n) {
> free(a[0]);
> free(a);
> }
>
> So far as I can tell, this is just plain cleaner, simpler, and so on.
>
> Am I missing something? Is the former idiom actually widespread, or did
> I just invent it wrong? Is there a benefit to it I haven't considered,
> other than sparse arrays or things larger than you can allocate as a
> single object?
>
> -s

I've just been catching up, doing my usual looking for certain posters,
when I came across this. Can it be that I'm the only one who noticed the
missing returns? I know it's inconsequential, as Seebs said he isn't
really using functions, but still...

comp.lang.c

multi-dimensional arrays, idioms...

Seebs

rouben

Ben Bacarisse

Seebs

Hallvard B Furuseth

luserXtrog

BruceS

x Login to ForumsZone