Asp Forum - Efficient use of malloc

Steve Keller

4/20/2011 7:21:00 PM

I have lots and lots of malloc() and free() calls for memory of
varying size, ranging from 100 to 500 bytes. How efficient will
malloc() be concerning execution time and fragmentation if I pass the
required sizes directly to malloc() as opposed to rounding them up.
Say

while (1) {
if (rand() % 2)) {
p = find_some_random_item();
free(p);
} else {
size = 100 + rand() % 400;
p = malloc(size);
store_item(p);
}
}

Will I help malloc() to improve performance if I call
malloc(ROUND(size)) instead with e.g.

// Round up to multiple of 64
#define ROUND(n) (((n) + 63) & ~63)

or even rounding up to the next power of two?

--- news://freenews.netfront.net/ - complaints: news@netfront.net ---

6 Answers

Keith Thompson

4/20/2011 7:54:00 PM

Steve Keller <keller@no.invalid> writes:
> I have lots and lots of malloc() and free() calls for memory of
> varying size, ranging from 100 to 500 bytes. How efficient will
> malloc() be concerning execution time and fragmentation if I pass the
> required sizes directly to malloc() as opposed to rounding them up.
> Say
>
> while (1) {
> if (rand() % 2)) {
> p = find_some_random_item();
> free(p);
> } else {
> size = 100 + rand() % 400;
> p = malloc(size);
> store_item(p);
> }
> }
>
> Will I help malloc() to improve performance if I call
> malloc(ROUND(size)) instead with e.g.
>
> // Round up to multiple of 64
> #define ROUND(n) (((n) + 63) & ~63)
>
> or even rounding up to the next power of two?

There is no definitive answer to your question. The language
defines the behavior of malloc; it (deliberately) says nothing
about its performance.

If that kind of rounding makes malloc more efficient, it's likely
to do it itself.

Note that malloc has to work properly for a wide range of sizes;
if all your allocations are from 100 to 500 bytes, and particularly
if you have only a few discrete sizes, you *might* benefit from
pre-allocating a large chunk of memory and carving fixed-size
objects out of it.

Some systems provide ways to give hints to the memory allocation
system.

But the only sure way to know how it's going to perform is to measure
it. You might very well find that whatever fancy scheme you use has
only an insignificant benefit over straightforward malloc calls.
Or you might find that some particular trick saves the say.
But keep in mind that whatever you do to improve performance,
it might not help (or it could even hurt) on some other system.

Consider carefully whether it's even worth the effort. Is your
application's performance noticeably slow? If not, don't worry about
it; if so, profile it to find out whether malloc is the bottleneck. If
you shave 10 milliseconds off the run time of an application that spends
10 minutes reading and writing data, you haven't won anything.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.ne...
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Bartc

4/20/2011 7:57:00 PM

"Steve Keller" <keller@no.invalid> wrote in message
news:ionbou$2qkk$1@adenine.netfront.net...
> I have lots and lots of malloc() and free() calls for memory of
> varying size, ranging from 100 to 500 bytes. How efficient will
> malloc() be concerning execution time and fragmentation if I pass the
> required sizes directly to malloc() as opposed to rounding them up.
> Say
>
> while (1) {
> if (rand() % 2)) {
> p = find_some_random_item();
> free(p);
> } else {
> size = 100 + rand() % 400;
> p = malloc(size);
> store_item(p);
> }
> }
>
> Will I help malloc() to improve performance if I call
> malloc(ROUND(size)) instead with e.g.
>
> // Round up to multiple of 64
> #define ROUND(n) (((n) + 63) & ~63)
>
> or even rounding up to the next power of two?

It's simple enough to test and see how much of a difference it will make,
but it will vary from system to system.

If malloc/free overheads are an issue, and you have plenty of spare memory,
try allocating only 500-byte blocks. When a block is freed, link it into a
free-list instead of using free().

When a new block is needed, first try and take the top one from the free
list, otherwise call malloc(). This can also work with a small number of
sizes (100 to 500 in steps of 100 for example), then not so much memory is
wasted, but is more fiddly.

This can be very fast, the downside is when there is a big bunch of free()
calls, then your memory is still tied up in the free-list and can't be used
for other purposes.

You can get more sophisticated with this, but usually end up reimplementing
a version of malloc and free!

So first do tests to find out how much of a problem this really is.

--
Bartc

China Blue Veins

4/20/2011 8:03:00 PM

In article <ionbou$2qkk$1@adenine.netfront.net>,
Steve Keller <keller@no.invalid> wrote:

> I have lots and lots of malloc() and free() calls for memory of
> varying size, ranging from 100 to 500 bytes. How efficient will
> malloc() be concerning execution time and fragmentation if I pass the
> required sizes directly to malloc() as opposed to rounding them up.

Implementation dependent. Some allocators, for example, round up to a power of
two and segregate allocations to a page of all the same size. If you're stuck
with a poor allocator, you would do better to download a good one than try to
patch around or design your own.

--
Damn the living - It's a lovely life. I'm whoever you want me to be.
Silver silverware - Where is the love? At least I can stay in character.
Oval swimming pool - Where is the love? Annoying Usenet one post at a time.
Damn the living - It's a lovely life. I am in the Nile.

Keith Thompson

4/20/2011 8:39:00 PM

"BartC" <bc@freeuk.com> writes:
> "Steve Keller" <keller@no.invalid> wrote in message
> news:ionbou$2qkk$1@adenine.netfront.net...
>> I have lots and lots of malloc() and free() calls for memory of
>> varying size, ranging from 100 to 500 bytes. How efficient will
>> malloc() be concerning execution time and fragmentation if I pass the
>> required sizes directly to malloc() as opposed to rounding them up.
>> Say
>>
>> while (1) {
>> if (rand() % 2)) {
>> p = find_some_random_item();
>> free(p);
>> } else {
>> size = 100 + rand() % 400;
>> p = malloc(size);
>> store_item(p);
>> }
>> }
>>
>> Will I help malloc() to improve performance if I call
>> malloc(ROUND(size)) instead with e.g.
>>
>> // Round up to multiple of 64
>> #define ROUND(n) (((n) + 63) & ~63)
>>
>> or even rounding up to the next power of two?
>
> It's simple enough to test and see how much of a difference it will
> make, but it will vary from system to system.
>
> If malloc/free overheads are an issue, and you have plenty of spare
> memory, try allocating only 500-byte blocks. When a block is freed,
> link it into a free-list instead of using free().

[...]

Something I forgot to mention: usage patterns might make a
substantial difference. malloc and free have to support the general
case: allocating and deallocating arbitrary chunks of memory in
arbitrary orders. If your application allocates a lot of objects,
uses them, and then deallocates everything in one fell swoop, you
might benefit from a simpler strategy. For example, you might
malloc N-byte chunks at a time, carve out sub-chunks as needed,
and then free all the big chunks when your done.

But again, *measure the performance*. If straightforward malloc and
free calls are fast enough, it's probably not worth worrying about.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.ne...
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

William Ahern

4/20/2011 9:14:00 PM

BartC <bc@freeuk.com> wrote:
<snip>
> If malloc/free overheads are an issue, and you have plenty of spare memory,
> try allocating only 500-byte blocks. When a block is freed, link it into a
> free-list instead of using free().
>
> When a new block is needed, first try and take the top one from the free
> list, otherwise call malloc(). This can also work with a small number of
> sizes (100 to 500 in steps of 100 for example), then not so much memory is
> wasted, but is more fiddly.

This is how the BSD allocator worked, or at least the second most recent
implementation in OpenBSD.

Even glibc's allocator does this for small chunks (perhaps too small).

Eric Sosman

4/21/2011 2:41:00 AM

On 4/20/2011 3:21 PM, Steve Keller wrote:
> I have lots and lots of malloc() and free() calls for memory of
> varying size, ranging from 100 to 500 bytes. How efficient will
> malloc() be concerning execution time and fragmentation if I pass the
> required sizes directly to malloc() as opposed to rounding them up.

For execution time, you'll need to measure. Be aware that you
will probably get different results on different platforms.

For fragmentation, rounding up might perhaps reduce external
fragmentation (unusable space between allocated blocks), but it is
absolutely 100% guaranteed to increase internal fragmentation (space
allocated within blocks but not used).

> Will I help malloc() to improve performance if I call
> malloc(ROUND(size)) instead with e.g.
>
> // Round up to multiple of 64
> #define ROUND(n) (((n) + 63)& ~63)
>
> or even rounding up to the next power of two?

Measure. It's worth noting, though, that many allocators add a
few bytes of overhead to each block requested, that some allocators
already work in power-of-two sizes, and therefore if you round your
request up to a power of two you will consume twice as much memory as
if you hadn't. For example, if you need 33KB and round it up to 64KB,
and the allocator tacks on 8 bytes of overhead for a total of 64KB+8,
then rounds the total up to 128KB. Had you simply asked for 33KB to
start with, you'd have used only half as much address space.

Advice: (1) As a default policy, just ask for what you need and
leave it to the allocator to figure out how best to satisfy the need.
(2) If you determine BY MEASUREMENT that the allocator can't handle
your request stream satisfactorily, then and only then should you try
to outsmart it, and you must MEASURE to ensure that your attempts to
outsmart don't in fact outstupid.

--
Eric Sosman
esosman@ieee-dot-org.invalid

comp.lang.c

Efficient use of malloc

Steve Keller

Keith Thompson

Bartc

China Blue Veins

Keith Thompson

William Ahern

Eric Sosman

x Login to ForumsZone