Asp Forum - My latest magnum opus!

pete

9/1/2011 11:24:00 PM

/* BEGIN block_swap_test.c */
/*
** block_swap swaps two contiguous subobjects.
** (base) is the address of the lower addressed subobject.
** (lo_size) is the size of the lower addressed subobject.
** (hi_size) is the size of the higher addressed subobject.
** If either (lo_size) or (hi_size) or both
** compare equal to zero, then no swapping takes place.
*/
#include <stdio.h>

#define STRING "0123456789A"

void block_swap(void *base, size_t lo_size, size_t hi_size);

int
main(void)
{
char string[] = STRING;
size_t right = sizeof string;
size_t left = 0;

puts("/* BEGIN block_swap_test.c output */\n");
do {
--right;
block_swap(string, left, right);
puts(string);
block_swap(string, right, left);
puts(string);
putchar('\n');
} while (++left != sizeof string);
puts("/* END block_swap_test.c output */");
return 0;
}

void
block_swap(void *base, size_t lo_size, size_t hi_size)
{
size_t mod;
unsigned char *end, *middle, swap;

mod = lo_size != 0 && hi_size != 0;
while (mod != 0) {
end = base;
middle = end + lo_size;
if (hi_size > lo_size) {
mod = hi_size % lo_size;
base = end + hi_size - mod;
do {
swap = *end;
*end++ = *middle;
*middle++ = swap;
} while (--hi_size != mod);
} else {
mod = lo_size % hi_size;
end = middle + hi_size;
do {
swap = *--middle;
*middle = *--end;
*end = swap;
} while (--lo_size != mod);
}
}
}

/* END block_swap_test.c */

/* BEGIN block_swap_test.c output */

0123456789A
0123456789A

123456789A0
0123456789A

23456789A01
0123456789A

3456789A012
0123456789A

456789A0123
0123456789A

56789A01234
0123456789A

6789A012345
0123456789A

789A0123456
0123456789A

89A01234567
0123456789A

9A012345678
0123456789A

A0123456789
0123456789A

0123456789A
0123456789A

/* END block_swap_test.c output */

--
pete

16 Answers

Eric Sosman

9/2/2011 1:14:00 AM

On 9/1/2011 7:24 PM, pete wrote:
> /* BEGIN block_swap_test.c */
> [...]

Jon Bentley tackled this problem in one of his "Programming
Pearls" columns in CACM, years and years ago. IIRC, he reported
that he had seen manymanymany attempts to solve it "efficiently"
with clever analysis of which block was the larger, and by how much,
and so on and so forth. He also reported that every such solution
he'd seen was either bug-ridden or actually slower than

Given an array A...FR...Z, to arrive at R...ZA...F:
Reverse the A...F sequence byte for byte
Reverse the R...Z sequence byte for byte
Reverse the entire array byte for byte

I have not examined your code closely -- in truth, I have not
even examined it cursorily. The very fact that it's longer than
Bentley's solution tells me it's almost certainly inferior; Bentley's
is certainly easier to analyze for correctness.

--
Eric Sosman
esosman@ieee-dot-org.invalid

pete

9/2/2011 2:30:00 AM

Eric Sosman wrote:
>
> On 9/1/2011 7:24 PM, pete wrote:
> > /* BEGIN block_swap_test.c */
> > [...]
>
> Jon Bentley tackled this problem in one of his "Programming
> Pearls" columns in CACM, years and years ago. IIRC, he reported
> that he had seen manymanymany attempts to solve it "efficiently"
> with clever analysis of which block was the larger, and by how much,
> and so on and so forth. He also reported that every such solution
> he'd seen was either bug-ridden or actually slower than
>
> Given an array A...FR...Z, to arrive at R...ZA...F:
> Reverse the A...F sequence byte for byte
> Reverse the R...Z sequence byte for byte
> Reverse the entire array byte for byte
>
> I have not examined your code closely -- in truth, I have not
> even examined it cursorily. The very fact that it's longer than
> Bentley's solution tells me it's almost certainly inferior; Bentley's
> is certainly easier to analyze for correctness.

I'll look into it.
Thank you.

--
pete

pete

9/2/2011 2:58:00 AM

pete wrote:
>
> Eric Sosman wrote:
> >
> > On 9/1/2011 7:24 PM, pete wrote:
> > > /* BEGIN block_swap_test.c */
> > > [...]
> >
> > Jon Bentley tackled this problem in one of his "Programming
> > Pearls" columns in CACM, years and years ago. IIRC, he reported
> > that he had seen manymanymany attempts to solve it "efficiently"
> > with clever analysis of which block was the larger, and by how much,
> > and so on and so forth. He also reported that every such solution
> > he'd seen was either bug-ridden or actually slower than
> >
> > Given an array A...FR...Z, to arrive at R...ZA...F:
> > Reverse the A...F sequence byte for byte
> > Reverse the R...Z sequence byte for byte
> > Reverse the entire array byte for byte
> >
> > I have not examined your code closely -- in truth, I have not
> > even examined it cursorily. The very fact that it's longer than
> > Bentley's solution tells me it's almost certainly inferior; Bentley's
> > is certainly easier to analyze for correctness.
>
> I'll look into it.
> Thank you.

Well, I've written it.
Now I have to time it.

I'm optimistic though.
The solution that you gave me,
always does
(lo_size - 1) / 2 + (hi_size - 1) / 2 + (lo_size + hi_size - 1) / 2
swaps.
That's the worst case for the function that I wrote.

/* BEGIN block_swap_test.c */
/*
** block_swap swaps two contiguous subobjects.
** (base) is the address of the lower addressed subobject.
** (lo_size) is the size of the lower addressed subobject.
** (hi_size) is the size of the higher addressed subobject.
** If either (lo_size) or (hi_size) or both
** compare equal to zero, then no swapping takes place.
*/
#include <stdio.h>

#define STRING "0123456789A"

void block_swap(void *base, size_t lo_size, size_t hi_size);
void block_swap2(void *base, size_t lo_size, size_t hi_size);

int
main(void)
{
char string[] = STRING;
size_t right = sizeof string;
size_t left = 0;

puts("/* BEGIN block_swap_test.c output */\n");
do {
--right;
block_swap(string, left, right);
puts(string);
block_swap2(string, right, left);
puts(string);
putchar('\n');
} while (++left != sizeof string);
puts("/* END block_swap_test.c output */");
return 0;
}

void
block_swap(void *base, size_t lo_size, size_t hi_size)
{
size_t mod;
unsigned char *end, *middle, swap;

mod = lo_size != 0 && hi_size != 0;
while (mod != 0) {
end = base;
middle = end + lo_size;
if (hi_size > lo_size) {
mod = hi_size % lo_size;
base = end + hi_size - mod;
do {
swap = *end;
*end++ = *middle;
*middle++ = swap;
} while (--hi_size != mod);
} else {
mod = lo_size % hi_size;
end = middle + hi_size;
do {
swap = *--middle;
*middle = *--end;
*end = swap;
} while (--lo_size != mod);
}
}
}

void
block_swap2(void *base, size_t lo_size, size_t hi_size)
{
unsigned char *first, *middle, *after, swap;

if (lo_size == 0 || hi_size == 0) {
return;
}
first = base;
middle = first + lo_size;
while (--middle > first) {
swap = *middle;
*middle = *first;
*first++ = swap;
}
first = base;
middle = first + lo_size;
after = middle + hi_size;
while (--after > middle) {
swap = *after;
*after = *middle;
*middle++ = swap;
}
first = base;
after = first + lo_size + hi_size;
while (--after > first) {
swap = *after;
*after = *first;
*first++ = swap;
}
}

/* END block_swap_test.c */

--
pete

pete

9/2/2011 4:41:00 AM

pete wrote:
>
> pete wrote:
> >
> > Eric Sosman wrote:
> > >
> > > On 9/1/2011 7:24 PM, pete wrote:
> > > > /* BEGIN block_swap_test.c */
> > > > [...]
> > >
> > > Jon Bentley tackled this problem in one of his "Programming
> > > Pearls" columns in CACM, years and years ago. IIRC, he reported
> > > that he had seen manymanymany attempts to solve it "efficiently"
> > > with clever analysis of which block was the larger, and by how much,
> > > and so on and so forth. He also reported that every such solution
> > > he'd seen was either bug-ridden or actually slower than
> > >
> > > Given an array A...FR...Z, to arrive at R...ZA...F:
> > > Reverse the A...F sequence byte for byte
> > > Reverse the R...Z sequence byte for byte
> > > Reverse the entire array byte for byte
> > >
> > > I have not examined your code closely -- in truth, I have not
> > > even examined it cursorily. The very fact that it's longer than
> > > Bentley's solution tells me it's almost certainly inferior; Bentley's
> > > is certainly easier to analyze for correctness.
> >
> > I'll look into it.
> > Thank you.
>
> Well, I've written it.
> Now I have to time it.
>
> I'm optimistic though.
> The solution that you gave me,
> always does
> (lo_size - 1) / 2 + (hi_size - 1) / 2 + (lo_size + hi_size - 1) / 2
> swaps.
> That's the worst case for the function that I wrote.

I Win!
by a slight amount, on the implementation that I'm using,
in a test that I contrived, with a slightly microoptimised
version of the function that I originally posted (block_swap),
against the first version that popped into my head
of the algorithm that you described (block_swap2).

/* BEGIN block_swap_time.c output */

No Discrepancies!

block_swap time is 5.390000
block_swap2 time is 5.547000

The array has 15000 long unsigned elements.

/* END block_swap_time.c output */

I made an (original) array of 15000 long unsigned
ans populated it with a prng and the I made a (copy) of it.
I tested block_swap against block_swap2 by using
block_swap to swap all the possible combinations of
block sizes in the (copy), and by each time using block_swap2
to undo the swap. Then I checked to make sure
that the copy was restored to be the same as the original,
and it was.

Then I tested each function and timed them.
The advantage that block_swap has over block_swap2
is that there is a best case for block_swap,
which is when the blocks being swapped are of equal sizes,
then it only does about half as many swaps
((lo_size + hi_size) / 2)
as block_swap2 does.
block_swap also has intermediate cases.

/* BEGIN block_swap_time.c */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

#define SIZE 15000
#define str(s) # s
#define xstr(s) str(s)

void block_swap(void *base, size_t lo_size, size_t hi_size);
void block_swap2(void *base, size_t lo_size, size_t hi_size);

int
main(void)
{
clock_t start, stop;
double time;
long unsigned lu_seed = 123456789LU;
long unsigned *original;
long unsigned *copy;
size_t right;
size_t left;

original = malloc(SIZE * sizeof *original);
copy = malloc(SIZE * sizeof *copy);
if (original == NULL || copy == NULL) {
puts("\noriginal == NULL || copy == NULL\n");
exit(EXIT_FAILURE);
}

for (right = 0; right != SIZE; ++right) {
original[right] = lu_seed;
lu_seed = lu_seed * 69069 + 362437;
}
memcpy(copy, original, SIZE);

left = 0;
right = SIZE;
puts("/* BEGIN block_swap_time.c output */\n");
do {
block_swap(copy, left, right);
block_swap2(copy, right, left);
if (memcmp(copy, original, SIZE) != 0) {
free(copy);
free(original);
puts("Discrepancy!\n");
exit(EXIT_SUCCESS);
}
++left;
} while (--right != 0);
puts("No Discrepancies!\n");

left = 0;
right = SIZE;
stop = clock();
do {
start = clock();
} while (start == stop);
do {
block_swap(copy, left, right);
block_swap(copy, right, left);
++left;
} while (--right != 0);
stop = clock();
time = (double)(stop - start) / (double)CLOCKS_PER_SEC;
printf("block_swap time is %f\n", time);

left = 0;
right = SIZE;
stop = clock();
do {
start = clock();
} while (start == stop);
do {
block_swap2(copy, left, right);
block_swap2(copy, right, left);
++left;
} while (--right != 0);
stop = clock();
time = (double)(stop - start) / (double)CLOCKS_PER_SEC;
printf("block_swap2 time is %f\n\n", time);

puts("The array has " xstr(SIZE)
" long unsigned elements.\n");
if (memcmp(copy, original, SIZE) != 0) {
puts("\nThere's a problem!\n");
}
free(copy);
free(original);
puts("/* END block_swap_time.c output */");
return 0;
}

void
block_swap(void *base, size_t lo_size, size_t hi_size)
{
size_t mod;
unsigned char *end, *middle, swap;

mod = lo_size != 0 && hi_size != 0;
while (mod != 0) {
end = base;
middle = end + lo_size;
if (hi_size > lo_size) {
mod = hi_size % lo_size;
hi_size -= mod;
base = end + hi_size;
do {
swap = *end;
*end++ = *middle;
*middle++ = swap;
} while (--hi_size != 0);
hi_size = mod;
} else {
mod = lo_size % hi_size;
lo_size -= mod;
end = middle + hi_size;
do {
swap = *--middle;
*middle = *--end;
*end = swap;
} while (--lo_size != 0);
lo_size = mod;
}
}
}

void
block_swap2(void *base, size_t lo_size, size_t hi_size)
{
unsigned char *first, *middle, *after, swap;

if (lo_size == 0 || hi_size == 0) {
return;
}
first = base;
middle = first + lo_size;
while (--middle > first) {
swap = *middle;
*middle = *first;
*first++ = swap;
}
first = base;
middle = first + lo_size;
after = middle + hi_size;
while (--after > middle) {
swap = *after;
*after = *middle;
*middle++ = swap;
}
first = base;
after = first + lo_size + hi_size;
while (--after > first) {
swap = *after;
*after = *first;
*first++ = swap;
}
}

/* END block_swap_time.c */

--
pete

pete

9/2/2011 4:54:00 AM

cri

9/2/2011 5:39:00 AM

On Thu, 01 Sep 2011 21:13:32 -0400, Eric Sosman
<esosman@ieee-dot-org.invalid> wrote:

>On 9/1/2011 7:24 PM, pete wrote:
>> /* BEGIN block_swap_test.c */
>> [...]
>
> Jon Bentley tackled this problem in one of his "Programming
>Pearls" columns in CACM, years and years ago. IIRC, he reported
>that he had seen manymanymany attempts to solve it "efficiently"
>with clever analysis of which block was the larger, and by how much,
>and so on and so forth. He also reported that every such solution
>he'd seen was either bug-ridden or actually slower than
>
> Given an array A...FR...Z, to arrive at R...ZA...F:
> Reverse the A...F sequence byte for byte
> Reverse the R...Z sequence byte for byte
> Reverse the entire array byte for byte
>
> I have not examined your code closely -- in truth, I have not
>even examined it cursorily. The very fact that it's longer than
>Bentley's solution tells me it's almost certainly inferior; Bentley's
>is certainly easier to analyze for correctness.

Back in 2004 I did an analysis. Let n be the number of bytes. Then
the reversal trick requires 3n data moves and 3n/u loop tests where u
is the loop unrolling factor. It turns out that if you treat the swap
and a permutation and process each cycle separately you can reduce the
number of data moves and the number of loop tests each to n+d where d
is the gcd of the lengths of the two segments. In terms of primitive
operations that is the best that can be done.

It was a fun exercise, but I never went to the trouble of coding it up
and timing it. I assume it would be slower than the reversal code.
The URL is http://home.tiac.net/~cri/2004/fr... if anyone wants
to give it a try.

pete

9/2/2011 6:59:00 AM

Richard Harter wrote:
>
> On Thu, 01 Sep 2011 21:13:32 -0400, Eric Sosman
> <esosman@ieee-dot-org.invalid> wrote:
>
> >On 9/1/2011 7:24 PM, pete wrote:
> >> /* BEGIN block_swap_test.c */
> >> [...]
> >
> > Jon Bentley tackled this problem in one of his "Programming
> >Pearls" columns in CACM, years and years ago. IIRC, he reported
> >that he had seen manymanymany attempts to solve it "efficiently"
> >with clever analysis of which block was the larger, and by how much,
> >and so on and so forth. He also reported that every such solution
> >he'd seen was either bug-ridden or actually slower than
> >
> > Given an array A...FR...Z, to arrive at R...ZA...F:
> > Reverse the A...F sequence byte for byte
> > Reverse the R...Z sequence byte for byte
> > Reverse the entire array byte for byte
> >
> > I have not examined your code closely -- in truth, I have not
> >even examined it cursorily. The very fact that it's longer than
> >Bentley's solution tells me it's almost certainly inferior; Bentley's
> >is certainly easier to analyze for correctness.
>
> Back in 2004 I did an analysis. Let n be the number of bytes. Then
> the reversal trick requires 3n data moves and 3n/u loop tests where u
> is the loop unrolling factor. It turns out that if you treat the swap
> and a permutation and process each cycle separately you can reduce the
> number of data moves and the number of loop tests each to n+d where d
> is the gcd of the lengths of the two segments. In terms of primitive
> operations that is the best that can be done.
>
> It was a fun exercise, but I never went to the trouble of coding it up
> and timing it. I assume it would be slower than the reversal code.
> The URL is http://home.tiac.net/~cri/2004/fr... if anyone wants
> to give it a try.

The way that I did it, was to move the bigger half
to its final location. If the bigsize % the smallsize == 0,
then there was no problem, but if not,
then the smaller half would be out of order
after the bigger half was moved.
So, instead I moved the bigger half minus (bigsize % smallsize) bytes.
That results in most of the bigger half being in its final place,
with the remnants of the bigger half becoming the new smaller half.
That gets repeated until eventually (bigsize % smallsize == 0)
and then it's done.

--
pete

Eric Sosman

9/2/2011 12:22:00 PM

On 9/2/2011 1:39 AM, Richard Harter wrote:
> On Thu, 01 Sep 2011 21:13:32 -0400, Eric Sosman
> <esosman@ieee-dot-org.invalid> wrote:
> [...]
>> Given an array A...FR...Z, to arrive at R...ZA...F:
>> Reverse the A...F sequence byte for byte
>> Reverse the R...Z sequence byte for byte
>> Reverse the entire array byte for byte
>>
>> I have not examined your code closely -- in truth, I have not
>> even examined it cursorily. The very fact that it's longer than
>> Bentley's solution tells me it's almost certainly inferior; Bentley's
>> is certainly easier to analyze for correctness.
>
> Back in 2004 I did an analysis. Let n be the number of bytes. Then
> the reversal trick requires 3n data moves and 3n/u loop tests where u
> is the loop unrolling factor.

Where do you get "3?" Looks like "2" from here. (Consider what
happens to one particular byte, Q: One of the partial passes moves
it from its original location to a temporary home, the other partial
pass never even looks at it, and the full pass moves it to its final
spot. When all is done, Q has moved twice, not three times.)

--
Eric Sosman
esosman@ieee-dot-org.invalid

cri

9/2/2011 1:39:00 PM

On Fri, 02 Sep 2011 08:21:35 -0400, Eric Sosman
<esosman@ieee-dot-org.invalid> wrote:

>On 9/2/2011 1:39 AM, Richard Harter wrote:
>> On Thu, 01 Sep 2011 21:13:32 -0400, Eric Sosman
>> <esosman@ieee-dot-org.invalid> wrote:
>> [...]
>>> Given an array A...FR...Z, to arrive at R...ZA...F:
>>> Reverse the A...F sequence byte for byte
>>> Reverse the R...Z sequence byte for byte
>>> Reverse the entire array byte for byte
>>>
>>> I have not examined your code closely -- in truth, I have not
>>> even examined it cursorily. The very fact that it's longer than
>>> Bentley's solution tells me it's almost certainly inferior; Bentley's
>>> is certainly easier to analyze for correctness.
>>
>> Back in 2004 I did an analysis. Let n be the number of bytes. Then
>> the reversal trick requires 3n data moves and 3n/u loop tests where u
>> is the loop unrolling factor.
>
> Where do you get "3?" Looks like "2" from here. (Consider what
>happens to one particular byte, Q: One of the partial passes moves
>it from its original location to a temporary home, the other partial
>pass never even looks at it, and the full pass moves it to its final
>spot. When all is done, Q has moved twice, not three times.)

In a reversal you have to swap bytes which requires moving one of them
to a temporary location. Reversing "ab" is

move a from old loc to temp loc
move b to a's old loc
move a from temp to b's old loc

Andrey Tarasevich

9/2/2011 4:21:00 PM

On 9/1/2011 6:13 PM, Eric Sosman wrote:
> On 9/1/2011 7:24 PM, pete wrote:
>> /* BEGIN block_swap_test.c */
>> [...]
>
> Jon Bentley tackled this problem in one of his "Programming
> Pearls" columns in CACM, years and years ago. IIRC, he reported
> that he had seen manymanymany attempts to solve it "efficiently"
> with clever analysis of which block was the larger, and by how much,
> and so on and so forth.

The primary point he makes in PP about this problem is that it happens
to be one of those problems that demonstrate pretty well how
hardware-specific efficiency considerations can trump any abstract
algorithmic efficiency considerations. The most efficient algorithm in
terms of number of moves (the Juggling algorithm) performs the poorest
among the three algorithms considered in the book (the recursive
approach used by the OP is also included). This is, of course, not
unusual for a problem dealing with relocation of continuous memory blocks.

--
Best regards,
Andrey Tarasevich

comp.lang.c

My latest magnum opus!

pete

Eric Sosman

pete

pete

pete

pete

cri

pete

Eric Sosman

cri

Andrey Tarasevich

x Login to ForumsZone