Asp Forum - any tricks to golf this code further?

luserXtrog

8/3/2011 6:30:00 AM

I've just wasted a week trying to solve this problem on code-golf:
http://codegolf.stackexchange.com/questions/284/write-an-interpreter-for-the-untyped-lambd...

I think I've squeezed it dry, but I've got to know if anybody has any
*really* sneaky tricks to make it shorter.

Here is the last version before reducing function names and replacing
repeated sequences with macros. It copies the argument string into
scratch memory, and then manipulates (mutilates?) the expression using
cursors and a stripped down suite of McCarthy primitives.

The winning solution was 320 chars of Python. But C lacks a library
call for string replacement. Even with my cleverest hackery, I can't
get the block below 778 chars. Any comments
or howls of horror are appreciated.

#include<stdio.h>
#include<string.h>
char*m,*n;atom(x){return x?(islower(m[x])?m[x]:0):0;}
eq(x,y){return x&&y&&atom(x)==atom(y);}
cell(x){char*t=n,d=0;if(!x||!m[x])return 0;if(m[x]==' ')++x;
if(atom(x)){*n++=m[x];*n++=0;return(n-m)-2;}
if(m[x]=='\\'){memcpy(n,m+x,4);n+=4;*n++=0;return(n-m)-5;}
do{d=m[x]?(m[x]=='('?d+1:(m[x]==')'?d-1:d)):0;*n++=m[x++];}
while(d);*n++=0;return t-m;}
car(x){return x?cell(x+1):0;}
cdr(x){return car(x)?cell(x+strlen(m+car(x))+1):0;}
cons(x,y){char*t=n;return x?(sprintf(n,y?"(%s %s)":"(%s)",m+x,m
+y),n+=strlen(n)+1,t-m):0;}
subst(x,y,z){if(!x||!y||!z)return 0;return atom(z)?(eq(z,y)?x:z):
cons(m[z+1]=='\\'?car(z):subst(x,y,car(z)),cdr(z)?
subst(x,y,cdr(z)):0);}
eval(x){int a;return atom(x)?x:atom(a=car(x))?x:m[a]=='\\'?
cons(a,eval(cdr(x))):
m[car(a)]=='\\'?eval(subst(eval(cdr(x)),cell(a+3),cdr(a))):
eval( cons(eval(a),cdr(x)?eval(cdr(x)):0));}
try(char*s){char*t=strcpy(n,s);n+=strlen(n)+1;int x=t-m;
printf("input: %s\n", s);printf("eval => %s\n", m+eval(x));n=m+1;}

char*samp[]={
"a","a","b","b",
"((\\ a. a) (b))", "(b)",
"((\\ x. x) (\\ y. (\\ z. z)))", "(\\ y. (\\ z. z))",
"(\\ x. ((\\ y. y) x))", "(\\ x. x)",
"(((\\ x. (\\ y. x)) (\\ a. a)) (\\ b. b))", "(\\ a. a)",
"((\\ x. (\\ y. y)) (\\ a. a))", "(\\ y. y)",
"(((\\ x. (\\ y. y)) (\\ a. a)) (\\ b. b))", "(\\ b. b)",
"((\\x. (x x)) (\\x. (x x)))", "undef",
NULL};
#include<unistd.h>

unsigned sz;
#include<signal.h>
void fix(x){signal(SIGSEGV,fix);brk(m+(sz*=2));}
main(){
char**t;
signal(SIGSEGV,fix);
m=n=sbrk(sz=10*getpagesize());
*n++=0;
for(t=samp;*t;t+=2){
try(*t);
printf("s.b. => %s\n\n", t[1]);
}
return 0;
}

9 Answers

io_x

8/5/2011 6:06:00 AM

"luser- -droog" <mijoryx@yahoo.com> ha scritto nel messaggio
news:27f55e9c-638f-4083-9069-

> #include<stdio.h>
> #include<string.h>
> char*m,*n;atom(x){return x?(islower(m[x])?m[x]:0):0;}
> eq(x,y){return x&&y&&atom(x)==atom(y);}
> cell(x){char*t=n,d=0;if(!x||!m[x])return 0;if(m[x]==' ')++x;
> if(atom(x)){*n++=m[x];*n++=0;return(n-m)-2;}
> if(m[x]=='\\'){memcpy(n,m+x,4);n+=4;*n++=0;return(n-m)-5;}
> do{d=m[x]?(m[x]=='('?d+1:(m[x]==')'?d-1:d)):0;*n++=m[x++];}
> while(d);*n++=0;return t-m;}
> car(x){return x?cell(x+1):0;}

if competition is for write less chars...
they have to count the letter different from ' ' and \n
so one can indent something

Bartc

8/5/2011 11:13:00 AM

luser- -droog" <mijoryx@yahoo.com> wrote in message
news:27f55e9c-638f-4083-9069-db6336be3e12@k8g2000yqk.googlegroups.com...

> The winning solution was 320 chars of Python. But C lacks a library
> call for string replacement. Even with my cleverest hackery, I can't
> get the block below 778 chars. Any comments
> or howls of horror are appreciated.

778? The code you posted is nearer to 2000 characters.

Since io_x has posted, I'm surprised he hasn't suggested his usual
techniques to reduce code size, such as:

#define R return

and replacing 'return' with 'R' everywhere.

The reduction will be slight however.

--
Bartc

luserXtrog

8/10/2011 11:27:00 PM

On Aug 5, 6:13 am, "BartC" <b...@freeuk.com> wrote:
> luser- -droog" <mijo...@yahoo.com> wrote in message
>
> news:27f55e9c-638f-4083-9069-db6336be3e12@k8g2000yqk.googlegroups.com...
>
> > The winning solution was 320 chars of Python. But C lacks a library
> > call for string replacement. Even with my cleverest hackery, I can't
> > get the block below 778 chars. Any comments
> > or howls of horror are appreciated.
>
> 778? The code you posted is nearer to 2000 characters.
>
> Since io_x has posted, I'm surprised he hasn't suggested his usual
> techniques to reduce code size, such as:
>
> #define R return
>
> and replacing 'return' with 'R' everywhere.
>
> The reduction will be slight however.

The 778 is *after* reducing all identifiers to single characters and
using that R macro as well. But these are basically mechanical
transformations and reduce the semantic value of the identifiers. By
following the link, the 778 (reduced since) version can be seen.

What I was interested in here was reducing the size of the "logic";
as this has much more potential for radical gains.

luserXtrog

8/10/2011 11:35:00 PM

On Aug 5, 1:06 am, "io_x" <a...@b.c.invalid> wrote:
> "luser- -droog" <mijo...@yahoo.com> ha scritto nel messaggio
> news:27f55e9c-638f-4083-9069-
>
> > #include<stdio.h>
> > #include<string.h>
> > char*m,*n;atom(x){return x?(islower(m[x])?m[x]:0):0;}
> > eq(x,y){return x&&y&&atom(x)==atom(y);}
> > cell(x){char*t=n,d=0;if(!x||!m[x])return 0;if(m[x]==' ')++x;
> > if(atom(x)){*n++=m[x];*n++=0;return(n-m)-2;}
> > if(m[x]=='\\'){memcpy(n,m+x,4);n+=4;*n++=0;return(n-m)-5;}
> > do{d=m[x]?(m[x]=='('?d+1:(m[x]==')'?d-1:d)):0;*n++=m[x++];}
> > while(d);*n++=0;return t-m;}
> > car(x){return x?cell(x+1):0;}
>
> if competition is for write less chars...
> they have to count the letter different from ' ' and \n
> so one can indent something

The rules for counting are defined for each question. For this one the
count was source-set characters. If one chose to use utf8 to use the
real lambda symbol, it would count as one character despite being a
multi-byte sequence. But space and newline are bona-fide characters in
any charset. So they count. Part of the challenge is to remove "all"
unnecessary characters.

My latest version is down to 644 (omitting main and the test cases).

#include<stdio.h>
#include<string.h>
#define X m[x]
#define R return
char*n,*m;int u,w,d;C(x,y){w=n-m;n+=sprintf(n,y?"(%s %s)":"(%s)",&X,m
+y)+1;R w;}T(x){R X>>5==3;}
L(x){R X==92;}O(x,j){w=n-m;memcpy(n,&X,j);n+=j;*n++=0;R w;}E(x){X==' '?
++x:0;R
X==41?0:L(x)?O(x,4):P(x);}P(x){d=0,w=x;do{X==40?d++:X==41?d--:0;++x;}
while(d>0);R
O(w,x-w);}D(x){u=E(x+1);R u?E(x+1+strlen(m+u)):0;}V(x){int a=E(x
+1),b=D(x);R
T(x)|T(a)?x:L(a)?C(a,V(b)):L(E(a+1))?V(S(V(b),E(a+3),D(a))):V(C(V(a),b?
V(b):0));}S(w,y,x){R
T(x)?(X==m[y]?w:x):C(L(w+1)?E(x+1):S(w,y,E(x+1)),D(x)?S(w,y,D(x)):0);}
Y(char*s){n+=strlen(s=strcpy(n,s))+1;printf("%s\n%s\n\n",s,m+V(s-
m));n=m+1;}

char*s[]={
"((\\ a. a) (b))",
"((\\ x. x) (\\ y. (\\ z. z)))",
"(\\ x. ((\\ y. y) x))",
"(((\\ x. (\\ y. x)) (\\ a. a)) (\\ b. b))",
"((\\ x. (\\ y. y)) (\\ a. a))",
"(((\\ x. (\\ y. y)) (\\ a. a)) (\\ b. b))",
"((\\x. (x x)) (\\x. (x x)))",0};
#include<unistd.h>
main(){char**k;n=m=sbrk(4096);*n++=0;for(k=s;*k;k++)Y(*k);R 0;}

io_x

8/11/2011 6:34:00 AM

"luser- -droog" <mijoryx@yahoo.com> ha scritto nel messaggio
news:7ea8f68e-6ead-4427-a99c-f15b4959915d@s7g2000yqk.googlegroups.com...
On Aug 5, 1:06 am, "io_x" <a...@b.c.invalid> wrote:
> "luser- -droog" <mijo...@yahoo.com> ha scritto nel messaggio
> news:27f55e9c-638f-4083-9069-
>
> > #include<stdio.h>
> > #include<string.h>
> > char*m,*n;atom(x){return x?(islower(m[x])?m[x]:0):0;}
> > eq(x,y){return x&&y&&atom(x)==atom(y);}
> > cell(x){char*t=n,d=0;if(!x||!m[x])return 0;if(m[x]==' ')++x;
> > if(atom(x)){*n++=m[x];*n++=0;return(n-m)-2;}
> > if(m[x]=='\\'){memcpy(n,m+x,4);n+=4;*n++=0;return(n-m)-5;}
> > do{d=m[x]?(m[x]=='('?d+1:(m[x]==')'?d-1:d)):0;*n++=m[x++];}
> > while(d);*n++=0;return t-m;}
> > car(x){return x?cell(x+1):0;}
>
> if competition is for write less chars...
> they have to count the letter different from ' ' and \n
> so one can indent something

The rules for counting are defined for each question. For this one the
count was source-set characters. If one chose to use utf8 to use the
real lambda symbol, it would count as one character despite being a
multi-byte sequence. But space and newline are bona-fide characters in
any charset. So they count. Part of the challenge is to remove "all"
unnecessary characters.

#they are wrong on blank spaces and '\n'
#because there is not one reason to count blank spaces and '\n'
#infact if there is a list of code of name-size
# a1.c 1000, a2.c 2000, a3.c 3000 etc
#that not count the spaces
#it is enought to delete all them (spaces and '\n') for obtain
#the same list conserver the order of the list

My latest version is down to 644 (omitting main and the test cases).

luserXtrog

8/11/2011 11:34:00 PM

On Aug 11, 1:34 am, "io_x" <a...@b.c.invalid> wrote:
> "luser- -droog" <mijo...@yahoo.com> ha scritto nel messaggionews:7ea8f68e-6ead-4427-a99c-f15b4959915d@s7g2000yqk.googlegroups.com...
> On Aug 5, 1:06 am, "io_x" <a...@b.c.invalid> wrote:
>
>
>
> > "luser- -droog" <mijo...@yahoo.com> ha scritto nel messaggio
> > news:27f55e9c-638f-4083-9069-
>
> > > #include<stdio.h>
> > > #include<string.h>
> > > char*m,*n;atom(x){return x?(islower(m[x])?m[x]:0):0;}
> > > eq(x,y){return x&&y&&atom(x)==atom(y);}
> > > cell(x){char*t=n,d=0;if(!x||!m[x])return 0;if(m[x]==' ')++x;
> > > if(atom(x)){*n++=m[x];*n++=0;return(n-m)-2;}
> > > if(m[x]=='\\'){memcpy(n,m+x,4);n+=4;*n++=0;return(n-m)-5;}
> > > do{d=m[x]?(m[x]=='('?d+1:(m[x]==')'?d-1:d)):0;*n++=m[x++];}
> > > while(d);*n++=0;return t-m;}
> > > car(x){return x?cell(x+1):0;}
>
> > if competition is for write less chars...
> > they have to count the letter different from ' ' and \n
> > so one can indent something
>
> The rules for counting are defined for each question. For this one the
> count was source-set characters. If one chose to use utf8 to use the
> real lambda symbol, it would count as one character despite being a
> multi-byte sequence. But space and newline are bona-fide characters in
> any charset. So they count. Part of the challenge is to remove "all"
> unnecessary characters.
>
> #they are wrong on blank spaces and '\n'
> #because there is not one reason to count blank spaces and '\n'

It's up to the whim of the questoner!

> #infact if there is a list of code of name-size
> # a1.c 1000, a2.c 2000, a3.c 3000 etc
> #that not count the spaces
> #it is enought to delete all them (spaces and '\n') for obtain
> #the same list conserver the order of the list
>
> My latest version is down to 644 (omitting main and the test cases).

Take a closer look at this golfed code (now down to 642!). The only
whitespace
in there is *necessary* to separate the tokens. I suspect we're
arguing *the
same thing* back and forth at each other. I think we actually agree in
substance
and are disputing vacuous details.

#include<stdio.h>
#include<string.h>
#define X m[x]
#define R return
char*n,*m;Y(char*s){n+=strlen(s=strcpy(n,s))+1;printf("%s\n%s\n\n",s,m
+V(s-m));n=m+1;}int
u,w,d;C(x,y){w=n-m;n+=sprintf(n,y?"(%s %s)":"(%s)",&X,m+y)+1;R w;}T(x)
{R
X>>5==3;}L(x){R X==92;}O(x,j){w=n-m;memcpy(n,&X,j);n+=j;*n++=0;R
w;}E(x){X==' '?++x:0;R
X==41?0:L(x)?O(x,4):P(x);}P(x){d=0,w=x;do{X==40?d++:X==41?d--:0;++x;}
while(d>0);R
O(w,x-w);}D(x){u=E(x+1);R u?E(x+1+strlen(m+u)):0;}V(x){int a=E(x
+1),b=D(x);R
T(x)|T(a)?x:L(a)?C(a,V(b)):L(E(a+1))?V(S(V(b),E(a+3),D(a))):V(C(V(a),b?
V(b):0));}S(w,y,x){R
T(x)?(X==m[y]?w:x):C(L(w+1)?E(x+1):S(w,y,E(x+1)),D(x)?S(w,y,D(x)):0);}

Andrey Tarasevich

9/2/2011 6:03:00 PM

On 8/2/2011 11:30 PM, luser- -droog wrote:
> ...

Instead of `return(n-m)-2` you can simply do `return n-m-2` even if `n`
and `m` are pointers.

luserXtrog

9/8/2011 6:24:00 AM

On Sep 2, 1:02 pm, Andrey Tarasevich <andreytarasev...@hotmail.com>
wrote:
> On 8/2/2011 11:30 PM, luser- -droog wrote:
>
> > ...
>
> Instead of `return(n-m)-2` you can simply do `return n-m-2` even if `n`
> and `m` are pointers.

Yeah. I've got a bit of a blind spot about associativity.
I was ultimately able to factor all of those copying clauses to use a
helper function:

O(x,j){w=n-m;memcpy(n,&X,j);n+=j;*n++=0;R w;}

so return(n-m)-2 and return(n-m)-5 no longer occur, because the value
is stashed in w before n is modified.

Thanks, though. That would certainly squeeze 1 (or 2) vital
characters.

I've been thinking it should be possible to rewrite the whole thing as
a while-switch statement, implementing recursion with an explicit
stack and
continue. Every function would be an expression. That could factor out
all
the 'return's entirely.

Moshbear dot Net

9/11/2011 5:50:00 AM

On Aug 3, 2:30 am, luser- -droog <mijo...@yahoo.com> wrote:
> I've just wasted a week trying to solve this problem on code-golf:http://codegolf.stackexchange.com/questions/284/write-an-in......
>
> I think I've squeezed it dry, but I've got to know if anybody has any
> *really* sneaky tricks to make it shorter.
>
> Here is the last version before reducing function names and replacing
> repeated sequences with macros. It copies the argument string into
> scratch memory, and then manipulates (mutilates?) the expression using
> cursors and a stripped down suite of McCarthy primitives.
>
> The winning solution was 320 chars of Python. But C lacks a library
> call for string replacement. Even with my cleverest hackery, I can't
> get the block below 778 chars. Any comments
> or howls of horror are appreciated.
>

Read the ISO spec three times, front to back. You will find ways to
further golf your code. It helped me a lot. Then again, I sold K&R
because I had no use for it anymore.

comp.lang.c

any tricks to golf this code further?

luserXtrog

io_x

Bartc

luserXtrog

luserXtrog

io_x

luserXtrog

Andrey Tarasevich

luserXtrog

Moshbear dot Net

x Login to ForumsZone