Asp Forum - tagbody/go vs labels/tail-call

Jim Newton

12/3/2015 2:11:00 PM

I have a CL function which takes a state machine (which is a particular data structure, not important to this particular conversation) and dumps lisp code which implements the state machine.

The code currently creates code in terms of TAGBODY and GO, which in my initial opinion
is very clear and clean and hopefully very easy to compile.

When I showed the result to to non-lispers, they both had the reaction that goto is inefficient.
I'd like to get the opinion of lispers and perhaps lisp compiler experts.

Which of the following pieces of machine-generated code is likely to be "better".
Each is just a syntactic translation of the other. And both examples illustrate a state
machine with two states named 13 and 14.

;; GOTO based code
(LAMBDA (SEQ)
(BLOCK CHECK
(LABELS ((END ()
(NULL SEQ))
(NEXT ()
(IF (END)
(RETURN-FROM CHECK NIL)
(POP SEQ))))
(TAGBODY
(GO 13)
13
(WHEN (END) (RETURN-FROM CHECK T))
(TYPECASE (NEXT)
(SYMBOL (GO 14))
(NUMBER (GO 13)))
(RETURN-FROM CHECK NIL)
14
(TYPECASE (NEXT)
(NUMBER (GO 13)))
(RETURN-FROM CHECK NIL)))))

;; LABELS tail-recursion based code
(LAMBDA (SEQ)
(BLOCK CHECK
(LABELS ((END ()
(NULL SEQ))
(NEXT ()
(IF (END)
(RETURN-FROM CHECK NIL)
(POP SEQ)))
(L13 ()
(WHEN (END)
(RETURN-FROM CHECK T))
(TYPECASE (NEXT)
(SYMBOL (L14))
(NUMBER (L13))
(t (RETURN-FROM CHECK NIL))))
(L14 ()
(TYPECASE (NEXT)
(NUMBER (L13))
(t (RETURN-FROM CHECK NIL)))))
(L13))))

8 Answers

Kaz Kylheku

12/3/2015 5:45:00 PM

On 2015-12-03, Jim Newton <jimka.issy@gmail.com> wrote:
> Which of the following pieces of machine-generated code is likely to be
> "better". Each is just a syntactic translation of the other. And both
> examples illustrate a state machine with two states named 13 and 14.

Relevant: you can have labels syntax, *and* compile to tagobyd/go, with macros:

http://www.kylheku.com/cgit/lisp-snippets/tree/tail-recu...

Helmut Eller

12/4/2015 9:33:00 AM

On Thu, Dec 03 2015, Jim Newton wrote:

> Which of the following pieces of machine-generated code is likely to
> be "better". Each is just a syntactic translation of the other. And
> both examples illustrate a state machine with two states named 13 and
> 14.

Compilers can convert state machines written with LABELS to machine
level gotos without (tail-)calls. That transformation is sometimes
called "contification". Implementing GO efficiently is usually a bit
easier for compiler writers. So with a good compiler, both versions
will end up with the same machine code. SBCL/CMUCL will likely do that;
CCL/Allegro probably not.

I think the ANSI Spec doesn't say which version has do be more
efficient, but in practice GO is more efficient in some implementations.
The reverse is not the case for any existing implementation, I think.
So, GO has some portability advantage.

(I wonder a bit why the ANSI Spec doesn't require contification; it's
not that hard to do and contrary to tail-calls, it's useful even on
targets that don't support tail-calls, like the JVM.)

> ;; GOTO based code
[...]
> ;; LABELS tail-recursion based code
[...]

Both versions look OK to me, but the GO based version will be efficient
in more implementations.

Not directly related to the question: the (RETURN-FROM CHECK NIL) inside
NEXT is likely a bit inefficient because calling NEXT will usually
allocate an new stack frame and the RETURN-FROM needs to undo that. An
inline declaration or turning NEXT into a macro would make it more
efficient. (Compilers usually don't inline NEXT automatically because
it's called in more than one place.)

Helmut

Jim Newton

12/4/2015 11:00:00 AM

w.r.t. NEXT and END, I didn't post it here but I thought about that later.
In my most recent version of this code generator, I actually inline NEXT and END
myself so I don't have to rely on the compiler to do it for me. But that produces
code which is of course more difficult for the human to read.

>
> Not directly related to the question: the (RETURN-FROM CHECK NIL) inside
> NEXT is likely a bit inefficient because calling NEXT will usually
> allocate an new stack frame and the RETURN-FROM needs to undo that. An
> inline declaration or turning NEXT into a macro would make it more
> efficient. (Compilers usually don't inline NEXT automatically because
> it's called in more than one place.)
>
> Helmut

Jim Newton

12/4/2015 11:07:00 AM

>
> It will be implementation dependant; you can use:
>
> (declaim (optimize (speed 3) (space 3) (safety 0) (debug 0)))
> (disassemble (compile nil ...
>

It does not really produce assembler code I can judge the efficiency of.

; disassembly for (LAMBDA (SEQ))
; Size: 452 bytes. Origin: #x12B516D943 (segment 1 of 2)
; 943: 498B442460 MOV RAX, [R12+96] ; thread.binding-stack-pointer
; no-arg-parsing entry point
; 948: 488945E8 MOV [RBP-24], RAX
; 94C: 498B442448 MOV RAX, [R12+72] ; thread.current-catch-block
; 951: 488945E0 MOV [RBP-32], RAX
; 955: 488965F0 MOV [RBP-16], RSP
; 959: 488D4DB8 LEA RCX, [RBP-72]
; 95D: 498B442450 MOV RAX, [R12+80] ; thread.current-unwind-protect-block
; 962: 488901 MOV [RCX], RAX
; 965: 48896908 MOV [RCX+8], RBP
; 969: 488D05D2000000 LEA RAX, [RIP+210] ; = L8
; 970: 48894110 MOV [RCX+16], RAX
; 974: 48894DB0 MOV [RBP-80], RCX
; 978: 0F1F840000000000 NOP
; 980: L0: 488BCD MOV RCX, RBP
; 983: 488D4424F0 LEA RAX, [RSP-16]
; 988: 4883EC70 SUB RSP, 112
; 98C: 488BD1 MOV RDX, RCX
; 98F: 488908 MOV [RAX], RCX
; 992: 488BE8 MOV RBP, RAX
; 995: E86A010000 CALL L15
; 99A: 4881FA17001020 CMP RDX, 537919511
; 9A1: 0F8594000000 JNE L7
; 9A7: 488BD5 MOV RDX, RBP
; 9AA: 488D4424F0 LEA RAX, [RSP-16]
; 9AF: 4883EC70 SUB RSP, 112
; 9B3: 488B75B0 MOV RSI, [RBP-80]
; 9B7: 488BCA MOV RCX, RDX
; 9BA: 488910 MOV [RAX], RDX
; 9BD: 488BE8 MOV RBP, RAX
; 9C0: E8D6000000 CALL L13
; 9C5: 4881FA17001020 CMP RDX, 537919511
; 9CC: 740D JEQ L1
; 9CE: 8D42F1 LEA EAX, [RDX-15]
; 9D1: A80F TEST AL, 15
; 9D3: 7545 JNE L4
; 9D5: 807AF145 CMP BYTE PTR [RDX-15], 69
; 9D9: 753F JNE L4
; 9DB: L1: 488BD5 MOV RDX, RBP
; 9DE: 488D4424F0 LEA RAX, [RSP-16]
; 9E3: 4883EC70 SUB RSP, 112
; 9E7: 488B75B0 MOV RSI, [RBP-80]
; 9EB: 488BCA MOV RCX, RDX
; 9EE: 488910 MOV [RAX], RDX
; 9F1: 488BE8 MOV RBP, RAX
; 9F4: E8A2000000 CALL L13
; 9F9: 8D42F1 LEA EAX, [RDX-15]
; 9FC: A801 TEST AL, 1
; 9FE: 750E JNE L2
; A00: 3C0A CMP AL, 10
; A02: 740A JEQ L2
; A04: A80F TEST AL, 15
; A06: 750B JNE L3
; A08: 807AF129 CMP BYTE PTR [RDX-15], 41
; A0C: 7705 JNBE L3
; A0E: L2: E96DFFFFFF JMP L0
; A13: L3: BE17001020 MOV ESI, 537919511
; A18: EB78 JMP L12
; A1A: L4: 8D42F1 LEA EAX, [RDX-15]
; A1D: A801 TEST AL, 1
; A1F: 7515 JNE L6
; A21: 3C0A CMP AL, 10
; A23: 7411 JEQ L6
; A25: A80F TEST AL, 15
; A27: 7506 JNE L5
; A29: 807AF129 CMP BYTE PTR [RDX-15], 41
; A2D: 7607 JBE L6
; A2F: L5: BE17001020 MOV ESI, 537919511
; A34: EB5C JMP L12
; A36: L6: E945FFFFFF JMP L0
; A3B: L7: BE4F001020 MOV ESI, 537919567 ; T
; A40: EB50 JMP L12
; A42: L8: BA17001020 MOV EDX, 537919511
; A47: E304 JRCXZ L9
; A49: 488B53F8 MOV RDX, [RBX-8]
; A4D: L9: 488B65F0 MOV RSP, [RBP-16]
; A51: 488BF2 MOV RSI, RDX
; A54: 488B45E0 MOV RAX, [RBP-32]
; A58: 4989442448 MOV [R12+72], RAX ; thread.current-catch-block
; A5D: 488B7DE8 MOV RDI, [RBP-24]
; A61: 498B4C2460 MOV RCX, [R12+96] ; thread.binding-stack-pointer
; A66: 4839CF CMP RDI, RCX
; A69: 7427 JEQ L12
; A6B: 31DB XOR EBX, EBX
; A6D: L10: 4883E910 SUB RCX, 16
; A71: 488B4108 MOV RAX, [RCX+8]
; A75: 4885C0 TEST RAX, RAX
; A78: 740B JEQ L11
; A7A: 488B11 MOV RDX, [RCX]
; A7D: 49891404 MOV [R12+RAX], RDX
; A81: 48895908 MOV [RCX+8], RBX
; A85: L11: 488919 MOV [RCX], RBX
; A88: 4839CF CMP RDI, RCX
; A8B: 75E0 JNE L10
; A8D: 49894C2460 MOV [R12+96], RCX ; thread.binding-stack-pointer
; A92: L12: 488BD6 MOV RDX, RSI
; A95: 488BE5 MOV RSP, RBP
; A98: F8 CLC
; A99: 5D POP RBP
; A9A: C3 RET
; A9B: L13: 8F4508 POP QWORD PTR [RBP+8]
; Origin #x12B516DA9E (segment 2 of 2)
; A9E: 48894DA8 MOV [RBP-88], RCX ; no-arg-parsing entry point
; AA2: 488975A0 MOV [RBP-96], RSI
; AA6: 488BDD MOV RBX, RBP
; AA9: 488D4424F0 LEA RAX, [RSP-16]
; AAE: 4883EC70 SUB RSP, 112
; AB2: 488BD1 MOV RDX, RCX
; AB5: 488918 MOV [RAX], RBX
; AB8: 488BE8 MOV RBP, RAX
; ABB: E844000000 CALL L15
; AC0: 488B75A0 MOV RSI, [RBP-96]
; AC4: 488B4DA8 MOV RCX, [RBP-88]
; AC8: 4881FA17001020 CMP RDX, 537919511
; ACF: 7519 JNE L14
; AD1: 488B41F8 MOV RAX, [RCX-8]
; AD5: 488B50F9 MOV RDX, [RAX-7]
; AD9: 488B41F8 MOV RAX, [RCX-8]
; ADD: 488B4001 MOV RAX, [RAX+1]
; AE1: 488941F8 MOV [RCX-8], RAX
; AE5: 488BE5 MOV RSP, RBP
; AE8: 5D POP RBP
; AE9: C3 RET
; AEA: L14: BA17001020 MOV EDX, 537919511
; AEF: 488BDC MOV RBX, RSP
; AF2: 52 PUSH RDX
; AF3: B902000000 MOV ECX, 2
; AF8: 488BC6 MOV RAX, RSI
; AFB: 41BB40010020 MOV R11D, 536871232 ; UNWIND
; B01: 41FFE3 JMP R11
; B04: L15: 8F4508 POP QWORD PTR [RBP+8]
NIL

Christian Jullien

12/22/2015 5:30:00 AM

May lispers (for good or religious reasons) hate TAGBODY/GO. I admit it is not very readable and should be avoided if other solutions exist. However, it is said that TAGBODY/GO is the very low level control structure that other special forms / macros may use for their implementation (e.g. loop). A lisp complier is especially happy to compile this special form as it generally matches low level instructions.

See for example https://en.wikipedia.org/wiki/OpenLis...

On Friday, December 4, 2015 at 10:32:35 AM UTC+1, Helmut Eller wrote:
> On Thu, Dec 03 2015, Jim Newton wrote:
>
> > Which of the following pieces of machine-generated code is likely to
> > be "better". Each is just a syntactic translation of the other. And
> > both examples illustrate a state machine with two states named 13 and
> > 14.
>
> Compilers can convert state machines written with LABELS to machine
> level gotos without (tail-)calls. That transformation is sometimes
> called "contification". Implementing GO efficiently is usually a bit
> easier for compiler writers. So with a good compiler, both versions
> will end up with the same machine code. SBCL/CMUCL will likely do that;
> CCL/Allegro probably not.
>
> I think the ANSI Spec doesn't say which version has do be more
> efficient, but in practice GO is more efficient in some implementations.
> The reverse is not the case for any existing implementation, I think.
> So, GO has some portability advantage.
>
> (I wonder a bit why the ANSI Spec doesn't require contification; it's
> not that hard to do and contrary to tail-calls, it's useful even on
> targets that don't support tail-calls, like the JVM.)
>
> > ;; GOTO based code
> [...]
> > ;; LABELS tail-recursion based code
> [...]
>
> Both versions look OK to me, but the GO based version will be efficient
> in more implementations.
>
> Not directly related to the question: the (RETURN-FROM CHECK NIL) inside
> NEXT is likely a bit inefficient because calling NEXT will usually
> allocate an new stack frame and the RETURN-FROM needs to undo that. An
> inline declaration or turning NEXT into a macro would make it more
> efficient. (Compilers usually don't inline NEXT automatically because
> it's called in more than one place.)
>
> Helmut

Pascal J. Bourguignon

12/22/2015 1:11:00 PM

Christian Jullien <christian.jullien@novasparks.com> writes:

> May lispers (for good or religious reasons) hate TAGBODY/GO. I admit
> it is not very readable and should be avoided if other solutions
> exist. However, it is said that TAGBODY/GO is the very low level
> control structure that other special forms / macros may use for their
> implementation (e.g. loop). A lisp complier is especially happy to
> compile this special form as it generally matches low level
> instructions.
>
> See for example https://en.wikipedia.org/wiki/OpenLis...

Indeed. If your compiler targetted a lambda-machine, then lambda and
closures would be the low-level structures. If your compiler targetted
a GPU, then (assuming the compiler can estabalish the absence of cross
side-effects), map, mapcar, etc, would be the low-level structures.
Etc.

What's "primitive" or low-level in lisp (or any other programming
language) is actually a function of the target machine.

--
__Pascal Bourguignon__ http://www.informat...
â??The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.â? -- Carl Bass CEO Autodesk

Pascal Costanza

12/22/2015 1:34:00 PM

On 22/12/2015 06:30, Christian Jullien wrote:
> May lispers (for good or religious reasons) hate TAGBODY/GO. I admit it is not very readable and should be avoided if other solutions exist.

There are situations where TAGBODY/GO is more readable than other
solutions, and then TAGBODY/GO should be preferred.

Pascal

--
My website: http:/...
Common Lisp Document Repository: http://cdr.eu...
Closer to MOP & ContextL: http://common-lisp.net/proje...
The views expressed are my own, and not those of my employer.

Kaz Kylheku

12/22/2015 10:25:00 PM

On 2015-12-22, Christian Jullien <christian.jullien@novasparks.com> wrote:
> May lispers (for good or religious reasons) hate TAGBODY/GO. I admit
> it is not very readable and should be avoided if other solutions
> exist. However, it is said that TAGBODY/GO is the very low level
> control structure that other special forms / macros may use for their
> implementation (e.g. loop). A lisp complier is especially happy to
> compile this special form as it generally matches low level
> instructions.

Common Lisp's tagbody and go is a quite high level control construct.

Firstly, a tagbody form establishes a dynamic exit point around the
evaluation of some forms. When a form invokes (go <symbol>), it searches
for an exit point which is the closest tagbody (or tagbody-like
construct such as prog) in which <symbol> is a label. The evaluation of
that form is then abandoned to transfer control to that exit point. The
target exit point will then transfer control to the selected label.
When the original form is being abandoned, all unwinding takes place.
The foregoing intricacies mean that it is not a naive goto.

Secondly, it's possible to GO out of a downward lexical closure to a
tagbody outside of that lexical closure which is still dynamically live.
That is to say, a form inside tagbody captures a closure, and passes it
to a function. That function then invokes the closure, and the closure
performs a GO. In this situation, all the frames in the whole
activation chain are properly bandoned. It is a true nonlocal
control transfer. The exit point is selected lexically, but the
semantics is dynamic.

Thirdly, all the labels are on one level: they are the direct arguments
of the tagbody. Lisp's GO cannot jump into the middle of some form.
Essentially, the tagbody serves as a dispatching *prompt*. GO abandons
up to that prompt, and then selects another prompt-level form.
Contrast that with C in which a goto has function-wide scope and can
jump into a block such that the variable initializations are skipped.
And so can C's switch statement, which is just goto in disguise:

switch (val) {
int x = 0;
case 42:
if (bar) {
case 43:
x++;
}
}

In summary: low level, my ass.

Appendix:

We can simulate TAGBODY/GO like this:

(let ((next-form t))
(loop while next-form
do (block tagbody
(case next-form
(t form1 (set next-form 'a))
(a form2 (set next-form 'b))
(b form2 (set next-form nil))))))

Suppose we want form1 to do (GO B). Then we just replace form1 with:

(return-from tagbody (set next-form 'b))

Of course, we choose some unique symbol instead of tagbody, and insert
that.

It may be that Lisp compilers treat TAGBODY efficiently. Obviously, not
all cases are simple, though. Analysis is required to see which GO
forms are trivial control transfers.

comp.lang.lisp

tagbody/go vs labels/tail-call

Jim Newton

Kaz Kylheku

Helmut Eller

Jim Newton

Jim Newton

Christian Jullien

Pascal J. Bourguignon

Pascal Costanza

Kaz Kylheku

x Login to ForumsZone