This page needs to be proofread.
Guy L. Steele Jr. 6 LAMBDA: The Ultimate Declarative

After the POP instruction, the stack looks like:

..., <return address for BAR>, result from G>, Y

That is, the top item of the stack has replaced the one two below it.

After the MOVEM (move to memory) instruction:

..., <return address for BAR>, <result from G>, <result from H>

which is exactly the correct setup for calling F.

Let us not here go into the issue of how such clever code might be generated, but merely recognize the fact that it gets the stack into the necessary condition for calling F.)

Suppose that the saving of a return address and the setting up of arguments were commutative operations.

(This is not true of the LSUBR calling convention, because both operations use the stack; but it is true of the SUBR convention, where the arguments are "spread" [McCarthy 62] [Moon 74] in registers, and the turn address on the stack.)

Then we may permute the code as follows (from the original example):

BAR: <set up arguments for G in registers>

PUSH [BAR1] ;save return address for (G X)

GOTO G ;call function G

BAR1: <save result of G>

<set up arguments for H in registers>

PUSH [BAR2] ;save return address for (H Y)

GOTO H ;call function H

BAR2: <set up arguments for F in registers>

GOTO F ;call function F

As it happens, the PDP-10 provides an instruction, PUSHJ, defined as follows:

L1:

PUSH [L1]

GOTO G

is the same as

L1:

PUSHJ G

except that the PUSHJ takes less code.

Thus we may write the code as:

BAR: <set up arguments for G in registers>

PUSHJ G ;save return address, call G

<save result of G>

<set up arguments for H in registers>

PUSHJ H ;save return address, call H

<set up arguments for F in registers>

GOTO F ;call function F

This is why PUSHJ (and similar instructions on other machines, whether they save the turn adress on a stack, in a register, or in a memory location) works on a subroutine call, and, by extension, why up to now many people have thought of pushing the return address at function call time rather than at form evaluation time.

The use of GOTO to call a function "tail-recursively" (known around MIT as the "JRST hack", from the PDP-10 instruction for GOTO, though the hack itself dates back to the PDP-1) is in fact not just a hack, but rather the most uniform method for invoking functions.

PUSHJ is not a function calling primitive per se, therefore, but rather than optimization of this general approach.