Rabbit: A Compiler for Scheme/Chapter 8

←

Rabbit: A Compiler for Scheme
by Guy L. Steele, Jr.

Chapter 8

→

1177439Rabbit: A Compiler for Scheme — Chapter 8Guy L. Steele, Jr.

[Page 54]

44 8. Compilation Strategy The overall approach RABBIT takes to the compilation of SCHEME code may be summarized as follows:

(I) Alpha-conversion (renaming of variables) *and macro-expansion (expansion of syntactic rewrite rules).

(Z) Preliminary analysis (variable references, 'trivial' expressions, and side effects). n (3) Optimization (meta-evaluation).

(4) Conversion to continuation-passing style.

(5) Environment and closure analysis.

(6) Code generation.

During (1) a data structure is built which is structurally a copy of the user program but in which all variables have been renamed and in which at each 'node' of the program tree are additional slots for extra information. These slots are filled in during (Z). In (3) the topology of the structure may be modified to reflect transformations made to the program; routines from (2) may be called to update the information slots. In (4) a new data structure is contructed from the old one, radically different in structure, but nevertheless also treelike in form. During (5) information is added to slots in the second structure. In (6) this information is used to produce the final code.

A. Alpha-conversion and macro-expansion
B. Preliminary analysis
C. Optimization
D. Conversion to Continuation-Passing Style
E. Environment and closure analysis
F. Code generation

[Page 55]

45 A. Alpha-conversion and macro-expansion In this phase a copy of the user program is made. The user program is conceptually a tree structure; each node is one of several kinds of construct (constant, variable, LAMBDA-expression, IF-expression, combination, etc.). Some kinds of nodes have subnodes; for example, a LAMBDA-expression node has a subnode representing the body, and a combination node has a subnode for each argument. The copying is performed in the obvious way by a recursive tree-walk.

In the process all bound variables are renamed. Each bound variable is assigned a new generated name at the point of binding, and each node for a reference to a bound variable contains this generated name, not the original name. From this point on all variables are dealt with in terms of their new names. (This is possible because, as a consequence of lexical scoping, we can identify all references to each bound variable.) These new names are represented as atomic symbols, and the property lists of these symbols will later be used to store information about the variables.

As each subform of the user program is examined, a check is made for a macro call, which is a list whose car is an atomic symbol with one of several macro-defining properties. when such a call is encountered, the macro call is expanded, and the tree-walk is resumed on the code returned by the expansion process. [Page 56]

46 B . Preliminary analysis The preliminary analysis ("phase l") is in three passes, each involving a tree-walk of the node structure, filling in information slots at each node. (Two passes would have sufficed, but for reasons of clarity and modularity there is one pass for each type of analysis.) The first pass (ENV-ANALYZE) analyzes variable references. For each node we determine the set of all gag (bound) variables referenced at or below that node. For example, for a variable-reference node this set is empty (for a global variable), or the singleton set of the variable itself (for a local variable); for a LAMBDA-expression, it is the set for its body minus the variables bound by that LAMBDA-expression; for an IF-expression, it is the union of the sets for the predicate, consequent, and alternative; and so on. We also compute for each node the set of bound variables which appear in an ASET' at or below the node.

(This set will be a subset of the first set, but no nontrivial use of this property is used in this pass.) Finally, for each variable we store several properties on its property list, including a list of all nodes which reference the variable (for "reading") and a list of of all ASET' nodes which modify the variable. These lists are built incrementally, with an entry added as each reference is encountered during the tree walk. (This exemplifies the general strategy for passing data around; any information which cannot be passed conveniently up and down the tree, but which must jump laterally between branches, is accumulated on the property lists of the variables. It may appear to be "lucky" that all such information has to do with variables, but this is actually an extremely deep property of our notation. The entire point of using identifiers is to relate textually separated constructions. we depend on alpha-conversion to give all variables distinct names (by 'names' we really mean

[Page 57]

47 "compile-time data structures") so that the information for variables which the user happened to give the same name will not be confused.) The second pass (TRIV-ANALYZE) locates "trivial" portions of the program.

(Cf. [wand and Friedman].) Constants and variables are trivial; an IF-expressions is trivial iff the predicate, consequent, and alternative are all trivial; an ASET' is trivial iff its body is trivial; a combination is trivial iff the function is either a global variable which is the name of a MacLISP primitive, or a LAMBDA-expression whose body is trivial, and the arguments are all trivial. LAMBDA-expressions, LABELS-forms (which contain LAMBDA-expressions), and CATCH-forms are never trivial. The idea is that a trivial expression is one that MacLISP could evaluate itself, without benefit of SCHEME control structures. (No denigration of Mac1.ISP's ability is intended by this terminology!) Note particularly the two special cases of combinations distinguished here (in which the function position contains either the name of a MacLISP primitive or a LAMBDA-expression); they are very important, and shall be referred to respectively as TRIVFN-combinations and LAMBDA-combinations.

The third pass (EFFS-ANALYZE) analyzes the possible side-effects caused by each node, and the side-effects which could affect it. It actually produces two sets of analyses, one liberal and one conservative. Where there is any uncertainty as to what side-effects may be involved, it assumes none in one case and all possible in the other. The liberal estimation is used only to issue error messages to the user about possible conflicts which might result as a consequence of the freedom to evaluate arguments to combinations in any order.

The user is given the benefit of doubt, and warned only of a 'provable' conflict.

(Actually, the "proof" is a little sloppy, and can err in both directions, but in practice it has issued no false alarms and a number of helpful warnings.) The conservative estimation is used by the optimizer, which will move expressions

[Page 58]

48 only if it can prove that there will be no conflict.

Side effects are grouped into classes: ASET, RPLACA and RPLACD (which are considered distinct), FILE (input/output operations), and CONS. These are not intended to be exhaustive; there is also an internal notation for "any side-effect whatever". The use of classes enables the analysis to realize, for example, that RPLACA. cannot affect the value of a variable per se. There is a moderately large body of data in RABBIT about the side-effects of MacLISP primitive functions. For example, CAR, CDR, CAAR, CADR, and so on are known not to have side-effects, and to be respectively affected only by RPLACA, RPLACD, RPLACA, RPLACA or RPLACD, and so on. Similarly, RABBIT knows that ASET' affects the values of variables, but cannot affect the outcome of a CAR operation. (It may affect the value of the expression (CAR X), but only because a variable reference is a subnode of the combination. The effects, or affectability, of a combination are the union of the effects, or affectibility, of all arguments plus those of the function.) The CONS side-effect is a special case. This side-effect cannot affect anything, and two instances of it may be performed in the "wrong" order, but performing a single instance twice will produce distinct (as determined by EQ) and therefore incorrect results. In particular, closures of LAMBDA-expressions involve the CONS side-effect. (The definition of SCHEME says nothing about whether EQ is a valid operation on closures, but in general it is not a good idea to produce unnecessary multiple copies.) On the other hand, LAMBDA-expressions occurring in function position of a LAMBDA-combination do not incur the CONS side-effect. The CONS side-effect is given special treatment in the optimizer. (Note Side-Effect Classifications) [Page 59]

49 C . Optimization Once the preliminary analysis is done, the optimization phase performs certain transformations on the code. The result is an equivalent program which will (probably) compile into more efficient code. This new program is itself structurally a valid SCHEME program; that is. all transformations are contained within the language. The transformations are thus similar to those performed on macro calls, consisting of a syntactic rewriting of an expression, except that the situations where such transformations are applicable are more easily recognized in the case of macro calls. It should be clear that the optimizer and the macro-functions are conceptually at the same level in that they may be written in the same metalanguage that operates on representations of SCHEME programs. (Note Non-deterministic Optimization) The simplest transformation is that a combination whose function position contains a LAMBDA-expression and which has no arguments can be replaced by the body of the LAMBDA-expression:

((LAMBDA () body)) => body Another is that, in the case of a LAMBDA-combination, if some parameter of the LAMBDA-expression is not referenced and the corresponding argument can be proved to have no side-effects (with an exception discussed below), then the parameter and argument can be eliminated:

((LAMBDA (xl x2 x3) body) al aZ a3) => ((LAMBDA (xl x3) body) al a3) if x2 is unreferenced in body and a2 has no side-effects Repeated applications of this rule can lead to the preceding case.

A third rule is that, in a LAMBDA-combination, an argument can be

[Page 60]

50 substituted for one or more occurrences of a parameter in the body of the LAMBDA-expression. (This rule is related to the view of LAMBDA as a renaming operator discussed in [Declarative], and together with the two preceding rules make up the rule of beta-conversion.) Such a substitution is permissible only if (a) either the parameter is referred to only once or the argument has no side effects, and (b) the substitution will not alter the order in which expressions are evaluated in such aa way as to allow possible side-effects to produce different results.

Before performing the substitution it is necessary to show that side-effects will not interfere in this manner. This issue is discussed in [Allen and Cocke], [Geschke], and [wu1f], and characterized more accurately in [Standish]. There is also some difficulty if the parameter appears in an ASET'. Presently RABBIT does not attempt any form of substitution for such a parameter; (ASET' is so seldom used in SCHEME programs that this restriction makes very little difference.) This third rule creates an exception to the second. If an argument with a side effect is referred to once, and is substituted for the reference, then the second rule must be invoked to eliminate the original occurrence of the argument, so that the side effect will not occur twice. This requires a little collusion between the two rules.

Even if such a substitution is permissible, it is not always desirable; time/space tradeoffs are involved. The current heuristic is that a substitution is desirable if (1) the parameter is referred to only once; or (2) the argument to be substituted in is a constant or variable; or (3) the argument is a LAMBDA-expression whose body is (3a) a constant, or (3b) a variable reference, or (3c) a combination which has no more arguments than the LAMBDA-expression requires and for which the arguments are all constants or variables. This heuristic was designed to be as conservative as possible while handling most cases which arise from typical macro-expansions.

[Page 61]

51 The case where the expression substituted for a variable is a LAMBDA-expression constitutes an instance of procedure integration [Allen and Cocke].

The more general kind of procedure integration proposed in [Declarative], which would involve block compilation of several user functions, and possibly also user declarations or data type analysis, has not been implemented yet.

It would be possible to substitute a LAMBDA-expression for a variable reference in the case of a variable bound by a LABELS. This might be useful in the case of a LABELS produced by a simpleminded PROG macro, which produced a labelled function for each statement of the FROG; in such a case most labelled functions would be referred to only once. We have not implemented this yet in RABBIT. (Note Loop Unrolling) Currently there is not any attempt to perform the inverse of beta-conversion. This process would be that of locating common subexpressions of some single large expression, making that large expression the body of a LAMBDA-expression of one parameter, replacing all occurrences of the common subexpression by a reference to the parameter, and replacing the large expression by a combination whose function position contained the LAMBDA-expression and whose argument was a copy of the common subexpression. More generally, several common subexpressions could be isolated at once and made into several parameters of the LAMBDA-expression. For example, consider:

(LAMBDA (A B C) (LIST (/ (+ (- B) (SQRT (- (" B Z) (e 4 A C)))) (* 2 AH (/ (- (- B) (SORT (~ (" B 2) (* 4 A C)))) (* 2 A)))) within the large expression (LIST ...) we might detect the common subexpressions (- B), (SQRT . ..), and (* Z A). Thus we would invent three parameters 01. Q2, Q3 and transform the expression into:

[Page 62]

(LAMBDA (A B C) 52 ((LAMBDA (QI Q2 Q3) (LIST (/ (+ Q1 QZ) Q3) (/ (- Q1 02) Q3))) (- B) (SORT (- (' B 2) (* 4 A C))) (* 2 A))) (There would be no problem of conflicting names as there is for macro rules, because we are operating on code for which all variables have already been renamed; Q1, QZ, and Q3 can be chosen as the next variables in sequence.) the renaming This approach doesn't always work if side-effects are present; the abstracted (!) common subexpression may be evaluated too soon , or the wrong number of times. This can be solved by wrapping (LAMBDA () e) around the common subexpression, and replacing references by' a combination instead of a simple variable reference. For example:

(IF (HAIRYP X) (BLOCK (PRINT '

(PRINT (PRINT (BLOCK (PRINT '

(PRINT (PRINT We could not transform it ((LAMBDA (Ql) (Here is some hairzl) X) '1End of hair.|)) 1This one is baldzl) X) '1End of baldness.|))) into this:

(IF (HAIRYP X) (BLOCK (BLOCK (PRINT X)) because x would be printed (PRINT Q1 (PRINT (PRINT Q1 (PRINT before '1Here is some hair:|) '1End of hair.|)) '1This one is baldzl) '1End of baldness.|)))) the appropriate leading message. Instead, we

[Page 63]

53 transform it into:

((LAMBDA (QI) (IF (HAIRYP X) (BLOCK (PRINT '1Here is some hairzl) (01) (PRINT '1End of hair.|)) (BLOCK (PRINT '1This one is bald:|) (Q1) (PRINT '1End of baldness,|)))) (LAMBDA () (PRINT X))) This is similar to the call-by-name trick used in writing macro rules.

A more general transformation would detect nearly common subexpressions as follows:

((LAMBDA (Ql) (IF (HAIRYP X) (QI '1Here is some hairzl '1End of hair.|) (QI '1This one is bald:| '1End of baldness.|))) (LAMBDA (Q2 Q3) (BLOCK (PRINT OZ) (PRINT X) (PRINT Q3)))) In this way we can express the notion of subroutinization.

(Note Subroutinization} we point out these possibilities despite the fact that they have not been implemented in RABBIT because the problem of isolating co mon subexpressions seems not to have been expressed in quite this way in the literature on compilation strategies. We might speculate that this is because most compilers which use complex optimization strategies have been for ALGOL-like languages which do not treat functions as full-fledged data objects, or even permit the writing of "anonymous" functions in functions calls as LISP does.

RABBIT does perform folding on constant expressions [Allen and Cocke]; that is, any combination whose function is a side-effect-less HacLISP primitive

[Page 64]

54 and whose arguments are all constants is replaced by the result of applying the primitive to the arguments. There is presently no attempt to do the same thing for side-effect-less SCHEME functions, although this is conceptually no problem.

Finally, there are two transformations on IF expressions. One is simply that an IF expression with a constant predicate is simplified to its consequent or alternative (resulting in elimination of dead code). The other was adapted from [Standish], which does not have this precise transformation listed, but gives a more general rule. In its original form this transformation is:

(IF (IF a b c) d e) => (IF a (IF b d e) (IF C d e)) One problem with this is that the code for d and e is duplicated. This can be avoided by the use of LAMBDA-expressions:

((LAMBDA (Ol QZ) (IF a (IF b (01) (QZ)) (IF C (01) (QZ)))) (LAMBDA () d) (LAMBDA () e)) As before, there is no problem of name conflicts with Q1 and QZ. While this code may appear unnecessarily complex, the calls to the functions Q1 and Q2 will typically, as shown above, be compiled as simple GOTO's. As an example, consider the expression:

(IF (AND PRED1 PREDZ) (PRINT 'WIN) (ERROR 'LOSE)) Expansion of the AND macro will result in:

(IF ((LANBDA (V R) (IF V (R) 'NIL)) PRED1 (LAMBDA () PRED2)) (PRINT 'WIN) (ERROR 'LOSE))

[Page 65]

55 (For expository clarity we will not bother to rename all the variables, inasmuch as they are already distinct.) Because V and R have only one reference apiece (and there are no possible interfering side-effects), the corresponding arguments can be substituted for them.

(IF ((LAMBDA (V R) (IF PRED1 ((LAMBDA () PRED2)) 'NIL)) PRED1 (LAMBDA () PREDZ)) (PRINT 'WIN) (ERROR 'LOSE)) Now, since V and R have no referents at all, they and the corresponding arguments can be eliminated, since the arguments have no side-effects.

(IF ((LAMBDA () (IF PRED1 ((LAMBDA () PREDZ)) 'NIL))) (PRINT 'WIN) (ERROR 'LOSE)) Next, the combination ((LAMBDA () ...)) is eliminated in two places:

(IF (IF PREDI PREDZ 'NIL) (PRINT 'WIN) (ERROR 'LOSE)) Now, the transformation on the nested IF's:

((LAMBDA (QI Q2) (IF PRED1 (IF PRED2 (QI) (QZ)) (IF 'NIL (01) (Q2)))) (LAMBDA () (PRINT 'WIN)) (LAMBDA () (ERROR 'LOSE))) Now one IF has a constant predicate and can be simplified:

[Page 66]

56 ((LAMBDA (Q1 Q2) (IF PRED1 (IF PRED2 (QI) (QZ)) (Q2))) (LAMBDA () (PRINT 'wIN)) (LAMBDA () (ERROR 'LOSE))) The variable Q1 has only one referent, and so we substitute in, eliminate the variable and argument, and collapse a ((LAMBDA () ..)):

((LAMBDA (Q2) (IF PRED1 (IF PRED2 (PRINT 'WIN) (Q2)) (Q2))) (LAMBDA () (ERROR 'LOSE))) Recalling that (QZ) is, in effect, a GOTO branching to the common piece of code, and that by virtue of later analysis no actual closure will be created for either LAMBDA-expression, this result is quite reasonable. (Note Evaluation for Control) [Page 66, continued]

D. Conversion to Continuation-Passing Style This phase is the real meat of the compilation process. It is of interest primarily in that it transforms a program written in SCHEME into an equivalent program (the continuation-passing-style version, or CPS version), written in a language isomorphic to a subset of SCHEME with the property that interpreting it requires no control stack or other unbounded temporary storage and no decisions as to the order of evaluation of (nontrivial) subexpressions, The importance of these properties cannot be overemphasized. The fact that it is essentially a subset of SCHEME implies that its semantics are as clean, elegant, and well-understood as those of the original language. It is easy to build an

[Page 67]

57 interpreter for this subset, given the existence of a SCHEME interpreter, which can execute the transformed program directly at this level. This cannot be said for other traditional intermediate compilation forms; building an interpreter for triples [Gries], for example, would be a tremendous undertaking. The continuation-passing version expresses all temporary intermediate results explicitly as variables appearing in the program text, and all temporary control structure in the form of LAMBDA-expressions (that is, closures). It is explicit in directing the order of operations; there is no nontrivial freedom at any point in the evaluation process. V As a result, once the CPS version of a program has been generated, the remainder of the compilation process is fairly easy. There is a reasonably direct correspondence between constructs in the CPS language and "machine-language" operations (if one assumes CONS to be a "machine-language" primitive for augmenting environment structure, which we do). The later passes are complicated only by the desire to handle certain special cases in an optimal manner, most particularly the case of a function call whose function position contains a variable which can be determined to refer to a known LAMBDA-expression. This analysis must be done after the CPS conversion because it applies to continuations as well as LAMBDA-expressions written by the user or generated by macros.

The CPS language differs from SCHEME in only two respects. First, each primitive function is different, in that it returns no value; instead, it accepts an additional argument, the continuation, which must be a "function" of one argument, and by definition invokes the continuation tail-recursively, giving it as an argument the computed "value" of the primitive function. We extend this by convention to non-primitive functions, and so Q functions are considered to take a continuation as one of its arguments (by convention the first this

[Page 68]

58 differs from the convention used in the examples in [SCHEME], [Imperative], and [Declarative]). Continuations, however, do not themselves take continuations as arguments. A Second, no combination may have a nontrivial argument. In strict continuation-passing style (as described in note {Evalorder) of [Imperative]), this implies that no combination can have another combination as an argument, or an IF-expression with a nontrivial consequent or alternative, etc. Ne relax this to allow as arguments any trivial form in the sense described above for the preliminary triviality analysis. we note that, in principle, trivial expressions require no unbounded space on the part of the SCHEME interpreter to evaluate, and that the compiler need not worry about control and environment issues for trivial expressions. (Trivial expressions do require unbounded space on the part of the MacLISP run-time system, because the point of the triviality analysis is that trivial expressions can be handled by MacLISP! The question of what should be considered trivial is actually a function of the characteristics of our target machine. we note that, at the least, variables, constants, and LAMBDA-expressions should be considered trivial. That the preliminary triviality analysis does QE consider LAMBDA-expressions trivial is a trick so that all closures will be processed by the CPS-conversion process, and the fact that we call it a triviality analysis is a white lie. See, however, [wand and Friedman].) The effect of the restriction on combinations is startling. On the one hand, they do not so constrain the language as to be useless; on the other hand, they require a radically different approach to the expression of algorithms. It is easy to see that no control stack is necessary to evaluate such code, for, as mentioned in [SCHEME], control stack is used only to keep track of intermediate values and return addresses, and these arise only in the case of combinations

[Page 69]

59 with nontrivial arguments, and conditionals with nontrivial predicates.

An algorithm for converting SCHEME programs to continuation-passing style was given in Appendix A of [Declarative]. {Note Old CPS Algorithm) The one used in RABBIT is almost identical, except that for the convenience of the code-generation phase a distinction is made between ordinary LAMBDA-expressions and continuations, and between combinations used to invoke "functions" and those used to invoke continuations. These sets can in fact be consistently distinguished, and it affords a certain amount of error-checking; for example, a LAMBDA-expression should never appear in the "function" position of a continuation-invoking combination. Another fine point is that ASET' can never be applied to a variable bound by a continuation. Except for such differences arising from their uses, the two sets of constructs are treated more or less identically in later phases. An additional difference between the algorithm in [Declarative] and the one in RABBIT is that trivial subforms are treated as single nodes in the CPS version; these nodes have pointers to the non-CPS versions of the relevant code, which are largely ignored by later processing until the final code is to be generated.

It must be emphasized that there is not necessarily a unique CPS version for a given SCHEME program; there is as much freedom as there is in the original program to reorder the evaluation of subexpressions. In the course of the conversion process decisions must be made as to what order to use in evaluating arguments to combinations. The current decision procedure is fairly simpleminded, consisting mostly of not making copies of constants and the values of variables. The point here, as earlier, is not so much that RABBIT has a much better algorithm than other compilers as that it has a far cleaner way of expressing the result. (For a complex decision procedure for argument ordering, see [wulf].) (Note Non-deterministic CPS Conversion) [Page 70]

60 E. Environment and closure analysis This phase consists of four passes over the CPS version of the program.

As with the earlier preliminary analysis, each pass determines one related set of information and attaches this information to nodes of the program tree and to property lists.

The first pass (CENV-ANALYZE) analyzes variable references for the CPS version in a manner similar to that of the first pass of the preliminary analysis. The results of this previous analysis are used here in the case of trivial expressions; with this exception the analysis is redone completely, because additional variables are introduced by the CPS conversion. (None of these new variables can appear in an ASET', however, and so the analysis of written variables need not be done over.) In addition, for each variable reference which does not occur in the function position of a combination, we mark that variable with a non-nil VARIABLE-REF? property, used later to determine whether closures need to be created for known functions.

The second pass (BIND-ANALYZE) determines for each LAMBDA-expression whether a closure will be needed for it at run time. There are three possibilities:

(1) If the function denoted by the LAMBDA-expression is bound to some variable, and that variable is referenced other than in function position, then the closure is being treated as data, and must be a full (standard CBETA format) closure. If the function itself occurs in non-function position other than in a LAMBDA-combination, it must be fully closed.

(Z) If the closure is bound to some variable, and that variable is referenced

[Page 71]

61 only in function position, but some of these references occur within other partially or fully closed functions, then this function must be partially closed. By this we mean that the environment for the closure must be "consed up", but no pointer to the code need be added on as for a full closure. This function will always be called from places that know the name of the function and so can just perform a GO to the code, but those such places which are within closures must have a complete copy of the necessary environment.

(3) In other cases (functions bound to variables referenced only in function position and never within a closed function, or functions occurring in function position of LAMBDA-combinations), the function need not be closed.

This is because the environment can always be fully recovered from the environment at the point of call.

In order to determine this information, it is necessary to determine, for each node, the set of variables referred to from within closed functions at or below that node. Thus this process and the process of determining which functions to close are highly interdependent, and so must be accomplished in a single pass.

The second pass also generates a name for each LAMBDA-expression (to be used as tags in the output code, as discussed in the examples earlier), and for non-closed functions determines which variables will be assigned to "registers" or "memory locations". For these non-closed functions it may determine that certain variables need not be assigned locations at all (they are never referenced, or are bound to other non-closed functions the latter circumstance is important when a variable is known to denote a certain function, but the optimizer was too conservative to perform beta-substitution for fear of duplicating code and thus wasting space). Finally, for each variable which is (logically, at run time not necessarily actually) bound to a known function (and

[Page 72]

62 which never appears in an ASET'), a property KNOWN-FUNCTION is put on its property list whose value is the node of the CPS version of that function. This property is used later in generating code for combinations in whose function positions such variables appear.

The third pass (DEPTH-ANALYZE) examines each LAMBDA-expression and determines the precise registers or memory locations through which arguments are to be passed to each. Closed functions take their arguments in the standard registers described earlier; non-closed functions may take their arguments in any desired places; (Partially closed functions could also, but there is little advantage to this.) The allocation strategy in RABBIT for non-closed functions is presently merely stack-like; the deeper the nesting of a function, the higher in the ordering of "registers" and "memory locations" are the locations assigned.

(See e.g. [Johnsson] for a detailed analysis of the register allocation problem.) ' The fourth pass (CLOSE-ANALYZE) determines the precise format of the environment to be constructed for each closure. That is, while the third pass handles cases for which stack-allocation of environments will suffice, the fourth pass deals with heap-allocated environment structures. Recall that the format of an environment can be completely arbitrary, since the only code which can possibly refer to an environment is the function for a closure of which the environment was created. Therefore the compiler which compiles that function has a free hand in determining the structure of the environment. For the sake of simplicity, RABBIT chooses to generate code which represents environments simply as a list of variable values. Several environment lists may share a common tail.

The environment for a closure need not contain any variables not needed by the closed function, but it may if this will allow the sharing of a single structure among several closures. (There is a problem with variables modified by ASET' which is discussed in the next paragraph.)

[Page 73]

63 For each LAMBDA-expression which must be closed, three sets of variables are computed: (1) the variables which will already be in the "consed" environment structure at the time the closure is to be created; (2) additional variables which must be added ("consed on") to the existing structure to create the closure (because at that point they are spread out in "registers") {Note Heap-Allocated Contours); (3) variables which must be added to the environment immediately after entering the function because they must eventually be added in for closures later and they are referred to in ASET' constructs. The third set arises from a requirement that ASET' constructs must have a consistent effect, and confusion can arise if a variable's value can be in more than one place. If the value were allowed to be both in a "register" and in an environment structure, or in several different environment structures, then altering the value in one place would not affect the other places. To assure consistency, this third set is computed, and such variables must at run time be placed in an environment structure to be shared by all others which refer to such variables.

For every LABELS statement a set of variables is computed which is the set of variables to be added to the existing environment on entry to the LABELS body, in order to share this new structure among all the closures to be created for the LABELS functions. [Page 74]

64 F. Code generation Given the foregoing analysis, the generation of code is straightforward. and largely consists of using the information already gathered to detect special cases. The special cases of interest relate almost entirely to function calls and closures (indeed, there is little else in the language for RABBlT's purposes! ) _ RABBIT has provision for 'block compiling' a number of functions into a single module. This permits an optimization in which one function can transfer control directly to another without going through the 'UUO handler'. Even if several g functions are not compiled into a single module, this is still of advantage, because a single user function can produce a large number of output functions, as a consequence of the code-generation techniques.

A module consists of a single HacLlSP function whose body is a single FROG. This FROG has no local variables, but does have a number of tags, one for each function in the module. On entry to the module, the register HI-INV!! will contain the "environment" for the function to be executed. As noted above, the format of this is arbitrary. For functions compiled by RABBIT, this Is a list whose car is a tag within the FROG and whose cdr Is the 'real envlronment'.

(Note Code Pointers) At the beginning of the FROG there is always the code (GO (PROGZ NIL (CAR HENVH) (sire ueuvn (CDR ~~Euvn)))) the effect of which is to put the 'real environment' in "ENV" and than perform a computed GO to the appropriate tag. (This is the only circumstance in which the HacLISP PROGZ and computed GO constructs are used by RABBIT-compiled code.

Either could be eliminated at the expense of more bookkeeping, the former by

[Page 75]

65 using a temporary intermediate variable, the latter by using a giant COND with non-computed GO statements (which is effectively how the HacLISP compiler compiles a computed GO anyway). As always, such trivial issues are left to the MacLISP compiler when they do not bear on the issues of interest in compiling SCHEME code.) For small functions, often the 'main entry point' is the only closed function, and it would be possible to eliminate the computed GO, but RABBIT always outputs one, because is is cheap and provides a useful error check.

Once the computed GO has been performed, the code following the tag is responsible for performing its bit of computation and then exiting. It may exit by setting the "FUN" register to another function, setting up appropriate argument registers, and then doing (RETURN NIL) to exit the module and enter the UUO handler; or it may exit by directly transferring control to another function within the module by performing a GO to the appropriate tag, after setting up the arguments and HENVH. In the latter case the arguments may actually be passed through "memory locations" rather than the standard "registers'. (Conceptually. in this optimized case the environment needed for the function being called is being passed, not in "ENV", but spread out in those registers and locations lower than those being used to pass the arguments.) Starting with the CPS version of one or more user functions. the generation of the code for a module proceeds iteratively. Code for each function is generated in turn, producing one segment of code and a tag; this tag and code will become part of the body of the module. In processing a function, other functions may be encountered; in general, each such function is added to the list of outstanding functions for the module, and is replaced by code to generate a closure for that function. When all functions have been processed, the outer structure of the module is created.

Many situations are treated specially. For example,

[Page 76]

66 ((LAMBDA ...) ...) does not cause the LAMBDA-expression to be added to the list of outstanding functions; rather, a MacLISP PROGN is constructed consisting of the argument setup followed by the code for the body of the LAMBDA-expression. A more subtle case is (FOO (LAMBDA ...) ...) where FOO is the name of a MacLISP primitive and the LAMBDA-expression is the continuation. In this case a PROGN is constructed consisting of calling the MacLISP primitive on the other arguments, putting this value into the appropriate location, and then executing the body of the LAMBDA-expression. (It should be noted that all these special cases must be anticipated by the analysis preceding the code generation phase.) In the case of ((LAMBDA ...) ...), we must also handle the argument setup a little carefully, because parameters which are never referred to or which represent known non-closed functions need not actually be passed. However, the corresponding argument for the first case must nevertheless be evaluated because it lnay have a side-effect. A good example is the result of' expanding BLOCK (neglecting the effects of optimization): there is a (continuation-passing style) combination of the form:

((LAMBDA (C A B) (B C)) cont X (LAMBDA (C) y)) The argument x need not be passed, but presumably has a side effect and so must be evaluated. The second LAMBDA-expression need not be closed, and so requires neither evaluation nor passing. The output code uses a PROGN to evaluate the arguments which are potentially for effect. In this way the end result of a

[Page 77]

67 BLOCK construct actually turns out to be a HacLISP PROGN. (The routine LAMBDACATE in the Appendix is responsible for this analysis.) (Note Evaluation for Effect) Another case of interest is a combination whose function position contains a variable with a KNOWN-FUNCTION property. The value of this property is the node for the CPS version of the function, which provides information about pox ° in code generation strategies. We can decide which arguments needn't be passed as for the ((LAMBDA ...) ...) case, and can also arrange to call the function with a direct (MacLISP) GO to the appropriate tag within the module.

The setup of the environment depends on whether the function is non-closed or partially closed; in the latter case the partial closure is the environment, and in the former the environment can be recovered from the current one (and may even be the same).

A certain amount of "peephole optimization' [HcKeeman] is also performed, primarily to make it easier for people to inspect the code produced, since the MacLISP compiler will handle them anyway. Examples of these are avoiding the generation of SETQ of a variable to the value of that same variable; reduction of car-cdr chains to single functions, such as (CAR (CDR (CDR x))) to (CADDR x); removal of nested PROGN's such as (PROGN a (PROGN b c) d) => (PROGN a b c d) and the like; and simplification of nested COND's, such as (COND (a b) (COND (a b) (T (COND (c d) => (c d) ...))) ...) One of the effects of this last peephole optimization is that many times, when the user writes a COND in a piece of SCHEME code, that COND is expanded into IF

[Page 78]

68 constructs, and then re-contracted by the peephole optimization 'into equivalent COND! (This fact is of no practical consequence, but looks cute.)