[Page 127]

117 Appendix we present here the complete working source code for RABBIT, written in SCHEME. (The listing of the code was produced by the "@" listing generator, written by Richard M. Stallman, Guy L. Steele Jr., and other contributors.) The code is presented on successive odd-numbered pages. Commentary on the code is on the facing even-numbered page. An index appears at the end of the listing, indicating where each function is defined.

It should be emphasized that RABBIT was not written with efficiency as a particular goal. Rather, the uppermost goals were clarity, ease of debugging, and adaptability to changing algorithms during the development process Much information is generated, never used by the compilation process, and then thrown away, simply so that if some malfunction should occur it would be easier to conduct a postmortem analysis. Information that is used for compilation is often retained longer than necessary. The overall approach is to create a big data structure and then, step by step, fill in slots, never throwing anything away, even though it may no longer be needed.

The algorithms could be increased in speed, particularly the optimizer, which often recomputes information needlessly. Determining whether or not the recomputation was necessary would have cluttered up the algorithms, however, making them harder to read and to modify, and so this was omitted. Similarly, certain improvements could dramatically decrease the space used. The larger functions in RABBIT can just barely be compiled with a memory size of 256K words on a PDP-10. However, it was deemed worthwhile to keep the extra information available for as long a time as possible.

The implementation of RABBIT has taken perhaps three man-months. This includes throwing away the original optimizer and rewriting it completely, and accomodating certain changes to the SCHEME language as they occurred. RABBIT was operational, without the optimizer, after about one man-month's work. The dissertation was written after the first version of the optimizer was demonstrated to work. The remaining time was spent analyzing the faults of the first optimizer, writing the second version, accomodating language changes, making performance measurements, and testing RABBIT on programs other than RABBIT itself.

[Page 128]

The main modules of RABBIT are organized something like this:

COMFILE, TRANSDUCE, PROCESS-FORM (Bookkeeping and file handling) COMPILE (Compile a function definition) ALPHATIZE (Convert input, rename variables) MACRO-EXPAND (Expand macro forms) META-EVALUATE (Source-to-source optimizations) PASS1-ANALYZE (Preliminary code analysis) ENV-ANALYZE (Environment analysis) TRIV-ANALYZE (Triviality analysis) EFFS-ANALYZE (Side effects analysis) META-IF-FUDGE META-COMBINATION-TRIVFN META-COMBINATION-LAMBDA SUBST-CANDIDATE META-SUBSTITUTE CONVERT CENV-ANALYZE BIND-ANALYZE DEPTH-ANALYZE CLOSE-ANALYZE CONPILATE-ONE-FUNCTION CONPILATE COMP-BODY ANALYZE TRIV-ANALYZE (Transform nested IF expressions) (Constants folding) (Beta-conversion) (Substitution feasibility) (Substitution, subsumption) (Convert to continuation-passing style) (Environment analysis) (Bindings analysis) (Register allocation) (Environment structure design) (Generate code, producing one module) (Generate code for one subroutine) (Compile procedure body) (Generate value-producing code) (Generate 'trivial' code)

[Page 129]

A1:0UUX;RABBIT 568 GLS 12:14:50 Monday, May 15, 1978 FO+1D.ZH.39M.4S.

Created 23:29:15 Sunday, May 14, 1978 FO+13H.53M.Z9S. -RRRRRRRR RRRRRRRR RRRRRRRR RR RR RR RR RR RR RR RR RR RR RR RR RRRRRRRR RRRRRRRR RRRRRRRR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AA AA AA AA AA AA AA AA AA AA AA AA BBBBBBBB BBBBBBBB BBBBBBBB BB BB BB BB BB BB BB BB BB BB BB BB BBBBBBBB BBBBBBBB BBBBBBBB BB BB BB BB B8 B8 BB BB BB BB BB BB BBBBBBBB BBBBBBBB BBBBBBBB BBBBBBBB 111111 TTTTTTTTTT BBBBBBBB 111111 TTTTTTTTTT BBBBBBBB 111111 TTTTTTTTTT BB BB 11 TT BB BB 11 TT BB BB 11 TT BB BB 11 TT BB BB IT TT BB BB 11 TT BBBBBBBB I1 TT BBBBBB66 11 TT BBBBBBBB 11 TT BB BB 11 TT B8 BB 11 TT BB BB 11 TT BB BB 11 TT BB BB I1 TT BB BB 11 TT BBBBBBBB 111111 TT BBBBBBBB 111111 TT BBBBBBBB 111111 TT Al:0UUX;RABBIT 568 GLS 12:14:50 Monday, May 15, 1978 FO+1D.ZM.39H.4S.

Created 23:29:15 Sunday. May 14. 1978 F0+13H.53M.Z9S.

5555555555 5555555555 5555555555 55 55 55 55 55 55 55555555 55555555 55555555 55 55 55 55 55 55 55 55 55 555555 555555 555555 666666 666666 666666 66 66 66 66 66 66 66 66 66 66666666 66666666 66666666 66 66 66 66 66 66 66 66 66 66 66 66 666666 666666 666666 swnuw Settings: L[LISP] z A N asv now x Fonts: F[FONTS;ZZFG KST,,] 888888 888888 888888 88 B8 88 88 88 88 88 88 88 88 88 88 888888 888888 888868 88 88 88 88 88 88 88 88 B8 88 88 86 888888 888888 888888

[Page 130]

120 The DECLARE forms are for the benefit of the MacLISP compiler, which will process the result of compiling this file (i.e. RABBIT compiling itself). The first few forms are concerned with switch settings, allocation of memory within the MacLISP compiler, and loading of auxiliary functions which must be available at compile time.

The large block of SPECIAL declarations contains the name of every SCHEME function in the file. This is necessary because the run-time representation of a global variable is as a MacLISP SPECIAL variable. The compiled function objects will reside in MacLISP value cells, and SCHEME functions refer to each other through these cells.

The second set of SPECIAL declarations (variables whose names begin and end with a "*") specify variables used globally by RABBIT. These fall into three categories: variables containing properties of the SCHEME interpreter which are parameters for the compiler (e.g. **ARGUMENT-REGISTERS**); switches, primarily for debugging purposes, used to control certain compiler operations (e.g.

  • FUDGE*); and own variables for certain functions, used to generate objects or gather statistics (e.g. *GENTEMPNUM* and *DEPROGNIFY-COUNT*).

The PROCLAIM forms are to RABBIT as DECLARE forms are to the MacLISP compiler. These provide declarations to the incarnation of RABBIT which is compiling the file. The subforms of a PROCLAIM form are executed by RABBIT when it encounters the form in a file being compiled. (We will see later how this is done.)

[Page 131]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 ;;; RABBIT coMP1Len -*LISP*- gggglx seo__g§;3q41g__g3gg_; (DECLARE (DECLARE (DECLARE (DECLARE (DECLARE (rAsLoAu (ouux) SCHMAC)) (MACROS 1) (uevlo T)) (ALLOC '(LIST (300000 <n£ruN olsPLAc£ (x Y) (SPECIAL (DECLARE (SPECIAL EMPTY TRIVFN ADJOIN UNION asoooo .z) rlxuun soooo SYMBOL 24ooo))) Y)) GENTEMP GENFLUSH GEN-GLOBAL-NAME PRINT-WARNING ADDPROP DELPROP SETPROP INTERSECT REMOVE SETDIFF PAIRLIS COMPILE PASS1-ANALYZE TEST-COMPILE NODIFY ALPHATIZE ALPHA-ATOM ALPHA-LAMBDA ALPHA-IF ALPHA-ASET ALPHA-CATCH ALPHA-LABELS ALPHA-LABELS-DEFN ALPHA-BLOCK MACRO-EXPAND ALPHA-COMBINATION ENV-ANALYZE TRIV-ANALYZE TRIV-ANALYZE-FN-P EFFS-ANALYZE EFFS-UNION EFFS-ANALYZE-IF EFFS-ANALYZE-COMBINATION CHECK-COMBINATION-PEFFS ERASE-NODES META-EVALUATE META-IF-FUDGE META-COMBINATION-TRIVFN META-COMBINATION-LAMBDA SUBST-CANDIDATE REANALYZE1 EFFS-INTERSECT EFFECTLESS EFFECTLESS-EXCEPT-CONS PASSABLE META-SUBSTITUTE COPY-CODE COPY-NODES CNODIFY CONVERT MAKE-RETURN CONVERT-LAMBDA-FM CONVERT-IF CONVERT-ASET CONVERT-CATCH CONVERT-LABELS CONVERT-COMBINATION CENV-ANALYZE CENV-TRIV-ANALYZE CENV-CCOMBINATION-ANALYZE BIND-ANALYZE REFD-VARS BIND-ANALYZE-CLAMBDA BIND-ANALYZE-CONTINUATION BIND-ANALYZE-CIF BIND-ANALYZE-CASET BIND-ANALYZE-CLABELS BIND-ANALYZE-RETURN BIND-ANALYZE-CCOMBINATION BIND-CCOMBINATION-ANALYZE DEPTH-ANALYZE FILTER-CLOSEREFS CLOSE-ANALYZE COMPILATE DEPROGNIFY1 TEMPLOC ENVCARCDR REGSLIST SET-UP-ASETVARS COMP-BODY PRODUCE-IF PRODUCE-ASET PRODUCE-LABELS PRODUCE-LAMBDA-COMBINATION PRODUCE-TRIVFN-COMBINATION PRODUCE-TRIVFN-COMBINATION-CONTINUATION PRODUCE-TRIVFN-COMBINATION-CVARIABLE PRODUCE-COMBINATION PRODUCE-COMBINATION-VARIABLE ADJUST-KNOVNFN-CENV PRODUCE-CONTINUATION-RETURN PRODUCE-RETURN PRODUCE-RETURN-1 LAMBDACATE PSETOIFY PSETOIFY-METHOD-2 PSETOIFY-METHOD-3 PSETO-ARGS PSETO-ARGS-ENV PSETO-TEMPS MAPANALYZE ANALYZE ANALYZE-CLAMBDA ANALYZE-CONTINUATION ANALYZE-CIF ANALYZE-CLABELS ANALYZE-CCOMBINATION ANALYZE-RETURN LOOKUPICATE CONS-CLOSEREFS OUTPUT-ASET CONDICATE DECARCDRATE TRIVIALIZE TRIV-LAMBDACATE COMPILATE-ONE-FUNCTION COMPILATE-LOOP USED-TEMPLOCS REMARK-ON MAP-USER-NAMES COMFILE TRANSDUCE PROCESS-FORM PROCESS-DEFINE-FORM PROCESS-DEFINITION CLEANUP SEXPRFY CSEXPRFY CHECK-NUMBER~0F-ARGS DUMPIT STATS RESET-STATS INIT»RABBIT)) *EMPTY* *GENTEMPNUM* *GENTEMPLIST* *GLOBAL-GEN-PREFIX* *ERROR-COUNT* *ERROR-LIST* *TEST* *TESTING* *OPTIMIZE* *REANALYZE* *SUBSTITUTE* *FUDGE* *NEW-FUDGE* *SINGLE-SUBST* *LAMBDA-SUBST* *FLUSH-ARGS* *STAT-VARS* *DEAD-COUNT* *FUDGE-COUNT* *FOLD-COUNT* *FLUSH-COUNT* *CONVERT-COUNT* *SUBST-COUNT* *DEPROGNIFY-COUNT* *LAMBDA-BODY-SUBST* *LAMBDA-BODY-SUBST-TRY-COUNT* *LAMBDA-BODY-SUBST-SUCCESS-COUNT!

  • CHECK-PEFFS* **CONT+ARG-REGS** **ENV+CONT+ARG-REGS** *iARGUMENT-REGISTERS** **NUMBER-OF-ARG-REGS** *BUFFER-RANDOM-FORMS* ODISPLACE-SN*)) (PROCLAIM (*EXPR PRINT-SHORT) (SET' *BUFFER-RANDOM-FORMS* NIL) (ALLOC '(LIST (240000 340000 1000) FIXNUM (30000 40000 1000) SYMBOL (14000 24000 NIL) HUNK4 (20000 53000 NIL) HUNK8 (20000 50000 NIL) HUNKIS (20000 60000 NIL)))) (SET' *STAT-VARS* '(*DEAD-COUNT* *FUDGE-COUNT* *FOLD-COUNT* *FLUSH-COUNT* *CONVERT-COUNT* *SUBST-COUNT* *DEPROGNIFY-COUNT* *LAMBDA-BODY-SUBST-TRY-COUNT* *LAMBDA-BODY-SUBST-SUCCESS-COUNT*)) (ALLOC '(LIST (240000 340000 1000) FIXNUM (30000 40000 1000) SYMBOL (14000 24000 NIL) HUNK4 (20000 50000 NIL) HUNK8 (20000 50000 NIL) HUNKI6 (20000 70000 NIL))) (APPLY 'GCTVA '(T)) SGC USELESS ATOMS (CAN'T SAY (EVAL' (GCTVA T)) BECAUSE OF NCOMPLR) (REPLACE) ;UNDO ANY DISPLACED HACROS (SET' *DISPLACE-SV* NIL) ;DON'T LET HACROS SELF-OISPLACE (GRINDEF) SLOAD THE GRINDER (PRETTY-PRINTER) (DECLARE (/PDEFINE DEFINE ISCHEHE FUNCTION1)) ;DECLARATIONS FOR LISTING PROGRAM (DECLARE (/UDEFINE DEFMAC IMACLISP MACRO1)) (n£cLAne (DECLARE (/BDEFINE SCHMAC IPDP-10 SCHEME MACRO1)) (/PDEFINE MACRO ISCHEME MACRO1))

[Page 132]

122 The variable *EMPTY* is initialized to a unique object (a list cell whose car is *EMPTY* this is so that no other object can be EQ to it, but it can be easily recognized when printed) which is used to initialize components of structures. (we will see later how such structures are defined.) We do not use, say, NIL to represent an empty component because NIL might be a meaningful value for that component. The predicate EMPTY is true of the unique object.

TRIVFN is a predicate which is true of 'trivial' functions. A function is trivial if it is a MacLISP primitive (an EXPR, SUBR, or LSUBR), or has been declared to be primitive via a *EXPR or *LEXPR proclamation.

' (INCREMENT FOO) expands into the code (ASET' FOO (+ FOO l)).

CATENATE is a utility macro which may be thought of as a function. Given any number of S-expressions it produces an atomic symbol whose print name is the concatenation of the print names of the S-expressions. Usually the S-expressions will be atomic symbols or numbers.

(CATENATE 'FOO ' 43) => FOO-43 GENTEMP is used to generate a new unique symbol, given a specified prefix. The global variable *GENTEMPNUM* starts at zero and increases monotonicially. Each call to GENTEMP catenates the prefix, a hyphen, and a new value of *GENTEMPNUML Because the numeric suffixes of the generated symbols increase with time, one can determine in which order symbols were generated. We also will use different prefixes for different purposes, so that one can tell which part of the compiler generated a given symbol. This information can be invaluable for debugging purposes; from the names of the symbols appearing in a data structure, one can determine how that structure was created and in what order. (The generated symbols are themselves used primarily as simple markers, or as simple structures (property lists). The use of the print names amounts to tagging each marker or structure with a type and a creation timestamp. A LISP-like language encourages the inclusion of such information.) (GENTEMP 'NODE) => NODE-2534 A list of all generated symbols is maintained in *GENTEMPLISTL GENFLUSH can be called to excise all generated symbols from the MacLISP obarray; this is periodically necessary when compiling a large file so that unneeded symbols may be garbage-collected. The symbols are initially interned on the obarray in the first place for ease of debugging (one can refer to them by name from a debugging breakpoint). GEN-GLOBAL-NAME is used to generate a symbol to be used as a run-time name by the compiled code. The prefix for such names is initially "7" for testing purposes, but is initialized by the file transducer as a function of the name of the file being compiled. This allows separately compiled files to be loaded together without fear of naming conflicts.

[Page 133]

001 002 003 004 005 006 007 008 009 010 011 012 013 D14 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 BADBIT sqg 05/15/78__§§gg"Z (COND ((NOT (BOUNDP '*EMPTYi)) _ (SET' *EMPTY* (LIST '*EMPTY¢)))) (DEFINE EMPTY (LAMBDA (x) (eo x *EMPTY¢))) (DEFINE TRIVFN (LAMBDA (SYM) (GETL SYM '(EXPR SUBR LSUBR ¢EXPR ¢LEXPR)))) (DEFMAC INCREMENT (X) "(ASET' ,X (+ ,X 1))) (DEFMAC CATENATE ARGS "(IMPLODE (APPEND @(MAPCAR '(LAMBDA (X) (COND ((OR (ATOM X) (NOT (EO (CAR X) 'QUOTE))) '(EXPLODEN ,X)) (T '(OUOTE ,(EXPLODEN (CADR X)))))) ARGS)))) (COND ((NOT (BOUNDP '*GENTEMPNUM¢)) (SET' *GENTEMPNUM* 0))) (COND ((NOT (BOUNDP '*GENTEMPLI$T*)) (SET' *GENTEMPLIST* NIL))) (DEFINE semenr '

(LAMBDA (x) (BLOCK (xucnsneur ~s£N1eMPuuM») (LET ((svM (cArenAre x '1-l *GENTEMPNUM*))) user' ¢GENTEMPLIST* (cons sm ¢GENT£MPLlSTi)) SYM)))) (DEFINE GENFLUSH (LAMBDA () (BLOCK (AMAPC REMOB *GENTEMPLIST*) (ASET° *GENTEMPLIST* NIL)))) (DEFINE GEN-GLOBAL-NAME (LAMBDA () (GENTEMP *GLOBAL-GEN-PREFIX*))) (SETF *GLOBAL-GEN-PREFIX* '1?|)

[Page 134]

124 WARN is a macro used to print a notice concerning an incorrect program being compiled. It generates a call to PRINT-WARNING, which maintains a count and a list of the error messages, and prints the message, along with any associated useful quantities.

(WARN IFOO is greater than BARI FOO BAR) would print (assuming the values of FOO and BAR were 43 and 15) ;Warning: FOO is greater than BAR ; 43 ; 15 WARN is used only to report errors in the program being compiled. The MacLISP ERROR function is used to signal internal inconsistencies in the compiler.

ASK is a macro which prints a message and then waits for a reply.

Typically NIL means "no", and anything else means "yes".

SX and CSX are debugging aids which print intermediate data structures internal to the compiler in a readable form. They make use of SPRINTER (part of the MacLISP GRIND pretty-printing package) and of SEXPRFY and CSEXPRFY, which are defined below. The EQCASE macro provides a simple dispatching control structure. The first form evaluates to an item, and the clause whose keyword matches the item is executed. If no clause matches, an error occurs. For example:

(EQCASE TRAFFIC-LIGHT (RED (PRINT 'STOP)) (GREEN (PRINT 'G0)) (YELLOW (PRINT 'ACCELERATE) (CRASH))) expands into the code:

(COND ((EQ TRAFFIC-LIGHT 'RED) (PRINT 'STOP)) ((EQ TRAFFIC-LIGHT 'GREEN) (PRINT 'GO)) ((EQ TRAFFIC-LIGHT 'YELLOW) (PRINT 'ACCELERATE) (CRASH)) (T (ERROR '|L0sing EQCASEI TRAFFIC-LIGHT 'FAIL-ACT)))

[Page 135]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 (UEFMAC (DEFINE (DEFUN (SCHMAC (DEFMAC (DEFMAC (UEFMAC F5§9lI_§§§__Q§!l§LZEE,F!9=,§ WARN (MSG _ STUFF) '

"(PRINT-WARNING ',MSG (LIST !STUFF))) PRINT-VARNING (LAMBDA (MSG STUFF) (BLOCK (INCREHENT *ERROR-COUNTI) (ASET' *ERROR-LIST* (CONS (CONS H56 STUFF) *ERROR-LlST¢)) (TYO 7 (SYHEVAL 'TYO)) ;BELL (TERPRI (sYMsvAL ~1Yo)) (PRINC '1;Vurning: I (SYMEVAL 'TYO)) (TYO 7 (SYHEVAL 'TYO)) ;BELL (PRINC MSG (SYNEVAL 'TYO)) (AHAPC PRINT-SHORT STUFF)))) PRINT-SHORT (X) ((LAMBDA (PRINLEVEL PRINLENGTN TERPRI) (TERPRI (SYHEVAL 'TYO)) (PRINC '1; I (SYMEVAL 'TYO)) (PRINI X (SYMEVAL 'TYO))) 3 8 T)) ASK (MSG) "(BLOCK (TERPRI) (PRINC ',MSG) (TYO 40) (READ))) SX (X) "(SPRlNT£R (SEXPRFY ,X NlL))) CSX (X) "(SPRINTER (CSEXPRFY ,X))) GDEBUGGING AID ZDEBUGGING AID EQCASE (OBJ _ CASES) ~(cono e(nAPcAR '(LAnaoA (CASE) (OR (ATOM (CAR CASE)) (ERROR '1Losing EOCASE clnuse1)) '((E0 ,OBJ ',(CAR CAS£)) !(COR CASE))) CASES) (T (ERROR '1L0§ing EOCASEI ,OBJ 'FAIL-ACT))))

[Page 136]

126 The next group of macros implement typed data structures with named components. ACCESSFN, CLOBBER, and HUNKFN allow definition of very general structure access functions. Their precise operation is not directly relevant to this exposition; suffice it to say that they are subsidiary to the DEFTYPE macro on the next page.

DEFTYPE defines structure "data types' with named components. These structures are implemented as MacLISP hunks. (A hunk is essentially a kind of list cell with more than two pointer components; it may be thought. of as a short, fixed-length vector. Hunks are accessed with the function (CXR n hunk), which returns the nth component of the hunk. (RPLACX n hunk newval) analogously alters the nth component. CXR and RPLACX are thus similar to CAR/CDR and RPLACA/RPLACD.) Slot 0 of each hunk is reserved for a "property list"; this feature is not used in RABBIT. Slot 1 always contains an atomic symbol which is the name of the type. Thus every structure explicitly bears its type. The form (HUNKFN TYPE 1) creates a function (actually a macro) called TYPE which when applied to a hunk will fetch slot 1. Slots 2 upward of a hunk are used to contain named components. A structure does not contain the component names. (However, the symbol which is the name of the type does have a list of the component names on its property list. This is useful for debugging purposes. There is, for example, a package which pretty-prints structured data types, showing the components explicitly as name-value pairs, which uses this information.)

[Page 137]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 (DECLARE (/EDEFINE ACCESSFN [ACCESS MACRO1)) (DEFMAC ACCESSFN (NAME UVARS FETCH _ PUT) ((LAMBDA (VARS CNAME) (DO ((A VARS (CDR A)) (B 'fzf "(CUR ,8)) (C NIL (CONS "(CAR ,B) C))) ((NULL A) "(PROGN 'COMPILE (DEFMAC .NAME *Z* ((LAMBDA ,(NREV£RS£ (CDR (REVERS ,FETCN) !(REV£RSE (CDR C)))) (DEFMAC ,CNAME *Z* ((LAMBDA .VARS ,(COND (PUT (CAR PUT)) B5§9lT_@_25!lM5 P€9€l E VARS))) (T "(CLOBBER ,,FETCH ,THE-NEW-VALUE)))) !(REV£RS£ C))))))) (COND (PUT UVARS) (T (APPEND UVARS '(THE-NEW-VALUE)))) (CATENATE 'ICLOBBER-I NAME))) (DEFHAC CLOBBER (X Y) "(,(CAT£NATE 'ICLOBBER-I (CAR X)) !(COR X) .Y)) (DECLARE (/PDEFINE HUNKFN IHUNK ACCESS MACRO1)) (DEFHAC HUNKFN (NAME SLOT) "(ACCESSFN ,NAME (THE-HUNK NEW-VALUE) "(CXR ,,SLOT .THE-HUNK) "(RPLACX ,,SLOT ,THE-HUNK ,NEW-VALUE)))

[Page 138]

128 Consider for example the form (DEFTYPE LAMBDA (UVARS VARS BODY)) This defines a structured data type called LAMBDA with three named components UVARS, VARS, and BODY. It also defines a series of macros for manipulating this data type.

For access, the macros LAMBDA\UVARS, LAMBDA\VARS, and LAMBDA\BODY are defined. These each take a single argument, a data structure of type VARIABLE, and return the appropriate component. (The TYPE function can also be applied to the data object, and will return LAMBDA.) For construction, a macro CONS-LAMBDA is defined. For example, the form:

(CONS-LAMBDA (UVARS = LIST1) (VARS = L-ISTZ)) would construct a LAMBDA structure with the TYPE, UVARS, VARS, and BODY slots initialized respectively to LAMBDA, the value of LISTI, the value of LISTZ, and the "empty object" (recall the EMPTY predicate above). Any component names (possibly none!) may be initialized in a CONS-xxx form, and any components not mentioned will be initialized to the empty object. (The "=" signs are purely syntactic sugar for mnemonic value. They can be omitted.) For alteration of components, a macro ALTER-LAMBDA is defined. For example, the form (ALTER-LAMBDA FOO (UVARS := LIST1) (BODY := (LIST A B))) would alter the UVARS and BODY components of the value of FOO (which should be a LAMBDA structure - this is not checked) to be respectively the values of LISTI and (LIST A B). Any nonzero number of components may be modified by a single ALTER-xxx form. (The ":=" signs are purely syntactic sugar also.) A great advantage of using these structure definitions is that it is very easy to add or delete components during the development of the program. In particular, when a new component is added to a type, it is not necessary to find all instances of creations of objects of that type; they will simply automatically initialize the new slot to the empty object. Only parts of the program which are relevant to the use of the new component need be changed.

[Page 139]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 O19 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 005 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 B'\B!|_§C¢H! Qwl _F39 €,_ 51 (DECLARE (/SDEFINE UEFTYPE IDATA TvP£1)) SLI SLOT 0 IS ALWAYS THE PROPERTY LIST, AND SLOT 1 THE HUNK TYPE.

(HUNKFN TYPE 1) (oerMAc u£rrYP£ (NAME sLoTs SUPP) "(PRoaN 'coMP1L£ (DEFMAC ,(CAT£NATE °Icons-l NAME) xvus (vnosu (no ((x nuns (CDR x))) ((NULL x)) (on ,(COND ((cnR SLOTS) "(H£Mo (CAAR x) °.sLoTs)) (1 ~(£o (CAAR x) °,(cAR sLovs)))) (ERROR ',(cAreuA1e '|Inv|Iid Keyword Argument to cons-l nnnz) (CAR K) 'FAIL-ACT))) '(HUNK °,~,nAn£ o(no ((s -,sLors (con s)) (X NIL (CONS ((LAMBDA (KVD) ' (COND (KVD (CAR (LAST KVD))) (1 'f£nPrv»))) (ASSO (CAR S) KVDS)) X))) ((NULL S) (NREVERSE X))) NlL))) (UEFMAC ,(CATENATE '1ALTER-l NAME) (OBJ _ KVDS) (PROGN (DO ((K KVDS (CDR K))) ((NULL K)) (OR ,(COND ((CDR SLOTS) "(MEMO (CAAR K) ',SLOTS)) (T "(E0 (CAAR K) ',(CAR SLOTS)))) (ERROR ',(CAT£NAT£ 'llnvalid Keyword Argument to ALTER-I NAME) (CAR K) '

'FAIL-Ac1))) (00 ((I (+ (LENGTH KHDS) 1) (- I l)) (VARS NIL (CONS (GENSYM) VAR$))) ((= I 0) '((LAMBDA ,VARS ,(BLOCKIFY (MAPCAR '(LAMBDA (K V) _ '(CLOBBER (,(CATENATE °,NAME ' N (CAR K)) (,(CAR vAns))) (.v))) KVDS (CDR VARS)))) (LAMBDA () ,O8J) !(MAPCAR '(LAMBDA (K) '(LAMBUA () ,(CAR (LAST K)))) KVDS)))))) !(DO ((S SLOTS (CDR S)) (N Z (+ N 1)) _ (X NIL (CONS "(HUNKFN ,(CATENATE NAME 'N (CAR S)) .NI X))) ((NULL S) (NREVERSE X))) (DEFPROP ,NAME ,SLOTS COMPONENT-NAMES) (DEFPROP ,NAME ,SUPP SUPPRESSED-COMPONENT-NAMES) '(TYPE ,NAME DEFINED)))

[Page 140]

130 On this page are two groups of utility functions. One group manipulates property lists, and the other manipulates sets of objects represented as lists.

For (ADDPROP SYM VAL PROP), the PROP property of the symbol SYM should be a list of things. The object VAL is added to this list if it is not already a member of the list.

DELPROP performs the inverse of ADDPROP; it removes an object from a list found as the property of a symbol.

(SETPROP SYM VAL PROP) puts the property-value pair PROP,VAL on the property list of SYM; but if SYM already has a PROP property, it is an error unless the new value is the same as (EQ to) the existing one. That is, a redundant SETPROP is permitted, but not a conflicting one.

(ADJOIN ITEM SET) produces a new set SET U (ITEM).

UNION produces the union of two sets.

INTERSECT produces the intersection of two sets.

(REMOVE ITEM SET) produces a new set SET - (ITEM).

(SETDIFF SETI SETZ) produces the set SETI - SETZ.

All of the set operations are accomplished non-destructively; that is, the given arguments are not modified. Examples:

(ADJOIN 'A '(A B C)) => (A B C) (ADJOIN 'A '(B C D)) => (A B C D) (UNION '(A B C) '(B D F)) => (D F A B C) (INTERSECT '(A B C) '(B D F)) =) (B) (REMOVE 'B '(A B C)) => (A C) (SETDIFF '(A B C) '(B D F)) => (A C)

[Page 141]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 RA§§!I"§§§__2§Ll§£Z§_LRQFEWR ;;; ADD TO A PROPERTY WHICH IS A LIST OF THINGS (DEFINE Anovnop (LAMBDA (SYM VAL PROP) (LET ((L (GET svn PROP))) (IF (NOT (MEMO VAL L)) (PUTPROP svn (cons VAL L) PaoP))))) 332 INVERSE OF ADDPROP (DEFINE DELPROP (LAMBDA (SYM VAL PROP) (PUTPROP SYM (DELO VAL (GET SYM PROP)) PROP))) 333 LIKE PUTPROP, BUT INSIST ON NOT CHANGING A VALUE ALREADY THERE (DEFINE SETPROP -(LAMBDA (SYM VAL PROP) (LET ((L (GETL SYM (LIST PROP)))) (IF (AND L (NOT (EO VAL (CADR L)))) (ERROR '1Attempt to redefine e unique property!

A (LIST 'SETPROP SYM VAL PROP) 'FAIL-ACT) (PUTPROP SYM VAL PROP))))) ;;; OPERATIONS ON SETS, REPRESENTED AS LISTS (DEFINE ADJOIN .

(LAMBDA (X S) (IF (MEMO X S) S (CONS X S)))) (DEFINE union (LAMBDA (x v) (oo ((z v (con z)) (v x (ADJOIN (CAR z) v))) ((NULL 2) V)))) (DEFINE INTERSECT (LAMBDA (X Y) (IF (NULL X) NIL (IF (MEMO (CAR X) Y) (CONS (CAR X) (INTERSECT (CDR X) Y)) (INTERSECT (CDR X) Y))))) (DEFINE nenove (LAMBDA (x s) (IF (NULL s) s (IF (eo x (CAR s)) (con s) ((LAMBDA (v) (IF (eo v (con s)) s (CONS (CAR s) v))) (REMOVE x (con s))))))) (uerxne sernzrr (LAMBDA (x v) (oo ((z x (con z)) (w NIL (IF (nano (CAR z) v) w (CONS (CAR z) w)))) ((NULL 2) V))))

[Page 142]

132 The PAIRLIS function is similar to, but not identical to, the function of the same name in the LISP 1.5 Manual. The difference is that the pairs of the association list produced are 2-lists rather than single conses. This was done purely so that structures produced by PAIRLIS would be more readable when printed; the ease of debugging was considered worth the additional CONS and access time.

(PAIRLIS '(A B C) '(X Y Z) '((F P) (G Q))) => ((C Z) (B Y) (A X) (F P) (G Q)) The COMPILE function is the main top-level function of the compiler. It is responsible for invoking each phase of the compiler in order. NAME is the name of a function (an atomic symbol), and LAMBDA-EXP the corresponding lambda-expression; these are easily extracted, for example, from a SCHEME DEFINE-form.

SEE-CRUD is NIL for normal processing, or T for debugging purposes. OPTIMIZE is a switch controlling whether the optimization phase should be invoked; it can be T, NIL, or MAYBE (meaning to ask the (human) debugger).

The overall flow within COMPILE is as follows: check number of arguments; apply ALPHATIZE to the lambda-expression to produce the pass 1 data structure; optionally optimize this data structure; perform pass 1 analysis; convert the pass 1 data structure to a pass Z (continuation-passing style) data structure; perform pass 2 analysis; generate code. The value of COMPILE is the MacLISP code produced by the code generator.

PASS1-ANALYZE is a separate function so that it can be used by the optimizer to reanalyze newly created subexpressions.

CL is a debugging utility. (CL FOO) causes the function FOO (which should be defined in the running SCHEME into which the compiler has been loaded) to be compiled. Various debugging facilities, such as SEE-CRUD, are enabled.

This is done by using TEST-COMPILE.

[Page 143]

001 002 O03 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 O33 034 035 036 037 038 039 040 041 042 043 044 O45 046 047 048 049 O50 051 052 053 054 055 056 057 058 059 060 061 (DEFINE PAIRLIS (LAMBDA (L1 LZ L) (oo ((v L1 (con (u L2 (CDR V)) U)) RFB?II_§§§_»Q§£l§L?Q_"?°99LZ (E L (CONS (LIST (CAR V) (CAR U)) E))) ((NULL V) E (DEFINE COMPILE )))) (LAMBDA (NAME LAMBDA-EXP SEE-CRUD OPTIMIZE) (BLOCK (CHECK-NUMBER-OF-ARGS NAME (DEFINE PASS]ANALYZE (LAMBDA (NODE Reno opt) (LENGTH (CADR LAMBDA-EXP)) T) (LET ((ALPHA-VERSION (ALPHATIZE LAMBDA-EXP NIL))) (IF (AND SEE-CRUD (ASK 1See alpha-conversion71)) (SX ALPHA-VERSION)) (LET ((OPT (IF (EO OPTIMIZE 'MAYBE) (ASK |0pt1mize?|) OPTIMIZE))) (LET ((META-VERSION (IF opt '

(META-EVALUATE ALPHA-VERSION) (PASS1ANALYZE ALPHA-VERSION NIL NIL)))) (OR (AND (NULL (NODE\REFS META~VERSION)) (NULL (NODE\ASETS META-VERSION))) (ERROR 'IENVANALYZE lost - COMPILEI NAME 'FAIL-ACT)) ND SEE-CRUD OPT (ASK lSee meta-evaluation?|)) (SX META-VERSION)) (IF (A (LET ( (

(CPS-VERSION (CONVERT META-VERSION NIL (NOT (NULL OPT))))) IF (AND SEE-CRUD (ASK 1See CPS-conversion?l)) (CSX CPS-VERSION)) (CENVANALYZE CPS-VERSION NIL NIL) (BINDANALYZE CPS-VERSION NIL NIL) (DEPTHANALYZE CPS-VERSION 0) (CLOSEANALYZE CPS-VERSION NIL) (COMPILATE-ONE-FUNCTION CPS-VERSION NAME)))))))) (BLOCK (ENVANALYZE NODE REDO) (TRIVANALYZE NODE REDO) (IF OPT (EFFSANALYZE NODE REDO)) NO0E))) (SCHMAC CL (FNNAME) "(TEST-COMPILE ',FNNAME)) (DEFINE TEST-COMPILE (LAMBDA (FNNAME) (COND (ru (T "

(ASET (ASET'

(ASET'

(Asst(Aser (LET ((FN (GET FNNAME 'SCHEME!FUNCTION))) ' *TESTING* T) *TEST* NIL) ;PURELY TO RELEASE FORMER GARBAGE *ERROR-COUNT* 0) *ERROR-LIST* NIL) *TEST* (coMP1Le FNNAME ru 1 fMAYBE)) (SPRINTER *TEST*) "(,(IF (ZEROP *ERROR-COUNT*) 'NO *ERROR-COUNT*) ERRORS)) (,FNNAME NOT DEFINED))))))

[Page 144]

134 Here are the structured data types used for the pass 1 intermediate representation. Each piece of the program is represented as a NODE, which has various pieces of information associated with it. The FORM component is a structure of one of the types CONSTANT, VARIABLE, LAMBDA, IF, ASET, CATCH, LABELS, or COMBINATION. This structure holds information specific to a given type of program node, whereas the NODE structure itself holds information which is needed at every node of the program structure. (One may think of the FORM component as a PASCAL record variant.) The ALPHATIZE routine and its friends take the S-expression definition of a function (a lambda-expression) and make a copy of it using NODE structures.

This copy, like the S-expression, is a tree. Subsequent analysis routines will all recur on this tree, passing information up and down the tree, either distributing information from parent node to child nodes, or collating information from child nodes to pass back to parent nodes. Some information must move laterally within the tree, from branch to branch; this is accomplished exclusively by using the property lists of symbols, usually those generated for renamings of variables (since all lateral information is associated with variable references - which is no accident!).

The function NODIFY is used for constructing a node, with certain slots properly initialized. In particular, the METAP slot is initialized to NIL, indicating a node not yet processed by META-EVALUATE; this fact will be used later in the optimizer. A name is generated for the node, and the node is put on the property list of the name. This property is for debugging purposes only; given the name of a node one can get the node easily. The name itself may also be used for another purpose by CONVERT-COMBINATION, to represent the intermediate quantity which is the value of the form represented by the node.

[Page 145]

001 002 003 004 005 D06 007 DOB 009 D10 011 012 013 014 015 016 017 018 019 020 021 O22 023 024 025 026 027 025 029 030 031 032 O33 034 035 036 037 038 039 040 041 042 043 044 D45 045 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 D65 066 067 RABBIT §§5 _,__ 95(l5[?3, E°9%_¢ ;;; ALPHA-CONVERSION SL; HERE WE RENAME ALL VARIABLES, AND CONVERT THE EXPRESSION TO AN ;;; WITH EXTRA SLOTS TO BE FILLED IN LATER. AFTER THIS POINT, THE :Li VARIABLES, AND THE USER NAMES ARE USED ONLY FOR ERROR MESSAGES ;;; FORM WILL BE USED AND AUGMENTED UNTIL IT IS CONVERTED TO CONTINUATION-PASSING STYLE.

EQUIVALENT TREELIKE FORM NEW NAMES ARE USED FOR AND THE LIKE. THE TREELIKE ;;; WE ALSO FIND ALL USER-NAMED LAMBDA-FORMS AND SET UP APPROPRIATE PROPERTIES.

52; THE USER CAN NAME A LAMBDA-FORM BY WRITING (LAMBDA (X) BODY NAME).

(DEFTYPE NODE (NAME SEXPR ENV REFS ASETS TRIVP EFFS AFFD PEFFS PAFFD METAP SUBSTP FORM) (SEXPR)) (UEFTYPE (DEFTYPE (DEFTYPE (DEFTYPEI (DEFTYPEI (DEFTYPE (DEFTYPEI (DEFTYPE NAME; SEXPR:

ENV:

REPS:

ASETSZ TRIVP:

EFFS:

AFFD:

PEFFS:

PAFFD:

METAP:

SUBSTP:

FORM:

A GENSYM WHICH NAMES THE NODE'S VALUE THE S-EXPRESSION WHICH WAS ALPHATIZED TO MAKE THIS NODE (USED ONLY FOR WARNING MESSAGES AND DEBUGGING) THE ALL ALL NON-NIL IFF EVALUATION OF THIS NODE IS TRIVIAL OF SIDE EFFECTS POSSIBLY OCCURRING AT THIS OF SIDE EFFECTS WHICH CAN POSSIBLY AFFECT THIS NODE OR BELOW ENVIRONMENT OF THE NODE (USED ONLY FOR DEBUGGING) VARIABLES BOUND ABOVE AND REFERENCED BELOW LOCAL VARIABLES SEEN IN AN ASET BELOW THIS OR BY THE NODE NODE (A SUBSET OF REFS) SET SET ABSOLUTELY PROVABLE SET OF EFFS ABSOLUTELY PROVABLE SET OF AFFD NON-NIL IFF THIS NODE HAS BEEN EXAMINED BY THE META~EVALUATOR FLAG INDICATING WHETHER META-SUBSTITUTE ACTUALLY MADE A SUBSTITUTION ONE OF THE BELOW TYPES NODE OR BELOW CONSTANT (VALUE)) VALUE:

THE S-EXPRESSION VALUE OF THE CONSTANT .VARIABLE (VAR GLOBALP)) ;VAR:

THE NEW UNIOUE NAME FOR THE VARIABLE, GENERATED BY ALPHATIZE.

THE USER NAME AND OTHER INFORMATION IS ON ITS PROPERTY LIST.

IGLOBALP: NIL UNLESS THE VARIABLE IS GLOBAL (IN WHICH CASE VAR IS THE ACTUAL NAME) LAMBDA UVARS:

VARS:

BODY:

(UVARS VARS BODY)) THE USER NAMES FOR THE BOUND VARIABLES (STRICTLY FOR DEBUGGING (SEE SEXPRFY)) A LIST OF THE GENERATED UNIQUE NAMES FOR THE BOUND VARIABLES THE NODE FOR THE BODY OF THE LAMBDA-EXPRESSION IF (PRED CON ALT)) PRED:

CON:

ALT:

THE NODE FOR THE PREDICATE THE NODE FOR THE CONSEOUENT THE NODE FOR THE ALTERNATIVE ASET (VAR BODY GLOBALP)) VAR:

BODY:

THE GENERATED UNIOUE NAME FOR THE ASET VARIABLE THE NODE FOR THE BODY OF THE ASET .GLOBALP: NIL UNLESS THE VARIABLE IS GLOBAL (IN WHICH CASE VAR IS THE ACTUAL NAME) CATCH (uvAR VAR BODY)) UVAR:

VAR:

BODY:

LABELS THE USER NAME FOR THE BOUND VARIABLE (STRICTLY FOR DEBUGGING (SEE SEXPRFY)) THE GENERATED UNIQUE NAME FOR THE BOUND VARIABLE THE NODE FOR THE BODY OF THE CATCH (UFNVARS FNVARS FNDEFS BODY)) UFNVARS: THE USER NAMES FOR THE BOUND LABELS VARIABLES FNVARS: A LIST OF THE GENERATED UNIQUE NAMES FOR THE LABELS VARIABLES FNDEFS: A LIST OF THE NODES FOR THE LAMBDA-EXPRESSIONS BODY:

THE NODE FOR THE BOY OF THE LABELS ,COMBINATION (ARGS WARNP)) ARGS:

WARNP

(DEFINE NODIFY (LAMBDA A LIST OF THE NODES FOR THE ARGUMENTS (THE FIRST IS THE FUNCTION) NON-NIL IFF CHECK-COMBINATION-PEFFS HAS DETECTED A CONFLICT IN THIS COMBINATION (FORM SEXPR ENV) (LET ((N (CONS-NODE (NAME = (GENTEMP 'NODE)) (FORM = FORM) (SEXPR = SEXPR) (ENV = ENV) (METAP = NIL)))) (PUTPROP.(NODE\NAME N) N 'NODE) NH)

[Page 146]

136 ALPHATIZE takes an S-expression to convert, and an environment. The latter is a list of 2-lists; each 2-list is of the form (user-name new-name).

This is used for renaming each variable to a unique name. The unique names are generated within ALPHA-LAMBDA, ALPHA-LABELS, and ALPHA-CATCH, where the variable bindings are encountered. The new name pairings are tacked onto the front of the then-current environment, and the result used as the environment for converting the body.

ALPHATIZE merely does a dispatch on the type of form, to one of the sub-functions for the various types. It also detects forms which are really macro calls, and expands them by calling MACRO-EXPAND, which returns the form to be used in place of the macro call. (BLOCK is handled as a separate special case.

In the interpreter, BLOCK is handled specially rather than going through the general MACRO mechanism. This is done purely for speed. Defining BLOCK as a macro in the compiler can confuse the interpreter in which the compiler runs, and so it was decided simply to handle BLOCK as a special case in the compiler also.) ALPHATIZE allows the S-expression to contain already converted code in the form of NODEs; this fact is exploited by the optimizer (see META-IF-FUDGE below), but has no use in the initial conversion.

ALPHA-ATOM creates a CONSTANT structure for numbers and the special symbols NIL and T. Otherwise a VARIABLE structure is created. If the symbol (it better be a symbol!) occurs in the environment, the new-name is used, and otherwise the symbol itself. The slot GLOBALP is set to T iff the symbol was not in the environment.

ALPHA-LAMBDA generates new names for all the bound variables. It then converts its body, after using PAIRLIS to add the user-name/new-name pairs to the environment. The result is used to make a LAMBDA structure. A copy is made of the list of variables in the UVARS slot; it must be copied because later META-COMBINATION-LAMBDA may splice out elements of that list. If so, it will also splice out corresponding members of VARS, but that list was freshly consed by ALPHA-LAMBDA. '

[Page 147]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 D52 053 054 055 056 057 9595II_§§§EE9§l}§/?5__B°9= 9 iii ON NODE NAMES THESE PROPERTIES ARE CREATED:

NODE THE CORRESPONDING (DEFINE ALPHATIZE (LAMBDA (sexvn ENV) (COND ((ATOM ssxpn) NODE (ALPHA-ATOM SEXPR ENV)) ((HUNKP sexvn) (IF (eo (TYPE sexvn) 'Non£) SEXPR (ERROR '|P¢¢u11»r hunk - ALPHATIZEI ssxrn 'FAIL-Acr))) ((eo (CAR SEXPR) 'OUOT£) (NODIFY (cons-coNsrAur (VALUE = (CADR SEXPR))) sexpn £Nv)) ((£o (CAR SEXPR) 'LAMBOA) (ALPHA-LAMBDA sexpn euv)) ((Eo (CAR sexvn) 'xr) (ALPHA-IF sexvn £Nv)) ((£o (CAR SEXPR) ~Assr) (ALPHA-ASET sexvn eNv)) ((£o (CAR SEXPR) 'cA1cH) (ALPHA-CATCH sexrn zuv)) ((£o (CAR sexrn) 'LABELS) (ALPHA-LABELS szxvn £nv)) ((£o (CAR ssxvn) 'BLOCK) (ALPHA-sLocx sexvn £nv)) ((Auu (ATOM (CAR s£xPn)) (so (GET (CAR sexwn) 'A1Nr) 'AuAcno)) (ALPHATIZE (MAcno-£xPANn ssxrn) £NV)) (r (ALPHA-COMBINATION ssxvn ENV))))) (DEFINE ALPHA-ATOM (LAMBDA (SEXPR ENV) (IF (on (NUMBERP ssxvn) (NULL ssxpn) (eo sexvn 'r)).

(uoolrv (CONS-CONSTANT (VALUE = s£xPn)) sexvn suv) (LET ((SLOT (Asso sexvn eNv))) (nonlrv (CONS-VARIABLE (VAR = (IF sLo1 (cAnn sLo1) s£xPn)) sexrn £NV))))) (DEFINE ALPHA-LAMBDA (LAMBDA (SEXPR ENV) (GLOBALP = (NULL SLOT))) (LET ((VARS (DO ((I (LENGTH (CADR SEXPR)) (- I l)) (V NIL (CONS (GENTEHP 'VAR) V))) ((= I 0) (NREVERSE V))))) (IF (CDDDR SEXPR) (WARN 1Ma\formed LAMBDA expressionl SEXPR)) (NODIFY (CONS-LAMBDA (UVARS = (APPEND (CADR SEXPR) NIL)) sexvn ENV)))) ;;SEE META-COMBINATION-LAMBDA (VARS = VARS) (BODY = (ALPHATIZE (CADDR SEXPR) (PAIRLIS (CADR SEXPR) VARS £NV))))

[Page 148]

138 ALPHA-IF simply converts the predicate, consequent, and alternative, and makes an IF structure.

ALPHA-ASET checks for a non-quoted first argument. (Presently RABBIT does not allow for computed ASET variables. Since RABBIT was written, such computed variables have in fact been banned from the SCHEME language [Revised Report].) For simplicity, it also does not allow altering a global variable which is the name of a MacLISP primitive. This restriction is related only to the kludginess of the PDP-10 MacLISP SCHEME implementation, and is not an essential problem with the language. The ERROR function was used here rather than WARN because the problems are hard to correct for and occur infrequently.

Aside from these difficulties, ALPHA-ASET is much like ALPHA-ATOM on a variable; it looks in the environment, converts the body, and then constructs an ASET structure.

ALPHA-CATCH generates a new name "CATCHVAR-nn" for the bound variable. tacks it onto the environment, and converts the body; it then constructs a CATCH structure. it then all the way all is then ALPHA-LABELS generates new names "FNVAR-n" for all the bound variables; constructs in LENV the new environment, using PAIRLIS. It then converts bound function definitions and the body, using this environment. In this the function names are apparent to all the functions. A LABELS structure created.

[Page 149]

001 002 003 004 005 006 007 008 009 010 U11 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 U45 046 047 048 049 050 051 052 053 054 055 056 057 058 059 (DEFINE ALPHA-IF (LAMBDA (SEXPR ENV) (NODIFY (CONS-IF (PRED = (ALPHAT (CON = (ALPHATI (ALT = (ALPHATI SEXPR ENV))) (DEFINE ALPHA-ASET (LAMBDA (SEXPR ENV) (LET ((VAR (COND ((OR (ATOM (CAD 9455IT,§§§_n9§£l§lZ§vE?P9= 10 IZE (CADR SEXPR) £uv)) ze (CADDR saxvn) env)) ze (cAounn SEXPR) £nv))) R SEXPR)) (NOT (EO (CAADR SEXPR) '0UOTE))) (ERROR '|Can'l SEXPR Compile Non-quoted ASET Varjab1e| 'FAIL-ACT)) (1 (cAnAnn sexv (LET ((SLOT (Asso VAR env)) (IF (AND (NULL sLor) ( (ERROR '|x11=g»1 1 ssxrn -FAIL-Acr)) T))))) TRIVFN VAR)) o ASET a MacLISP primitivel (NODIFY (cons-Asst (VAR = (IF SLOT (CADR SLOT) vAR)) (GLOBALP = (NULL SLOT)) (BODY = (ALPHATIZE (cAnnn SEXPR) £uv))) SEXPR ENV))))) (DEFINE ALPHA-CATCH (LAMBDA (SEXPR ENV) (LET ((VAR (GENTEMP 'CATCNVAR))) (NODIFY (CONS-CATCH (VAR = (UVAR = (BODY = sexvn - ENV)))) (DEFINE ALPHA-LABELS (LAMBDA (SEXPR ENV) (LET ((UFNVARS (AMAPCAR (LAMBDA (CADR SE VAR) (cAun S£XPR)) (ALPHATIZE (cAnnn ssxpn) (cons (Llsr (cAnn ssxvn) VAR) ENV)))) (X) (lr (ATOM (CAR x)) (CAR x) (cAAn x))) XPR)))) (LET ((FNVARS (no ((| (LENGTH UFNVARS) (- I 1)) (v NIL (cons (esurcnv 'rNvAn) v))) ((= I 0) (LET ((LENV (vAxnLIs u (MODIFY (CONS-LAB szxvn ENV)))))) (nnevznse v))))) FNVARS rNvARs £nv))) eLs (UFNVARS = UFNVARS) (ruvnns = rNvAns) (FNDEFS - (AMAPCAR (LAMBDA (x) (ALPHA-LABELS-DEFN x L£uv)) (CADR 5EXPR))) (BODY - (ALPHATIZE (cAnnR sexrn) Leuv)))

[Page 150]

140 ALPHA-LABELS-DEFN parses one LABELS definition clause. An extension to the SCHEME language (made just after the publication of [Revised Report]!) allows a LABELS definition to take on any of the same three forms permitted by DEFINE. Thus this LABELS form actually defines FOO, BAR, and BAZ to be equivalent functions:

(LABELS ((FOO (LAMBDA (X Y) (BLOCK (PRINT X) (+ X Y)))) (BAR (X Y) (PRINT X) (+ X Y)) ((BAZ X Y) (PRINT X) (+ X Y))) (LIST (FOO 1 Z) (BAR 1 2) (BAZ 1 Z))) ALPHA-BLOCK implements the standard macro definition of BLOCK. (BLOCK x) is simply x, and (BLOCK x . y) expands into:

((LAMBDA (A B) (B)) X (LAMBDA () (BLOCK . y))) MACRO-EXPAND takes a macro call and expands it into a new form to be used in place of the macro call. In the PDP-10 MacLISP SCHEME implementation there are three different kinds of macros. Types MACRO and AMACRO are defined by MacLISP code, and so their defining functions are invoked using the MacLISP primitive FUNCALL. Type SMACRO is defined by SCHEME code which is in the value cell of an atomic symbol; thus SYMEVAL is used to get the contents of the value cell, and this SCHEME function is then invoked.

ALPHA-COMBINATION converts all the subforms of a combination, making a list of them, and creates a COMBINATION structure. If the function position contains a variable, it performs a consistency check using CHECK-NUMBER-OF-ARGS to make sure the right number of arguments is present.

[Page 151]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 D37 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 RA§BlIL§§A""Q§l!5/Z§__F°9¢ ll (DEFINE ALPHA-LABELS-DEFN (LAMBDA (LDEF LENV) (ALPHATIZE (IF (ATOM (CAR LDEF)) (IF (CDDR LDEF) '(LAMBDA ,(CADR LDEF) ,(BLOCKIFY (CDDR LDEF))) (CADR LDEF)) "(LAM8DA .(CDAR LDEF) ,(BLOCKIFY (CDR LDEF)))) LENV))) (DEFINE ALPHA-BLOCK (LAMBDA (sfxvn ENV) (COND ((NuLL (CDR SEXPR)) (VARN IBLOCK with no forms) "(ENV = ,(AMAPCAR CAR ENV))) (ALPHATIZE NIL ENV)) (1 (LABELS ((MUNG ~ (LAMBDA (BODY) (IF (NULL (CDR BODY)) (CAR BODY) "((LAMBDA (A B) (B)) ,(CAR BODY) (LAMBDA () ,(MUNG (CDR BODY)))))))) (ALPNATIZE (Muna (CDR SEXPR)) £uv)))))) (DEFINE MACRO-EXPAND (LAMBDA (sexpn) (LET ((M (GETL (CAR SEXPR) '(MACRO AMACRO SMACRO)))) (IF (NULL M) (BLOCK (WARN lmissing macro definitionl SEXPR) "(ERROR '1Undefined Macro Forml ',SEXPR 'FAIL-ACT)) (EOCASE (CAR M) (MACRO (FUNCALL (CADR M) S£XPR)) (AMACRO (FuNcALL (CABR M) s£xPn)) (SMACRO ((svM£vAL (CABR M)) s£xPn))))))) (DEFINE ALPHA-COMBINATION (LAMBDA (SEXPR ENV) (LET ((N (NODIFY (CONS-COMBINATION (VARNP = NIL) (ARGS = (AMAPCAR (LAMBDA (X) (ALPHATIZE X ENV)) SEXPR))) SEXPR ENVII) (LET ((M (NODE\FORM (CAR (COMBINATION\ARG$ (NDUE\FDRM N)))))) (IF (AND (EO (TYPE M) 'VARIA8LE) (VARIABLE\GL08ALP M)) (CHECK-NUMBER-DF-ARGS (VARIABLE\VAR M) (LENGTH (CDR (COMBINATION\ARGS (NODE\FORM N)))) NIL)) N))))

[Page 152]

142 Once the S-expression function definition has been copied as a NODE tree, COMPILE calls PASSIANALYZE to fill in various pieces of information. (If optimization is tx: be performed, COMPILE instead calls META-EVALUATE. META-EVALUATE in turn calls PASS!ANALYZE in a coroutining manner we will examine later.) PASSIANALYZE in turn calls ENVANALYZE, TRIVANALYZE, and EFFSANALYZE in order. Each of these has roughly the same structure. Each takes a node and a flag called REDOTHIS. Normally REDOTHIS is NIL and the information has not yet been installed in the node, and so the routine proceeds to analyze the node and install the appropriate information. when invoked by the optimizer, however, there may be information in the node already, but that information may be incorrect or obsolete as a result of the optimizing transformations. If REDOTHIS is non-NIL, then the given node must be reanalyzed, even if the information is already present. If REDOTHIS is in fact the symbol ALL, then all descendants of the given node must be reanalyzed.

Otherwise, only the given node requires reanalysis, plus any descendants which have not had the information installed at all. We will see later how these mechanisms are used in the optimizer.

The purpose of ENVANALYZE is to fill in for each node the slots REPS and ASETS. The first is a set (represented as a list) of the new-names of all variables bound above the node and referenced at or below the node, and the second (a subset of the first) is a set of such names which appear in an ASET at or below the node. These lists are computed recursively. A CONSTANT node has no such references; a VARIABLE node (with GLOBALP = NIL) refers to its own variable. An ASET node adds its variable to the ASET list for its body. Most other kinds of nodes merely merge together the lists for their immediate descendants. In order to satisfy the "bound above the node' requirement, those structures which bind variables (LAMBDA, CATCH, LABELS) filter out their own bound variables from the two sets.

As an example, consider this function:

(LAMBDA (X) ((LAMBDA (Y) ((LAMBDA (W) (ASET' Z (* X Y))) (ASET' Y (- Y 1)))) (- X 3))) The node for (- X 3) would have a REFS list (X) and an ASET list (). The node for the ASET on Z would have REFS=(X Y) (or perhaps (Y X)) and ASETS=(); Z does not appear in the ASETS list because it is not bound above. The node for the combination ((LAMBDA (W) ...) ...) would have REFS=(X Y) and ASETS=(Y). The node for the lambda-expression (LAMBDA (Y) ...) would have REFS=(X) and ASETS=(), because Y is filtered out.

[Page 153]

001 002 003 004 005 006 007 006 009 010 011 012 013 014 015 016 017 O18 019 020 021 D22 023 024 025 026 027 028 029 030 031 D32 033 034 035 D36 037 036 039 040 041 042 043 044 045 046 047 048 049 050 051 O52 053 054 O55 056 057 D58 059 050 061 062 063 D64 065 066 067 068 069 RAB9lI_§§9__Q§L}§!Z9u_EHQQEl? ii: ENVIRONMENT ANALYSIS.

FOR NODES ENCOUNTERED VE FILL IN
; REFS Q.; ASETS ;;; ON VARIABLE NAMES THESE PROPERTIES ARE CREATED

2 L BINDING THE NODE WHERE THE VARIABLE IS BOUND' i USER-NAME THE USER'S NAME FOR THE VARIABLE (WHERE BOUND) ; READ-REFS VARIABLE NODES VHICH READ THE VARIABLE ; VRITE-REFS ASET NODES WHICH SET THE VARIABLE $32 NORMALLY, ON RECURRING TO A LOVER NODE VE STOP IF THE INFORMATION GS; IS ALREADY THERE. MAKING THE PARAMETER "REDOTHIS" BE "ALL" FORCES ;;; RE-COMPUTATION TO ALL LEVELS; MAKING IT "ONCE" FORCES ;;; RECOMPUTATION OF THIS NODE BUT NOT OF SUBNODES.

(DEFINE ENV-ANALYZE (LAMBDA (NODE REDOTHIS) (IF (OR REDOTHIS (EMPTY (NODE\REFS NODE))) (LET ((FM (NODE\FORM NODE)) (REDO (IF (EO REDOTHIS 'ALL) 'ALL NIL))) (EOCASE (TYPE FM) (CONSTANT (ALTER-NODE None (Refs ;= NIL) (ASETS ¢= u1L))) (vAn1ABL£ (ADDPROP (VARIABLE\VAR FM) NODE '

(IF (VARIABLE\GLOBALP FM) READ-REFS) (SETPROP (VARIABLE\VAR FM) (VARIABLE\VAR FH) 'USER-NAME)) (ALTER-NODE NODE (REFS := (AND (NOT (VARIABLE\GLOBALP FM)) (LIST (VARIA8LE\VAR FM)))) (ASETS := NIL))) (LAMBDA (DO ((V (LAMBDA\VARS FM) (CDR V)) (uv (LAMaoA\uvAns rn) (CDR uv))) ((NULL V)) (SETPROP (CAR V) (CAR UV) 'US ER-NAME) (SETPROP (CAR V) NODE 'BINDING)) (LET ((B (LAMBDA\80OY FM))) (ENV-ANALYZE B REDO) (ALTER-NODE NODE (REFS := (SETDIF (ASETS := (SETDI (IF (LET ((PRED (IF\PRED FM)) (CON (IF\CON FM)) (ALT (1F\AL1 rn))) (Env-ANALYZE PRED Rsoo) (ENV-ANALYZE con Reno) (Env-ANALYZE ALT Reno) (ALTER-none Nous (REFS := (union F (NODE\REFS B) (LAMBDA\VARS FM))) FF (NODE\ASETS B) (LAMBDA\VARS FM)))))) (NODE\REFS PRED) (UNION (uoo£\R£Fs con) (NODE\REFS ALr)))) (ASETS := (UNION (NODE\ASETS PRED) (ASET (LET ((B (ASET\BODY FM)) (V (ASET\VAR FM))) (ENV-ANALYZE B REDO) (ADDPROP V NODE 'WRITE-REFS) (IF (ASET\GLOBALP FM) (ALTER-NODE NODE (UNION (NODE\ASETS CON) (NODE\ASETS ALT)))))))

[Page 154]

144 It should be easy to see the the topmost node of the node-tree must have REFS=() and ASETS=(), because no variables are bound above it. This fact is used in COMPILE for a consistency check. (After writing this last sentence, I noticed that in fact this consistency check was not being performed, and that it was a good idea. On being installed, this check immediately caught a subtle bug in the optimizer. Consistency checks pay off!) Another purpose accomplished by ENV-ANALYZE is the installation of several useful properties on the new-names of bound variables. Two properties, READ-REFS and WRITE-REFS, accumulate for each variable the set of VARIABLE nodes which refer to it and the set of ASET nodes that refer to it. These lists are very important to the optimizer. A nonempty WRITE-REFS set also calls for special action by the code generator. _ when a LAMBDA node is encountered, that node is put onto each new-name under the BINDING property, and the user-name is put under the USER-NAME property; these are used only for debugging.

[Page 155]

070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 D85 086 087 D88 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 106 107 108 109 110 111 112 RA88IT_§§§__05/15/Z8 Page_12,1 (REFS := (NODE\REFS B)) (ASETS I2 (NODE\A$ETS B))) (ALTER-NODE NODE (REPS :I (ADJOIN V (NODE\REPS 8))) (ASETS := (ADJOIN V (NODE\ASETS 8))))))) (CATCH (LET ((B (CATCH\BODY PM)) (v (cA1cH\vAn rM))) (SETPROP V (CATCN\UVAR PM) 'USER-NAME) (SETPROP V NODE 'BINDlNG) (ENV~ANALYZE B REDO) (ALTER-NODE NODE (REPS := (REMOVE V (NODE\REPS 8))) (ASETS := (REMOVE V (NODE\ASETS B)))))) (LABELS (oo ((v (LA8£LS\FNVARS rn) (CDR v)) (uv (LAa£Ls\uruvAns rn) (con uv)) - (D (LABELS\FNDEFS FM) (CDR D)) (R NIL (UNION n (uoo£\n£rs (CAR o)))) (A NIL (UNION A (NODE\ASETS (CAR D))))) ((NULL V) (Let ((a (LAseLs\aonv rn))) (ENV-ANALYZE 8 REDO) (ALTER-NODE NODE (REPS 2= (SETDIFP (UNION R (NODE\REPS 8)) (LABELS\PNVARS rM))) (ASETS := (SETDIFF (UNION A (NO0E\ASETS a)) (LABELS\FNVARS FH)))))) (SETPROP (CAR V) (CAR UV) 'USER-NAME) (SETPROP (CAR v) nous 'a1nu|ns) (ENV-ANALYZE (CAR D) REDO))) (COMBINATION (LET ((ARGS (COMBINATION\ARGS PM))) (AMAPC (LAMBDA (X) (ENV-ANALYZE X REDO)) ARGS) (DO ((A ARGS (CDR A)) (R NIL (UNION R (NODE\REPS (CAR A)))) (S NIL (UNION S (NODE\A$ET$ (CAR A))))) ((NULL A) (ALTER-NODE NODE (REPS := R) (ASETS 2' S)))))))))))

[Page 156]

146 TRIV-ANALYZE fills in the TRIVP slot for each node. This is a flag which, if non-NIL, indicates that the code represented by that node and its descendants is "trivial", i.e. it can be executed as simple host machine (Macl.ISP) code because no SCHEME closures are involved. Constants and variables are trivial, as are combinations with trivial arguments and a provably trivial function. While lambda-expressions are in general nontrivial (because a closure must be constructed), a special case is made for ((LAMBDA ...) ...), i.e. a combination whose function is a lambda-expression. This is possible because the code generator will not really generate a closure for the lambda-expression.

This is the first example of a trichotomy we will encounter repeatedly.

Combinations are divided into three kinds: those with a lambda-expression in the function position, those with a trivial MacLISP- primitive (satisfying the predicate TRIVFN) in the function position, and all others.

All other expressions are, in general, trivial iff all their subparts are trivial. Note that a LABELS is trivial iff its body is trivial; the nontriviality of the bound functions does not affect this.

The triviality flag is used by phase 2 to control conversion to continuation-passing style. This in turn affects the code generator, which compiles trivial forms straightforwardly into MacLISP code, rather than using the more complex techniques required by nontrivial SCHEME code. It would be possible to avoid triviality analysis entirely; the net result would only be less optimal final code.

[Page 157]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 U25 026 027 028 029 030 031 032 033 034 035 036 037 038 039 D40 041 042 043 044 045 D46 047 048 049 050 051 052 053 054 D55 056 D57 058 059 060 061 RABBIT E551 LU§ll§!Z§_ _P999 _li 22; TRIVIALITY ANALYSIS ;;; FOR NODES ENCOUNTERED VE FILL IN:

; TRIVP ;;; A COMBINATION I5 TRIVIAL IFF ALL ARGUMENTS ARE TRIVIAL, AND ;;; THE FUNCTION CAN BE PROVED TO BE TRIVIAL. VE ASSUME CLOSURES ;;; TO BE NONTRIVIAL IN THIS CONTEXT, SO THAT THE CONVERT FUNCTION ;;; VILL BE FORCED TO EXAMINE THEM.

(DEFINE TRIV-ANALYZE (LAMBDA (NODE REDOTHIS) (IF (OR REDOTHIS (EMPTY (NODE\TRIVP NODEII) (LET ((FM (NODE\FORM NODE)) (REDO (IF (EO REDOTHIS 'ALL) 'ALL NIL))) (EOCASE (TYPE FM) (CONSTANT (ALTER-NODE NODE (TRIVP := T))) (VARIABLE (ALTER-NODE NODE (TRIVP := T))) (LAMBDA (TRIV-ANALYZE (LAMBDA\BODY FM) REDO) (ALTER-NODE NODE (TRIVP := NIL))) (IF (TRIV-ANALYZE (lF\PREO FM) REDO) (TRIV-ANALYZE (IF\CON FM) RE00) (TRIV-ANALYZE (IF\ALT rn) neoo) (ALTER-Nona none (raxvv ¢= (Ano (NOD£\TRIVP (1F\PR£o rn)) (NODE\TRIVP (xF\con Fn)) (NOUE\TRIVP (Ir\ALr rn)))))) (Asar (vnlv-ANALvze (AS£T\BODY rn) Reno) (ALTER-NODE None (vnlvv ¢= (NODE\TRIVP (AS£T\BODY rM))))) (CATCH (TRIV-ANALYZE (CATCH\BODY FM) RE00) (ALTER-None None (TRIVP == NlL))) (LABELS (AMAPC (LAMBDA (F) (TRIV-ANALYZE F R£no)) (LAaeLs\Fuu£Fs rM)) (rnrv-AuALvz£ (LAa£Ls\BooY rn) REDO) (ALTER-NODE None (TRIVP := nxL))) (COMBINATION (LET ((ARss (COHBINATl0N\ARGS rn))) (TRIV-ANALYZE (CAR ARGS) REDO) (no ((A (con Anas) (con A)) (sv 1 (Ann sw (NODE\TRlVP (CAR A))))) ((NULL A) (ALTER-NODE None _ URWP:-(MDSH (TR1v-ANALYZE-rn-P (CAR ARGS)))))) (TRIV-ANALYZE (CAR A) REDO))))))))) (DEFINE TRIV-ANALYZE-FN-P (LAMBDA (FN) (OR (AND (EO (TYPE (NOUE\FORM FN)) 'VARIABLEI (TRIVFN (VARIABLE\VAR (NODE\FORM FN)))) (AND (EQ (TYPE (NO0E\FORM FN)) 'LAMBDA) (NODE\TRIVP (LAMBDA\BODY (NODE\FORM FN)))))))

[Page 158]

148 EFFS-ANALYZE analyzes the code for side-effects. In each node the four slots EFFS, AFFD, PEFFS, and PAFFD are filled in. Each is a set of side effects, which may be the symbol NONE, meaning no side effects; ANY, meaning all possible side effects; or a list of specific side effect names. Each such name specifies a category of possible side effects. Typical names are ASET, RPLACD, and FILE (which means input/output transactions).

The four slots EFFS, AFFD, PEFFS, and PAFFD refer to the node they are in and all nodes beneath it. Thus each is computed by taking the union cn' the corresponding sets of all immediate descendants, then adjoining any effects due to the current node.

EFFS is the set of side effects which may possibly be caused at or below the current node; PEFFS is the set of side effects which can be proved to occur at or below the node. These may differ because of ignorance on RABBIT's part.

For example, the node for a combination (RPLACA A B) will have the side-effect name RPLACA adjoined to both EFFS and PEFFS, because the RABBIT knows that RPLACA causes an RPLACA side effect (how this is known will be discussed later).' On the other hand, for a combination (FOO A B), where FOO is some user function, RABBIT can only conjecture that FOO can cause any conceivable side effect, but cannot prove it. Thus EFFS will be forced to be ANY, while PEFFS will not.

AFFD is the set of side effects which can possibly affect the evaluation of the current node or its descendants. For example, an RPLACA side effect can affect the evaluation of (CAR X), but on the other hand an RPLACD side effect cannot. PAFFD is. the corresponding set of side effects for which it can be proved. (This set is "proved" in a less rigorous sense than for PEFFS. The name RPLACA would be put in the PAFFD set for (CAR X), even though the user might know that while there are calls to RPLACA in his program, none of them ever modify X.

PEFFS and PAFFD are only used by CHECK-COMBINATION-PEFFS to warn the user of potential conflicts anyway, and serve no other purpose. EFFS and AFFD, on the other hand, are used by the optimizer to prevent improper code motion. Thus EFFS and AFFD must be pessimistic, and err only on the safe side; while PEFFS and PAFFD are optimistic, so that the user will not be pestered with too many warning messages.) The CONS side effect is treated specially. A node which causes the CONS side effect must not be duplicated, because each instance will create a new object; but whereas two RPLACA side effects may not be executed out of order, two CONS side effects may be.

The computation of AFFD and PAFFD for variables depends on whether the variable is global or not. If it is, SETQ and RPLACD can affect it (RPLACD can occur because of the peculiarities of the PDP-10 MacLISP implementation); otherwise, ASET can affect it if indeed any ASET refers to it (in which case ENV-ANALYZE will have left a WRITE-REFS property); otherwise, nothing can affect it.

Similar remarks hold for the computation of EFFS and PEFFS for an ASET node. The name SETQ applies to modifications of global variables, while ASET applies to local variables.

[Page 159]

001 002 003 004 005 000 007 000 009 010 011 012 013 014 015 016 017 010 019 020 021 022 023 024 025 020 027 020 029 030 031 032 033 034 035 030 037 030 039 040 041 042 043 044 045 040 047 040 049 050 051 052 053 054 055 050 057 050 059 000 061 062 063 054 005 000 GRC SIDE-EFFECTS ANALYSIS PAQ?1T_§§§_,Q§!l§[Z9_LH¢9S li C13 FOR NOUES ENCOUNTERED HE FILL IN: EFFS, AFFD, PEFFS, PAFFD ;;; A SET OF SIDE EFFECTS HAY BE (DEFINE EFFS-ANALYZE (LAMBDA (none neooruls) (IF (OR nsuovuxs EITHER 'NONE OR 'ANY, OR A SET.

(EMPTY (NODE\EFF$ NODE))) (LET ((rn (NODE\FORM N00£)) (REDO (lf (so Rsooruls 'ALL) 'ALL NlL))) (EOCASE (TYPE FM) (CONSTANT (ALTER-NODE N005 (EFFS ¢= 'n0N£) (AFFD == 'Non£) (PEFFS == 'noN£) (PAFFD == ~uoN£))) (VARIABLE (LET ((A (COND ((VARIABLE\GLOBALP FM) '(SETO)) ((Oer (vAn1AaL£\vAn fn) 'WRITE-REFS) '(As£r)) (T 'N0NE)))) (ALTER-NODE None (srrs == 'nou£) (AFFD == A) (riffs ;= 'NouL) (PAFFD := A)))) (LAMBDA (EFFS-ANALYZE (LAn00A\0o0v rn) neoo) (ALTER-none Nous (srrs := '(cous)) ° (AFFD == NIL) (PEFFS == '(cons)) (PAFFD := NlL))) (lr (EFFS-ANALYZE-IF N005 rn n£00)) (Asst (EFFS-ANALYZE (ASET\BODY rn) nano) (LET ((ASETEFF$ (IF (ASET\GL08ALP rn) '(5£T0) '(ASET)))) (ALTER-NODE None'

(Lfrs == (EFFS~UNION ASEIEFFS (NOD£\EFFS (As£1\0o0v rn)))) (Afro == (no0£\Arr0 (ASET\800Y rn))) (vsrrs := (EFFS-UNION ASETEFFS (NODE\P£FFS (ASET\BODY rM)))) (PAFF0 == (NODE\PAFFD (ASET\BODY rn)))))) (CATCH (EFFS-ANALYZE (CATCH\8O0Y rn) nano) (ALTER-NODE none (zrrs == (uo0£\£rrs (CATCH\BODY rn))) (Arrn := (N00£\Arr0 (CATCH\BO0Y rn))) (Psrrs == (NODE\PEFFS (CATCH\BODY rM))) (PAFF0 == (NO0E\PAFFU (CATCH\BODY rn))))) (LABELS (AMAPC (LAMBDA (F) (EFFS-ANALYZE F REDO)) (LABELS\FND£FS rn)) (EFFS-ANALYZE (LABELS\BODY rn) nano) (ALTER~NODE none (EFFS == (errs-uunou '(CONS) ~ (uo0£\£rrs (LABELS\BODY rn)))) (Afro ;= (NO0E\AFFD (LABELS\80DY rn))) (PEFFS == (EFFS-UNION '(CONS) (n00£\P£rFs (LABELS\800Y rn)))) (PArr0 ¢= (NODE\PAFFD (LABELS\BODY rM))))) (COMBINATION (EFFS-ANALYZE-COMBINATION none rn ne0o)))))))

[Page 160]

150 (While it may be held that allowing ASET' on variables is unclean, and that the use of cells as in PLASMA is semantically neater, it is true that because of the lexical scoping rules it can always be determined whether a given variable is ever used in an ASET'. In this way one can say that variables are divided by the compiler into two classes: those which are implicitly cells, and those which are not.) A closure (LAMBDA-expression) causes a CONS side-effect. This is not so much because SCHEME programs depend on being able to do EQ on closures (it is unclear whether this is a reasonable thing to specify in the semantics of SCHEME), as because there is no point in creating two closures when one will suffice. Adjoining CONS to EFFS will prevent the creation of such duplicate code by the optimizer. The same idea holds for LABELS (which has LAMBDA-expressions within it).

Notice that a LAMBDA node does mat add to its four sets the information from its body's sets. This is because evaluation of a LAMBDA-expression does not immediately evaluate the body. Only later, when the resulting closure is invoked, is the body executed.

EFFS-UNION gives the union of two sets of side effects. It knows about the special symbols NONE and ANY.

EFFS-ANALYZE-IF computes the side-effect sets for IF nodes. It has been made a separate function only because its code is so bulky; it must perform a three-way union for each of four sets.

EFFS-ANALYZE-COMBINATION computes the side-effect sets for COMBINATION nodes. First the function is analyzed, then the arguments. The unions of the four sets over all the arguments are accumulated in EF, AF, PEF, and PAF. CHECK-COMBINATION-PEFFS is called to warn the user of any possible violations of the rule that SCHEME is privileged to choose the order in which to evaluate the subforms of a combination. Finally, there are three cases depending on the form of the function position. If it is a variable, then the property list of the variable name is searched for information about that function. (The generated names for local variables will never have any such information; thus information will be found only for global variables. This information is used to augment the sets. (A clever technique not used in RABBIT would be to arrange for situations like ((LAMBDA (F) <body1>) (LAMBDA (___) <bodyZ>), where F denotes a "known function"

(see the description of BIND-ANALYZE below), to put on the property list of F the side-effect information for <bodyZ>, to aid optimization in <bodyl>.) If the function position is a LAMBDA-expression, then the four sets of the body of the LAMBDA-expression are unioned into the four sets for the COMBINATION node. This is because in this case we know that the body LAMBDA-expression will be executed in the course of executing the COMBINATION node.

In any other case, an unknown function is computed, and so it must be assumed that any side-effect is possible for EFFS and AFFD.

[Page 161]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 O35 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 R5§§ll_§§§__Q§!l§/Z§__PP9€~!§ (UEPINE Errsunion (LAMBDA (A B) (COND ((Eo A 'NONE) B) ((Eo s nous) A) ((Eo A 'Auv) 'Anv) ((Eo a 'ANY) °AnY) (1 (UNION A e))))) (DEFINE EFFS-ANALYZE-IF (LAMBDA (NODE FH REDO) (BLOCK (EFFS-ANALYZE (IF\PRED FH) REDO) (EFFS-ANALYZE (lF\CON FM) REDO) (EFFS-ANALYZE (IF\ALT FM) REDO) (ALTER-NODE NODE (EFFS == (EFFSUNION (NouE\Errs (1E\PnEo rM)) (Erfs-union (NODE\EFFS (IF\CON (NODE\EFFS (1r\ALr (Afro ;= (Errsunion (uooE\Arru (1r\PnEn rn)) (EFFSUNION (NooE\APru (lr\cou (NODE\AFFO (IF\ALT (PEFFS == (EFFSUNION (NODE\PEFFS (Ir\PREo rn)) (EFFSUNION (NODE\PEFFS (IF\CON (NooE\PErrs (\P\AE1 (PArrn == (EFFSUNION (NODE\PAFFD (1F\PREo rn)) (Errsunion (NODE\PAFFD (IF\CON (NODE\PAFFD (lF\ALT (S£T' *CHECK-PEFFS* NIL) (DEFINE EFFS-ANALYZE-COMBINATION (LAMBDA (NODE FM REoo) (LET ((ARss (COMBINATION\ARGS FM))) (EFFS-ANALYZE (CAR ARGS) REDO) (oo ((A (CDR Anas) (con A)) (EF 'nous (Errsunion EF (NODE\EFFS (CAR A)))) (AP 'uoNE (Errsunion AF (nooE\Arro (CAR A)))) (PEP 'none (PAF 'NONE ((uuLL A) (EFFSUNION PAF (NooE\PArPn (CAR A (EFFSUNION PEF (nouE\PEPrs (CAR A)))) )))) I

I J

(IF *CHECK-PEFFS* (CHECK-COMBINATION-PEFFS FH)) (COND ((Eo (LET ((V (VARIABLE\VAR (NODE\FORM (CAR (TYPE (NODE\FORM (CAR ARGS))) 'vAn IABLE) ARGS))))) (LET ((vE (GET v 'FN-SIDE-EFFECTS)) (VA (GET v 'rn-s1uE-ArPEcrE (ALTER-none NODE (srrs ¢= (IF VE (AFFD :- (IF VA (PErrs == (Errs(PArro == (Errs((Eo (TYPE (NooE\roRM (CAR ARGS))) 'LAM 0))) (EFFS (EFFS UNION UNION BDA) FH FM UNION EF VE) UNION AF VA) PEF vE)) PAF VA)))))) (LET ((8 (LAM8DA\8ODY (NODE\FORH (CAR ARGS))))) (ALTER-NODE NODE FH)) ))))) )) FH))))) rM)) FM))))) FM)) FH))))))))) 'ANY)) 'ANY)) (EFFS := (EFFSUNION EF (NODE\EFFS B))) (AFFU := (EFFSUNION AF (NODE\AFFD B))) (PEFFS := (EFFSUNION PEF (NODE\PEFFS B))) (PAFFD 2' (EFFSUNION PAF (NODE\PAFFD B)))))) (T (ALTER-NooE NODE (Errs ;= 'ANY) (Afro == 'ANY) (PEEPS == (Errsunion PE F

(NODE\PEFFS (CAR ARGS)))) (PAFFD := (EFFSUNION PA F

(NODE\PAFFD (CAR ARGS)))))))) (EFFS-ANALYZE (CAR A) REDO)))))

[Page 162]

152 CHECK-COMBINATION-PEFFS checks all the argument forms of a combination (including the function position) to see if they are all independent of each other with respect to side effects. If not, a warning is issued. This is because the semantics of SCHEME specify that the arguments may be evaluated in any order, and the user may not depend on a particular ordering.

The test is made by comparing all pairs of arguments within the combination. If the side-effects of one can "provably" affect the evaluation of the other, or if they both cause a side effect of the same category (other than CONS, which is special), then the results may depend on which order they are evaluated in. The test is not completely rigorous, and may err in either direction, but "probably" a reasonably written SCHEME program will satisfy the test.

This check is controlled by the switch *CHECK-PEFFS* in EFFS-ANALYZE-COMBINATION. This switch is provided because empirical tests show that performing the test slows down compilation by twenty to thirty percent. The test has proved valuable in trapping programming errors, and so is normally on, but it can be turned off for speed in compiling programs in which one has confidence.

EFFDEF is a macro which expands into a number of DEFPROP forms. This is used to define side-effect information about primitive functions. For example:

(EFFDEF CADR NONE (RPLACA RPLACD)) states that CADR causes no side-effects, and is "provably' affected by the RPLACA and RPLACD categories of side-effects. Similarly:

(EFFDEF NEMO NONE (RPLACA RPLACD) T) specifies the same information for MEMQ, but the "T" means that a call to MEMQ with constant arguments may be "folded" (evaluated, and the result compiled instead), despite the fact that some side effects can affect it. This represents a judgement that it is extremely unlikely that someone will write a program which modifies a constant argument to be given to MEMQ.

[Page 163]

001 002 003 004 005 006 007 008 009 010 Ol!

012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 R"BB!TL§§§CC°§L!§/7§CCE!9€C15 (DEFINE CHECK-COMBINATION-Pefrs (LAMBDA (FM) (IF (NOT (COMBINATION\VARNP FM)) (oo ((A (COMBINATION\ARGS FM) (Con A))) ((NULL A)) (no ((a (Con A) (Con a))) ((NULL s)) (IF (NOT (EFFECTLESS (EFFS-INTERSECT (NOUE\PEFFS (CAR A)) (NODE\PAFFD (CAR B))))) (BLOCK (WARN Ico-argument may affect later onel (NODE\SEXPR (CAR A)) "(EFFECT$ = ,(NODE\PEFFS (CAR A))) (NODE\S£XPR (CAR B)) "(AFFECTED BY .(NODE\PAFFD (CAR B)))) (ALT£R-COMBINATION FH (VARNP := T)))) (IF (NOT (EFFECTLESS (EFFS-INTERSECT (NOUE\PEFFS (NODE\PAFFD (BLOCK (VARN Ico-argument may affect earlier (Non£\s£xPR (CAR a)) "(EFFECTS = ,(Non£\P£rrs (CAR s))) (NODE\SEXPR (CAR A)) "(AFFECTED BY .(NODE\PAFFU (CAR A)))) (ALTER-COMBINATION FM (VARNP := T)))) (IF (NOT (EFFECTLESS-EXCEPT-CONS (EFFS-INTERSECT (NODE\PEFFS (NO0E\PEFFS (BLOCK (YARN lco-arguments may have interfering effectsl (NODE\S£XPR (CAR A)) "(EFFECTS = ,(NODE\PEFF$ (CAR A))) (NODE\SEXPR (CAR B)) "(EFFECTS = ,(NODE\P£FF$ (CAR B)))) (ALTER-COMBINATION FH (HARNP := T))))))))) (CAR a)) (CAR A))))) onel (DEFHAC EFFDEF (FN EFFS AFFD _ FOLD) "(PROGN (DEFPROP ,FN ,EFFS FN-SIDE-EFFECTS) (DEFPROP ,FN ,AFFD FN-SIDE-AFFECTED) ,(ANU FOLD "(UEFPROP ,FN T OKAY-TO-FOLD)))) (DECLARE (/EDEFINE EFFDEF 1SIDE EFFECTS1)) (CAR A)) (CAR B)))))

[Page 164]

154 This page contains declarations of side-effect information for many standard primitive functions. The EFI-'DEF macro used to make the declarations is described on the previous page.

[Page 165]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 ORS 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 (PROGN 'COMPILE (EFFDEF (EFFDEF (EFFDEF (EFFDEF (Erruer (EFFDEF (srrnfr (errosr (srrnsr (EFFDEF (EFFDEF (zrroer (EFFDEF (EFFDEF (errosr (EFFDEF (£rfu£F (erroer (EFFDEF (EFFDEF (EFFDEF (EFFDEF (EFFUEF (efrozr (srrosr (erroer (erroer (sfrosr (EFFDEF (EFFDEF (errnsr (EFFDEF (errner (erroer (EFFDEF (EFFDEF (EFFDEF (errosr (errner (EFFDEF (errner (errnzr (errnzf (EFFDEF (srrnzr (zrrozf (erfoir (EFFDEF (erroer (erroer (Erroer (srrosf (EFFDEF (zrfoer (EFFDEF (EFFOEF (errner (srrnsr (EFFUEF (zrrner (EFFUEF (srroef (erruer (zrroer + None Nous) _ none Nous) ¢ None NONE) // noun NONE) = None None) < nous Nous) > Nona NONE) CAR none (RPLACA)) con none (RPLACO)) CAAR Nous (RPLAcA)) cnon Nous (RPLACA RPLACD)) coAR none (RPLACA RPLACD)) conn none (RPLACU)) CAAAR CAAOR CAOAR CADOR CDAAR CDADR CODAR CODUR CAAAAR CAAAOR CAADAR CAADOR CADAAR CADADR CAODAR CADUDR COAAAR CDAAUR CDADAR CDADDR CDDAAR CDDADR CDDDAR CDUDOR NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE NONE (RPLACA) (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACD) (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACA (RPLACO cxn none (RPLACA RPLACU)) RPLACA (RPLACA) NONE) nPLAco (RPLACA) none) RPLACX (RPLACA RPLACD) NONE) so Noni Nous) Aron Nous None) uumaenv None None) TYPEP nous noni) SYMBOLP Nous NONE) Huuxv Nous None) rxxp None None) rLoA1P None Noun) BIGP None NONE) nor None nous) NULL None None) cons (cons) NONE) LIST (cons) none) Appsno (cons) (RPLACD)) nemo None (RPLACA RPLACU) 1) Asso Nous (nPLAcA nPLAcu) PRINT (FILE) (FILE RPLACA PRIN1 (FILE) (FILE RPLACA Pnluc (FILE) (FILE RPLACA TERPRL (FILE) (FxL£)) rvo (FILE) (FILE)) READ ANY (rxL£)) Tvx ANY (FILE)) 'SIUE-EFFECTS-PROPERTIES) ) nPLAco)) RPLACU)) RPLACD)) nPLAco)) RPLACD)) nPLAcu)) )

)) RPLACD)) RPLACO)) RPLACO)) RPLACD)) RPLACD)) RPLACO)) RPLACD)) RPLACO)) RPLACD)) nPLAco)) RPLACD)) RPLACO)) RPLACD)) RPLAcn)) )) T) nvLAco)) RPLACU)) RPLACD)) FFF!!I_§§§L_Q§£!§£1§_LP°9€_ll

[Page 166]

156 ERASE-NODE and ERASE-ALL-NODES are convenient mnemonic macros used to invoke ERASE-NODES.

ERASE-NODES is used by the optimizer to destroy nodes which have been removed from the program tree because of some optimization. If ALLP is NIL (ERASE-NODE), then only the given node is erased; if it is T (ERASE-ALL-NODES), then the given node and all descendants, direct and indirect, are erased.

Erasing a node may involve removing certain properties from property lists. This is necessary to maintain the consistency of the properties. For example, if a VARIABLE node is erased, then that node must be removed from the READ-REFS property of the variable name. The optimizer depends on this so that, for example, it can determine whether all references to a variable have been erased.

It should be noted in passing that in principle all occurrences of ASET on a given variable could be erased, thereby reducing its WRITE-REFS property to NIL. Because the EFFS-ANALYZE computation on VARIABLE nodes used the WRITE-REFS property, a VARIABLE node might have ASET in its AFFD set after the optimizer had removed all the ASET nodes. Because of the tree-walking discipline of the optimizer, the VARIABLE nodes will not be reanalyzed immediately. This cannot hurt, however; it may just cause the optimizer later to be more cautious than necessary when examining a VARIABLE node. (If this doesn't make sense, come back after reading the description of the optimizer.) The flag *TESTING* is used to determine whether or not to remove the node from the NODE property on the node's name, When debugging, it is very useful to keep this information around to trace the optimizer's actions; but when compiling a large function for "production" purposes, the discarded nodes may bloat memory, and so they must be removed from the NODE property in order that they may be garbage-collected by LISP.

[Page 167]

001 002 D03 004 005 006 007 008 009 010 011 D12 D13 014 015 D16 017 018 019 020 021 022 D23 024 025 026 027 028 029 030 D31 032 033 034 035 036 037 D38 039 040 BABE II _51§5__9jLl§!Z9___R99'=, I9 S31 THIS ROUTINE IS USED TO UNDO ANY PASS 1 ANALYSIS ON A NODE.

(DEFMAC ERASE-NODE (NODE) "(ERASE-NODES ,NODE NIL)) (DEFMAC ERASE-ALL-NODES (NODE) "(ERASE-NODES ,NODE T)) (DEFINE ERASE-NODES (LAMBDA (NODE ALLP) (LET ((rM (Non£\1=oRn Nooe))) (OR (EQ (TYPE NODE) 'NODE) (ERROR '1Cannot erase a non-node) NODE 'FAIL-ACT)) (EOCASE (TYPE FM) (CONSTANT) (VARIABLE (DELPROP (VARIABLE\VAR FM) NODE 'READ-REFS)) (LAMBDA (IF ALLP (ERASE-ALL-NODES (LAMBDA\BODY FM))) (IF (NOT *TESTING*) (AMAPC (LAMBDA (V) (REMPROP V 'BINDING)) (LAMBDA\VARS FM)))) (IF (COND (ALLP (ERASE-ALL-NODES (IF\PR£D FH)) (ERASE-ALL-NODES (IF\CON FM)) (ERASE-ALL-NODES (IF\ALT FM))))) (ASET (IF ALLP (ERASE-ALL-NODES (A5ET\BODY FM))) (DELPROP (ASET\VAR FH) NODE 'WRITE-REFS)) (CATCH (IF ALLP (ERASE-ALL-NODES (CATCH\BODY FM))) (IF (NOT *TESTING*) (REMPROP (CATCH\VAR FM) 'BINDING))) (LABELS (COND (ALLP (AMAPC (LAMBDA (D) (ERASE-ALL-NODES D)) (LABELS\FNDEFS FM)) (ERASE-ALL-NODES (LABELS\BODY FM)))) (IF (NOT lTESTING*) (AMAPC (LAMBDA (V) (REMPROP V 'BINDlNG)) (LABELS\FNVARS FM)))) (COMBINATION (IF ALLP (AMAPC (LAMBDA (A) (ERASE-ALL-NODES A)) (COMBINATION\ARGS FM))))) (IF (NOT *TESTING*) (REMPROP (NODE\NAME NODE) 'NODE)))))

[Page 168]

158 META-EVALUATE is the top-level function of the optimizer. It accepts a node, and returns a node (not necessarily the same one) for an equivalent program. '

The METAP flags in the nodes are used to control reanalysis. META-EVALUATE checks this flag first thing, and returns the given node immediately if its METAP flag is non-NIL, meaning the node has already been properly optimized.

Otherwise it examines the node more carefully.

Some rules about the organization of the optimizer:

[I] A node returned by a call to META-EVALUATE will always have its METAP flag set.

[2] The descendants of a node must be meta-evaluated before any information in them is used.

[3] If a node has its METAP flag set, so do all of its descendants. Moreover, REANALYZE1 has been applied to the node, so all of the information filled in by pass-1 analysis (ENV-ANALYZE, TRIV-ANALYZE, and EFFS-ANALYZE) is up-to-date. when COMPILE calls META-EVALUATE, all the METAP flags are NIL, and no pass-1 analysis has been performed. META-EVALUATE, roughly speaking, calls itself recursively, and meta-evaluates the node tree from the bottom up. After meta-evaluating all the descendants of a node, it applies REANALYZE1 to perform pass-1 analysis on that node, sets the METAP flag, and returns the node.

Exceptions can be made to this discipline if a nontrivial optimization occurs.

If the (meta-evaluated) predicate part of an IF node is itself an IF node (and the debugging switch *FUDGE* is set), then META-IF-FUDGE is called. If it is a constant, then the value of the constant is used to select either the consequent CON or the alternative ALT. The other one is then erased, and the IF node is itself erased. The selected component node is then returned (it has already been meta-evaluated). The statistics counter *DEAD-COUNT* counts occurrences of this "dead code elimination" optimization.

The other two interesting cases are COMBINATION nodes whose function position contains either a trivial function or a LAMBDA node. META-COMBINATION-TRIVFN and META-COMBINATION-LAMBDA handle these respective cases.

[Page 169]

001 002 003 O04 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 D25 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 D41 D42 043 044 045 046 D47 046 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 FBBBII_5§§ __ 05115/73,,TP,H9¢ _19 ;;; THE VALUE OF META-EVALUATE IS THE (POSSIBLY NEW) NODE RESULTING FROM THE GIVEN ONE.

(SET' *FUDGE* T) (SET' *DEAD-COUNT* 0) (DEFINE META-EVALUATE (LAMBDA (NODE) ;SVITCN TO CONTROL META~lF-FUDGE ;COUNT OF DEAD-CODE ELIMINATIONS (IF (NODE\METAP NODE) NODE (LET ((FM (NODE\FORM NODE))) (EOCASE (TYPE FM) (CONSTANT (REANALYZE1 (ALTER-NODE (VARIABLE (REANALYZE1 (ALTER-NODE (LAMBDA (ALTER-LAMBDA FM (BODY := (META-EVALUATE (LAMBDA\BODY FM)))) (REANALYZEI NODE) (ALTER-NODE NODE (METAP := T))) NODE) NODE (METAP := T))) NODE) NODE (METAP := T))) (IF (ALTER-IF FM (PRED := (META-EVALUATE (lF\PRED FM))) (CON := (META-EVALUATE (IF\CON FM))) (ALT := (META-EVALUATE (IF\ALT FH)))) (IF (AND *FUDGE* (EO (TYPE (NODE\FORM (IF\PRED FM))) 'IF)) (META-IF-FUDGE NODE) (IF (EO (TYPE (NODE\FORM (IF\PRED FM))) 'CONSTANT) (LET ((CON (IF\CON FM)) (ALT (IF\ALT FM)) (VAL (CONSTANT\VALUE (NOOE\FORM (IF\PRED FH))))) (ERASE-NODE NODE) (ERASE-ALL-NODES (lF\PRED FM)) (INCREMENT *DEAD-COUNT*) (IF VAL (BLOCK (BLOCK (BLOCK (REANALYZE1 NODE) (ALTER-NODE NODE (METAP := T)))))) (ERASE-ALL-NODES ALT) CON) (ERASE-ALL-NODES CON) ALT))) (ASET (ALTER-Aser rn (souv := (META-£vALuAT£ (ASET\BODY rM)))) (REANALYZE1 None) (ALTER-Nous none (METAP ;= T))) (CATCH (ALTER-CATCH rn (BODY == (META-£vALuAT£ (CATCH\BODY rM)))) (REANALYZEI NODE) (ALTER-NODE none (HETAP == T))) (LABELS (no ((o (LAa£Ls\rnners FM) (CDR o))) ((NULL 0)) (RPLACA D (META-EVALUATE (CAR n)))) (ALTER-LABELS FM (BODY ¢= (META-EVALUATE (LAa£Ls\aooY rH)))) (neANALYze1 Nous) (ALTER-NODE Nous (METAP ¢= T))) (COMBINATION (LET ((rN (NODE\FORM (CAR (coMa|nAT1oN\Anss Fn))))) (COND ((Ann (eo (TYPE FN) 'vARxAaL£) (TRIVFN (VARIABLE\VAR FN))) (META-coMa1NAT|oN-Tnxvru non£)) ((£o (TYPE FN) 'LAH8DA) (META-COMBINATION-LAMBDA noo£)) (T (oo ((A (COMBINATION\ARGS rn) (CDR A))) ((NULL A)) (nPLAcA A (META-evALuAT£ (CAR A)))) (REANALYZEI none) (ALTER-NODE none (METAP == T)))))))))))

[Page 170]

160 For an IF nested within another IF, the transformation shown in the comment is performed. This involves constructing an S-expression of the appropriate form and then calling ALPHATIZE to convert it into a node-tree. (The node-tree could be constructed directly. but it is easier to call ALPHATIZE.

This is the reason why ALPHATIZE merely returns a NODE if it encounters one in the S-expression; META-IF-FUDGE inserts various nodes in the S-expression it constructs.) The original two IF nodes are erased, a statistics counter *FUDGE-COUNT* is incremented, and the new expression is meta-evaluated and returned in place of the nested IF nodes.

(The statistics counter shows that this optimization is performed with modest frequency, arising from cases such as (IF (AND ...)...).) META-COMBINATION-TRIVFN performs the standard recursive meta-evaluation of all the arguments, and then checks to see whether the combination can be "folded". This is possible all the arguments are constants, and if the function has no side effects and cannot be affected by side-effects, or has an OKAY-TO-FOLD property. If this is the case, the function is applied to the arguments, the combination node and its descendants are erased, the statistics counter *FOLD-COUNT* is bumped, and a new CONSTANT node containing the result is created and meta-evaluated. This might typically occur for (NOT NIL) => T, or (+ 3 4) => 7, or (NENQ 'BAR '(FOO BAR BAZ)) => '(BAR BAZ). If this optimization is not permissible, then the usual reanalysis and setting of the METAP flag is performed.

(The statistics counter shows that even in a very large program such as RABBIT this optimization is performed fewer than a dozen times. This may be due to my programming style, or because there are very few macros in the code for RABBIT which might expand into foldable code.)

[Page 171]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 RABBIT'§§§"_0§[I§/7§<MPage_2D ;;; TRANSFORM (IF (IF A a c) 0 E) Into; ; ; ((LAMBDA (01 ex) ; (IF A (IF a (01) (£1)) (IF c (01) (£1)))) ; (LAMBDA () n) ;.. (LAMBDA () E)) (SET' *FUDGE-COUNT* 0) ;COUNT OF IF-FUDGES (DEFINE META-IF-FUDGE (LAMBDA (NODE) (LET ((rn (NO0E\FORH N00£))) (LET ((Prn (uone\ronn (1r\PR£u rn)))) (LET ((u (ALPHATIZE (LET ((coNvAn (seurcnp 'META-CON)) (ALTVAR (GENTEMP 'MELA-ALr))) '((LAMaoA (,CONVAR ,ALTVAR) (IF .(1r\PREo Prn) (IF ,(IF\CON PFM) (,CONVAR) (,ALTVAR)) (IF ,(IF\ALT Prn) (,couvAn) (_ALrvAn)))) (LAMBDA () ,(1F\con rn)) (LAMBDA () ,(IF\ALT rn)))) (uoo£\£uv NOD£)))) ;oossn'r MATTER (ERASE-NODE Noor) (ERASE-NODE (IF\PRED rn)) (INCREMENT *FUDGE-COUNT¢) (META-EVALUATE n)))))) ;;; REDUCE A COMBINATION VITH A SIDE~EFFECT-LESS TRIVIAL ;;; FUNCTION AND CONSTANT ARGUMENTS TO A CONSTANT.

(SET' *FOLD-COUNT* 0) ;COUNT OF CONSTANT FOLDINGS (DEFINE META-COMBINATION-TRIVFN (LAMBDA (NODE) (LET ((FM (NODE\FORM NODE))) (LET ((ARGS (COMBINATION\ARGS FH))) (RPLACA ARGS (META-EVALUATE (CAR ARGS))) (DO ((A (CDR ARGS) (CDR A)) ° (CONSTP (LET ((FNNAME (VARIABLE\VAR (NODE\FORM (CAR ARGS))))) (OR (AND (EO (GET FNNAHE 'FN-SIDE-EFFECTS) 'NONE) (EO (GET FNNAME 'FN-SIDE-AFFECTED) 'NONE)) (GET FNNAME '0KAY-TO-FOLD))) (Ann consrv (eo (TYPE (NO0E\FORM (CAR A))) 'CONSTANT)))) ((nuLL A) (COND (CONSTP (LET ((VAL (APPLY (VARIA8LE\VAR (NODE\FORH (CAR ARGS))) (AMAPCAR (LAMBDA (x) (CONSTANT\VALUE (uooe\ronn x))) (CDR ARGS))))) (ERASE-ALL-MODES NODE) (xncneneur *FOLD~COUNT*) (META-EVALUATE (ALPHATIZE °(ouor£ ,VAL) NIL)))) (1 (n£AnALvze1 NODE) (AL1£n-none nous (MLIAP == r))))) -(RPLACA A (META-EVALUATE (CAR A))))))))

[Page 172]

162 META-COMBINATION-LAMBDA performs several interesting optimizations on combinations of the form ((LAMBDA ...) ...). It is controlled by several debugging switches, and keeps several statistics counters, which we will not describe further.

First all the arguments, but @ the LAMBDA-expression, are meta-evaluated by the first DO loop. Next, the body of the LAMBDA node is meta-evaluated and kept in the variable B in the second DO loop. This loop iterates over the LAMBDA variables and the corresponding arguments. For each variable-argument pair, SUBST-CANDIDATE determines whether the argument can "probably" be legally substituted for occurrences of the variable in the body. If so, META-SUBSTITUTE is called to attempt such substitution. when the loop finishes, B has the body with all possible substitutions performed. This body is then re-meta-evaluated. (The reason for this is explained later in the discussion of META-SUBSTITUTE.) Next an attempt is made to eliminate LAMBDA variables. A variable and its corresponding argument may be eliminated if the variable has no remaining references, and the argument either has no side effects or has been successfully substituted. (If an argument has side effects, then SUBST-CANDIDATE will give permission to attempt substitution only if no more than one reference to the corresponding variable exists. If the substitution fails, then the argument may not be eliminated, because its side effects must not be lost. It the substitution succeeds, then the argument must be eliminated, because the side effects must not be duplicated.) A consistency check ensures that in fact the variable is unreferenced within the body as determined by its REFS and ASETS slots; then the argument and variable are deleted, and the nodes of the argument are erased. when all possible variable-argument pairs have been eliminated, then there are two cases. If the LAMBDA has no variables left, then the combination containing it can be replaced by the body of the LAMBDA node. In this case the LAMBDA and COMBINATION nodes are erased. Otherwise the LAMBDA and COMBINATION nodes are reanalyzed and their METAP flags are set.

(The statistics counters show that when RABBIT compiles itself these three optimizations are performed hundreds of times. This occurs because many standard macros make use of closures to ensure that variables local to the code for the macro do not conflict with user variables. These closures often can be substituted into the code by the compiler and eliminated.)

[Page 173]

DO1 002 003 004 005 005 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 D27 028 029 030 031 D32 033 034 035 036-037 038 039 040 041 042 043 044 045 046 047 D48 049 050 051 052 053 054 055 056 057 RAPBITE§Q§EEQ§ll5(7¢mEP°9= 2?

(sEr' *FLUSH-Anas* 1) ;sv1TcH TO CONTROL VARIABLE ELIHINATION (ser' *FLUSH-COUNT* 0) ;counr or VARIABLES ELIMINATED (s£T' *CONVERT-COUNT* 0) ;COUNT or ruLL BETA-couvsnsxons (DEFINE META-COMBINATION-LAMBDA (LAMBDA (NODE) (LET ((FM (NODE\FORM NODE))) (LET ((ARGS (COMBlNATl0N\ARGS FM))) (DO ((A (CDR ARGS) (CDR A))) ((NULL A)) (RPLACA A (META~EVALUATE (CAR A))) (ALTER-NODE (CAR A) (SUBSTP := NIL))) (LET ((FN (NODE\FORM (CAR ARGS)))) (DO ((V (LAM8DA\VARS FN) (CDR V)) (A (CDR ARGS) (CDR A)) (8 (META-EVALUATE (LAMBDA\BODY FN)) (IF (SUBST-CANDIDATE (CAR A) (CAR v) B) (META-SUBSTITUTE (CAR A) (CAR V) B) B))) ((NULL V) (ALTER-LAMBDA FN (BODY := (META-EVALUATE B))) (oo ((v (LAMBDA\VARS rn) (con V)) (A (CDR ARGS) (CDR A))) '

((NULL A)) (IF (AND *FLUSH-ARGS* (NULL (GET (CAR V) 'READ-REFS)) (NULL (GET (CAR V) 'WRITE-REFS)) (OR (EFFECTLESS-EXCEPT-CONS (NODE\EFFS (CAR A))) (NODE\SUBSTP (CAR A)))) (BLOCK (IF (OR (MEMO V (NODE\REFS (LAMBDA\BODY FN))) (MEMO V (NODE\ASETS (LAHBDA\BODY FN)))) V (ERROR '1ReanaTysis lost - META-COMBINATION-LAMBDAI NODE 'FAIL-ACT)) (DELO (CAR A) ARGS) (ERASE-ALL-NODES (CAR A)) (INCREMENT *FLUSH-COUNT*) (ALTER-LAMBDA FN (VARS := (DELO (CAR V) (LAMBUA\VARS FN))) (UVARS := (DELO (GET (CAR V) 'USER-NAME) (LAMBDA\UVARS FN))))))) (COND ((NULL (LAM8DA\VARS FN)) (OR (NULL (CDR ARGS)) (ERROR '1Too many args in META-COMBINATION-LAMBDAI NODE 'FAIL-ACT)) (LET ((BOD (LAMBDA\BODY FN))) (ERASE-NODE (CAR ARGS)) (ERASE-NODE NODE) (INCREMENT *CONVERT-COUNT*) BOD)) (T (REANALYZE1 (CAR ARGS)) (ALTER-NODE (CAR ARGS) (METAP := T)) (REANALYZE1 NODE) (ALTER-NODE NODE (MEIAP := T)))))))))))

[Page 174]

164 (SUBST-CANDIDATE ARG VAR BOD) is a predicate which is true iff it is apparently legal to attempt to substitute the argument ARG for the variable VAR in the body BOD. This predicate is very conservative, because there is no provision for backing out of a bad choice. The decision is made on this basis:

[1] There must be no ASET references to the variable. (This is overly restrictive, but is complicated to check for correctly, and makes little difference in practice.) [2] One of three conditions must hold:

[Za] There is at most one reference to the variable. (Code with possible side effects must not be duplicated. Exceptions occur, for example, if there are two references, one in each branch of an IF, so that only one can be executed. This is hard to detect, and relaxing this restriction is probably not worthwhile.) [Zb] The argument is a constant or variable. (This is always safe because the cost of a constant or variable is no worse than the cost of referencing the variable it replaces.) [Zc] The argument is a LAMBDA-expression, and either:

[Zcl] There is no more than one reference. (This is tested again because of the presence of debugging switches in SUBST-CANDIDATE which can control various tests independently to help localize bugs.) [2c2] The body of the LAMBDA-expression is a combination, all of whose descendants are constants or variables, and the number of arguments of the combination (not counting the function) does not exceed the number of arguments taken by the LAMBDA-expression. (The idea here is that substitution of the LAMBDA-expression into function position of some combination will later allow reduction to a combination which is no worse than the original one. This test is a poor heuristic if references to the variable VAR occur in other than function position within BOD, because then several closures will be made instead of one, but is very good for code typically produced by the expansion of macros. In retrospect, perhaps ENV-ANALYZE should maintain a third property besides READ-REFS and WRITE-REFS called, say, NON-FN-REFS. This would be the subset of READ-REFS which occur in other than function position of a combination. SUBST-CANDIDATE could then use this information. Alternatively, META-SUBSTITUTE could, as it walked the node-tree of the body, keep track of whether a variable was encountered in function position, and refuse to substitute a LAMBDA-expression for a variable not in such a position which had more than one reference. This might in turn prevent other optimizations, however.)

[Page 175]

001 002 003 004 005 006 007 008 009 010 Ol!

012 013 014 015 016 017 018 019 020 021 022 023 024 RABBU 5§§E L93ZLlL§/ZQLLPPQELLZZ (sET' *SUBSTITUTE* T) ;SUlTCH TO CONTROL SUBSTITUTION (SET' *SINGLE-SUBST* T) ;SVITCH TO CONTROL SUBSTITUTION or ExPREss1oNs WITH SIDE EFFECTS (SET' *LAMBDA-SUBST* T) ;SVlTCN TO CONTROL SUBSTITUTION or LAMBDA-EXPRESSIONS (UEFXNE suasT-cANo1uATE (LAMBDA (ARG vnu BOD) (Ann *SUBSTITUTE* (NOT (GET vAn 'WRITE-REFS)) ;uE PARANOTU ron now (OR (AND *SINGLE-SUBST* (NULL (con (GET vnu 'READ-nErs)))) (MEMQ (TYPE (NOD£\FORM ARG)) '(CON5TANT VARIABLE)) (Ano *LAMBDA-SUBST* (Eo (TYPE (NouE\EoRM ARG)) 'LAMBDA) (OR (NULL (con (GET VAR 'READ-REFs))) (LET ((a (NooE\EoRM (LAM8DA\BODY (NonE\roRM ARG))))) (OR (MEMO (TYPE a) '(CONSTANT vARlAsLE)) (Ann (Eo (TYPE B) 'coMs1NAT1oN) (NOT (> (LENGTH (con (coMa1uATToN\AnGs a))) (LENGTH (LAMBDA\VARS (NooE\EonM AnG))))) (no ((A (coMaINATloN\ARGs a) (CDR A)) (P T (Ann P (MEMO (TYPE (NODE\FORH (CAR A))) '(coNsTANT VARIABLE))))) ((NULL A) P)))))))))))

[Page 176]

166 REANALYZEI calls PASS1-ANALYZE on the given node. The argument T means that optimization is in effect, and so EFFS-ANALYZE must be invoked after ENV-ANALYZE and TRIV-ANALYZE (EFFS-ANALYZE information is used only by the optimizer). The argument *REANALYZE* specifies whether reanalysis should be forced to all descendant nodes, or whether reanalysis of the current node will suffice. This variable normally contains the symbol ONCE, meaning reanalyze only the current node. META-EVALUATE normally ensures, before analyzing a node, that all descendant nodes are analyzed. Thus the initial pass-1 analysis occurs incrementally, interleaved with the meta-evaluation process.

The switch *REANALYZE* may be set to the symbol ALL to force all descendants of a node to be reanalyzed before analyzing the node itself. This ability is provided to test for certain bugs in the optimizer. If the incremental analysis should fail for some reason, then the descendant nodes may not contain correct information (for example, their information slots may be empty!). The ALL setting ensures that a consistent analysis is obtained. If the optimizer's behavior differs depending on whether *REANALYZE* contains ONCE or ALL, been then a problem with the incremental analysis is implicated. This switch has very useful for isolating such bugs.

The next group of functions are utilities for META-SUBSTITUTE which deal with sets of side-effects.

EFFS-INTERSECT takes the intersection of two sets of side-effects. It is just like INTERSECT, except that it also knows about the two special sets ANY and NONE.

EFFECTLESS is a predicate which is true of an empty set of side-effects.

EFFECTLESS-EXCEPT-CONS is a predicate true of a set of side-effects which is empty except possibly for the CONS side-effect.

PASSABLE takes a node and two sets of side-effects, which should be the EFFS and AFFD sets from some other node. PASSABLE is a predicate which is true if the given node, which originally preceded the second in the standard evaluation order, can legitimately be postponed until after the second is evaluated. That is, it is true iff the first node can 'pass' the second during the substitution process.

[Page 177]

001 002 003 004 005 006 007 008 009 010 O11 O12 013 014 015 D16 017 018 D19 020 021 DZZ 023 O24 025 026 D27 028 029 030 031 032 033 034 035 036 D37 036 039 040 041 042 043 044 045 046 047 048 049 D50 RFB?IL§§5__E9Q§{Z§,_£F9= £3 (venus REANALYZE1 (LAMBDA (NODE) (PASSI-ANALYZE NODE lREANALYZE° T))) (SET' ¢REANALYZE* 'ONCE) ;;; HERE WE DETERMINE, FOR EACH VARIABLE NODE WHOSE VAR IS THE ONE lil GIVEN, WHETHER IT TS POSSIBLE TO SUBSTITUTE IN FOR IT; THIS IS SCL DETERMINED ON THE BASIS OF SIDE EFFECTS. THIS IS DONE BY ;;; WALKING THE PROGRAM, STOPPING WHEN A SIDE-EFFECT BLOCKS IT.

A SUBSTITUTION IS MADE IFF IS VARIABLE NODE IS REACHED IN THE WALK.
THERE IS A BUG IN THIS THEORY TO THE EFFECT THAT A CATCH ;;; WHICH RETURNS MULTIPLY CAN CAUSE AN EXPRESSION EXTERNAL 121 TO THE CATCH TO BE EVALUATED TWICE. THIS IS A DYNAMIC PROBLEM ;;; WHICH CANNOT BE RESOLVED AT COMPILE TIME, AND SO WE SHALL LC; IGNORE IT FOR NOW.

25; WE ALSO RESET THE METAP FLAG ON ALL NODES WHICH HAVE A ZLL SUBSTITUTION AT OR BELOW THEM, SO THAT THE META-EVALUATOR WILL ;;; RE-PENETRATE TO SUBSTITUTION POINTS, WHICH MAY ADMIT FURTHER ;;; OPTIMIZATIONS.

(DEFINE EFFS-INTERSECT (LAMBDA (A B) (COND ((EO A 'ANY) B) ((E0 B 'ANY) A) ((EO A 'NONE) A) ((EO B 'NONE) B) (T (INTERSECT A B))))) (DEFINE EFFECTLESS (LAMBDA (X) (OR (NULL X) (EO X 'NONE)))) (DEFINE EFFECTLESS-EXCEPT-CONS (LAMBDA (X) (OR (EFFECTLESS X) (EQUAL X '(CONS))))) (DEFINE PASSABLE (LAMBDA (NODE EFFS AFFD) (BLOCK (IF (EMPTY (NODE\EFFS NODE)) (ERROR '1Pass l Analysis Missing - PASSABLEI NODE 'FAIL-ACT)) (AND (EFFECTLESS (EFFS-INTERSECT EFFS (NODE\AFFD NODE))) (EFFECTLESS (EFFS-INTERSECT AFFO (NODE\EFFS NODE))) (EFFECTLESS-EXCEPT-CONS (EFFS-INTERSECT EFFS (NODE\EFFS NODE)))))))

[Page 178]

168 META-SUBSTITUTE takes a node-tree ARG, a variable name VAR, and another node-tree BOD, and wherever possible substitutes copies of ARG for occurrences of VAR within BOD. The complexity of this process is due almost entirely to the necessity of determining the extent of "wherever possible".

META-SUBSTITUTE merely spreads out the EFFS and AFFD slots of ARG to make them easy to refer to, makes an error check, and then passes the buck to the internal LABELS routine SUBSTITUTE, which does the real work.

SUBSTITUTE recurs over the structure of the node-tree. At each node it first checks to see whether VAR is in the REFS set of that node. This is purely an efficiency hack: if VAR is not in the set, then it cannot occur anywhere below that node in the tree, and so SUBSTITUTE can save itself the work of' a complete recursive search of that portion of the node-tree.

SUBSTITUTE plays another efficiency trick in cahoots with META-EVALUATE to save work. whenever SUBSTITUTE actually replaces an occurrence of VAR with a copy of ARG, the copy of ARG will have its METAP flag turned off (set to NIL).

Now SUBSTITUTE propagates the METAP flag back up the node-tree; when all sub-nodes of a node have had SUBSTITUTE applied to them, then if the METAP flag of the current node is still set, it is set to the AND of the flags of the subnodes.

Thus any node below which a substitution has occurred will have its METAP flag reset. More to the point, any node which after the substitution still has its METAP flag set has had no substitutions occur below it. META-EVALUATE can then be applied to BOD after all substitutions have been tried (this occurs in META-COMBINATION-LAMBDA), and META-EVALUATE will only have to reexamine those parts of BOD which have changed. In particular, if no substitutions were successful, META-EVALUATE will not have to reexamine BOD at all.

If the variable is referenced at or below the node, it breaks down into cases according to the type of the node.

For a CONSTANT, no action is necessary.

For a VARIABLE, no action is taken unless the variable matches VAR, in which case the node is erased and a copy of ARG is made and returned in its place. The SUBSTP slot of the original ARG is set as a flag to META-COMBINATION-LAMBDA (q.v.), to let it know that at least one substitution succeeded.

For a LAMBDA, substitution can occur in the body only if ARG has no side-effects except possibly CONS. This is because evaluation of the LAMBDA-expression (to produce a closure) will not necessarily cause evaluation of the side-effect in ARG at the correct time. The special case of a LAMBDA occurring as the function in a COMBINATION is handled separately below.

For an IF, substitution is attempted in the predicate. It is attempted in the other two sub-trees only if ARG can pass the predicate.

For an ASET' or a CATCH, substitution is attempted in the body. The same is true of LABELS, but substitution is also attempted in the labelled function definitions.

[Page 179]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 FABBYL§§§__25115,/Z¢L_E39§,_24 (s£r' fsuasr-count* o) ;couN1 or suesrrrurlous (S£T' *LAMBDA-BODY-SUBST* T) ;sv11cH TO coNrRoL suasrxrurlou IN LAMBDA eoolzs (s£r~ ¢LAMBDA-BO0Y-SUBST-TRY-COUNT* o) ;couN1 Tuensor - Tales -(SET' *LAMBOA-BODY-SUBST-SUCCESS-COUNT* o) ;couur rueneor - successes (DEFINE META-SUBSTITUTE (LAMBDA (ARG VAR soo) (LET ((errs (NOU£\£FF$ ARG)) (AFFD (NODE\AFFD ARG))) (IF (EMPTY EFFS) (ERROR '1Pass 1 Analysis Screved Up - META-SUBSTITUTEI ARG 'FAIL-ACT)) (LABELS ((SUBSTITUTE (LAMBDA (NODE) (IF (OR (EMPTY (NODE\REFS NODE)) (nor (Memo VAR (NODE\REFS No0£)))) ;£rFxcl£ncv HACK NODE (LET ((Fn (NODE\FORM uoo£))) (EOCASE (TYPE FM) (CONSTANT NODE) (VARIABLE (IF (EO (VARIABLE\VAR FM) VAR) (BLOCK (ERASE-ALL-NODES NODE) (INCREMENT *SUBST-COUNT*) ~ (ALTER-NODE ARG (SUBSTP 1° T)) (covv-cone ARs)) .

NODE)) (LAMBDA (IF (AND (EFFECTLESS-EXCEPT-CONS EFFS) (EFFECTLESS AFFD)) (ALTER-LAMBDA FM (BODY Z* (SUBSTITUTE (LAMBDA\BODY FM))))) (IF (NODE\METAP NODE) (ALTER-NODE NODE (METAP := (NODE\METAP (LAMBDA\BODY FM))))) NODE) (IF (ALTER-IF FM (PRED Z= (SUBSTITUTE (IF\PRED FM)))) (IF (PASSABLE (IF\PRED FM) EFFS AFFD) (ALTER-IF FM (CON := (SUBSTITUTE (IF\CON FM))) (ALT 2= (SUBSTITUTE (IF\ALT FM))))) (IF (NODE\METAP NODE) (ALTER-NODE NODE (METAP ¢= (Ano (Non£\M£rAP (IF\PRED rn)) (NODE\METAP (IF\CON rn)) (NODE\METAP (IF\ALT FM)))))) NODE) (ASET (ALTER-ASET rn (BODY ;= (SUBSTITUTE (As£1\Boov rn)))) (IF (NODE\METAP NODE) (ALTER-NODE NODE (METAP := (NODE\METAP (ASET\BODY FM))))) NODE) (CATCH (ALTER-CATCH FM (BODY := (SUBSTITUTE (CATCH\BODY FM)))) (IF (NODE\METAP NODE) (ALTER-NODE NODE (METAP := (NODE\METAP (CATCH\BODY FM))))) NODE) (LABELS '

(ALTER-LABELS FM (BODY := (SUBSTITUTE (LABELS\BODY FM)))) (DO ((D (LABELS\FNDEFS FH) (CDR D)) (MP (NODE\METAP (LABELS\BODY FM)) (Ano np (NODE\METAP (CAR n))))) 1[(NULL D) (IF (NODE\METAP NODE) V (ALTER-NODE NODE (METAP :I MP)))) RPLACA D (SUBSTITUTE (CAR D)))) (I noon ja

[Page 180]

170 The most complicated case is the COMBINATION. First it is determined (in the variable X) whether ARG can correctly pass all of the arguments of the combination.

(It is not possible to substitute into any argument unless all can be passed, because at this time it has not been decided in what order to evaluate them. This decision is the free choice of CONVERT-COMBINATION below.) If it can, then substitution is attempted in all of the arguments except the function itself. Then two kinds of function are distinguished. If it is not a LAMBDA, a straightforward recursive call to SUBSTITUTE is used. If it is, then substitution substitution special case so it is all is attempted in the bog! of the LAMBDA (got in the LAMBDA itself; in a LAMBDA requires that ARG be EFFECTLESS-EXCEPT-CONS, but in this we know that the LAMBDA-expression will be invoked immediately, and right if ARG has side-effects).

[Page 181]

070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 DB8 089 090 091 092 093 094 095 096 097 098 099 100 101 (SUBSTITUTE BOD))))) (COHBINATION FA55lT,§§§ _!5ll§lZ§E_P°9¢,?3;!

(LET ((ARGS (COMBINATION\ARGS FM))) (DO ((A A (x T ((nuL RGS (CDR A)) (AND X (PASSABLE (CAR A) EFFS AFFD)))) L A) (IF X (DO ((A (CDR ARGS) (CDR A))) (IF (no ((A A (nv ((uuL (IF N0UE))))))) ((NULL A)) (RPLACA A (SUBSTITUTE (CAR A))))) (AND *LAMBDA-BODY-SUBSTK (EO (TYPE (NODE\FORM (CAR ARGS))) 'LAHBDA)) (LET ((fN (NOD£\FORM (CAR ARGS)))) (INCREMENT *LAMBUA-BODY-SUBST-TRY-COUNT!) (COND (X (INCREHENT *LAMBDA-BODY-SUBST-SUCCESS-COUNT*) (ALTER-LAMBDA FN (BODY := (SUBSTITUTE (LAMBDA\BODY FN)))))) (IF (NODE\HETAP (CAR ARGS)) (ALTER-NODE (CAR ARGS) (METAP := (NODE\METAP (LAHBDA\BODY FN)))))) (IF X (RPLACA ARGS (SUBSTTTUTE (CAR ARGS))))))) RGS (CDR A)) T (AND MP (NODE\HETAP (CAR A))))) L A) (NODE\HETAP NDDE) (ALTER-NODE NODE (METAP := MP))))))

[Page 182]

COPY representing COPY argument ENV COPYNODES; purposes.

172 CODE is used by META-SUBSTITUTE to make copies of node-trees code. It invokes COPYNODES with appropriate additional arguments.

NODES does the real work. The argument ENV is analogous to the taken by ALPHATIZE. However, variables are not looked-up in ENV by ENV is maintained only to install in the new nodes for debugging The argument RNL is a "rename list" for variables. when a node is copied which binds variables, new variables are created for the copy. RNL provides a mapping from generated names in the original code to generated names in the copy (as opposed to ENV, which maps user names to generated names in the copy). Thus, when a LAMBDA node is copied, new names are generated, and PAIRLIS is used to pair new names with the LAMBDA\VARS of the old node, adding the new pairs to RNL.

A using the generated 34-73-156 turn was variables neat trick to aid debugging is that the new names are generated by old names as the arguments to GENTEMP. In this way the name of a variable contains a history of how it was created. For example, VAR-was created by copying the LAMBDA node which bound VAR-34-73, which in copied from the node which bound VAR-34. Copies of CATCH and LABELS are generated in the same way.

The large EQCASE handles the different types of nodes. The result is then given to NODIFY, the same routine which creates nodes for ALPHATIZE. Recall that NODIFY initializes the METAP slot to NIL; the next meta-evaluation which comes along will cause pass-1 analysis to be performed on the new copies.

Note particularly that the UVARS list of a LAMBDA node is copied, for the same reason that ALPHA-LAMBDA makes a copy: META-COMBINATION-LAMBDA may alter it destructively.

[Page 183]

001 002 003 004 005 006 007 006 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 045 047 048 049 050 051 052 053 054 055 056 057 058 059 060 (DEFINE covv-cons (LAMBDA (NODE) FBBEIT 555 Q§!l§!Z§__K99€_§§ (REANALYZE1 (COPY-NODES none (NO0E\ENV NODE) u1L)))) (DEFINE covv-uonis (LAMBDA (NODE suv RNL) (NODIFY (LET ((FM (NODE\FORM NODE (EOCASE (TYPE FM) (CONSTANT ))) (CONS-CONSTANT (VALUE = (CONSTANT\VALUE FM)))) (VARIABLE (CONS-VARIABLE (VAR = (LET ((SLOT (ASSO (VARIABLE\VAR FH) RNL))) (LAMBDA (LET ((VARS (IF SLOT (CADR SLOT) (VARIABLE\VAR FH)))) (GLQBALP = (VARIA8LE\GLOBALP FM)))) (AMAPCAR GENTEMP (LAM8DA\VARS rM)))) (CONS-LAMBDA (UVARS = (APPENU (LAHBUA\UVARS FM) NIL)) (IF (cons-xr (ASET (CONS-ASET (CATCH (LET ((VAR (uvAR (VARS = VARS) _ (BODY = (COPY-NODES (LAM8DA\80DY rn) (PAIRLIS (LAMBOA\UVARS rn) vAns ENV) (PAIRLIS (LAHBDA\VARS FM) vARs nnL)))))) (PRED = (COPY-NODES (1r\Pnen rn) suv RNL)) (con = (covv-nones (1r\con rn) env RNL)) (ALT = (corv-uonss (IF\ALT rn) suv nuL)))) (VAR = (LET ((SLOT (ASSO (ASET\VAR FM) RNL))) (IF SLOT (CADR SLOT) (ASET\VAR FM)))) (GLOBALP = (ASET\GLOBALP FM)) (BODY = (COPY-NODES (ASET\BODY FH) ENV RNL)))) (ssnvsnv (CATCH\VAR rn))) (CATCH\UVAR FM))) (CONS-CATCH (uvan - (CATCH\UVAR rn)) (LABELS (LET ((FNVA (LET ( (VAR = VAR) (aonv = (COPY-NODES (CATCH\BOUY rn) (cons (LIST uvAR VAR) suv) (cons (LIST (cArcu\vAn rn) VAR) nnL)))))) ns (AMAPCAR GENTEMP (LABELS\FNVARS rn)))) (LENV (PAIRLIS (LA8£LS\UFNVARS rn) ruvAns euv)) (LRNL (PAIRLIS (LAaeLs\ruvARs rn) rnvnns RNL))) (CONS-LABELS (UFNVARS = (LABELS\UFNVARS FM)) (COMBINATION (CONS-COHBI (NODE\S£XPR NODE) £NV))) (FNVARS = FNVARS) (FNUEFS = (AMAPcAn ' (LAMBDA (N) (COPY-NODES n Leuv LRNL)) (LABELS\FNDEFS rM))) (BODY = (COPY-NODES (LABELS\80DY rn) Lauv LRNL)))))) uAT1ou (Anas - (AnAPcAn (LAMBDA (N) (COPY-NODES n env nuL)) (coMa1nA11oN\Anss Fn))) (HARNP = (COM8INATl0N\VARNP rn))))))

[Page 184]

174 The next several functions process the node-tree produced, analyzed, and optimized by pass 1, converting it to another representation. This new representation is a tree structure very similar to the node-tree, but has different components for the pass-2 analysis. We will call this the "cnode-tree". The "c" stands for "Continuation-passing style': for the conversion process transforms the node-tree into a form which uses continuation-passing to represent the control and data flow within the program.

We define a new collection of data types used to construct cnode-trees.

The CNODE data type is analogous to the NODE data type; one component CFORM contains a variant structure which is specific to the programming construct represented by the CNODE. _ The types CVARIABLE, CLAMBDA, CIF, CASET, CLABELS, and CCOMBINATION correspond roughly to their non-C counterparts in pass 1.

Type TRIVIAL is used to represent pieces of code which were designated trivial in pass 1 (TRIVP slot = T) by TRIV-ANALYZE; the NODE component is simply the pass-1 node-tree for the trivial code. This is the only case in which part of the pass-1 node-tree survives the conversion process to be used in pass Z.

A CONTINUATION is just like a CLAMBDA except that it has only one bound variable VAR. This variable can never appear in a CASET, and so the CONTINUATION type has no ASETVARS slot; all other slots are similar to those in a CLAMBDA structure.

A RETURN structure is just like a CCOMBINATION, except that whereas a CCOMBINATION may invoke a CLAMBDA which may take any number of arguments, a RETURN may invoke only a CONTINUATION on a single value. Thus, in place of the ARGS slot of a CCOMBINATION, which is a list of cnodes, a RETURN has two slots CONT and VAL, each of which is a cnode.

(In retrospect, this was somewhat of a design error. The motivation was that the world of closures could be dichotomized into LAMBDA-closures and continuation-closures, as a result of the fundamental semantics of the language: one world is used to pass values "down" into functions, and the other to pass values "up" from functions. Combinations can similarly be dichotimized, and I thought it would be useful to reflect this distinction in the data types to enforce and error-check this dichotomy. However, as it turned out, there is a great deal of code in pass 2 which had to be written twice, once for each "world", because the data types involved were different. It would be better to have a single structure for both CLAMBDA and CONTINUATION, with an additional slot flagging which kind it was. Then most code in pass Z could operate on this structure without regard for which "world" it belonged to, and code which cared could check the flag.)

[Page 185]

001 002 003 004 005 006 D07 008 009 010 011 012 013 014 015 016 017 018 019 O20 021 022 023 024 025 026 027 028 O29 030 031 032 033 034 035 036 037 D38 039 040 041 042 043 044 045 D46 047 048 049 050 D51 RABBIT_S§B" QSLI5[]B _Page 26 ;;; CONVERSION TO CONTINUATION-PASSING STYLE ;;; THIS INVOLVES MAKING A COMPLETE COPY OP THE PROGRAM IN TERMS SS; OF THE FOLLOWING NEW DATA STRUCTURES:

(DEFTYPE (DEFTYPE (DEFTYPE (UEFTYPE (DEFTYPE (nsrrvve (oerrvve (UEFTYPE (DEFTYPE (DEFTYPE cnone (ENV Refs CLOVARS°CFORM)) ;ENV ENVIRONMENT (A LIST OF VARIABLES, NOT A MAPPING; DEBUGGING ONLY) 'REFS VARIABLES BOUND ABOVE AND REFERENCED BELOW THIS CNODE ;CLOVARS VARIABLES REFERRED TO AT OR BELOW THIS CNODE BY CLOSURES (SHOULD BE A SUBSET OF REFS) ;CFORM ONE OF THE BELOW TYPES TRIVIAL (NODE)) :NODE A PASS-1 NODE TREE CVARIABLE (VAR)) 'VAR GENERATED VARIABLE NAME CLAMBDA (VARS BODY FNP TVARS NAME DEP MAXDEP CONSENV CLOSEREFS ASETVARS)) ;FNP NON-NIL =) NEEDN'T MAKE A FULL CLOSURE OF THIS CLAMBDA. MAY BE 'NOCLOSE OR 'EZCLOSE (THE FORMER MEANING NO CLOSURE IS NECESSARY AT ALL. THE LATTER THAT THE CLOSURE IS MERELY THE ENVIRONMENT).

TVARS THE VARIABLES WHICH ARE PASSED THROUGH TEMP LOCATIONS ON ENTRY. NON-NIL ONLY IF FNP='NOCLOSE; THEN IS NORMALLY THE LAMBDA VARS. BUT MAY BE DECREASED TO ACCOUNT FOR ARGS WHICH ARE THEMSELVES KNOWN NOCLOSE'S, OR WHOSE CORRESPONDING PARAMETERS ARE NEVER REFERENCED.

THE TEMP VARS INVOLVED START IN NUMBER AT DEP.

NAME THE PROG TAG USED TO LABEL THE FINAL OUTPUT CODE FOR THE CLAMBDA ;DEP DEPTH OF TEMPORARY REGISTER USAGE WHEN THE CLAMBDA IS INVOKED ;MAXDEP MAXIMUM DEPTH OF REGISTER USAGE WITHIN CLAMBDA BODY ;CONSENV THE "CONSED ENVIRONMENT" WHEN THE CLAMBDA IS EVALUATED LCLOSEREFS VARIABLES REFERENCED BY THE CLAMBDA WHICH ARE NOT IN THE CONSED ENVIRONMENT AT EVALUATION TIME, AND SO MUST BE ADDED TO CONSENV AT THAT POINT TO MAKE THE CLOSURE ;ASETVARS THE ELEMENTS OF VARS WHICH ARE EVER SEEN IN A CASET CONTINUATION (VAR BODY FNP TVARS NAME DEP MAXDEP CONSENV CLOSEREFS)) SCOMPONENTS ARE AS FOR CLAMBDA CIF (PRED CON ALT)) CASET (CONT VAR BODY)) CLABELS (FNVARS FNDEFS FNENV EASY CONSENV BODY)) ;FNENV A LIST OF VARIABLES TO CONS ONTO THE ENVIRONMENT BEFORE CREATING THE CLOSURES AND EXECUTING THE BODY ZEASY NON-NIL IFF NO LABELED FUNCTION IS REFERRED TO AS A VARIABLE. CAN BE 'NOCLOSE OR 'EZCLOSE (REFLECTING THE STATUS OF ALL THE LABELLED FUNCTIONS) SCONSENV AS FOR CLAMBDA CCOMBINATION (ARGS)) SARGS LIST OF CNODES REPRESENTING ARGUMENTS RETURN (CONT VAL)) ;CONT CNODE FOR CONTINUATION 'VAL CNODE FOR VALUE

[Page 186]

176 CNODIFY is for cnode-trees what NODIFY was for node-trees. It takes a CFORM and wraps a CNODE structure around it.

CONVERT is the main function of the conversion process; it is invoked by COMPILE on the result (META-VERSION) of pass 1. CONVERT dispatches on the type of node to be converted, often calling some specialist which may call it back recursively to convert subnodes. CONT may be a cnode, or NIL. If it is a cnode, then that cnode is the code for a continuation which is to receive as value that produced by the code to be converted. That is, when CONVERT finishes producing code for the given node (the first argument to CONVERT), then in effect a RETURN is created which causes the value of the generated code to be returned to the code represented by CONT (the second argument to CONVERT). Sometimes this RETURN cnode is created explicitly (as for CONSTANT and VARIABLE nodes), and sometimes only implicitly, by passing CONT down to a specialist converter.

MP is T if optimization was performed by pass 1, and NIL otherwise. This argument is for debugging purposes only: CONVERT compares this to the METAP slot of the pass-1 nodes in order to detect any failures of the incremental optimization and analysis process. CONVERT also makes some other consistency checks.

[Page 187]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 R*§PII_§§§"~9§ll§lZ§_M?99£_ZZ (oarluf CNODIFY (LAMBDA (CFORM) (CONS-CNODE (CFORH = CFORM)))) (DEFINE CONVERT (LAMBDA (NODE CONT MP) (LET ((rM (NODE\FORH NODE))) (IF (EMPTY (NODE\TRIVP NODE)) (ERROR '1Pass I analysis missingl NODE 'FAIL-ACT)) (OR (EO (NODE\METAP NODE) MP) (ERROR '1Meta-evaluation Screved Up METAP1 NODE 'FAIL-ACT)) (EOCASE (TYPE FM) (CONSTANT (OR (NODE\TRIVP NODE) (ERROR '1Non-trivial Constant) NODE 'FAIL-ACT)) (MAKE-RETURN (CONS-TRIVIAL (NODE ' NODE)) CONT)) (VARIABLE (OR (NODE\TRIVP.NODE) (ERROR '1Non-trivia! Variablel 'FAIL-ACT)) (MAKE-RETURN (CONS-TRIVIAL (NODE = NODE)) CONT)) (LAMBDA (MAKE-RETURN (CONVERT-LAMBDA-FM NODE NIL MP) CONT)) (IF (OR CONT (ERROR '1NuIT Continuation to IFI NODE 'FAIL-ACT)) (CONVERT-IF NODE FM CONT MP)) (ASET (OR CONT (ERROR '1NuTT Continuation to ASETI NODE 'FAIL-ACT)) (CONVERT-ASET NODE FM CONT MP)) qcnrcn (on cout (ERROR '1Nu11 Continuation to cA1cH| none 'FAIL-ACT)) (CONVERT-CATCH NODE FM CONT HP)) (LABELS (OR CONT (ERROR 'INUTT Continuation to LABELS) NODE 'FAIL-ACT)) (CONVERT-LABELS NODE FM CONT MP)) (COMBINATION (OR CONT (ERROR '1Nu\I Continuation to Combination] NODE 'FAIL-ACT)) (CONVERT-COMBINATION NODE FM CONT MP))))))

[Page 188]

178 MAKE-RETURN takes a CFORM (one of the types TRIVIAL, CVARIABLE, ...) and a continuation, and constructs an appropriate returning of the value of the CFORM to the continuation. First the CFORM is given to CNODIFY. If the continuation is in fact NIL (meaning none), this new cnode is returned; otherwise a RETURN cnode is constructed.

CONVERT-LAMBDA-FM takes a LAMBDA node and converts it into a CLAMBDA cnode. The two are isomorphic, except that an extra variable is introduced as an extra first parameter to the CLANBDA. Conceptually, this variable will be bound to a continuation when the CLANBDA is invoked at run time; this continuation is the one intended to receive the value of the body of the CLAMBDA. This is accomplished by creating a new variable name CONT-nnn, which is added into the lambda variables. A new CVARIABLE node is made from it, and given to CONVERT as the continuation when the body of the LAMBDA node is to be recursively converted.

The CNAME argument is used for a special optimization trick by CONVERT-COMBINATION, described below.

CONVERT-IF distinguishes several cases, to simplify the converted code.

First, if the entire IF node is trivial, then a simple CTRIVIAL node may be created for it. Otherwise, the general strategy is to generate code which will bind the given continuation to a variable and evaluate the predicate. This predicate receives a continuation which will examine the resulting value (with a CIF), and then perform either the consequent or alternative, which are converted using the bound variable as the continuation. The reason that the original continuation is bound to a variable is because it would be duplicated by using it for two separate calls to CONVERT, thereby causing duplicate code to be generated for it. A schematic picture of the general strategy is:

NODE = (IF a b c) and CONT = k becomes ((CLAMBDA (q) (RETURN (CONTINUATION (p) (CIF p (RETURN q b) (RETURN q c))) a)) k) Now there are two special cases which allow simplification. First, if the given continuation is already a cvariable, there is no point in creating a new one to bind it to. This eliminates the outer CCOMBINATION and CLAMBDA. Second, if the predicate a is trivial (but the whole IF is not, because the consequent b or the alternative c is nontrivial), then the CONTINUATION which binds p is unnecessary.

[Page 189]

001 002 003 004 005 006 007 008 009 O10 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 O53 054 055 056 057 058 059 060 061 062 063 064 R*BBII"§§§""Q§Ll§£Z§_»RQQQ_QQ (DEFINE MAKE-RETURN (LAMBDA (CFORM CONT) (LET ((cu (CNODIFY croRM))) (IF conf (cnooxrv (CONS-RETURN (CONT = cout) (VAL = cu))) CN)))) (DEFINE CONVERT-LAMBDA-FM (LAMBDA (NODE CNAME MP) (LET ((CV (GENTEMP 'CONT)) (FM (NO0E\FORM NODE))) (CONS-CLAMBDA (VARS = (CONS CV (LAMBDA\VARS FM))) (BODY = (CONVERT (LAMBDA\BODY FM) (CNODIFY (cons-cvAR1AaL£ (VAR I HP)))))) FOR CONVERTING IF:

VHOLE IF IS TRIVIAL. MAY JUST CREATE A CTRIVIAL.

CONTINUATION IS NON-CVARIABLE, MUST BIND A VARIABLE TO IT.

PREDICATE IS TRIVIAL, MAY JUST STICK IT IN SIMPLE CIF.

221 ISSUES :;: (1) IF ;;; (2) IF zz: (3) IF (DEFINE CONVERT-IF (LAMBDA (NODE FM CONT MP) (IF (NODE\TRIVP NODE) (MAKE-RETURN (CONS-TRIVIAL (NODE = NODE)) CONT) (LET ((CVAR (IF (EO (TYPE (CNODE\CFORM CONT)) 'CVARIABLE) NIL , (GENTEMP 'CONT))) (PVAR (IF (NODE\TRIVP (IF\PRED FM)) NIL (NODE\NAME (IF\PRED FM))))) (LET ((lCONT (IF CVAR (CNODIFY (cons-cvAn1AaL£ (VAR = cvAn))) CONT)) (IPRED (IF PVAR (cuoulrv (cons-cvAn1AaL£ (VAR = PvAn))) (CNODIFY (CONS-TRIVIAL (NODE = (IF\PREU (LET ((c1r (cuoolrv (cons-cxr (PRED = IPRED) (CON = (CONVERT (IF\CON rn) :cout MP)) (ALT = (convent (IF\ALT rn) (cuoolrv (CONS-CVARIABLE (VAR = (CVARlABL€\VAR (CNODE\CFORH HP)))))) (LET ((FOO (IF PVAR (CONVERT (IF\PRED FM) (CNODIFY (CONS-CONTINUA MP) ClF))) (IF_CVAR (cuonlfv (CONS-CCOHBINATION (ARGS = (LIST (CNODIFY (CONS-CLAMBDA (VARS = (LIST CVAR)) (BODY I FOO))) CONT)))) FOO)))))))) (OR CNAME CV)))) FH))))))) ICONT))))) TION (VAR I PVAR) (BODY I CIF)))

[Page 190]

180 This is all done as follows. First CVAR and PVAR are bound to generated names if necessary, CVAR for binding the continuation and PVAR for binding the predicate value. Then ICONT and IPRED (the "I" is a mnemonic for "internal") are bound to the cnodes to be used for the two conversions of consequent and alternative, and for the predicate of the CIF, respectively. CIF is then bound to the cnode for the CIF code, including the conversions of consequent and alternative. Finally, using FOO as an intermediary, CONVERT-IF first conditionally arranges for conversion of a nontrivial predicate, and then conditionally arranges for the binding of a non-cvariable continuation. The result of all this is returned as the final conversion of the original IF node.

CONVERT-ASET is fairly straightforward, except that, as for IF nodes, a special case is made of trivial nodes, as determined by the TRIVP slot.

The CATCH construct may be viewed as the user's one interface between the "LAMBDA world" and the "continuation world". CONVERT-CATCH arranges its conversion in such a way as to eliminate CATCH entirely. Because CONTINUATION cnodes provide an explicit representation for the continuations involved, there is no need at this level to have an explicit CCATCH sort of cnode. The general idea is:

' NODE = (CATCH a b) and CONT = k becomes ((CLAMBDA (q) '

((CLANBDA (a) (RETURN q b)) (CLAMBDA (*IGNORE* V) (RETURN q V)))) k) In the case where the given continuation k is already a cvariable, then it need not be bound to a new one q. Note that the (renamed) user catch variable a is bound to a CLAMBDA which ignores its own continuation, and returns the argument V to the continuation of the CATCH. Thus the user variable a is bound not to an actual CONTINUATION, but to a little CLAMBDA which interfaces properly between the CLANBDA world and the CONTINUATION world. The uses of CVAR and ICONT are analogous to their uses in CONVERT-IF.

[Page 191]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 (DEFINE CONVERT-ASET '

(LAMBDA (NODE FM CONT MP) (IF (NODE\TRIVP NODE) RAB51T_§§§""Q§!l§l1§_ P°9¢ Z9 (MAKE-RETURN (CONS-TRIVIAL (NODE = NODE)) CONT) (CONVERT (ASET\BO0Y FM) (LET ((NM (NODE\NAME (ASET\8ODY FM)))) (CNODIFY (CONS-CONTINUATION (VAR = un) (BODY = (cuoolrv (CONS-CASET (CONT = cont) (VAR = (As£r\vAn rn)) (BODY = (CNODIFY (cons-cvAR1AsL£ (VAR = NH)))))))))) HP)))) iii ISSUES FOR CONVERTING CATCH:

(I) MUST BIND THE CATCH VARIABLE TO A FUNNY FUNCTION WHICH ;;; (Z) IF CONTINUATION IS NON-CVARIABLE, MUST BIND A VARIABLE (DEFINE convent-CATCH (LAMBDA (None FM cont MP) IGNORES ITS CONTINUATION

TO IT.

(LET ((cvAR (IF (eo (TYPE (CNOD£\CFORM coNT)) 'cvAn|AaL£) NIL (GENTEMP 'CONT)))) (LET ((lCONT (IF CVAR (CNODIFY (CONS-CVARIABLE (VAR I CVAR))) (LET ((CP (CNODIFY coN1))) (CONS-CCOMBINATION (ARGS = (LIST (cuooxrv (CONS-CLAMBDA (IF CVAR (CNODIFY (VARS = (LIST (CA (BODY = (CONVERT TcH\vAR rM))) (CATCH\BODY FM) :cont nP)))) (cuooxrv (cons-cLAMauA (VARS = '(*IGNORE* v)) (BODY = (MAKE-RETURN (cons-cvAn|AsL£ (VAR = 'v)) (cuooxrv (cons-cvAnlAaLe (VAR = (CVARlABLE\VAR (CONS-CCOMBINATION (ARGS = (LIST (CNOUIFY (CONS-CLAMBDA CONT)))) CP)))))) (CN0DE\CF0RH ICONT))))))))))))))) (vAAs = (LIST CVAR)) (BODY - cpm

[Page 192]

182 CONVERT-LABELS simply converts all the labelled function definitions using NIL as the continuation for each. This reflects the fact that no code directly receives the results of closing the definitions; rather, they simply become part of the environment. The body of the LABELS is converted using the given continuation.

To make things much simpler for the pass-2 analysis and the code generator, it is forbidden to use ASET' on a LABELS-bound variable. This is an arbitrary restriction imposed by RABBIT (out of laziness on my part and a desire to concentrate on more important issues), and not one inherent in the SCHEME language. This restriction is unnoticeable in practice, since one seldom uses ASET' at all, let alone on a LABELS variable.

The conversion of COMBINATION nodes is the most complex of all cases.

First, a trivial combination becomes simply a TRIVIAL cnode. Otherwise, the overall idea is that each argument is converted, and the continuation given to the conversion is the conversion of all the following arguments. The conversion of the last argument uses a continuation which performs the invocation of function on arguments, using all the bound variables of the generated continuations. The end result is a piece of code which evaluates one argument, binds a variable to the result, evaluates the next, etc., and finally uses the results to. perform a function call.

To simplify the generated code, the arguments are divided into two classes. One class consists of trivial arguments and LAMBDA-expressions (this class is precisely the class of "trivially evaluable" expressions defined in [Imperative]), and the other class consists of the remaining arguments. The successive conversion using successive continuations as in the general theory is only performed on the latter class of arguments. The trivially evaluable expressions are included along with the bound variables for nontrivial argument values in the final function call. For example, one might have something like:

NODE = (FOO (CONS A B) (BAR A) B (BAZ B)) and CONT = k becomes (RETURN (CONTINUATION (x) (RETURN (CONTINUATION (y) (BAZ B))) (FOO k (CONS A B) X B y)) (BAR A)) where FOO, (CONS A B), and B are trivial, but (BAR A) and (BAZ B) are not.

[Page 193]

001 D02 D03 004 D05 D06 007 008 009 010 011 012 D13 D14 015 016 017 D18 RA§BlT.§§§,_Q§£l§/Z§m_R99€_QQ iii ISSUES FOR CONVERTING LABELS:

332 (1) MUST CONVERT ALL THE NAMED LAMBDA-EXPRESSIONS. USING A NULL CONTINUATION. iii (2) TO MAKE THINGS EASTER LATER. VE FORBID ASET ON A LABELS VARIABLE.

(DEFINE CONVERT-LABELS (LAMBDA (NODE FM CONT MP) (DO ((F (LABELS\FNDEFS FM) (CDR F)) (V (LABELS\FNVARS FM) (CDR V)) (CF NIL (CONS (CONVERT (CAR F) NIL MP) CF))) ((NULL F) (CNODIFY (CONS-CLABELS (FNVARS = (LABELS\FNVARS FM)) (runsfs = (unsvcnse cr)) (BODY = (CONVERT (LABELS\BODY FM) CONT MP))))) (AND (GET (CAR V) 'WRITE-REFS) (ERROR ' |Ar~e you crazy, using ASET on a LABELS variable?| (CAR V) 'FAIL-ACT)))))

[Page 194]

184 The separation into two classes is accomplished by the outer DO loop.

DELAY-FLAGS is a list of flags describing whether the code can be "delayed" (not converted using strung-out continuations) because it is trivially evaluable. The inner DO loop of the three (which loops on variables A, D, and Z, not A, D, and F!) then constructs the final function call, using the "delayed" arguments and generated continuation variables. The names used for the variables are the names of the corresponding nodes, which were generated by NODIFY. Finally, the middle DO loop (which executes last because the "inner" DO loop occurs in the initialization, not the body, of the "middle" one) generates the strung-out continuations, converting the non-delayable arguments in reverse order, so as to generate the converted result from the inside out.

The net effect is that nontrivial arguments are evaluated from left-to-right, and trivial ones are also (as it happens, because of MacLISP semantics), but the two classes are intermixed. This is where RABBIT takes advantage of the SCHEME semantics which decree that arguments to a combination may be evaluated in any order. It is also why CHECK-COMBINATION-PEFFS tries to detect infractions of this rule.

A special trick is that if the given continuation is a variable, and the combination is of the form ((LAMBDA ...) ...), then it is arranged to use the given continuation as the continuation for converting the body of the LAMBDA, rather than the extra variable which is introduced for a continuation in the LAMBDA variables list (see CONVERT-LAMBDA-FM). This effectively constitutes the optimization of substituting one continuation variable for another, much as META-COMBINATION-LAMBDA may substitute one variable for another. (This turns out to be the only optimization of importance to be done on pass-Z cnode code; rather than building a full-blown optimizer for pass-2 cnode-trees, or arranging to make the optimizer usable on both kinds of data structures, it was easier to tweak the conversion of combinations.) The substitution is effected by passing a non-NIL CNAME argument to CONVERT-LAMBDA-FORM, as computed by the form (AND (NULL (CDR A)) ...).

[Page 195]

D01 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 U18 019 020 021 022 023 020 025 D26 027 028 029 030 031 032 D33 034 035 036 037 038 039 D40 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 RABBIT 5§Qv"Q5[}§/78 Page 31 ;; ISSUES FOR CONVERTING COMBINATIONSZ ;;; (1) TRIVIAL ARGUMENT EVALUATIONS ARE DELAYED AND ARE NOT BOUND TO THE VARIABLE OF ';; A CONTINUATION. VE ASSUME THEREBY THAT THE COMPILER IS PERMITTED TO EVALUATE ;; OPERANDS IN ANY ORDER.

(Z) ALL NON-DELAYABLE COMPUTATIONS ARE ASSIGNED NAMES AND STRUNG OUT HITH CONTINUATIONS.
(3) IF CONT IS A CVARIABLE AND THE COMBINATION I5 ((LAMBDA ...) ...) THEN VHEN CONVERTING ;; THE LAMBDA-EXPRESSION VE ARRANGE FOR ITS BODY TO REFER TO THE CVARIABLE CONT RATHER THAN TO ITS OVN CONTINUATION. THIS CROCK EFFECTIVELY PERFORMS THE OPTIMIZATION OF SUBSTITUTING ONE VARIABLE FOR ANOTHER, ONLY ON CONTINUATION VARIABLES (WHICH COULDN'T ... BE CAUGHT BY META-EVALUATE).

(DEFINE coNvERT-coMaTnAT1ou (LAMBDA (NODE FM CONT MP) (IF (NOD£\TRlVP NODE) (MAKE-RETURN (cons-TRTVTAL (NODE = NonE)) CONT) (no ((A (COMBINATION\ARGS EM) (CDR A)) (DELAY-FLAGS NIL » (cons (OR (NonE\Tn1vP (cAR A)) (Eo (TYPE (NODE\FORM (CAR A))) ~LAnauA)) DELAY»FLAGS))) ((NULL A) (oo ((A (REVERSE (coMa1nAT|oN\Anss rn)) (con A)) (0 DELAY-FLAGS (CDR n)) (F (cuourrv (CONS-CCOMBINATION (ARGS = (oo ((A (REVERSE (COMBINATION\ARGS rn)) (con A)) (u oELAv-FLAGS (CDR o)) (z NIL (cons (IF (CAR n) (IF (Eo (TYPE (NO0E\FORM (CAR A))) 'LAManA) (CNODIFY (CONVERT-LAMBDA-FM (CAR A) (Ann (NULL (CDR A)) (Eo (TYPE (CNODE\CFORM conT)) 'cvARTAaLE) (CVARIABLE\VAR (cNonE\croun CONT))) MP)) (CNODIFY (cons-TRTVTAL (NonE = (CAR A))))) (cuoolrv (CONS-CVARIABLE (VAR = (uouE\NAnE (CAR A)))))) Z))) ((NULL A) (cons (CAR z) (cons CONT (con z)))))))) (IF (CAR u) F (CONVERT (CAR A) (cnoolrv (CONS-CONTINUATION (VAR = (NODE\NAME (CAR A))) (BODY = r))) HP)))) ((NUl-L A) F)))))))

[Page 196]

186 Once the pass-2 cnode-tree is constructed, a pass-2 analysis is performed in a manner very similar to the pass-1 analysis. As before, successive routines are called which recursively process the code tree and pass information up and down, filling in various slots and putting properties on the property lists of variable names.

The first routine, CENV-ANALYZE, is similar to ENV-ANALYZE, but differs in some important respects. Two slots are filled in for each cnode. The slot ENV is computed from the top down, while REFS is computed from the bottom up.

ENV is the environment, a list of bound variables visible to the cnode.

The ENV slot in the node-tree was a mapping (an alist), but this ENV is only a list. The argument ENV is used in the analysis of CVARIABLE and CASET nodes.

The cnode slot ENV is included only for debugging purposes, and is never used by RABBIT itself.

REFS is analogous to the REFS slot of a node-tree: it is the set of variables bound above and referenced below the cnode. It differs from the pass-1 analysis in that variables introduced to name continuations and variables bound by continuations are also accounted for. In the case of a TRIVIAL cnode, however, the REFS are precisely those of the contained node.

The argument FNP to CENV-ANALYZE in non-NIL iff the given cnode occurs in "functional position" of a CCOMBINATION or RETURN cnode. This is used when a variable is encountered; on the property list a VARIABLE-REFP property is placed iff FNP is NIL, indicating that the variable was referenced in 'variable (non-function) position". This information will be used by the next phase, BIND-ANALYZE.

[Page 197]

001 002 003 004 005 006 007 008 009 010 O11 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 ¢».

Div FAUUIIU§§§_EQ§Ll§£Z§~EF99€T32 ENVIRONMENT ANALYSIS FOR CPS VERSION ;;; WE WISH TO DETERMINE THE ENVIRONMENT AT EACH CNODE, AND DETERMINE WHAT VARIABLES ARE BOUND ABOVE AND ;;; REFERRED TO BELOW EACH CNODE.

FOR EACH CNODE WE FILL IN THESE SLOTS

' ENV THE ENVIRONMENT SEEN AT THAT CNODE (A LIST OF VARS) ; REFS VARIABLES BOUND ABOVE AND REFERRED TO BELOW THAT CNODE ;;; FOR EACH VARIABLE REFERRED TO IN NON-FUNCTION POSITION ;;; BY A CVARIABLE OR CTRIVIAL CNODE WE GIVE A NON-NIL VALUE TO THE PROPERTY:

°'; VARIABLE-REFP ;;; FNP IS NON~NIL IFF CNODE OCCURS IN FUNCTIONAL POSITION (DEFINE CENV-ANALYZE (EOCASE (LAMBDA (CNODE env rnp) (LET ((cFM (CNODE\CFORM cno0£))) (ALr£n-cnooe cnoue (ENV := £nv)) (TYPE CFM) (1R1v1AL (cenv-rnlv-AnALvz£ (TRIVIAL\NODE CFM) Fnv) (ALTER-CNODE cnoos (REFS := (NO0E\REFS (TRIVIAL\NODE CFM))))) (CVARIABLE (LET ((V (cvAR1AsL£\vAn CFM))) (Aoovnov v cnoue 'Reno-airs) (OR rnP (vurvnov v 1 'VARIABLE-R£FP)) (ALT£R-cnooz cnooe (REFS := (AND (NEMO V ENV) (LIST (CVARlABLE\VAR CFM))))))) ICENV-ANALYZE B (APPEND (CLAM8DA\VARS CFM) ENV) NIL) LET ((REFS (SETDIFF (CNODE\REFS B) (CLAMBDA\VARS CFM)))) (ALTER-CNODE CNODE (REFS := REFS))))) (cLAnanA (LET 1(a (cLAMaoA\aonY crn))) 1

K (CONTINUATION (LET 1(a (CONTINUATION\BODY crn))) 1

(CIF (LET1 1

1 1

1 (CASET (CENV-ANALYZE B (cons (CONTINUATION\VAR CFM) ENV) NIL) LET ((REFS (REMOVE (CONTINUATION\VAR CFM) (CNODE\REFS B)))) (ALTER-CNODE CNODE (REFS := REFS))))) (PRED (c1r\Pn£o CFM)) (CON (CIF\CON CFH)) (ALT (c1F\ALr crn))) ICENV-ANALYZE PRED ENV NIL) ICENV-ANALYZE CON ENV NIL) ICENV-ANALYZE ALT ENV NIL) ALTER-CNODE CNODE (REFS := (UNION (CNODE\REFS PRED) (UNION (CNODE\REFS CON) (CNODE\REFS ALT))))))) (LET ((V (CASET\VAR CFM)) (cn (CASET\CONT crn)) (a (CASET\BODY CFN))) (PUTPROP (CASET\VAR CFM) T 'VARIABLE~REFP) (CENV-ANALYZE CN ENV T) (CENV-ANALYZE B ENV NIL) '

(ALTER-CNODE CNODE (REFS := (LET ((R (UNION (CNODE\REFS CN) (CNOUE\REFS B)))) (IF (MEMO V ENV) (ADJOIN V R) R)))))) (CLABELS (LET ((L£nv (Avpeno (CLABELS\FNVARS CFM) £nv))) (oo ((r (cLAeeLs\rnuers crn) (con F)) (R NIL (UNION R (CNODE\REFS (CAR F)))))

[Page 198]

188 This page intentionally left blank except for an annoying and self-referential little sentence

[Page 199]

070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 RABBIT"§§QL19§£}5f7§M_?°9= 32~l ((NULL F) (LET ((B (CLABELS\BODY CFM))) (ctuv-ANALYz£ B LENV NIL) (ALTER-CNOOE CNODE (REPS := (SETDIFF (UNION R (CNODE\REFS B)) (CLABELS\FNVARS CFM)))))) (CENV-ANALYZE (CAR F) LENV NIL)))) (CCOMBINATION (LET ((ARGS (CCOMBINATlON\ARGS CFM))) (CENV-ANALYZE (CAR ARGS) ENV T) (COND ((AND (EO (TYPE (CNODE\CFORM (CAR ARGS))) 'TRIVIAL) (EO (TYPE (NODE\FORH (TRIVIAL\NODE (CNO0E\CFORM (CAR ARGS))))) 'VARIABLE) (TRIVFN (VARIABLE\VAR (NODE\FORH (TRIVIAL\NODE (CNO0E\CFORM (CAR lRGS))))))) (CENV-ANALYZE (CAUR ARES) ENV T) (CENV-CCOMBINATION-ANALYZE CNODE ENV (CDDR ARGS) (UNION (CNODE\REFS (CAR ARGS)) (CNODE\REFS (CADR ARGS))))) (T (CENV-CCOMB[NATION-ANALYZE CNOUE ENV (CDR ARGS) (CNODE\REFS (CAR AnGs))))))) (RETURN (LET ((C (RETURN\CONT CFM)) (V (RETURN\VAL CFM))) (CENV-ANALYZE C ENV T) (CENV-ANALYZE V ENV NIL) (ALTER-CNODE CNODE (REFS := (UNION (CNO0E\REFS C) (CNODE\REFS V))))))))))

[Page 200]

190 The only purpose of CENV-TRIV-ANALYZE is to go through the code for a TRIVIAL cnode, looking for variables occurring in other than function position, in order to put appropriate VARIABLE-REF? properties. Notice that the types LAMBDA and LABELS do not occur in the EQCASE. expression, as nodes of those types can never occur in trivial expressions.

CENV-CCOHB[NATION-ANALYZE is a simple routine which analyzes CCOHBINATXON cnodes; it is a separate routine only because it is used in more than one place in CENV-ANALYZE. It could have been made a local subroutine by using a LABELS in CENV-ANALYZE, but I elected not to do so for purely typographical reasons.

[Page 201]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 FABBII §§§__9§{!§/Z§_mP?9¢_§3 ;;; THIS FUNCTION MUST GO THROUGH AND LOCATE VARIABLES APPEARING IN NON-FUNCTION POSITION.

(DEFINE CENV-TRIV-ANALYZE (LAMBDA (NODE suv) (LET ((rM (NO0E\FORM uoo£))) (eocAs£ (TYPE FM) (CONSTANT NIL) (VARIABLE (oR FNP (Pu1PRoP (VARIABLE\VAR fn) 1 'VARIABLE-REFP))) (LAMBDA _ (OR FRP (ERROR '|rr1vi»1 closure - CENV-TRIV-ANALYZEI none 'FAIL-ACT)) (CENV-TRIV-ANALYZE (LAH8DA\BODY rn) NlL)) (IF (CENV-TRIV-ANALYZE (IF\PRED FH) NIL) (CENV-TRIV-ANALYZE (IF\CON FH) NIL) (CENV-TRIV-ANALYZE (IF\ALT FH) NIL)) (ASET (PUTPROP (ASET\VAR FH) T 'VARIABLE-REFP) KCENV-TRIV-ANALYZE (ASET\BODY rn) NlL)) (COMBINATION (no ((A (COMBINATION\AR6S FM) (cnR A)) (F T u1L)) ((NULL A)) (CENV-TRIV-ANALYZE (CAR A) F))))))) (DEFINE CENV-CCOMBINATION-ANALYZE (LAMBDA (CNOUE Env ARGS FREFS) (oo ((A ARGs (cuR A)) (R FREFS (UNION R (cNoo£\R£rs (CAR A))))) ((NULL A) (ALTER-CNODE cuoos (Refs == R))) (CENV-ANALYZE (CAR A) suv NIL))))

[Page 202]

192 The binding analysis is the most complicated phase of pass 2. It determines for each function whether or not a closure structure will be needed for it at run time (and if so, whether the closure structure must contain a pointer to the code); it determines for each variable whether or not it can be referred to by a run-time closure structure; and it determines for each function how arguments will be passed to it (because for internal functions not apparent to the "outside world", any arbitrary argument-passing convention may be adopted by the compiler to optimize register usage; in particular, arguments which are never referred to need never even be actually passed). If flow analysis determines that a given variable always denotes (a closure of) a given functional (CLANBDA) expression, then a KNOWN-FUNCTION property is created to connect the variable directly to the function for the benefit of the code generator.

BIND-ANALYZE is just a simple dispatch to one of many specialists, one for each type of CNODE. TRIVIAL and CVARIABLE cnodes are handled directly because they are simple.

The argument FNP is NIL, EZCLOSE, or NOCLOSE, depending respectively on whether a full closure structure, a closure structure without a code pointer, or no closure structure will be needed if in fact CNODE turns out to be of type CLAPIBDA (or CONTINUATION). Normally it is NIL, unless determined otherwise by a parent CLABELS or CCONBINATION cnode.

The argument NAME is meaningful only if the CNODE argument is of type CLAMBDA or CONTINUATION. If non-NIL, it is a suggested name to use for the cnode. This name will later be used by the code generator as a tag. The only reason for using the suggestion rather than a generated name (and' in fact one will be generated if the suggested name is NIL) is to make it easier to trace things while debugging.

REFD-VARS is a utility routine. Given a set of variables, it returns the subset of them that are actually referenced (as determined by the READ-REFS and WRITE-REFS properties which were set up by ENV-ANALYZE and CENV-ANALYZE).

[Page 203]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 FABBII_§§§_"9§Ll§LZ§__R§9¢J§5 232 BINDING ANALYSIS.

... FOR ».

IDD ,., FOR -» IO nl!

..-lll ¢.

IO FOR PUT ~~» 011 '22 THE ,;; THE ... EACH CNODE WE FILL IN:

CLOVARS THE SET OF VARIABLES REFERRED TO BY CLOSURES AT OR BELOW THIS NODE (SHOULD ALWAYS BE A SUBSET OF REFS) EACH CLAMBDA AND CONTINUATION WE FILL IN:

NON-NIL IFF REFERENCED ONLY AS A FUNCTION.

WILL BE 'EZCLOSE IF REFERRED TO BY A CLOSURE, AND OTHERWISE 'NOCLOSE.

VARIABLES PASSED THROUGH TEMP LOCATIONS WHEN CALLING THIS FUNCTION THE NAME OF THE FUNCTION (USED FOR THE PROG TAG) EACH CLABELS WE FILL IN:

EASY REFLECTS FNP STATUS OF ALL THE LABELLED FUNCTIONS EACH VARIABLE WHICH ALWAYS DENOTES A CERTAIN FUNCTION WE THE PROPERTIES:

KNOWN-FUNCTION IFF THE VARIABLE IS NEVER ASET VALUE OF THE KNOWN-FUNCTION PROPERTY IS THE CNODE FOR FUNCTION DEFINITION.

EACH LABELS VARIABLE IN A LABELS OF THE 'EZCLOSE VARIETY FNP TVARS NAME ;;; WE PUT THE PROPERTY:

LABELS-FUNCTION '

,;; TO INDICATE THAT ITS "EASY" CLOSURE MUST BE CDR'D TO GET THE ;;; CORRECT ENVIRONMENT (SEE PRODUCE-LABELS).

»..

001 (DEFINE (DEFINE NAME, IF NON-NIL, IS A SUGGESTED NAME FOR THE FUNCTION BIND-ANALYZE (LAMBDA (cNooE FNP NAME) (LET ((cEN (cNonE\cronN CNODE))) (EOCASE (TYPE CEN) (TRIVIAL (ALTER-CNODE cNooE (CLOVARS ;= NIL))) (CVARTABLE (ALTER-CNODE cNooE (CLOVARS ¢= NTL))) (CLANEUA (BTNU-ANALYZE-cLANauA cNonE FNP NAME CFH)) (CONTINUATION (BIND-ANALYZE-CONTINUATION CNODE FNP NAME cEN)) (cur (BIND-ANALYZE-CIF CNODE crN)) (CASEY (BIND-ANALYZE-CASET cNouE cEN)) (CLABELS (BIND-ANALYZE-CLABELS CNODE cEN)) (CCOMBINATION (BIND-ANALYZE-CCOMBINATION cNooE cEN)) (RETURN _ (BIND-ANALYZE-RETURN CNODE cEN)))))) REFD-VARS (LAMBUA (VARS) (oo ((v vARs (CDR v)) (w NIL (IF (on (GET (CAR v) 'READ-nEEs) (GET (CAR v) 'WRITE-REFS)) (cons (CAR v) w) V))) ((NULL v) (NREVERSE v)))))

[Page 204]

194 For a CLAHBDA cnode, BIND-ANALYZE-CLAMBDA first analyzes the body. The CLOVARS component of the cnode is then calculated. If the CLAMBDA will have a run-time closure structure created for it, then any variable it references is obviously referred to by a closure. Otherwise, only the CLOVARS of its body are included in the set.

The TVARS component is the set of parameters for which arguments will be passed in a nonstandard manner. Nonstandard argument-passing is used only for NOCLOSE-type functions (though in principle it could also be used for EZCLOSE-type functions also). In this case, only referenced variables (as determined by REFD-VARS) are actually passed. The code generator uses TVARS for two purposes: when compiling the CLAMBDA itself, WARS is used to determine which arguments are in which registers; and when compiling calls to the function, WARS determines which registers to load (see LAMBDACATE).

The FNP slot is just filled in using the FNP parameter. If a name was not suggested for the NAME slot, an arbitrary name is generated.

BIND-ANALYZE-CONTINUATION is entirely analogous to BIND-ANALYZE-CLAHBDA.

BIND-ANALYZE-CIF straightforwardly analyzes recursively its sub-cnodes, and then passes the union of their CLOVARS up as its own CLOVARS.

BIND-ANALYZE-CASET tries to be a little bit clever about the obscure case produced by code such as:

(ASET' FOO (LAMBDA ...)) where the continuation is a CONTINUATION cnode (rather than a CVARIABLE). It is then known that the variable bound by the CONTINUATION (Qt the variable set by the CASET!!) will have as its value the (closure of the) CLAMBDA-expression.

This allows for the creation of a KNOWN-FUNCTION property, etc. This analysis is very similar to that performed by BIND-ANALYZE-RETURN (see below). Aside from this, the analysis of a CASET is simple; the CLOVARS component is merely the union of the CLOVAR slots of the sub-cnodes.

[Page 205]

DDI 002 003 004 D05 006 007 D08 009 DID DI1 012 D13 014 015 016 017 016 D19 020 021 DZZ D23 024 025 D26 027 028 029 030 031 032 033 034 D35 036 037 035 D39 040 041 042 043 044 045 D46 047 048 049 D50 051 D52 053 054 055 055 057 058 059 060 061 062 D63 "ABBIT_5§9vHQ§!l§(?§_mE§9€"€§ (DEFINE BIND-ANALYZE-CLAMBDA (LAMBDA (CNODE FNP NAME CFM) (BLOCK (BIND-ANALYZE (CLAMBDA\BODY CFM) NIL NIL) (ALTER-CNODE CNODE (CLOVARS := (IF (ED FNP 'NOCLOSE) (CNODE\CLOVARS (CLAMDDA\BODY CFM)) (CNODE\REFS CNODE)))) (ALTER-CLAMBDA CFM (FNP := FNP) (TVARS := (IF (EO FNP °NOCLOSE) (REFD-VARS (CLAMBDA\VARS CFM)) NIL)) (NAME := (OR NAME (GENTEMP 'F))))))) (DEFINE BIND-ANALYZE-CONTINUAIION (LAMBDA (cnooz ruv NAME crm) (BLOCK (slum-ANALYZE (cour1nuAr|on\aonY CFM) NIL NIL) (ALTER-CNODE cnooe (CLOVARS := (IF (so PNP 'nocLos£) (CNODE\CLOVARS (CONTINUATION\BODY crn)) (cnonf\usrs cuou£)))) (ALTER-conrruunrlon ern (Fur ;= fur) (rvAns ;= (IF (no ruv 'nocLos£) (R£ro~vARs (LISI (CONTINUATION\VAR CFH))) NlL)) (NAM: ;= (OR nuns (GENTEMP 'c))))))) (DEFINE BIND-ANALYZE-CIF (LAMBDA (CNODE CFM) (BLOCK (BIND-ANALYZE (CIF\PRED CFM) NIL NIL) (BIND-ANALYZE (CIF\CON CFM) NIL NIL) (BIND-ANALYZE (CIF\ALT CFM) NIL NIL) (ALTER-CNODE CNODE (CLOVARS := (UNION (CNODE\CLOVARS (CIF\PRED CFM)) (UNION (CNOg:\CLOVARS (CIF\CON CFM)) (CNO \CLOVARS (CIF\ALT CFH))))))))) (DEFINE BIND-ANALYZE-CASET (LAMBDA (CNODE CFM) (LET ((CN (CASET\CONT CFM)) (VAL (CASET\BODY CFM))) (BIND-ANALYZE CN 'NOCLOSE NIL) (COND ((AND (EO (TYPE (CNODE\CFORM CN)) 'CONTINUATION) (EO (TYPE (CNODE\CFORM VAL)) 'CLAMBDA)) (LET ((VAR (CONTINUATION\VAR (CNODE\CFORM CN)))) (PUTPROP VAR VAL 'KNOWN-FUNCTION) (BIND-ANALYZE VAL (AND (NOT (GET VAR 'VARIABLE-REFP)) (IF (MEMO VAR (CNODE\CLOVARS (CONTINUATION\BODY (CNODE\CFORM CN)))) 'EZCLOSE (BLOCK (ALTER-CONTINUATION (CNODE\CFORM CN) (TVARS :I NIL)) 'NOCLOSE))) NIL))) (T (BIND-ANALYZE VAL NIL NIL))) (ALTER-CNODE CNODE (CLOVARS :I (UNION (CNODE\CLOVARS CN) (CNODE\CLOVARS VAL)))))))

[Page 206]

196 The binding analysis of a CLABELS is very tricky because of the possibility of mutually referent functions. For example, suppose a single CLABELS binds two CLAHBDA expressions with names FOO and BAR. Suppose that the body of FOO refers to BAR, and that of BAR to FOO. Should FOO and BAR be of FNP-type NIL, EZCLOSE, or NOCLOSE? If either is of type EZCLOSE, then the other must be also; but the decision cannot be made sequentially. It is even more complicated if one must be of type NIL.

An approximate solution is used here, to prevent having to solve complicated simultaneous constraints. It is arbitrarily decreed that all functions of a single CLABELS shall all have the same FNP type. If any one must be of type NIL, then they all are. Otherwise, it is tentatively assumed that they all may be of type NOCLOSE. If this assumption is disproved, then the analysis is retroactively patched up.

The outer DO loop of BIND-ANALYZE-(LABELS creates KNOWN-FUNCTION properties, and determines (in the variable EZ) whether any of the labelled functions needs a full closure structure. (This can be done before analyzing the functions, because it is determined entirely by the VARIABLE-REF? properties created in the previous phase.) The inner DO loop then analyzes the functions. when this is done, if EZ is NOCLOSE, and it turns out that it should have been EZCLOSE after all, then the third DO loop forcibly patches the CLAHBDA cnodes for the labelled functions, and the AMAPC form creates LABELS-FUNCTION properties as a flag for the code generator.

BIND-ANALYZE-RETURN simply analyzes the continuation and return value recursively, and then merges to two CLOVARS sets to produce its own CLOVARS set.

A special case is when the two sub-cnodes are respectively a CONTINUATION and a CLAMBDA; then special work is done because it is known that the variable bound by the CONTINUATION will always denote the (closure of the) CLAHBDA-expression.

A nasty trick is that if it turns out that the CLAHBDA can be of type NOCLOSE, then the TVARS slot of the CONTINUATION is forcibly set to NIL (i.e. the empty set). This is because no argument will really be passed. (This fact is also known by the LAMBDACATE routine in the code generator.)

[Page 207]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 O50 051 052 053 054 055 056 BA5B!I_§§§" 05/15!Z§__B#9°L§§ (DEFINE also-AnALvzE-cLAeELs (LAMBDA (CNODE CFM) (BLOCK (alum-ANALYZE (cLAsELs\aooY CFM) NIL NIL) (oo ((v (CLA8£LS\FNVARS crn) (con v)) (u (CLABELS\FND£FS crn) (CDR o)) (EZ 'NOCLOSE (Ano (NULL (GET (CAR v) 'VARIABLE-nEEP)) Ez))) ((NULL v) (ALTER-cLAeELs crn (EASY := Ez)) (oo ((v (cLAaELs\EuvAns CFM) (con v)) (n (CLABELS\FNDEFS CFM) (con o)) (cv (CNODE\CLOVARS (CLABELS\BODY CFH)) (UNION cv (CNODE\CLOVARS (CAR o))))) ((NULL 0) (ALTER-cnooE CNOUE (CLOVARS ¢= cv)) (COND ((Auo EZ (INTERSECT cv (LABELS\FNVARS crn))) (oo ((o (cLAnELs\EnuEEs CFM) (CDR n)) (CV (cnooE\cLovAns (cLAaELs\aooY crn)) (UNION cv (CNO0E\CLOVARS (CAR o))))) ((NuLL 0) (ALTER-CNODE cNooE (CLOVARS ;= cv))) (ALTER-CLAMBDA (CNODE\CFORM (CAR o)) (FMP ¢= °EzcLosE) (TVARS ¢= NlL)) (ALTER-cNooE (CAR 0) (CLOVARS := (cuonE\nEEs (CAR o))))) (AMAPC (LAMBDA (v) KPUTPROP v T 'LABELS-FUNCTl0N)) (CLAB£LS\FNVARS CFH)) (ALTER-CLABELS crm (EASY := 'EzcLosE))))) (BIND-ANALYZE (cAn 0) EZ (CAR v)))) (PUTPROP (CAR v) (CAR n) 'KNOVN~FUNCTl0N))))) (DEFINE BIND-ANALYZE-RETURN (LAMBDA (cNooE crm) (LET ((cN (RETURN\CONT crn)) (VAL (RETURN\VAL cEM))) (BIND-ANALYZE cn 'NOCLOSE NIL) (COND ((Ano (Eo (TYPE (CNODE\CFORM cn)) 'coNTTuuAT1ou) (Eo (TYPE (CNODE\CFORM vAL)) 'CLAMBD~)) (LET ((vAn (CONTINUATION\VAR (CNODE\CFORH cN)))) (PUTPROP VAR vAL 'KNOWN-FUNCTION) (BIND-ANALYZE VAL (Ano (NOT (GET VAR 'vAnlAaLE-nEEP)) (IF (nzno VAR (CNO0E\CLOVARS (coNT1NuAT1on\Boov (cNouE\croRn cu)))) 'EZCLOSE (BLOCK (ALTER-CONTINUATION (CNODE\CFORH cn) (TVARS 1 nxL)) 'NOCLOSE))) Nil))) (T (Brno-ANALYZE VAL NIL NIL))) (ALTER-cuooE CNODE (cLovAns == (UNION (CNO0E\CLOVARS cu) (CNOU£\CLOVARS VAL)))))))

[Page 208]

198 BIND-ANALYZE~CCO|*IBINATION first analyzes the function position of the combination. It then distinguishes three cases: a trivial function, a CLAHBDA-expression function, and all others.

In the case of a trivial function, the continuation (which is the second item in ARGS) can be analyzed with FNP = NOCLOSE, because the compilation will essentially turn into "calculate all other arguments, apply the trivial function, and then give the result to the continuation". A CCOMBINATION which looks like:

(a-trivial-function (CONTINUATION (var) ...) argl argn) is compiled almost as if it were:

((CONTINUATION (var) ...) (a-trivial-function argl argn)) and of course the continuation can be treated as of type NOCLOSE.

In the case of a CLANBDA-expression, the arguments are all analyzed, and then the AMAPC expression goes back over the WARS list of the CLAMBDA, and removes from the TVARS set each variable corresponding to an argument which the analysis has proved to be a NOCLOSE-type KNOWN-FUNCTION. This is because no actual argument will be passed at run time for such a function, and so there is no need to allocate a register through which to pass that argument.

In the third case, the arguments are analyzed straightforwardly by BIND-CCONBINATION-ANALYZE.

BIND-CCONBINATION-ANALYZE does the dirty work of analyzing arguments of a CCONBINATION and updating the CLOVARS slot of the CCOMBINATION cnode. If VARS is non-NIL, then it is the variables of the CLAMBDA-expression which was in the function position of the CCOMBINATION. As the arguments are analyzed, KNOWN-FUNCTION properties are put on the variables as appropriate, and the correct value of FNP is determined for the recursive call to BIND-ANALYZE. If VARS is NIL, then this code depends on the fact that (CDR NIL)=NIl. in P1acLISP.

[Page 209]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 056 059 060 RA§§lI"§§§_lE!l5f1§,,f29F,;Z (DEFINE BIND-ANALYZE-CCOHBINATION (LAMBDA (CNODE CFH) (LET ((ARGS (CCOMBlNATION\ARGS CFH))) (BIND-ANALYZE (CAR ARGS) 'NOCLOSE NIL) (LET ((FN (CNOUE\CFORM (CAR ARGS)))) (COND ((AND (EO (TYPE FN) 'TRIVlAL) (EO (TYPE (NOUE\FORM (TRIVIAL\NODE FN))) 'VARIABLE) (TRIVFN (VARIABLE\VAR (NODE\FORH (TRIVIAL\NODE FN))))) (BIND-ANALYZE (CADR ARGS) 'NGCLOSE NIL) (BIND-CCOHBINATION-ANALYZE CNODE (CDUR ARGS) NIL (CNODE\CLOVARS (CAUR AnGs)))) ((£o (TYPE rn) 'cLAnaoA) (Brno-ccoMa1NATxoN-ANALYz£ cNon£ (CDR ARG5) (CLAMBDA\VARS Fu) (CNODE\CLOVARS (CAR ARGS))) (AMAPC (LAMBDA (v) (IF (LET ((xFN (GET v 'KNOWN-FUNCTION))) (Ann xru (zo (EOCASE (TYPE (CNODE\CFORM xru)) (CLAHBDA (cLAManA\rnP (CNODE\CFORH xrN))) (CONTINUATION (conT1NuAT|ou\FuP (CNODE\CFORM xFu)))) 'NOCLOSE ) )) (ALTsn-cLAMaoA ru (TVARS := (DELO V (CLAHBDA\TVARS FN)))))) (CLAMBDA\TVARS FN))) '

(T (8IND-CCOHBINATION-ANALYZE CNODE (CDR ARGS) NIL (cnou£\cLovAns (CAR ARGS))))))))) 53: VARS MAY BE NIL - HE DEPEND ON (CDR NIL)=NlL.

(DEFINE BIND-CCOMBINATION-ANALYZE (LAMBDA (CNODE ARGS vAAs FCV) (oo ((A ARGS (CDR A)) (v VARS (con v)) (cv rcv (UNION cv (;uoo£\cLovAAs (CAR A))))) ((nuLL A) (ALT£R-cnooe cnoos (CLOVARS ¢= cv))) (COND ((AND vAns _ (nemo (TYPE (CNOD£\CFORM (CAR A))) '(CLAHBDA CONTINUATION)) (uoT (GET (CAR v) 'VRITE-REFS))) (PUTPROP (CAR v) (CAR A) 'KNOWN-FUNCTION) (BIND-ANALYZE (CAR A) (Ann (NOT (GET (CAR v) 'VARIABLE-REFP)) (IF (MEMO (CAR v) rcv) 'ezcLos£ ~nocLose)) u1L)) (T (BIND-ANALYZE (CAR A) NIL nxL))))))

[Page 210]

200 DEPTH-ANALYZE allocates registers through which to pass arguments to NOCLOSE functions, i.e. for arguments corresponding to elements of WARS sets.

An unclever stack discipline is used for allocating registers. Each function is assigned a "depth", which is zero for a function whose FNP is NIL or EZCLOSE (such functions take their arguments in the standard registers **ONE** through **EIGHT**, assuming that "NUMBER-OF-ARG-REGSM is B, as it is in the current SCHEME implementation). For a NOCLOSE function the depth is essentially the depth of the function in whose body the NOCLOSE function appears, plus the number of TVARS belonging to that other function (if it is of type NOCLOSE) or the number of standard argument registers used by it (if it is NIL or EZCLOSE). For example, consider this code:

(CLANBDA (C X Y) ((CLANBDA (K F Z) ((CLAMBDA (Q W V) ...) CONT-57 '3 '4)) (CONTINUATION (V) ...) (CLANBDA (H) ...) 'F00)) Suppose that the outer CLANBDA is of type EZCLOSE for some reason. Its depth is 0. The two CLAMBDA-expressions and CONTINUATION immediately within it have depth 3 (assuming the CONTINUATION and second CLAMBDA are of type NOCLOSE the first CLAMBDA definitely is). The innermost CLAMBDA is then of depth 4 (for Z, which will be in TVARS K and F will not be because they are names for NOCLOSE functions, assuming K and F have no WRITE-REFS properties).

To each function is also attached a MAXDEP value, which is in effect the number of registers used by that function, including all NOCLOSE functions within it. This is used in only one place in the code generator, to generate a SPECIAL declaration for the benefit of the MacLISP compiler, which compiles the output of RABBIT. For most constructs this is simply the numerical maximum over the depths of all sub-cnodes. Toward this end the maximum depth of the cnode is returned as the value of DEPTH-ANALYZE.

[Page 211]

001 002 003 D04 005 006 D07 008 009 010 011 DIZ 013 014 015 016 017 018 019 020 021 022 D23 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 D48 049 050 051 052 053 054 055 056 057 058 059 RBH§lI_§§§__P§£l§!?§__P@9¢_39 ;;; DEPTH ANALYSIS FOR CPS VERSION.

FOR EACH GLAMDDA AND coNT1NuAT1oN VE FILL TN
DEP DEPTH or TEMP VAR USAGE AT THIS POINT HAxDEP MAX DEPTH BELOW THIS POINT
;; VALUE or DEPTH-ANALYZE IS THE MAX DEPTH (DEFINE DEPTH-ANALYZE (LAMBDA (CNODE DEP) (LET ((cFM (cNoDE\cEonM cNooE))) (EocAsE (TYPE CEM) (TRTVTAL DEP) (CVARIABLE DEP) (CLAMBDA (LET ((ND (DEPTH-ANALY1E (CLAMBDA\BODY CFM) (IF (Eo (CLAMBDA\FNP CFM) 'NOCLOS£) (+ DEP (LENGTH (CLAMBDA\TVARS crH))) (HTN (LENGTH (CLAM8DA\VARS crH)) (+ 1 ~»NuHaER-or-APG-nEGs*~)))))) (ALTER-CLAMBDA cEH (DEP
= (IF (EO (CLAMBDA\FNP CFM) 'NOCLOSE) DEP 0)) (MAXDEP 2= MD)) MD)) (CONTINUATION (LET ((MD (DEPTH-ANALYZE (CONTINUATION\BODY CFM) (IF (EO (CONTINUATION\FNP CFM) 'NOCLOSE) (+ DEP (LENGTH (CONTINUATION\TVARS CFM))) 2)))) (ALTER-CONTINUATION CFM (DEP
= (IF (EO (CONTlNUAT}ON\FNP CFM) 'NOCLOSE) DEP 0)) (MAXDEP := MD)) HD)) (CIF (MAX (DEPTH-ANALYZE (ClF\PRED CFM) DEP) (DEPTH-ANALYZE (CIF\CON CFM) DEP) (DEPTH-ANALYZE (CIF\ALT CFM) DEP))) (CASET (MAX (DEPTH-ANALYZE (CASET\CONT CFM) DEP) (DEPTH-ANALYZE (CASET\BODY CFM) DEP))) (CLABELS (LET ((DP (IF (EO (CLABELS\EASY CFM) 'NOCLOSE) DEP (+ DEP (LENGTH (CLABELS\FNVARS CFM)))))) (Do ((D (CLABELS\FNDEFS CFM) (con D)) (MD (DEPTH-ANALYZE (CLABELS\BODY crm) DP) (HAH MD (DEPTH-ANALYZE (CAR D) DP)))) ((NULL U) HD)))) (CCOMBINATION (Do ((A (CCOMBINATION\ARGS CFM) (con A)) (MD D (MAX MD (DEPTH-ANALYZE (CAR A) DEP)))) ((NULL A) HD))) (RETURN (MAX (DEPTH-ANALYZE (RETURN\CONT CFM) DEP) . (DEPTH-ANALYZE (RETunN\vAL CFM) DEP)))))))

[Page 212]

202 Just as DEPTH-ANALYZE assigns locations in registers ("stack locations") for variables, so CLOSE-ANALYZE assigns locations in consed ("heap-allocated") environment structures for variables. The general idea is that if the value of a an accessible variable is not in a register, then it is in the structure which is in the register **ENV**. This structure can in principle be any structure whatsoever, according to the whim of the compiler. RABBIT's whim is to be very unclever; the structure of **ENV** is always a simple list of variable values.

Thus a variable in the **ENV** structure is always accessed by a series of CDR operations and then one CAR operation.

(More clever would be to maintain the environment as a chained list of vectors, each vector representing a non-null contour. Then a variable could be accessed by a series of "CDR" operations equal to the number of contours (rather than the number of variables) between the binding and the reference, followed by a single indexing operation into the contour-vector. The number of "CDR" operations could be reduced by having a kind of "cache" for the results of such contour operations; such a cache would in fact be equivalent to the "display" used in many Algol implementations. If such a display were maintained, a variable could be accessed simply by a two-level indexing operation.) Within the compiler an environment structure is also represented as a simple list, with the name of a variable occupying the position which its value will occupy in the run-time environment.

For every CLANBDA, CONTINUATION, and CLABELS, a slot called CONSENV is filled in, which is a list representing what the environment structure will look like when the closure(s) for that construct are to be constructed, if any. This is done by walking over the cnode-tree and doing to the environment representation precisely what will be done to the real environment at run time.

There is a problem with the possibility that a variable may initially be in a register (because it was passed as an argument, for example), but must be transferred to a consed environment structure because the variable is referred to by the code of a closure to be constructed. There are two cases: either the variable has no WRITE-REFS property, or it does.

If it does not, then there is no problem with the value of the variable being in two or more places, so it is simply copied and consed into the environment as necessary. The CLOSEREFS slot of a function is a list of such variables which must be added to the consed environment before constructing the closure.

If the variable does have WRITE-REFS, then the value of the variable must have a single "home", to prevent inconsistencies when it is altered. (This is far easier than arranging for every ASET' operation to update all extant copies of a variable's value.) It is arranged that such variables, if they are referred to be closures (are in the CLOVARS set of the CLAMBDA which binds them) will exist only in the consed environment. Thus for each CLAMBDA the ASETVARS set is that subset of the lambda variables which have WRITE-REFS and are in the CLOVARS set. Before the body of the CLAMBDA is executed, a piece of code inserted by the code generator will transfer the variables from their registers immediately into the consed environment, and the values in the registers are thereafter never referred to.

[Page 213]

001 002 003 004 005 006 007 008 009 D10 011 DIZ 013 014 015 016 017 018 D19 D20 021 022 023 024 025 026 027 D28 029 030 031 D32 033 034 035 036 037 038 D39 040 041 D42 D43 D44 D45 RAB5IIM§§§"_2§ll§/Z§_"P°9¢U§?

CLOSURE ANALYSIS FOR CPS VERSION ;;; FOR EACH CLAMBOA, CONTINUATION, AND CLABELS HE FILL IN

Z CONSENV THE CONSED ENVIRONMENT OF THE CLAMBOA.

, CONTINUATION, OR CLABELS (BEFORE ANY ,,; CLOSEREFS HAVE BEEN CONSED ON) ;;; FOR EACH CLAMBOA AND CONTINUATION HE FILL IN:

A LIST OF VARIABLES REFERENCED BY THE CLAMBOA OR CONTINUATION WHICH ARE NOT IN THE CONSED ENVIRONMENT AT THE POINT OF THE CLAMBOA OR CONTINUATION AND SO MUST BE CONSED ONTO THE ENVIRONMENT AT CLOSURE TIME; HOWEVER, THESE NEED NOT BE CONSED ON IF THE CLAMBOA OR CONTINUATION IS IN FUNCTION POSITION OF 2 A FATHER VHICH IS A'CCOMBINATION OR RETURN iii FOR THE CLAMBDA'S IN THE FNDEFS OF A CLABELS, THESE MAY BE 231 SLIGHTLY ARTIFICIAL FOR THE SAKE OF OPTIMIZATION (SEE BELOW).

FOR EACH CLAMBOA HE FILL IN
'

A LIST OF THE VARIABLES BOUND IN THE CLAMBDA WHICH ARE EVER ASET AND SO MUST BE CONSED ONTO THE ENVIRONMENT IMMEDIATELY IF ANY 2 CLOSURES OCCUR IN THE BODY 233 FOR EACH CLABELS HE FILL IN:

VARIABLES TO BE CONSED ONTO THE CURRENT CONSENV BEFORE CLOSING THE LABELS FUNCTIONS ; CLOSEREFS 2 ASETVARS "L FNENV ;;; CENV IS THE CONSED ENVIRONMENT (A LIST OF VARIABLES) (DEFINE FILTER-cLos£Refs (LAMBDA (Refs CENV) (oo ((x Refs (con x)) (Y NIL (If (OR (nano (CAR x) CENV) (LET ((xru (GET (CAR x) 'KNOWN-FUNCTION))) (Ann xru (eo (cocnse (TYPE (cNou£\cronn xrn)) (cLAMaoA (cLAMaoA\ruP (CNODE\CFORH KFN))) (CONTINUATION (CONTINUATION\FNP (CNO0E\CFORM KFN)))) 'NOCLOSE)))) Y

(cons (CAR x) Y)))) ((NULL x) (nnevenss Y)))))

[Page 214]

204 For each CLABELS a set called FNENV is computed. This is strictly an efficiency hack, which attempts to arrange it such that the several closures constructed for a CLABELS share environment structure. The union over all the variables needed is computed, and these variables are, at run time, all consed onto the environment before any of the closures is constructed. The hope is that the intersection of these sets is large, so that the total environment consing is less than if a separate environment were consed for each labelled closure.

FILTER-CLOSEREFS is a utility routine which, given a set of variables and an environment representation, returns that subset of the variables which are not already in the environment and so do not denote known NOCLOSE functions. (Those variables which are already in the consed environment or which do denote NOCLOSE functions of course need not be added to that consed environment.) The argument CENV to CLOSE-ANALYZE is the representation of the consed environment (in **ENV**) which will be present when the code for CNODE is executed. The only processing of interest occurs for CLAMBDA, CONTINUATION, and CLABELS cnodes.

The CLOSEREFS of a CLANBDA are those and which are not already in CENV, provided The ASETVARS are precisely those VARS which The processing for a CONTINUATION is which are referred to by the CLAMBDA the CLAMBDA is not of type NOCLOSE. have WRITE-REFS and are in CLOVARS. similar. As a consistency check, we make sure the bound variable has no WRITE-REFS (it should be impossible for an ASET' to refer to the bound variable of a CONTINUATION).

For a CLABELS, the FNENV set is first calculated and added to CENY. This new CENV is then used to process the definitions and body of the CLABELS.

[Page 215]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 QAUQlT~§Q§_"9§Ll§£Z§_"P°9¢H29 (DEFINE CLOSE-ANALYZE (LAMBDA (CNOOE CENV) (LET ((cFn (CNODE\CFORM cNou£))) (EQCASE (TYPE crn) (TRTv1AL NIL) (CVARIABLE NIL) (cLAMaoA (LET ((cR (AND (NOT (so (CLAMBUA\FNP CFM) 'NOCLOSE)) (FILTER-CLOSEREFS (CNODE\REFS CNODE) CENV))) (AV (oo ((v (CLAMBOA\VARS (CNODE\CFORM CNODE)) (COR v)) (A NIL (IF (Ano (GET (CAR v) 'WRITE-REFS) (Mano (CAR v) (CNO0E\CLOVAR$ (CLAMBDA\BOUY crn)))) (CONS (CAR v) A) A))) ((NULL V) A)))) (ALTER-CLAMBDA cfm _ (CONSENV := CENV) (CLOSEREFS := CR) (ASETVARS := AV)) (CLOSE-ANALYZE (CLAMBDA\BODY CFM) (APPEND AV CR CENV)))) (coNT1NuATloN (AND (GET (CONTINUATION\VAR CFM) 'WRITE-REFS) (ERROR '1How could an ASET refer to a continuation variable?) CNODE 'FAIL-ACT)) (LET ((CR (AND (NOT (EO (CONTINUATION\FNP CFM) 'NOCLOSE)) (FILTER-CLOSEREFS (CNODE\REFS CNODE) CENV)))) (ALTER-CONTINUATION CFM (CONSENV := CENV) ' (CLOSEREFS := CR)) (CLOSE~ANALYZE (CONTINUATION\BODY CFM) (APPEND CR CENV)))) (CIF '

(CLOSE-ANALYZE (CIF\PRED CFM) CENV) (CLOSE-ANALYZE (CIF\CON CFM) CENV) (CLOSE-ANALYZE (CIF\ALT CFM) CENV)) (CASET (CLOSE-ANALYZE (CASET\CONT CFM) CENV) (CLOSE-ANALYZE (CAS£T\BODY CFM) CENV)) (CLABELS ((LAM8DA (CENV) (BLOCK (AMAPC (LAMBDA (D) (CLOSE-ANALYZE D C£NV)) (CLABELS\FNDEFS CFM)) (CLOSE-ANALYZE (CLA8EL$\BO0Y CFM) C£NV))) (COND ((CLA8ELS\£ASY CFM) (00 ((D (CLABELS\FND£FS CFM) (CDR D)) (R NIL (UNION R (CNO0E\R£FS (CAR D))))) ((NULL D) (LET ((E (FILTER-CLOSEREFS R C£NV))) (ALTER-CLABELS CFM (FNENV := E) (CONSENV := CENV)) (APPEND E CENV))))) (T (ALTER-CLABELS CFM (FNENV :I NIL) (CONSENV := CENV)) CENVHH (CCOMBINATION (ANARC (LAMBDA (A) (cLose-ANALYZE A c£uv)) (CCOMBlNATION\ARGS CFM))) (RETURN (CLOSE-ANALYZE (R£TURN\CONT crm) cenv) (CLOSE-ANALYZE (RETURN\VAL CFM) c£nv))))))

[Page 216]

206 we now come to the code generator, which is altogether about one-fourth of all the code making up RABBIT. Part of this is because much code which is conceptually singular is duplicated in several places (partly as a result of the design error in which CCOMBINATION and RETURN nodes, or CLAMBDA and CONTINUATION nodes, are treated distinctly; and also because a powerful text editor made it very easy to make copies of the code for various purposes!). The rest is just because code generation is fairly tricky and requires checking for special cases.

A certain amount of peephole optimization is performed; this is not so much to improve the efficiency of the output code, as to make the output code easier to read for a human debugging RABBIT. A large fraction of the output code (perhaps ten to twenty percent) is merely comments of various kinds intended to help the debugger of RABBIT figure out what happened.

One problem in the code generator is that most functions need to be able to return two things: the code generated for a given cnode-tree, and a list of functions encountered in the cnode-tree, for which code is to generated separately later. we solve this problem by a stylistic trick, namely the explicit use of continuation-passing style. Many functions in the code generator take an argument named "C". This argument is itself a function of two arguments: the generated code and the deferred-function list. The function which is given C is expected to compute its two results and then invoke C, giving results as arguments. (In practice a function which gets an argument an argument FNS, which is a deferred-functions list; the function is add its deferred functions onto this list FNS, and give the augmented C along with the generated code.) it the two C also gets expected to FNS list to Other arguments which are frequently passed within the code generator are CENV (a representation of the consed environment); BLOCKFNS, a list describing external functions compiled together in this "block" or "module" (this is used to compile a direct GOTO rather than a more expensive call to an external function, the theory being that several functions might be compiled together in a single module as with the InterLISP "block compiler"; this theory is not presently implemented, however, and so BLOCKFNS always has just one entry); PROGNAME, a symbol which at run time will have as its value the MacLISP SUBR pointer for the current module (this SUBR pointer is consed into closures of compiled functions, and so any piece of code which constructs a closure will need to refer to the value of this symbol); and variable names to pieces of variable is to be generated, found in RNL, and otherwise therefore global) is output).

RNL, the "rename list", an alist pairing internal code for accessing them (when code to reference a the piece of code in RNL is used if the variable is a reference to the variable name itself (which is CONPILATE is the topmost routine of the code generator. FN is the cnode-tree for a function to be compiled. The topmost cnode should of course be of type CLAMBDA or CONTINUATION. For a CLAMBDA, the call to REGSLIST sets up the initial RNL (rename list) for references to the arguments. Also, when COMP-BODY has returned the code (the innermost LAMBDA-expression in COMPILATE is the argument C given to COMP-BODY), SET-UP-ASETVARS is called to take care of copying the variables in the ASETVARS set into the consed environment. The code for a CONTINUATION is similar, except that a CONTINUATION has no ASETVARS and only one bound variable.

[Page 217]

001 002 D03 004 D05 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 Zi; CODE GENERATION ROUTINES ;;; PROGNAME: NAME OF A VARIABLE VHICH AT RUN TIME H ; ; AS VALUE THE SUBR POINTER FOR THE PROG ;;; FN: THE FUNCTION TO COMPILE (A CLAMBDA OR 22; EXTERNALP: NON-NIL IF THE FUNCTION IS EXTERNAL 22; RNL: INITIAL RENAME LIST (NON-NIL ONLY FOR ; ; ENTRIES ARE: (VAR _ CODE) ;;; BLOCKFNS: AN ALIST OF FUNCTIONS IN THIS BLOCK.

ENTRIES ARE
(USERNAME CNODE) ;;; FNS
A LIST OF TUPLES FOR FUNCTIONS YET TO ; ; EACH TUPLE IS (PROGNAME FN RNL) Si; C
A CONTINUATION, TAKING:
'; CODE
THE PIECE OF MACLISP CODE FOR FNS: AN AUGMENTED FNS LIST .»» Oil (DEFINE COMPILATE _ (LAMBDA (mzosmxne rn nm mocxrus rus c) (LU ((cm (CNOD£\CFORM rn))) (eocase (TYPE cm) RAB?!T_§§§__Q§!l§£Z§_"E!9€"2l ILL HAVE CONTINUATION CNODE) NOCLOSE FNS).

BE COMPILED; THE FUNCTION (CLAMBDA (LET ((CENV (APPEND (CLAM8DA\ASETVARS CFM) (CLAM8DA\CLOSEREFS CFM) (CLAM80A\CONSENV CFM)))) (COMP-BODY (CLAMBDA\8ODY CFM) (REGSLIST CFM T (ENVCARCDR CENV RNL)) PROGNAME BLOCKFNS CENV FNS (LAMBDA (CODE FNS) (C (SET-UP-ASEYVARS CODE (CLAM8DA\ASETVARS CFM) (REGSLIST CFM NIL NIL)) FNS))))) (CONTINUATION (LET ((CENV (APPEND (CONTINUATION\CLOSEREFS CFM) (CONTINUATION\CONSENV CFM)))) (COMP-BODY (CONTINUATION\BODY CFM) (IF (EO (IF (CON PROGNAME BLOCKFNS CENV Has C))))))) (CONTINUATION\FNP CFM) 'NOCLOSE) (NULL (CONTINUATION\TVARS CFM)) (ENVCARCDR CENV RNL) (CONS (CONS (CONTINUATION\VAR CFM) (TEMPLOC (CONTINUATION\DEP CFM))) (ENVCARCDR CENV RNL))) S (CONS (CONTINUATION\VAR CFM) (CAR *lARGUMENT-REGISTERS**)) (ENVCARCDR CENV RNL)))

[Page 218]

208 **ARGUNENT-REGISTERS** is a list of the standard "registers" through which arguments are passed. In the standard SCHEME implementation this list is:

(**ONE** **TWO** **THREE** **FOURk* **FIVE** **SIX** **SEVEN** **EIGHT**) DEPROGNIFYI is a peephole optimizer. It takes a MacLISP form and returns a list of MacLISP forms. The idea is that if the given form is (PROGN ...), the keyword PROGN is stripped off; also, any irrelevant computations (references to variables or constants other than in the final position) are removed.

(ATOMFLUSHP, when NIL, suppresses the removal of symbols, which in some cases may be MacLISP PROG tags). The purpose of this is to avoid multiple nesting of PROGN forms:

(PROGN (PROGN a b) (PROGN (PROGN c (PROGN d e) f) g)) Any code generation routine which constructs a PROGN with a component Q generated by another routine generally says:

"(PROGN (SETQ FOO 3) @(DEPROGNIFY Q) (GO ,THE-TAG)) The "@" means that the list of forms returned by the call to DEPROGNIFY (which is actually a macro which expands into a call to DEPROGNIFYI) is to be substituted into the list (PROGN ...) being constructed by the '"' operator. Thus rather than the nested PROGN code shown above, the code generator would instead produce:

(PROGN a b c d e f g) which is much easier to read when debugging the output of RABBIT.

TENPLOC is a little utility which given the number (in the DEP ordering used by DEPTH-ANALYZE) of a register returns the name of that register.

    • CONT+ARG-REGS** is the same as **ARGUMENT-REGISTERS** except that the name **CONT** is tacked onto the front. **CONT** is considered to be register 0. If N is greater than the_number of the highest standard argument register, then a new register name of the form "N" is invented. Thus the additional temporary registers are called ll, -12-, -13-, etc. ~

[Page 219]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 RAB9lI,§§§,L9§l!§IZ§__P99€m4?

DEPROGNIFY IS USED ONLY TO MAKE THE OUTPUT PRETTY BY ELIHINATING ;;; UNNECESSARY OCCURRENCES OF "PROGN'.

(DEFMAC DEPROGNIFY (FORM) "(nFPROGMlFv1 ,FORM NlL)) (SET' *DEPROGNIFY-COUNT* o) (DEFINE DEPROGNIFYI (LAMBDA (FORM ATOHFLUSHP) (IF (OR (ATOM FORM) (NOT (£O (CAR FORM) 'PROGN))) (LIST FORM) (oo ((x (COR FORM) (COR x)) (z MIL (COND ((NULL (COR x)) (CONS (CAR x) z)) ((NULL (CAR x)) (xNCR£M£n1 *DEPROGNIFY-COUNTG) Z) ((A1OM (CAR x)) (COND (ATOMFLUSHP (INCRFMFNT *DEPROGNIFY-COUNT*) Z) (1 (Cons (CAR x) z)))) ((£o (CAAR x) '0UOT£) (INCRFMFMT *DEPROGNIFY-COUNTQ) Z) (1 (Cons (CAR x) z))))) ((NULL x) (uRev£Rse z)))))) (DEFINE TEMPLOC (LAMBDA (n) (LABELS ((LooP (LAMBDA (REGS J) (IF (NULL Ress) (IMPLODE (APPEND °(-) (EXPLODEN M) '(-))) (IF (- J o) (CAR REGS) '

(LOOP (COR Rees) (- J 1))))))) (LOOP **CONT+ARG-REGS** n))))

[Page 220]

210 ENVCARCDR takes a set of variables VARS representing the consed environment, and an old rename list RNL, and adds to RNL new entries for the variables, supplying pieces of code to access the environment structure. For example, suppose RNL were NIL, and VARS were (A B C). Then ENVCARCDR would produce the list:

((C . (CAR (CDR (CDR **ENV**)))) (B . (CAR (CDR **ENV**))) (A . (CAR **ENV**))) where each variable has been paired with a little piece of code which can be used to access it at run time. This example is not quite correct, however, because the peephole optimizer DECARCDRATE is called cur the little pieces of code; DECARCDRATE collapses CAR-CDR chains to make them easier to read, and so the true result of ENVCARCDR would be:

((C _ (CADDR **ENV**)) (B . (CADR **ENV**)) (A . (CAR **ENV**)))

[Page 221]

001 002 003 004 005 006 007 RA8BlI_§§§_"9§f}§/Z§__PF9¢'f§ (DEFINE £NvcARcon (LAMBDA (vAns Run) (oo (<x °~~zuv¢» "(con ,x)) (v vans (con v)) (n nun (cons (CONS (CAR v) (DECARCDRATE '(cAR ,x))) n))) ((NULL V) R))))

[Page 222]

212 REGSLIST takes a CLAMBDA cnode, a switch AVP, and a rename list RNL. It tacks onto RNL new entries which describe how to access the arguments of the CLAMBDA. This is complicated because there are three cases. (1) A NOCLOSE function takes its arguments in nonstandard registers. (Z) Other functions of not more than **NUNBER-OF-ARGUMENT-REGISTERS** (the length of the **ARGUMENT-REGISTERS** list) arguments takes their arguments in the All other functions takes a list of arguments in the (**ONE**), except for the continuation in **CONT**. The or not the elements of ASETVARS should be included include). switch AVP tells (non-nil means As an example, suppose the CLAMBDA is a NOCLOSE with DEP = 12 and (A B C D), and suppose that AVP = T and RNL = NIL. Then the result would ((D _ -15-) (C _ -14-) (B _ -13-) (A . -12-)) standard registers. (3) first argument register whether do not TVARS = be:

As another example, suppose the CLAMBDA is of type EZCLOSE with VARS = (K X Y Z) and ASETVARS = (Y), and suppose that AVP = NIL and RNL = ((A . -12-)). Then the result would be:

((Z _ **THREE**) (X . **ONE**) (K . **CONT**) (A . -12-)) SET-UP-ASETVARS takes a piece of code (the code for a CLANBDA body), an ASETVARS set AV, and a rename list. If there are no ASETVARS, then just the code is returned, but otherwise a PROGN-form is returned, which ahead of the code has a SETQ which adds the ASETVARS to the environment. (LOOKUPICATE takes a variable and ai RNL and returns a piece of code for referring to that variable.) For example, suppose we had:

CODE = (GO FOO) AV = (A C) RNL = ((C . -14-) (B . -13-) (A . -12-)) Then SET-UP-ASETVARS would return the code:

(PROGN (SETQ **ENV** (CONS -lZ- (CONS -14- **ENV**))) (GO FOO))

[Page 223]

DO1 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 D25 026 027 028 029 030 031 032 033 D34 035 036 037 D38 039 040 RABREE-§§§W,Q§£l§[Z§"_R§9° 92 ;;; AVP NON-NIL MEANS THAT ASETVARS ARE TO BE EXCLUUED FROM THE CONSED LIST.

(DEFINE REGSLIST (LAMBDA (CLAM AVP RNL) (LET ((Av (AND AVP (cLAMauA\AservARs cLAM)))) (lr (eo (CLAMBOA\FNP CLAM) 'NOCLOSE) (oo ((J (cLAMaoA\oeP CLAM) (+ J 1)) (TV (CLAMBDA\TVARS CLAM) (coR 1v)) (R RNL (IF (Memo (CAR TV) AV) R

(CONS (CONS (CAR TV) (TEMPLOC J)) R)))) ((NULL TV) R)) '

(LET ((VARS (CLAMBDA\VARS CLAM))) (IF (> (LENGTH (CDR VARS)) **NUMBER-OF-ARG-REGS**) (DO ((X (CAR ¢*ARGUMENT-REGISTERS**) '(CDR ,X)) (V (CDR VARS) (CDR V)) (R (CONS (CONS (CAR VARS) '**CONT**) RNL) (IF (MEMO (CAR V) AV) R

(cons (CONS (CAR v) (u£cARcoRAT£ '(CAR ,x))) R)))) ((NULL v) R)) (oo ((v vARs (cnR v)) (x **CONT+ARG-RE6S°* (coR x)) (R RNL (IF (nano (CAR v) AV) R

(cons (CONS (CAR v) (CAR x)) R)))) ((NULL V) R)))))))) (DEFINE SET-UP-ASETVARS (LAMBDA (CODE AV RNL) (IF (NULL AV) CODE "(PROGN (SETO *°ENV** ,(DO ((A (REVERSE AV) (CDR A)) (E '**ENV** '(CONS ,(LO0KUPlCATE (CAR A) RNL) ,E))) ((NULL A) E))) !(DEPROGNIFY CODE)))))

[Page 224]

214 In the continuation-passing style, functions do not return values; instead, they apply a continuation to the value. Thus, the body of a CLAMBDA-expression is a form which is not expected to produce a value. On the other hand, such a form will have subforms which do produce values, for example references to variables.

Thus the forms to be dealt with in the code generator can be divided into those which produce values and those which do not. Initially the latter will always be attacked, as the body of a "function"; later the former will be seen.

COMP-BODY takes a valueless form and compiles it. The routine ANALYZE, which we will see later, handles valued forms.

COMP-BODY instantiates a by now familiar theme: it simply dispatches on the type of BODY to some specialist routine. In the case of a CLABELS, it first compiles the body of the CLABELS (which itself is valueless if the CLABELS is valueless, and so a recursive call to COMP-BODY is used), and then goes to PRODUCE-LABELS. For a CCOMBINATION or RETURN, it does a three-way (for RETURN, two-way) sub-dispatch on whether the function is a TRIVFN, a CLAMBDA (or CONTINUATION), or something else.

The PRODUCE series of routines produce code for valueless forms.

PRODUCE-IF calls ANALYZE on the predicate (which will produce a value), and COMP-BODY on the consequent and alternative (which produce no value because the entire CIF does not). The three pieces of resulting code are respectively called PRED, CON, and ALT. These are then given to CONDICATE, which generates a MacLISP COND form to be output.

[Page 225]

001 002 003 D04 005 006 007 008 D09 D10 D11 D12 013 D14 015 D16 D17 018 019 020 021 022 023 024 025 026 027 D28 D29 030 031 D32 D33 034 D35 036 D37 038 039 D40 041 042 043 044 D45 046 D47 048 D49 050 D51 052 053 D54 D55 056 D57 D58 059 D60 D61 D62 063 064 065 D66 067 068 8BB°lTM§§§__9§£l5lZ§,wP99¢_!§ ;;; RNL IS THE "RENAME L1s1~¢ AN AL1s1 oescnxalns now TO nares to THE vAn1AaL£s nu rue S32 ENVIRONMENT. CENV IS THE CONSED ENVIRONMENT (DEFINE COMP-BODY (LAMBDA (BODY RNL PROGNAME BLOCKFNS CENV FNS C) - (LET ((CFM (CNODE\CFORM BODY))) (EOCASE (TYPE CFM) SEEN BY TNE BODY.

(CIF (PRODUCE-IF BODY RNL PROGNAME BLOCKFNS CENV FNS C)) (CASET (PRODUCE-ASET BODY RNL PROGNAME BLOCKFNS CENV FNS C)) (CLABELS (OR (EQUAL CENV (CLABELS\CONSENV CFM)) (ERROR '|Env1ronment disagreementl BODY 'FAIL-ACT)) (LET ((LCENV (APPEND (CLABELS\FNENV CFM) CENV))) (COMP-BODY (CLABELS\BODY CFM) (ENVCARCDR LCENV RNL) PROGNAME BLOCKFNS LCENV FNS (LAMBDA (LBOD FNS) (PRODUCE-LABELS BODY LBOD RNL PROGNAME BLOCKFNS FNS C))))) (CCOMBINATION (LET ((FN (CNODE\CFORM (CAR (CCOMBlNATION\ARGS CFM))))) (COND ((£o (TYPE rn) 'CLAMBOA) (PRODUCE-LAMBDA-COMBINATION BODY RNL PROGNAME BLOCKFNS CENV FNS C)) ((AND (EO (TYPE FN) 'TRIVIAL) (EO (TYPE (NODE\FORM (TRIVIAL\NODE FN))) °VARIABLE) (TRIVFN (VARIABLE\VAR (NODE\FORM (TRIVIAL\NODE FN))))) (PRODUCE-TRIVFN-COMBINATION BODY RNL PROGNAME BLOCKFNS CENV FNS C)) (T (PRODUCE-COMBINATION BODY RNL PROGNAME BLOCKFNS CENV FNS C))))) (RETURN (LET ((ru (CNODE\CFORM (RETURN\CONT cFM)))) (1¥ (so (TYPE rn) 'CONTlNUATION) (PRODUCE-CONTINUATION-RETURN aoov RNL Pnosnnne aLocxrns ceuv rss c) (PRODUCE-RETURN aoov RNL Pnosnnns aLocxrns cenv rns c)))))))) (DEFINE PRODUCE-IF (LAMBDA (CNODE RNL PROGNAME BLOCKFNS CENV FNS C) (LET ((CFM (CNODE\CFORM CNODE))) (ANALYZE (CIF\PRED CFM) RNL PROGNAME BLOCKFNS FNS (LAMBDA (PRED FNS) (COMP-BODY (CIF\CON CFM) RNL PROGNAME BLOCKFNS CENV FNS (LAMBDA (CON FNS) (COMP-BODY (ClF\ALT crn) RNL vaosnnne BLOCKFNS csnv sus (LAMBDA (ALT rss) (c (CONDICATE Paso con ALI) rns))))))))))

[Page 226]

216 PRODUCE-ASET first calls ANALYZE on the body, which must produce a value (to be assigned to the CASET variable). There are then two cases, depending on whether the CASET\CONT is a CONTINUATION or not.

If it is, then the body of the continuation is compiled (using COMP-BODY), and then LANBDACATE is called to generate the invocation of the continuation. The routine OUTPUT-ASET generates the actual MacLISP SETQ (or other construct) for the CASET variable, using the environment location provided by LOOKUPICATE. All in all this case is very much like a RETURN with an explicit CONTINUATION, except that just before the continuation is invoked a SETQ is stuck in.

If the CASET\CONT is not a CONTINUATION, then ANALYZE is called on the CASET\CONT, and then a piece of code is output which sets **FUN** to the continuation, **ONE** (which is in the car of **ARGUMENT-REGISTERS**) to the value of the body (after also setting the CASET variable, using OUTPUT-ASET), and does (RETURN NIL), which is the SCHEME run-time protocol for invoking a continuation. '

[Page 227]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 E5§BlI_§§§__!§Ll§£Z§_»F!9$M2§ (DEFINE PRODUCE-ASET (LAMBDA (CNODE RNL PROGNAME aLocxrNs ceuv rns c) (LET ((crM (CNODE\CFORM CNODE))) (ANALYZE (CAS£T\800Y cfm) nuL _ Pnosuane eLocxrNs rns '

(LAMBDA (aoov FNS) (LET ((courcrn (CNODE\CFORH (CAS£T\CONT crn)))) (if (eo (TYPE CONTCFM) 'CONT|NUATION) (COMP-BODY (CONTINUATION\BODY CONTCFH) (IF (CONTINUATION\TVARS CONTCFH) (CONS (CONS (CAR (CONTINUAT l0N\TVARS CONTCFH)) _ (TEMPLOC (CONTINUATION\D£P CONTCFH))) (ENVCARCUR CENV RNL)) (ENVCARCDR CENV RNL)) PROGNAME BLOCKFNS CENV FNS (LAMBDA (CODE FNS) (C (LAHBDACATE (LIST (CONTINUATION\VAR CONTCFH)) (CONTINUATION\TVARS CONTCFM) (CONTINUATION\DEP CONTCFM) (LIST (OUTPUT-ASET (CASET\VAR CFM) (Looxuvxcare RNL souv)) (REMARK-ON (CAS£T\CONT crM)) ' iifuvli cone) FNS))) (ANALYZE (cAs£T\c0NT CFM) RNL PROGNAME BLOCKFNS FNS (LAMBDA (CONT FNS) (C '(PROGN (SETO **FUN** ,CONT) (SETO ,(CAR **ARGUMENT-REGISTERS**) ,(OUTPUT-ASET (LOOKUPICATE BODY)) (RETURN n1L)) FNS)))))))))) (CASET\VAR CFM) RNL)

[Page 228]

218 PRODUCE-LABELS takes an already-compiled body LBOD. FNENV-FIX is a (possibly empty) list of pieces of code which will fix up the consed environment by adding the variables common to all the closures to be made up (this set was computed by CLOSE-ANALYZE and put in the FNENV slot of the CLABELS). The code for this addition is built from the list of variables by CONS-CLOSEREFS.

There are then three cases, depending on the type of closures to be constructed (NOCLOSE, EZCLOSE, or NIL). Suppose that the CLABELS is:

(CLABELS ((FOO (LAMBDA ...)) (BAR (LAMBDA ...))) <body>) Let us see roughly what code is produced for each case.

For a NIL type (full closures), the idea is merely to create all the closures in standard form (but with a null environment), add them all tx: the consed environment, and then go back and clobber the environment portion of the closures with the new resulting environment, plus any other variables needed.

Now a standard closure looks like (CBETA <value of progname> <tag> .

<environment>). (At run time the value of the progname will be a MacLISP SUBR pointer for the module; the tag identifies the particular routine in the lnodule.) In the DO loop, FNS accumulates the function definitions (to be compiled separately later), RP accumulates RPLACD forms for clobbering the closures, and CB accumulates constructors of CBETA lists. For our example, the generated code looks like:

((LAMBDA (FOO BAR) (SETQ **ENV** (CONS ... (CONS X43 **ENV***)...)) (RPLACD (CDDR BAR) (CONS ... (CONS X72 **ENV**)...)) (RPLACD (CDDR FOO) (CONS ... (CONS X69 **ENV**)...)) <body>) (LIST 'CBETA ?-453 'FOO-TAG) (LIST 'CBETA ?-453 'BAR-TAG)) where ?-453 is the PROGNANE for the module containing the CLABELS, and FOO-TAG and BAR-TAG are the tags (whose names will actually look like FNVAR-91) for FOO and BAR. (Now in fact CLOSE-ANALYZE creates a null FNENV for type NIL CLABELS, and so the first SETQ would in fact not appear. However, the decision as to the form of the FNENV is only a heuristic, and so PRODUCE-LABELS is written so as to be prepared for any possible choice of FNENV and CLOSEREFS of individual labelled functions. In this way the heuristic in CLOSE-ANALYZE can be freely adjusted without having to change PRODUCE-LABELS.) For the EZCLOSE case the "closures" need only contain environments, not also code pointers. A trick is needed here, however, to build the circular environment. when adding the labelled functions to the environment, vue must somehow 'cons in an object; but we want this object to possibly be the environment itself! what we do instead is to make up a list of the tag, and later RPLACO this list cell with the environment. The tag is never used, but is useful for debugging. This method also makes the code very similar to the NIL case, the only difference being that the atom CBETA and the value of the PROGNAHE are not consed onto each closure.

[Page 229]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 016 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 (DEFINE PRODUCE~LABELS (LAMBDA (CNODE LBOD RNL PROGNAME BLOCKFNS FNS C) (LET ((CFM (CNOD£\CFORM CNOUE))) (LET ((VARS (CLABELS\FNVARS CFM)) (UEFS (CLABELS\FNDEFS CFH)) (FNENV (CLABELS\FNENV CFM))) (LET ((FNENV-FIX (IF FNENV (EOCASE 8A§9}I_§§§__9§Ll§/ElLR99=v5] "((szro -*env-f ,(cons-cLoseR:rs rneuv nuL)))))) (CLABELS\EASY crm) (NIL (no ((v vnns (con v)) (u ness (con n)) (FNS FNS (CONS (LIST PROGNAME (CAR U) NIL) FNS)) (RP NIL (CONS '(RPLACD (CDDR .(CAR V)) ,(CONS-CLOSEREFS (CLAMBDA\CLOSEREFS (CNOUE\CFORM (CAR D))) RNL)) )) ~(Lxs1 'ce£TA ,wnosunnz ',(CAR v)) cs))) RP (CB NIL (CONS ((NULL_V) (C "((LAMBDA ,VARS EFNENV-FIX ERP !(o£PRoGn1rY LBOD)) e(NR£v£ns£ ce)) .

FNS)))) (DO ((V VARS (CDR V)) (D DEFS (CDR D)) (FNS FNS (CONS (LIST PROGNAME (CAR U) NIL) FNS)) (RP NIL (CONS '(RPLACD .(CAR V) ,(CONS-CLOSEREFS (CLAMBDA\CLOSEREF$ (CNOD£\CFORH (CAR D))) RNL)) (EZCLOSE RPN (CB NIL (CONS "(LIST T,(CAR V)) CB))) ((NULL V) (C "((LAHBDA ,VARS !(DEPROGNIFY LBOD)) erusnv-rlx env s(nn£vERs£ ca)) FN5)))) (NOCLOSE (c "(Puocn erN£nv~r|x p(nePnoen1rY LBOD)) (oo ((v vAns (con v)) (n ncrs (con n)) (rns rns (cons (LIST Pnoeunnz (cAn u) nnL) rns))) ((NULL V) FNS))))))))))

[Page 230]

ZZO One problem is that these "closures" are not of the same form as ordinary EZCLOSE closures, which do not have the tag. This is the purpose of the LABELS-FUNCTION properties which BIND-ANALYZE created; when a call to an EZCLOSE function is generated, the presence of a LABELS-FUNCTION property indicates that the "closure" itself is not the environment, but rather its cdr is. (It would be possible to do without the cell containing the tag, by instead making up the environment with values of NIL, then constructing the "closures" as simple environments, and then going back and clobbering the environment structure with the closure objects, rather than clobbering the closure objects themselves. The decision not to do this was rather arbitrary.) The generated code for the EZCLOSE case thus looks like:

((LAMBDA (FOO BAR) (SETQ **ENV** (CONS ... (CONS X43 **ENV***)...)) (RPLACD (CDDR BAR) (CONS ... (CONS X72 **ENV**)...)) (RPLACD (CDDR FOO) (CONS ... (CONS X69 **ENV**)...)) <body>) (LIST 'FOO-TAG) (LIST 'BAR-TAG)) In the NOCLOSE case, no closures are made at run time for the labelled functions, and so the code consists merely of the FNBNV-FIX (which, again, using the current heuristic in CLOSE-ANALYZE will always be null in the NOCLOSE case) and the code for the body:

(PROGN (SETQ **ENV** (CONS (CONS X43 **ENV**)...)) <body>) In any case, of course, the labelled functions are added to the FNS list which is handed back to C for later compilation.

PRODUCE-LAMBDA-COMBINATION generates code for the case of ((CLAHBDA ...) argl argn). First a number of consistency checks are performed, to make sure the pass-2 analysis is not completely awry. Then code is generated for the body of the CLANBDA, using COMP-BODY. Then all the arguments, which are of course expected to produce values, are given to MAPANALYZE, which will call ANALYZE on each in turn and return a list of the pieces of generated code (here called ARGS in the continuation handed to MAPANALYZE). Finally, LAMBDACATE is called to generate the code for entering the body after setting up the arguments in an appropriate manner. Notice the use of SET-UP-ASETVARS to generate any necessary additional code for adding ASETVARS to the consed environment on entering the body. (A more complicated compiler would in this situation add the argument values to the consed environment directly, rather than first putting them in registers (which is done by LAMBDACATE) and then moving the registers into the consed environment (which is done by SET-UP-ASETVARS). To do this, however, would involve destroying the modular distinction between LAMBDACATE and SET-UP-ASETVARS. The extra complications were deemed not worthwhile because in practice the ASETVARS set is almost always empty anyway.)

[Page 231]

001 002 003 006 005 006 D07 006 009 010 D11 012 013 014 015 016 017 D18 019 020 021 U22 023 D24 025 026 027 026 029 030 031 032 033 D34 035 036 037 038 039 F9991Tm§§§_vQ§l!§Z?§_d§°9° B?

(DEFINE PRODUCE-LAMBDA-COHB[NATION (LAMBDA (CNODE RNL PROGNAME BLOCKFNS CENV FNS C) (LET ((CFM (CNO0E\CFORM QNonE))) (LET ((FN (CNODE\CFORM (CAR (CCOMBlNATION\ARGS CFM))))) (AND (CLAMBDA\CLOSEREFS FN) (ERROR '1Functiona\ LAMBDA has CLOSEREFSI CNODE 'FAIL-ACT)) (OR (EQUAL CENV (CLAM8DA\CONSENV FN)) (ERROR '1Environment disagreementl CNODE 'FAIL-ACT)) (OR (EO (CLAMBDA\FNP FN) 'NOCLOSE) (ERROR '1Non-NOCLOSE LAMBDA In funciion posiiionl CNODE 'FAIL-ACY)) (COMP-BODY (CLAMBOA\BODY FN) _ (ENVCARCDR (CLAM8DA\ASETVARS FN) (REGSLIST FN T (ENVCARCDR CENV RNL))) PROGNAME BLOCKFNS (APPEND (CLAHBDA\ASETVARS FN) CENV) FNS (LAMBDA (BODY FNS) (MAPANALYZE (CDR (CCOHBINATl0N\ARGS CFM)) RNL PROGNAME BLOCKFNS FNS (LAMBDA (ARGS FNS) (C (LAMBDACATE (CLAMBDA\VARS FN) (CLAMBDA\TVARS FN) (CLAMBDA\DEP FN) ARGS (REMARK-ON (CAR (CCOMBlNATl0N\ARGS '\uENVn1 (SET-UP-ASETVARS BODY (CLAMBDA\ASETVARS FN) FNS))))))))) (REGSLIST FN NIL NlL))) CFM)))

[Page 232]

222 PRODUCE-TRIVFN-COMBINATION handles a case like (CONS continuation argl argZ), i.e. a CCONBINATION whose function position contains a TRIVFN. First all the arguments (excluding the continuation!) are given to MAPANALYZE; then a dispatch is made on whether the continuation is a CONTINUATION or a CVARIABLE, and one of two specialists is called.

PRODUCE-TRIVFN-COMBINATION-CONTINUATION handles a Case like (CONS (CONTINUATION (Z) <body>) argl argZ). The idea here is to compile it approximately as if it were ((CONTINUATION (Z) <body>) (CONS argl arg2)) That is, the arguments are evaluated, the trivial function is given them to produce a value, and that value is then given to the continuation. Accordingly, the body of the CONTINUATION is compiled using COMP-BODY, and then LAMBDACATE takes care of setting up the argument (the fourth argument to LAMBDACATE is a list of the NacLISP code for invoking the trivial function) and invoking the body of the (necessarily NOCLOSE) CONTINUATION.

[Page 233]

001 002 003 004 005 006 007 008 009 010 011 012 013 016 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 PABBII.§§5n_9§l}5£Z§NwH°9¢ Q9 (DEFINE PRODUCE-TRIVFN-COMBINATION (LAMBDA (CNODE RNL PROGNAME BLOCKFNS CENV FNS C) (LET ((CFM (CNODE\CFORM CNODE))) (LET ((FN (CNODE\CFORM (CAR (CCOHBINATION\ARGS CFM)))) (CONT (CNODE\CFORM (CADR (CCOHBlNATION\ARGS CFH))))) (MAPANALYZE (CDDR (CCOHBINATION\ARGS CFH)) RNL PROGNAHE BLOCKFNS FNS (LAMBDA (ARGS FNS) (EOCASE (TYPE CONT) (CONTINUATION (PRODUCE-TRIVFN-COMBINATION-CONTINUATION CNODE RNL PROGNAHE BLOCKFNS CENV FNS C CFM FN CONT ARGS)) ' (CVARIABLE (PRODUCE-TRIVFN-COHBINATION-CVARIABLE CNODE RNL PROGNAHE BLOCKFNS CENV _ FNS C CFM FN CONT ARGS))))))))) (DEFINE PRODUCE-TRIVFN-COMBINATION-CONTINUATION (LAMBDA (cnooe nnL Pnosunne eLocxrns ceuv sus c crn rn conf ARGS) (aLocx (AND (CONTINUATION\CLOSEREFS conr) (ennon 'ICONTINUATION for rnxvrn has cLoseners1 cnooe -sA1L-Aer); (on (so (CONTINUATION\FNP coury 'nocLose> (ennon ~|n°n-uocLose courxnunrnou for rn|vrn| cnonz 'FAIL-ACT)) (COMP-BODY (CONTINUATION\BODY CONT) (IF (CONTINUATION\TVARS CONT) (CONS (CONS (CAR (CONTINUATION\TVARS CONT)) (TEHPLOC (CONTINUATION\0EP CONT))) (ENVCARCDR CENV'RNL)) (envcnncon cenv RNL)) vnoeunne BLOCKFNS cenv rns (LAMBDA (aouv sus) (C (LAMBDACATE (LIST (CONTINUATION\VAR CONT)) (CONTINUATION\TVARS CONT) (CONTINUATION\D£P conf) (LIST ~(,(vAn1AnLe\vAu (NODE\FORH (TRIVIAL\NOUE rn))) - eAa@s)) <nsnAnx-on (cnun (CCOMBINATION\ARGS crn))) ' iifuvii aouv) FNS))))))

[Page 234]

224 PRODUCE-TRIVFN-COMBINATION-CVARIABLE handles a case like (CONS CONT-43 argl arg2), where the continuation for a trivial function call is a CVARIABLE.

In this situation the continuation is given to ANALYZE to generate MacLISP code for referring to it; 'there are then two cases, depending on whether the CVARIABLE has a KNOWN-FUNCTION property. (Note that before the decision is made, VAL names the piece of MacLISP code for calling the trivial function on the arguments.) If the CVARIABLE denotes a KNOWN-FUNCTION, then it should be possible to invoke it by adjusting the environment, setting up the arguments in registers, and jumping to the code. First the environment adjustment is computed; ADJUST-KNOWNFN-CENV generates a piece of MacLISP code which will at run time compute the correct new environment in which the continuation will expect to run. There are then two subcases, depending on whether the KNOWN-FUNCTION is of type NOCLOSE or not. If it is, then LAMBDACATE is used to set up the arguments in the appropriate registers (the last argument of NIL indicates that there is no "body", but rather that the caller of LAMBDACATE takes the responsibility' of jumping to the code). If it is not, then PSETQIFY is used, because the value will always go in **ONE** (which is the car of **ARGUMENT-REGISTERS**). In either case, a GO is generated to jump to the code (within the current module, of course) for the continuation.

If the continuation is not a KNOWN-FUNCTION, then the standard function linkage mechanism is used: the continuation is put into **FUN**, the value into **ONE**, and then (RETURN NIL) exits the module to request the SCHEME run-time interface to invoke the continuation in whatever manner is appropriate.

[Page 235]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 . 903.1 T_§§§_L_2§Z_1_5(_Z§ L,_f_!9& §9 (DEFINE PRODUCE-TRIVFN-COMBINATION-CVARIABLE (LAMBDA (CNODE RNL PROGNAHE BLOCKFNS CENV FNS C CFM FN CONT ARGS) (ANALYZE (CADR (CCOMBlNATl0N\ARGS CFM)) RNL PROGNAME BLOCKFNS FNS (LAMBUA (coutr rus) (LET ((xf (GET (CVARIABL£\VAR cout) 'KNOWN-FUNCTION)) (vAL "(,(VARIABLE\VAR (NODE\FORM (TRIVIAL\NODE rn))) eAnss))) (IF KF (LET ((xcrn (CNODE\CFORH xr))) (LET ((£uvAoJ (ADJUST-KNOHNFN-C£NV CENV (CVARIABL£\VAR CONT) CONTF (CONTlNUATl0N\FNP KCFM) (APPEND (CONTlNUATl0N\CLOSEREFS KCFH) (CONTINUATl0N\CONSENV KCFH))))) (c '(PnoGn !(lF (EQ (CONTlNUATl0N\FNP KCFH) 'NOCLOSE) (DEPROGNIFY (LAMBDACATE (LIST (CONTINUATION\VAR KCFM)) (CONTlNUATl0N\TVARS KCFM) (CONTINUATl0N\D£P KCFH) (LIST vAL) (REMARK-ON xr) - ENVADJ NlL)) (PSETOIFY (LIST envnna VAL) (LIST '**£NVi* (CAR ¢#ARGUHENT-R£Gl$T£RS¢¢)))) (Go ,(CONTINUATl0N\NAH£ xcfn))) FNS))) (c "(PROGN (sera ffrun-~ ,coN1r) (SETO ,(CAR *IARGUMENT-REGISTERS*¢) ,VAL) (RETURN n1L)) FNS)))))))

[Page 236]

226 PRODUCE-COMBINATION handles combinations whose function positions contain neither TRIVFNs nor CLAMBDAS. All of the arguments, including the function position itself and the continuation, are given to MAPANALYZE, resulting in a list FORM of pieces of MacLlSP code. There are then two cases. If the function position is a VARIABLE (within a TRIVIAL - not a CVARIABLE!), then PRODUCE-CONBINATION-VARIABLE is used. Otherwise code is generated to use the standard SCHENE run-time interface: first set **FUN** to the function, then set up the arguments in the standard argument registers (PSETQ-ARGS generates the code for this), then set **NARGS** to the number of arguments (this does not include the continuation), and exit the module with (RETURN NIL).

PRODUCE-COMBINATION-VARIABLE first determines whether the variable has a KNOWN-FUNCTION property. If so, then the approach is very much as in TRIVFN-COMBINATION-CVARIABLE: first the environment adjustment is computed, then either LANBDACATE or PSETQ-ARGS-ENV is used to adjust the environment and set up the arguments, and finally a GO to the piece of code for the KNOWN-FUNCTION is generated.

If the variable is not a KNOWN-FUNCTION, then it may still be in the list BLOCKFNS (which, recall, is a list of user functions included in this module).

If so, the effect on the code generation strategy is roughly as ii' it were a KNOWN-FUNCTION. The environment adjustment is done differently, but a GO is generated to the piece of code for the called function.

In any other case, the standard interface is used. **FUN** is set to the function, the arguments are set up, **NARGS** is set to the number of arguments, and (RETURN NIL) exits the module.

[Page 237]

001 002 003 004 005 006 007 008 009 010 011 OIZ 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 (DEFINE PRODUCE-COMBINATION B8BBlIM§§§_EQ§!l§ZZ9__R!9SL§l (LAMBDA (CNOUE RNL PROGNAME BLOCKFNS CENV FNS C) (MAPANALYZE (CCOMBlNATl0N\ARGS (CNODE\CFORM CNODE)) RNL PROGNAME BLOCKFNS FNS (LAMBDA (FORM FNS) (C (LET ((F (CNODE\CFORM (CAR (CCOMBINAT10N\ARGS (CNODE\CFORM CNODE)))))) (IF (AND (EO (TYPE F) 'TRIVIAL) (EQ (TYPE (NODE\FORM (TRIVIAL\NODE F))) 'VARIABLE)) (LET ((V (VARIABL£\VAR (NODE\FORM (TRIVIAL\NODE F))))) (PRODUCE-COMB[NATION-VARIABLE CNODE RNL PROGNAME BLOCKFNS CENV FNS C FORM V (GET V 'KNOWN-FUNCTION))) '(PROGN (SETO **FUN*l ,(CAR FORM)) C(PSETO-ARGS (CDR FORH)) (SETO **NARGS** ',(L£NGTH (CDDR FORM))) (RETURN n1L)))) FHS))))) (DEFINE PRODUCE-COMBINATION-VARIABLE (LAMBDA (CNODE RNL PROGNAME BLOCKFNS CENV FNS C FORM V KFN) (IF KFN (LET ((ENVADJ (ADJUST-KNOVNFN-CENV CENV V , (CAR FORM) (CLAMBDA\FNP (CNODE\CFORM KFN)) (APPEND (CLAHBDA\CLOSEREFS (CNODE\CFORM KFN)) ' (CLAMBDA\CONSENV (CNODE\CFORM KFN)))))) (OR (EQ (TYPE (CNODE\CFORM KFN)) 'CLAM8DA) (ERROR '1Known function not CLAMBDA1 CNODE 'FAIL-ACT)) "(PROGN P(IF (EO (CLAMBDA\FNP_(CNODE\CFORM KFN)) 'NOCLOSE) (IF (ASSO V "(PROGN "(PnosN (DEPROGNIFY (LAMBDACATE (CLAMBDA\VARS (CNODE\CFORM »<F~>) .

(CLAMBDA\TVARS (CNODE\CFORM KFN)) (CLAMBUA\DEP (CNODE\CFORM KFN)) (con FORM) (REMARK-on nrn) zuvnna . NlL)) (PSETO~ARGS-ENV (CDR FORM) £NVADJ)) (so ,(CLAM8DA\NAME (CNODE\CFORM xrn))))) BLOCKFNS) @(PSETO-ARGS (con ronM)) @(lF (NOT (eouAL (cLAMaoA\cons£nv (cNooe\cronn (cAon (Asso v BLOCKFNS)))) C£NV)) "((s£To ~~£Nvf~ (CDDDR ,(cAR ronM))))) (so ,(cLAMaoA\NAM£ (CNODE\CFORM (cunn (Asso v (SETO ¢»ruuff ,(cAR ronM)) @(PSETO-ARGS (con FORM)) (sero **NARGS** ',(LENGTH (cook FORM))) (narunu n1L)))))) BLOCKFNS))))))

[Page 238]

Z28 ADJUST-KNOWNFN-CENV computes a piece of code for adjusting the environment. CENV is the internal representation (as a list of variable names) of the environment in which the generated code will be used. VAR is the name of the variable which names the function to be invoked, and for whose sake the environment is to be adjusted. VARREF is a piece of MacLISP code by which the run-time value of VAR may be accessed. FNP is the FNP type of the KNOWN-FUNCTION denoted by VAR. LCENV is the representation of the environment for the function.

Thus, the generated code should compute LCENV given CENV.

The two easy cases are when LCENV=CENV, in which case the environment does not change, and when LCENV=NIL, in which case the run-time environment will also be NIL. Otherwise it breaks down into three cases on FNP.

For FNP=NOCLOSE, it must be true that LCENV is some tail of CENV; that is, there is a stack-like discipline for NOCLOSE functions, and so CENV was constructed by adding things to LCENV. The piece of code must therefore consist of some number of CDR operations on **ENV**. If this operation does not in fact produce LCENV, then there is an inconsistency in the compiler.

For FNP=EZCLOSE, then VARREF can be used to reference the run-time "closure"; this may require a CDR operation if the function is an EZCLOSE LABELS-FUNCTION (see PRODUCE-LABELS).

For FNP=NIL, then VARREF will refer to a full closure; the CDDDR of this closure is the environment.

PRODUCE-CONTINUATION-RETURN is, mutatis mutandis, identical to PRODUCE-LAMBDA-COMBINATION. This is a good example of the fact that much code was duplicated because of the early design decision to treat COMBINATION and RETURN as distinct data types.

[Page 239]

O01 002 O03 004 005 006 007 008 009 010 011 012 O13 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 O42 043 044 045 046 047 O48 049 050 051 052 053 054 055 056 057 058 059 060 86Q§!1 5§9_ 05/15/76 PPO* 5?

(nerxue Aoausr-xnovnrn-ceuv (LAMBDA (CENV vAn vAnner rnv LCENV) (COND ((£ouAL Lcenv CENV) '»»env¢») ((NULL LCENV) 'NIL) (1 (EOCASE run (NOCLOSE (no ((x cenv (con x)) (Y '¢~euv»» "(COR ,v)) (1 (- (LENGTH ceuv) (LENGTH LCENV)) (- 1 1))) ((< I 1) (IF (EOUAL X LCENV) (OECARCORATE Y) (ERROR '|C|nnot recover environment for known functionl VAR 'FAIL-ACT))))) (EZCLOSE (IF (GET VAR 'LABELS-FUNCTION) '(COR ,VARREF) VARREF)) (NIL "(COOOR ,VARREF))))))) (OEFLNE PRODUCE-CONTINUATION-RETURN (LAMBDA (CNOOE RNL PROGNAHE BLOCKFNS CENV FNS C) (LET ((CFM (CNOOE\CFORH CNODE))) (LET ((FN (CNOOE\CFORM (RETURN\CONT CFM)))) (AND (CONTlNUATl0N\CLOS£R£FS FN) (ERROR 'lfunctional CONTINUATION has CLOSEREFSI CNOOE 'FAIL-ACT)) (OR (EQUAL CENV (CONTINUATION\CONSENV FN)) (ERROR '1Environment disagreementl CNOOE 'FAIL-ACT)) (OR (EO (CONTINUATION\FNP FN) 'NOCLOSE) (ERROR '1Non-NOCLOSE CONTINUATION in function positionl CNOOE 'FAIL-ACT)) (COMP-BODY (CONTINUATION\BOOY FN) (IF (CONTINUATION\TVARS FN) ' (CONS (CONS (CAR (CONTINUATION\TVARS FN)) (TEHPLOC (CONTINUATION\OEP FN))) (ENVCARCOR CENV RNL)) (ENVCARCOR CENV RNL)) PROGNAME BLOCKFNS CENV rns (LAMBDA (BODY rus) (ANALYZE (RETURN\VAL CFM) RNL PROGNAHE BLOCKFNS rus (LAMBDA (VAL rss) (c (LAHBOACATE (LIST (CONTINUATION\VAR rn)) (CONTlNUAT10N\TVAR$ rn) (CONTINUATION\OEP rn) (LIST VAL) (REMARK-ON (RETURN\CONT crn)) 'eagnvff BODY) FNS)))))))))

[Page 240]

230 PRODUCE-RETURN and PRODUCE-RETURN-1 together are almost identical to PRODUCE-COMBINATION and PRODUCE-COMBINATION-VARIABLE, except that the division between the two parts is different, and the BLOCKFNS trick is not applicable to RETURN.

PRODUCE-RETURN merely calls ANALYZE on each of the continuation and the value, and calls PRODUCE-RETURN-1.

PRODUCE-RETURN-l checks to see whether the continuation is a KNOWN-FUNCTION. If so, the environment adjustment is computed, and code is generated in a way similar to previous routines. If not, the standard interface (involving (RETURN NIL)) is used. Notice the check to see if VAL is in fact **ONE** (the car of **ARGUMENT-REGISTERS**); if so, the redundant code (SETQ **ONE** **ONE**) is suppressed.

[Page 241]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 (DEFINE Pnonuce-RETURN (LAMBDA (cuooe RNL Pno (LET ((crM (cn (ANALYZE (DEFINE PRODUCE-RETURN-1 ¥"5§LI_§§° Q5/15lZ§,_E!9§,§§ GNAME BLOCKFNS ceuv rus c) oo£\cronM cNon£))) (n£runN\vAL CFM) RNL vnosNAn£ BLOCKFNS rns (LAMBDA (VAL rss) (AnALYz£ (R£tuRu\coNr CFM) RNL Pnoeunnf BLOCKFNS rns (LAMBDA (CONT FNS) (rnooucz-n£1unn-1 CNODE RNL PROGNAHE BLOCKFNS CENV FNS C CFM VAL CONT)))))))) (LAMBDA (CNODE RNL PROGNAME BLOCKFNS CENV FNS C CFM VAL CONT) (IF (Ano (eo ( (GET (LET ((KCF (OR 2 (LET (c '(Pnoeu FNS)))) TYPE (CNODE\CFORM (RETURN\CONT CFH))) 'CVARlABLE) (CVARlABLE\VAR (CNODE\CFORH (RETURN\CONT CFH))) 'KNOWN-FUNCTION)) H (CNODE\CFORM (GET (CVARlABLE\VAR (CNO0E\CFORM (RETURN\CONT CFH))) 'KNOWN-FUNCTl0N)))) E0 (TYPE KCFH) 'CONT|NUATION) ERROR '1Known function not CONTXNUATIONI CNODE 'FAIL-ACT)) ((ENVADJ (ADJUST-KNOVNFN-CENV CENV (CVARIABLE\VAR (CNODE\CFORH (RETURN\CONT CFH))) CONT (CONTINUATION\FNP KCFH) (npvenn (CONTlNUATl0N\CLOSEREFS KCFM) (CONTlNUATl0N\CONSENV KCFM))))) (C '(PROGN !(lF (EQ (CONTINUATl0N\FNP KCFH) 'NOCLOSE) (DEPROGNIFY (LAMBDACATE (LIST (CONTINUATl0N\VAR KCFM)) (CONTINUATION\TVARS KCFM) (CONTINUATl0N\DEP KCFM) (LIST VAL)'

(REMARK-ON (GET (CVARIABLE\VAR (CNODE\CFORM (RETURN\CONT CFM))) 'KNOWN-FUNCTl0N)) ENVADJ NlL)) (PSETQIFY (LIST ENVADJ VAL) (LIST '*iENV¢¢ (CAR **ARGUHENT-REGISTERS**)))) (GO ,(CONTINUATl0N\NAME KCFM))) FNS))) (SETO **FUN** ,CONT) e(lr (NOT (eo VAL (CAR **ARGUMENT-REGlSTERS¢¢))) '((SETO .(CAR ¢*ARGUHENT-REGlSTERS¢*) ,vAL))) (RETURN NIL))

[Page 242]

232 LAMBDACATE generates code for invoking a NOCLOSE KNOWN-FUNCTION. It arranges for the arguments to be evaluated and put in the proper registers, and also performs some optimizations.

VARS is a list of the variables which are to be bound. TVARS is a list of those variables (a subset of VARS) which will actually be passed through registers, as specified by the TVARS slot of the CLAMBDA or CONTINUATION; this is used for a consistency check on the optimizations of LANBDACATE. DEP is the register depth of the function (the DEP slot). ARGS is a list of pieces of MacLISP code which have been generated for the arguments to the function. REM is a comment (usually one generated by REMARK-ON) to be included in the generated code for debugging purposes; this comment typically details the state of the environment and what variables are being passed through registers at this point.

ENVADJ is a piece of MacLISP code (usually generated by ADJUST-KNOWNFN-CENV) to whose value HENVM is to be set, to adjust the environment. BODY may be a list of pieces of l'lacLISP code which constitute the body of the known function, to be executed after the arguments are set up (typically because of a combination like ((LAMBDA ...) ...)), or it may be NIL, implying that the caller of LAMBDACATE intends to generate a GO to the code.

LAMBDACATE divides ARGS into three classes: (1) arguments which are themselves NOCLOSE KNOWN-FUNCTIONs such arguments actually have no actual run-time representation as a MacLISP data object, and so are not passed at all; (Z) arguments whose corresponding variables are never referenced these are accumulated in EFFARGS, a list of arguments to be evaluated for effect only (presumably the optimizer eliminated those unreferenced arguments_which had no side effects); and (3) arguments whose values are needed and are to be passed through the registers these are accumulated in REALARGS, and the corresponding variables in REALVARS. when this loop is done, (the reverse of) REALVARS should equal TVARS, for it is the set of actually passed arguments.

The generated code first evaluates all the EFFARGS (if any), then sets all the proper registers to the REALARGS (this code is generated by PSETQ-TEMPS), then (after the remark REM) executed the BODY (which, if NIL, is empty).

For example, consider generating code for:

((LAMBDA (F A B) (F A) ...) (LAMBDA (X) ...) (CONS X Y) (PRINT Z)) where F denotes a NOCLOSE KNOWN-FUNCTION, and B is never referred to. Then the call to LAMBDACATE might look like this:

[Page 243]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 F659II_§§§__95Ll§lZ§_LE!99L§£ ;;; HANDLE CASE OF INVOKING A KNOWN NOCLOSE FUNCTION OR CONTINUATION.

FOR AN EXPLICIT ((LAMBDA ... BODY) ...), BODY IS THE BODY.

232 OTHERWISE, IT IS NIL, AND SOMEONE VILL DO AN APPROPRIATE GO LATER.

(DEFINE LAHBDACATE ~ (LAMBDA (vAns 1vARs uev ARGS n (LABELS ((LOOP (LAMBDA (v A ;:RE (IF (LOOP VARS ARG EH ENVADJ BODY) REALVARS REALARGS EFFARGS) ALVARS IS COMPUTED PURELY FOR.ERROR-CHECKING (NULL A) (LET ((8 '(PROGN !(PSETO-TEMPS (NREVERSE REALARGS) DEP ENVADJ) ,REM !(DEPROGNIFY BODY))) (RV (NREVERSE REALVARS))) (IF (NOT (EQUAL RV TVARS)) (ERROR °1TVARS screwup in LAMBDACATEI '((VARS = .VARS) (TVARS = ,TVARS) (REALVARS = ,RV)) 'FAIL-ACT)) (IF EFFARGS '(PROGN DEFFARGS !(DEPROGNIFY B)) B)) (COND ((L£r ((xru (ser (CAR v) 'KNOWN-FUNCTION))) (AND urn (so (EOCASE (TYPE (CNO0E\CFORM KFN)) (CLAMBDA (CLAMBDA\FNP (CNO0E\CFORH KFN))) (CONTINUATION (CONTINUATION\FNP (CNODE\CFORH xrn)))) 'NOCLOSE))) (LOOP (con v) (con A) n£ALvAns REALARGS EFFARGS)) ((on (GET (CAR v) 'neAn~n£rs) (GET (CAR v) 'WRITE-REFS)) (LOOP (CDR v) (con A) (cons (CAR v) n£ALvAas) (cons (CAR A) neALAnss) EFFARGS)) (T (LOOP (CDR v) (con A) REALVARS n£ALAnes (CONS (CAR A) £frAnss)))))))) s NIL NIL NlL))))

[Page 244]

234 (LAHBDACATE '(F A B) '(»\) IZ '(<illegal) (CONS X43 V69) (PRINT Z91)) (remark) afiuvna <body)) where <illegal> is an object that should never be looked at (see ANALYZE-CLAMBDA); X43. Y69, and Z91 are pieces of code which refer to the variables X, Y, and Z; (remark) is some remark; the environment adjustment is assumed to be trivial; and <body> is the code for the body of the LAMBDA. The generated code would look something like this:

(PROGN (PRINT Z91) (SETQ -lZ- (CONS X43 Y69)) <remark> <body>) Notice that LAMBDACATE explicitly takes advantage of the fact that the execution of arguments for a combination may be arbitrarily reordered.

The various PSETQ,.. routines generate code to perform Parallel SETQS, i.e. the simultaneous assignment of several values to several values. The parallel nature is important, because some of the values may refer to other registers being assigned to, and a sequential series of assignments might not work.

The main routine here is PSETQIFY, which takes a list of arguments (pieces of NacLISP code which will generate values when executed at run-time) and a list of corresponding registers. One of two different methods is used depending on the number of values involved, Method Z produces better code (this is obvious only when one understands the properties of the HacLISP compiler which will compile the 1'IacLISP code into PDP-10 machine language). Unfortunately, it happened that when RABBIT was written there was a bug in the MacLISP compiler such that it often found itself unable to compile the code generated by Method Z.

Moreover, the primary maintainer of the HacLISP compiler was on leave for a year.

For this reason Method 3 was invented, which always works, but is considerably more expensive in terms of the PDP-10 code produced. (I concerned myself with this low level of detail only for this routine, because the code it produces is central to the whole code generator, and so its efficiency is of the greatest importance.) In order to achieve the best code, I determined empirically that Method 2 never failed as long as fewer than five values were involved. I might also add that a Method I was once used, which happened to provoke a different bug in the I'lacLISP compiler; Method Z was invented in an attempt to circumvent that first bug! Now that the maintainer of the MacLISP compiler (Jon L White) has returned, it may soon be possible to remove Method 3 from RABBIT; but I think this story serves as an excellent example of pragmatic engineering to get around immediate obstacles (also known as a "kludge').

[Page 245]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 RAQQIT ssa 05/15/7§ Qgge sg 322 GENERATE PARALLEL SETO'ING OF REGISTERS TO ARGS.

32; RETURNS A LIST OF THINGS; ONE HRITES !(PSETOIFY ...) VITNIN '.

(DEFINE Pserolrv (LAMBDA (Anas REGISTERS) (IF (< (LENGTH ARGS) 5) (vsarolrv-Meruou-2 ARGS REGISTERS) (PSETOIFY-METHOD-3 Anas RE6I$T£RS)))) (DEFINE PSETOIFY-METHOD-2 (LAMBDA (Anas REGISTERS) (LABELS ((PS£TO1 (LAMBDA (A ness ovAns seros USED) (xr (NULL A) (IF (NULL SETOS) NIL (IF (NULL (CDR SETOS)) "((s£ro ,(CAUAR SETOS) ,(cAN USEO))) ;;IMPORTANT: oo NOT NREVERSE rue sevose ;;HAKES HACLISP COMPILER NxN Barren.

'(((LANBDA ,(NREVERSE ovANs) QSETOS) Q( NREVERSE USED))))) (IF (so (CAR A) (CAR R£GS)) ;AVOID USELESS seTO°s (PSETOI (CDR A) (con ness) ovAns seros usso) ((LAMBDA (ov) (PSETO1 (con A) (con ness) (CONS OV OVARS) (CONS '(S£TO .(CAR REGS) ,0V) SETOS) (CONS (CAR A) USEU))) (GENTEHP '0))))))) (rserox ARGS Nesrsreas NIL NIL NIL))))

[Page 246]

Z36 Method Z essentially uses local HacLISP LAMBDA variables to temporarily name the values before assignment to the registers, while Method 3 uses global variables. (Method Z produces better code because the MacLlSP compiler can allocate the local variables on a stack, one by one, and then pop them off in reverse order into the 'registers".) Both methods perform two peephole optimizations: (1) If a value-register pair calls for setting the register to its own contents, that SETQ is eliminated. (2) If this elimination reduces the number of SETQs to zero or one, then NIL or a single SETQ is produced, rather than the more complicated and general piece of code.

As examples, (PSETQIFY '(-lZ- -lZ- (CDR -13-)) '(ll -lZ- -I3-)) would produce:

((LAMBDA (Q-43 Q-44) (SETQ 'l3- O-44) (SETQ ll Q-43)) -12-(CDR -13-)) (note that (SETQ -l2- -12-) was eliminated), and (PSETQIFY '(-Z3- -Zl- -Z4- -Z5- -ZZ-) '(-2l- -ZZ- -Z3- -Z4- -Z5-)) would produce:

(FROG () (DECLARE (SPECIAL -21--TEMP -ZZ~-TEMP -Z3--TEMP -Z4--TEMP -'25--TEMP) Z5 ZZ ) (SETQ TEMP - -(SETO TEMP -Z5-) (SETQ TEMP -24-) (SETQ TEMP -Zl~) (SETQ TEMP -Z3-) (SETQ -25--TEMP) (SETO -Z4- TEMP) (SETQ -23- TEMP) (SETQ -ZZ- TEMP) (SETQ -Zl- TEMP)) The only reason for using FROG is so that the DECLARE form could be included for the benefit of the MacLISP compiler.

The examples here are slightly incorrect; PSETQIFY actually produces a list of MacLlSF' forms, so that when no SETQS are produced the resulting NIL is interpreted ns no code at all.

In principle the elimination of redundant SETQS should be performed before choosing which method to use, so that there will be a maximal chance of using the more efficient Method 2. I chose not to only so that the two methods would remain distinct pieces of code and thus easily replaceable.

[Page 247]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 O15 016 017 018 019 020 021 022 023 024 025 R9B9!Ti!¥ _!§!15/7§L_E29§_§§ (DEFINE Pseroxrv-Nernou-3 (LAMBDA (ARGS REGISTERS) (LABELS ((PSET01 (LAMBDA (A ness OVARS seros USED) (IF (NULL A) (IF (NULL SETOS) NIL (IF (NULL (con s£ros)) "((S£T0 .(CADAR SETOS) ,(CADOR (CAR USED)))) "((PROG () (DECLARE (SPECIAL e0vAns)) luseo eseros) ))) (xr (eo (CAR A) (CAR REG$)) ;AV0lD USELESS s£ro's (PSET01 (CDR A) (con REGS) OVARS seros USED) ((LAM80A (ov) (PSETOI (con A) (con REGS) (CONS OV OVARS) (CONS "(SETO ,(CAR REGS) ,0V) S£TOS) (CONS "(SETO ,OV ,(CAR A)) US£U))) (CATENATE (CAR REGS) '1-TEMP1))))))) (PSET01 ARGS REGISTERS NIL NIL NlL))))

[Page 248]

238 PSETQ-ARGS is a handy routine which calls PSETQ-ARGS-ENV with an ENVADJ of **ENV**, knowing that later the redundant '(SETQ **ENV** **ENV**)" will be eliminated.

PSETQ-ARGS-ENV takes a set of arguments and an environment adjustment, and arranges to call PSETQIFY so as to set up the standard argument registers.

Recall that how this is done depends on whether the number of arguments exceeds **NUNBER-OF-ARG-REGS**; if it does, then a list of the arguments (except the continuation) is passed in **ONE**. **ENV+CONT+ARG-REGS**° is the same as **ARGUMENT-REGISTERS**, except that both the names **ENV** and **CONT** are adjoined to the front. It can be quite critical that **ENV** and the argument registers be assigned to in parallel, because the computation of the argument values may well refer to variables in the environment, whereas the environment adjustment may be taken from a closure residing in one of the argument registers.

PSETQ-TEMPS is similar to PSETQ-ARGS-ENV, but is used on registers other than the standard argument-passing registers. It takes ARGS and ENVADJ as before, but also a depth DEP which is the number of the first register to be assigned to. TEHPLOC is used to generate the register names, then **ENV** is tacked on and PSETQIFY does the real work.

MAPANALYZE is a simple loop which maps over a list of cnode-trees and calls ANALYZE on each. A list of the results returned by ANALYZE is given to C.

Also, FNS is chained through the calls to ANALYZE, so that all functions to be compiled later will have been accumulated properly.

[Page 249]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 (DEFINE Psero-Anas (LAMBDA (Anas) . (PSETO-ARGS-ENV Anss 'l#ENV°f)) (DEFINE PSETQ-ARGS-ENV (LAMBDA (ARGS ENVADJ) (IF () (LENGTH ARGS) (+ f¢NUMBE (PSETOIFY (LIST ENVADJ (CAR **ENV+CONT+ARG-RE (PSETOIFY (CONS ENVADJ ARGS (DEFINE Psero-temps (LAMBDA (ARGS DEP ENVADJ) (Do ((A ARGS (CDR A)) (J DEP (+ J 1)) gAgg;1vssa 05/1§/78 9.9; 57 )

R-OF-ARG-REGS*¢ l)) ARGS) (CONS 'LIST (CDR ARGS))) GSAA) ) **ElV+CONT+ARG-REGS**)))) (R NIL (CONS (TEHPLOC J) R))) ((NULL A) (PSETQIFY (CONS ENVADJ ARG (CONS '**ENV** ( (DEFINE MAPANALYZE , (LAMBDA (FLIST AAL PnosuAne BLOCKFNS rn (LABELS ((LooP (LAMBDA (r z FNS) (IF (NULL F) (c (uaevs (ANALYZE (LOOP rL1s1 NIL rns)))) S) nnsvfnss n))))))) S C) use z) rns) (CAR F) RNL PADGNAME BLOCKFNS rus (LAMBDA (STUFF rns) . (LOOP (CDR r) (CONS stuff z) FNS)))))))

[Page 250]

240 ANALYZE is the routine called to compile a piece of code which is expected to produce a value. ANALYZE itself is primarily a dispatch to specialists. For the "simple" case of a "trivial" form, TRIVIALIZE is used to generate the code. For the simple case of a CVARIABLE, ANALYZE simply uses LOOKUPICATE to get the code for the variable reference.

ANALYZE-CLANBDA has three cases based on FNP. For type NIL, code is generated to create a full closure of the form (CBETA (value of progname> <tag> .

<environment>). CONS-CLOSEREFS generates the code to add the CLOSEREFS to the existing consed environment for making this closure. For type EZCLOSE, just the environment part is created, again using CONS-CLOSEREFS. For type NOCLOSE, the generated "code" should never be referenced at all it is not even passed as an argument as such and so a little message to the debugger is returned as the "code", which of course must not appear in the final code for the module. For all three cases, the code for the function is added to the FNS list for later compilation.

ANALYZE-CONTINUATION is essentially identical to ANALYZE-CLAMBDA.

[Page 251]

D01 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 D21 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 D50 051 052 053 054 055 gzgasu sea __05(l§/78 Pege sa (DEFINE ANALYZE (LAMBDA (CNODE RNL PROGNAME BLOCKFNS FNS C) (LET ((CFM (CNODE\CFORM CNODE))) (EOCASE (TYPE CFM) (TRIVIAL (C (TRIVIALIZE (TRIVIAL\N00£ CFM) RNL) FNS)) QCVARIABLE (C (LOOKUPICATE (CVARlABL£\VAR CFM) RNL) FN$)) (CLAMBDA (ANALYZE-CLAMBDA CNODE RNL PROGNAME DLOCKFNS FNS C CFM)) KCONTINUATION (ANALYZE-CONTINUATION CN00£ RNL PROGNAME BLOCKFNS FNS C CFN)) (CIF (ANALYZE-CIF CNODE RNL PROGNAME BLOCKFNS FNS C CFM)) HCLABELS (ANALYZE-CLABELS CNODE RNL PROGNAME BLOCKFNS FNS C CFM)) HCCOMBINATION (ANALYZE-CCOMBINATION CNODE RNL PROGNAME BLOCKFNS FNS C CFN)) (RETURN) (ANALYZE-RETURN CNODE RNL PROGNANE BLOCKFNS FNS C CFM)))))) (DEFINE ANALYZE-CLAMBDA _ (LAMBDA (CNODE RNL PROGNAME BLOCKFNS FNS C CFM) (EOCASE (CLAMBDA\FNP CFM) (NIL (C '(CONS 'CBETA (CONS .PROGNAME (CONS ',(CLAMBDA\NAME CFM) ,(CONS-CLOSEREFS (CLAMBDA\CLOSEREFS CFM) . RNL)))) (cons (LIST PROGNAME cuooe NIL) rns))) (EZCLOSE (C (CONS-CLOSEREFS (CLAMBDA\CLOSEREF$ CFM) RNL) (CONS (LIST PROGNAME CNODE NIL) FNS))) (NOCLOSE (C '|ShauIdn't ever be seen - NOCLOSE CLAMBDA1 (CONS (LIST PROGNAME CNODE RNL) FNS)))))) (DEFINE ANALYZE-CONTINUATION (LAMBDA (CNODE RNL PROGNAME BLOCKFNS FNS C CFM) (EOCASE (CONTINUATION\FNP CFM) (NIL (C '(CONS 'CBETA (CONS ,PROGNAME (CONS ',(CONTINUATION\NAME CFM) ,(CONS-CLOSEREFS (CONTINUATION\CLOSEREFS CFM) RNL)))) (CONS (LIST PROGNAME CNODE NIL) FN$))) (EZCLOSE (C (CONS-CLOSEREFS (CONTINUATION\CLOSEREFS CFM) RNL) (CONS (LIST PROGNAME CNODE NIL) FNS))) '

(NOCLOSE (C '|Shou1dn't eve( be seen - NOCLOSE CONTINUATIONI (CONS (LIST PROGNAME CNOUE RNL) FNS))))))

[Page 252]

242 ANALYZE-CIF merely calls ANALYZE recursively cum the predicate, consequent, and alternative, and then uses CONDICATE to construct a MacLISP COND form.

ANALYZE-CLABELS calls ANALYZE recursively on the body of the CLABELS, and then calls PRODUCELABELS to do the rest. (Unlike the other PRODUCE functions, PRODUCELABELS does not depend on generating code which does not produce a value.

It accepts an already-compiled body, and builds around that the framework for constructing the mutually referent functions. If the body was compiled using COMP-BODY, then the code generated by PRODUCELABELS will produce no value; but if the body was compiled using ANALYZE, then it will produce a value.)

[Page 253]

001 002 003 004 005 006 007 006 009 U10 011 012 013 014 O15 O16 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 RF§§lIS§§B °§Ll§Ll§__Ee99_§9 (DEFINE ANALYZE»ClF (LAMBDA (CNODE nu( PROGNAME sLoc¢rus rns c crm) - (ANALYZE (c1F\Pn£o CFM) RNL PROGNAME aLocxrNs rns (LAMBDA (Penn rms) (ANALYZE (CIF\CON cfm) RNL PROGNAME BLOCKFNS fns .

(LAMBDA (con FNS) (ANALYZE (CIF\ALT CFM) RNL PnosuAM£ aLocxrns rns (LAMBDA (ALT FNS) (c (CONDICATE Pano con AL1) FNS))))))))) (DEFINE ANALYZE-CLABELS (LAMBDA (CNODE RNL PROGNAME BLOCKFNS FNS C CFM) (ANALYZE (CLABELS\BODY CFM) (ENVCARCDR (APPEND (CLABELS\FN£NV CFM) (CLABELS\CONSENV CFM)) RNL) PROGNAME BLOCKFNS FNS (LAMBDA (LBOD FNS) (PRODUCE-LABELS CNODE LBOD RNL PROGNAME BLOCNFNS FNS C)))))

[Page 254]

244 ANALYZE-CCOMBINATION requires the function to be a CLAMBDA (for if it were not, then something too complicated for continuation-passing style is going on). ANALYZE is called on the body of the CLAMBDA, and then on all the arguments (using NAPANALYZE); finally LAMBDACATE is used to generate the code.

(LAMBDACATE is much like PRODUCE-LABELS, in that it is handed a body, and whether the generated code produces a value depends only on whether the body does.) ANALYZE-RETURN is essentially just like ANALYZE-CCOMBINATION.

[Page 255]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 BFBQ!I_§§§__9§£l§lZ§"_E¢99_Q2 (DEFINE AuALvz£-ccouaxunrxon (LAMBDA (CNODE RML Pnocunnf BLOCKFNS sus c crm) (LET ((rn (CNODE\CFORH (can (ccoMaxnArxon\nnss crn))))) (IF (eo (TYPE rn) 'CLAM80A) (ANALYZE (cLAMaoA\aooY rn) (ENVCARCDR (CLAMBDA\ASETVARS ru) (REGSLIST rn 1 (ENVCARCDR (CLAHBOA\CONSENV rn) auL))) Pnosnnne _ _ anocxrns rss (LAMBDA (BODY rms) (nnvnunlvze (con (CCOM8INATl0N\ARGS CFH)) RNL Pnosunne BLOCKFNS FNS (LAMBDA (ARGS FNS) (C (LAHBDACATE (CLAMBDA\VARS FN) (CLAHBDA\TVARS FN) (CLAHBDA\UEP FN) ARGS (REMARK-Ol (CAR (CCOMBlNATl0N\ARGS CFH))) 'iafhvaa (SET-UP-ASETVARS BODY (CLAMBDA\ASETVARS FN) (REGSLIST FN NIL NIL))) FN$))))) (ERROR '1Non-trivial Function in ANALYZE-CCOHBINATIONI CNODE 'FAIL-ACT))))) (DEFINE ANALYZE-Rsrunn (LAMBDA (CNODE RNL PnoGnAn£ aLocxrus rns c CFM) (LET ((rn (CNO0E\CFORM (RETURN\CONT cfn)))) (IF (eo (TYPE fn) 'cou1|nuA11ou) (AuALvz£ (CONTlNUATl0N\BODY rn) (IF (CONTINUATION\TVARS ru) (CONS (CONS (CAR (CONTlNUATl0N\TVARS Fu)) - (T£MPLOC (CONTINUATION\DEP rn))) (ENVCARCDR (couTxnuA11on\cous£uv ru) nnL)) (cnvcnncon (CONTINUAT10N\CONS£NV rn) RNL)) Pnosunne aLocxrus rns (LAMBDA (BODY rns) (ANALYZE (RETURN\VAL CFM) RNL PROGNAME BLOCKFNS rss (LAnaoA (ARG FNS) (c (LAnaoAcA1£ (LIST (cou1xuuA11ou\vAn rn)) (CONTlNUATl0N\TVARS fu) (CONTINUATION\DEP ru) (LIST ARG) (REMARK-ON (RETURN\CONT CFH)) 'ffgnvau BODY) FNS))))) (ERROR '1Non-trivial Function in ANALYZE-RETURNI CNODE 'FAIL-ACT)))))

[Page 256]

246 LOOKUPICATE (I make no apology for the choice of the name of this or any other function; suffice it to say that a function named LOOKUP already existed in the SCHEME interpreter) takes a variable name VAR and a rename list RNL, and returns a piece of I*IacLISP code for referring to that variable. If an entry is in RNL for the variable, that entry contains the desired code. Otherwise a global variable reference must be constructed. This will simply be a reference to the NacLISP variable, unless it is the name of a TRIVFN. In this case a GETL form is constructed. (This is a big kludge which does not always work, and is done this way as a result of a rather unclean hack in the SCHEME interpreter which interfaces MacLISP functions with SCHEME functions.) CONS-CLOSER!IFS constructs a piece of I'IacLISP code which will cons onto the value of HENVM all the variables in the set CLOSEREFS. This is a simple loop which uses LOOKUPICATE to generate code, and constructs a chain of calls to CONS. For example, (CONS-CLOSEREFS '(A B C) NIL) would produce:

(CONS A (CONS B (CONS C **ENV**))) Notice the use of REVERSE to preserve an order assumed by other routines.

OUTPUT-ASET takes two pieces of code: VARREF, which refers to a variable, and BODY, which produces a value to be assigned to the variable. From the form of VARREF a means of assigning to the variable is deduced. (This implies that OUTPUT-ASET knows about all forms of code which might possibly be returned by LOOKUPICATE and, a fortiori, which might appear in a RNL.) For example, if the reference is (CADR (CDDDDR **ENV**)), OUTPUT-ASET would generate (RPLACA (CDR (CDDDDR **ENV**)) <body>).

[Page 257]

001 002 003 004 005 006 007 006 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 ggpgll sau 05/15/78 Pag§_§1 (DEFINE LooxuP1cA1e '

(LAMBDA (VAR RNL) _ ((LAMBDA (SLOT) (IF sLor (con sLo1) (IF (raxvrn VAR) "(GETL ~,vAn '(exPn suan LSUBR)) VAR))) (ASSO vAn nnL)))) (DEFINE CONS-CLOSEREFS (LAMBDA (CLOSEREFS RNL) (DO ((CR (REVERSE CLOSEREFS) (CDR CR)) '

(X '¢*ENV** "(CONS ,(LO0KUPlCATE (CAR CR) RNL) ,X))) ((NULL CR) X)))) (DEFINE OUTPUT-ASET (LAMBDA (VARREF BODY) (COND ((ATOM VARREF) "(SETO ,VARREF ,80DY)) ((E0 (CAR VARREF) 'CAR) "(CAR (RPLACA ,(CADR VARREF) ,BODY))) ((E0 (CAR VARREF) 'CADR) "(CAR (RPLACA (CDR ,(CADR VARREF)) ,BODY))) .((E0 (CAR VARREF) 'CADDR) "(CAR (RPLACA (CDDR .(CADR VARREF)) »BODY))) ((E0 (CAR VARREF) 'CADDDR) '(CAR (RPLACA (CDDDR ,(CADR VARREF)) ,80DY))) (T (ERROR '1Unknovn ASET discipline - OUTPUT-ASETI VARREF 'FAIL-ACT)))))

[Page 258]

248 CONDICATE takes the three conponents of an IF conditional, and constructs a MacLISP COND form. It also performs a simple peephole optimization:

(COND (a b) (T (COND (c d) ...))) becomes:

(COND (a b) (c d) ...) Also, DEPROGNIFY is used to take advantage of the fact that MacLISP COND clauses are implicitly PROGN forms. Thus:

(CONDICATE '(NULL X) '(PROGN (PRINT X) Y) '(COND ((NULL Y) X) (T FOO))) would produce:

(COND ((NULL X) (PRINT X) Y) ((NULL Y) X) (T FOO)) DECARCDRATE is a peephole optimizer which attempts to collapse CAR/CDR chains in a piece of MacLISP code to make it more readable. For example:

(CAR (CDR (CDR (CAR (CDR (CAR (CDR (CDR (CDR (CDR X)))))))))) would become:

(CADDR (CADR (CADDDR (CDR X)))) The arbitrary heuristic is that "A" should appear only initially in a "C...R' composition. DECARCDRATE also knows that MacLISP ordinarily has defined CAR/CDR compositions up to four long.

[Page 259]

001 002 003 004 005 006 O07 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 RBQQIT 555 05/15/7§_MR£9€_§§ 22; CONDICATE TURNS AN IF INTO A COND; IN SO DOING IT TRIES TO MAKE THE RESULT PRETTY (DEFINE CONDICATE ~ (LAMBDA (PRED CON ALT) (IF (OR (ATOM ALT) (nor (eo (CAR ALT) 'COND))) ~(conn (,vnso e(oePnoen1rY con)) (r e(oePnoen1rY ALr))) '(COND (,Pneo s(oePnocn1fY CON)) e(con ALT))))) 321 DECARCDRATE MAKES CAR-CDR CHAINS PRETTIER.

(DEFINE DECARCDRATE (LAMBDA (X) (COND ((ATOM X) X) ((£o (CAR x) 'CAR) (IF (ATOM (CADR x)) x (LET ((Y (DECARCDRATE (CADR X)))) (COND ((£o (cnn ((E0 (CAR ((E0 (CAR ((£0 (CAN (T '(cAa ((co (CAR x) 'cnn) , (IF (ATOM (cAnn x)) x Y) 'cAa) ~(cAAn ,(cAon Y))) Y) 'CUR) ~(cAoa ,(cAon Y))) Y) 'CUDR) '(cAuon .(CADR Y))) Y) ~coooa) '(CADDOR ,(CADR Y))) Y)))))) (LET ((Y (DECARCDRATE (CADR X)))) (T X)))) (COND ((£o (CAR ((E0 (CAR ((E0 (CAR (1 ~(cnn Y) 'con) '(coon ,(cAon Y))) Y) 'CODR) °(cuoon ,(CADR Y))) Y) 'CDO0R) '(cooooR .(cAnn Y))) Y))))))

[Page 260]

250 TRIVIALIZE is the version of ANALYZE which handles trivial forms. Recall that these are represented as pass-1 node-trees rather than as pass-Z cnode-trees. The task of TRIVIALIZE is to take such a node-tree and generate value-producing code. Recall that the subforms of a trivial form must themselves be trivial.

For a CONSTANT, a quoted copy of the value of the constant is generated.

For a VARIABLE, a reference to the variable is generated using LOOKUPICATE.

For an IF, the components are recursively given to TRIVIALIZE and then CONDICATE is used to generate a MacLISP COND form.

For an ASET, a reference to the ASET variable is generated using LOOKUPICATE, and code for the body is generated by calling TRIVIALIZE recursively; then OUTPUT-ASET generates the code for the ASET.

For a COMBINATION, the function must be either a TRIVFN or a LAMBDA-expression. For the former, a simple MacLISP function call is generated, after generating code for all the arguments. For the latter, TRIV-LAMBDACATE is invoked after generating code for the arguments and the LAMBDA body.

TRIV-LAMBDACATE is, so to speak, a trivial version of LAMBDACATE. The arguments are divided into two classes, those which are referenced and those which are not (the possibility of a referenced argument which is a KNOWN-FUNCTION cannot arise). when this is done, a MacLISP ((LAMBDA ...) ...) form is generated, preceded by any unreferenced arguments (which presumably have side-effects). For example:

(TRIV-LAMBDACATE '(Vl V2 V3) '((CAR X) (PRINT Y) (CDR Z)) '(PROGN (PRINT VI) (LIST V1 V3))) ought to produce:

(PROGN (PRINT Y) ((LAMBDA (VI V3) (COMMENT (VARS = (A C))) (PRINT VI) (LIST VI V3)) (CAR X) (CDR Z))) Note that a MacLISP LAMBDA body is an implicit PROGN. TRIV-LAMBDACATE also takes advantage of the ability to arbitrarily reorder the execution of arguments to a combination.

[Page 261]

001 002 D03 D04 005 006 007 008 D09 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 FABBIT"§§§n_9§Ll5lZ§__£§9£_§?

(DEFINE TRIVIALIZE (LAMBDA (NODE RNL) (LET ((FM (NODE\FORM NODE))) 4 (EOCASE (TYPE FM) (CONSTANT ",(CONSTANT\VALUE FH)) (VARIABLE (LOOKUPICATE (VARIABLE\VAR FM) RNL)) (IF (CONDICATE (TRIVIALIZE (IF\PRED FM) RNL) (TRIVIALIZE (IF\CON FM) RNL) (TRIVIALIZE (IF\ALT FM) RNL))) (ASET (OUTPUT-ASET (LOOKUPICATE (ASET\VAR FM) RNL) (TRIVIALIZE (ASET\BODY FM) RNL))) (COMBINATION (LET ((ARGS (COMBINATION\ARGS FM))) (LET ((FN (NODE\FORM (CAR ARGS)))) (IF (AND (EO (TYPE FN) 'VARIABLE) (VARIABLE\GL08ALP FN) _ (TRIVFN (VARIABLE\VAR ru))) (CONS (VARIABLE\VAR FN) (AMAPCAR (LAMBDA (A) (TRIVIALIZE A RNL)) (CDR ARG$))) (IF (EQ (TYPE FN) 'LAMBDA) (TRIV-LAMBDACATE (LAMBDA\VARS FN) (AMAPCAR (LAMBDA (A) (TRIVIALIZE A RNL)) (con ARGS)) (TRIVIALIZE (LAMBDA\BODY FN) RNL)) (ERROR '1Strange Trivial Function - TRIVIALIZEI NODE 'FAIL-ACT)))))))))) (DEFINE rnlv-LAnaoAcA1e (LAMBDA (VARS ARGS soov) (LABELS ((LooP (LAMBDA (v A n£ALvAAs AeALAnss EFFARGS) (IF (NULL A) (LET ((av (NREVERSE REALVARS))) (on (NULL v) (ERROR '|w¢ buev It in raxv-LAManAcA1e| v 'FAIL-ACT)) (LET ((s (IF nv '((LAHBDA .Av » (COMMENT (vAAs = ,(HAP-USER-NAMES av))) |(nePnocu1rv BODY)) !(NREVERSE n£ALAnss)) aonv))) (IF EFFARGS -(Pnosu DEFFARGS !(o£PAoGnxrv s)) B))) (IF (oA (GET (CAR v) 'READ-REFS) (est (CAR v) 'VRITE-REFS)) (LOOP (CDR v) (CDR A) (CONS (CAR v) REALVARS) (CONS (CAR A) REALARGS) EFFARG5) (LOOP (CDR v) (CDR A) REALVARS REALARGS (cons (CAR A) £rrAAss))))))) (LOOP vAns ARGS NIL NIL nxL))))

[Page 262]

252 we have examined the entire code generator, and now turn to high-level control routines. COMPILATE-ONE-FUNCTION is the highest-level entry to the code generator, called by COMPILE. It takes a code-tree and the user-name for the function, and returns a complete piece of MacLISP code constituting a module for the user function. It generates a global name for use as the module name (PROGNAME), and invokes COMPILATE-LOOP (which really ought to have been a LABELS function, but was too big to fit on the paper that way). The last argument is a list of two MacLISP forms; one causes a SCHEME compiled closure form (a CBETA list) to be put in the value cell of the user-name, so that it will be a globally defined SCHEME function, and the other creates a property linking the PROGNAME with the USERNAME for debugging purposes. '

COMPILATE-LOOP repeatedly calls COMPILATE, giving it the next function on the FNS list. Of course, the invocation of COMPILATE may cause new entries to appear on the FNS list. COMPILATE-LOOP iterates until the FNS list converges to emptiness. As it does so it takes each piece of generated code and strings it together as PROGBODY. It also calculates in TMAX the maximum over all MAXDEP slots; this is the only place where the MAXDEP slot is ever used. when FNS is exhausted, a module is constructed. This contains a comment, a MacLISP DEFUN form for defining a MacLlSP function, a SETQ form to put the SUBR pointer in the value cell of the PROGNAME (for the benefit of code which creates CBETA forms), and extra "stuff". TMAX is used to generate a list of all temporary variables used in the module; a MacLISP SPECIAL declaration is created to advise the MacLISP compiler.

USED-TEMPLOCS takes a TMAX value and generates the names of all temporary registers (whose names are of the form -nn-; standard argument registers are not included) up to that number.

[Page 263]

D01 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 D20 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 D44 045 046 047 048 049 050 051 052 053 054 055 055 057 058 059 060 061 062 063 R8BQlI_§§Q__Q§£l5£l§__B99€_§5 (DEFINE COMPILATE-ONE-FUNCTION ;coMPLxcuvz-one-function?

(LAMsuA (cuooe USERNAME) (LET ((PnoenAMs (GEN-GLOBAL-NAME))) -(COMPILATE-LOOP us£nuAnz PRocuAne (LIST (LIST us£nnAM£ cnooe)) (LIST (LIST PRoGNAn£ cuone N1L)) NIL o (LIST '(S£T0 ,usenunnz (LIST °ce£rA ,PnosnAMz ~,(cLAnsoA\nAne (cuoo£\croRn cuoue)))) '(ozrPnoP ,pnosunnc ,usenunne USER-FUNCTION)))))) (DEFINE COMPILATE-LOOP '

(LAMBDA (USERNAME PROGNAME BLOCKFNS FNS PROGBODY TMAX STUFF) (IF (NULL FNS) "(PROGN 'COMPILE (COMMENT MODULE FDR FUNCTION ,USERNAME) (DEFUN .PROGNAME () (FROG () (DECLARE (SPECIAL .PROGNAME !(USED-TEMPLOCS TMAX))) (60 (PROGZ NIL (CAR *¢ENV**) (SETO *¢ENV** (CDR f¢£Nv¢¢)))) e(unzv£ns£ PROG80DY))) (SETO ,PROGNAME (GET ',PROGNAME 'SU8R)) QSTUFF) (coMP1LAr£ (CAR (CAR rNs)) (CADR (CAR FNS)) - (CADDR (CAR FNS)) BLOCKFNS (con FNS) (LAMBDA (CODE NEVFNS) (LET ((CFM (CNODE\CFORM (CADR (CAR FNS))))) (COMPILATE-LOOP USERNAME -PROGNAME BLOCKFNS NEVFNS (NCONC (REVERSE (DEPROGNIFYI CODE T)) (cons (nenunx-on (CADR (CAR fus))) (CONS (EOCASE (TYPE CFM) (CLAMBDA (CLAM8DA\NAME crM)) (CONTINUATIDN (couT|nuAr|ou\uAne CFM))) PROGBODY))) (MAX TMAX (EQCASE (TYPE CFM) (CLAMBDA (CLAMBDA\MAXDEP CFM)) (CONTINUATION STUFF))))))) (CONTINUATION\MAXDEP crn)))) (DEFINE usso-1cMPLocs (LAMBDA (N) _ (oo ((J (+ ~-nunasn-or-Ana-ness-~ 1) (+ J 1)) (x NIL (cons (rznrnoc J) x))) ((> J N) (unevenss x)))))

[Page 264]

254 REMARK-ON takes a cnode for a CLANBDA or CONTINUATION and generates a comment containing pertinent information about invoking that function. This comment will presumably be inserted in the output code to guide the debugging process.

MAP-USER-NAMES takes a list of internal variable names and returns a list of the corresponding user names for the variables, as determined by the USER-NAME property. (If a variable is an internally generated one, e.g. for a continuation, then it will have no USER-NAME property, and the internal name itself is used.)

[Page 265]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 D26 027 028 D29 030 BAE!!!L§§§__9§l15/7§L_£95§_£§ (DEFINE REMARK-ou (Lamson (cuooe) (LET ((cfM (cuou£\croRH cnone))) (LABELS ((R£MARKl (LAMBDA (new fur vAas env) "(COMHENT (DEPTH = ,DEP) '(FNP = ,FNP) e(1r vARs ~((vAns - ,(MAP-uses-NAMES vAns)))) !(IF ENV '((ENV = ,(HAP-USER-NAMES ENV)))))))) (EOCASE (TYPE CFM) (CLAHBDA (n£MAnx1 (cLAnaoA\u£P crm) (cLAMaoA\ruP CFM) (IF (zo (cLAManA\ruP CFM) 'NOCLOSE) (cLAnaoA\1vARs CFM) (CLAM8DA\VARS crn)) (APPEND (CLAM8DA\CLOSEREFS cfs) (CLAM8DA\CONS£NV crM)))) (CONTINUATION (REHARKI (CONTINUATION\DEP crm) (CONTINUATION\FNP CFM) NIL ;n£vsn lnrsnssrlus ANYVAY (APPEND (CONTlNUATl0N\CLOSEREFS ern) (CONTlNUATl0N\CONS£NV crn))))))))) (DEFINE MAP-USER-NAMES (LAMBDA (VARS) (AHAPCAR (LAMBDA (X) (OR (GET X 'USER-NAME) X)) VARS)))

[Page 266]

256 The next few pages contain routines which deal with files. COMFILE takes a file name, and compiles all the code in that file, producing an output file of MacLISP code suitable for giving to the MacLISP compiler. It also computes the CPU time required to compile the file.

FN gets the given file name, processed and defaulted according to ITS/MacLISP standard conventions. RT and GCT save runtime and gc-runtime information.

IFILE and OFILE get MacLISP "file objects" created by the OPEN function, which opens the file specified by its first argument. (The output file names are initially " RABB OUTPUT", conforming to an ITS standard. These will later be renamed.) *GLOBAL-GEN-PREFIX* is initialized as a function of the file name, to "directory=firstname". This is to guarantee that the global symbols generated for two different compiled files of SCHEME code will not conflict should the two files be loaded into the same SCHEME together. (Notice the use of SYMEVAL. This is necessary because MacLISP allows names to be used in two different kinds of contexts, one meaning the "functional" value, and the other meaning the "variable" value. SCHEME does not make this distinction, and tries to make the functional value available, but does not do this consistently. This is a problem which results from a fundamental difference in semantics between SCHEME and MacLISP. For such variables as DEFAULTF and TYO, which in MacLISP are used for both purposes, it is necessary to use SYMEVAL to specify that the variable, rather than the function, is desired.) (SYMEVAL 'TYO) refers to the file object for the terminal; this is used to print out messages to the user while the file is being compiled. Various information is also printed to the file, including identification and a timestamp. The DECLARE form printed to the output file contains the names of the standard argument registers, and also **ENV**, **FUN**, and **NARGS**. (This is why USED-TENPLOCS need not generate names of standard argument registers this single global declaration covers them.) The second DECLARE form defines to the MacLISP compiler a function called DISPLACE for obscure reasons having to do with the implementation of SCHEME macros.

TRANSDUCE does the primary work of processing the input file. when it is done, another timestamp is printed to the output file, so that the real time consumed can be determined; then the runtime statistics are calculated and printed, along with the number of errors if any occurred. The output file is then renamed as "firstname LISP" and closed. The statistics message is returned so that it will be printed as the last message on the terminal.

[Page 267]

001 002 003 004 005 006 007 00a 009 010 011 DIZ 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 026 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 (DEFINE COMFILE (LAMBDA (FNAME) RQDDIT 560 05410470 P»g¢_§g (LET ((rN (DEFAULTF (nenssr FNAME '(~ >)))) (RT (RUNTlME)) (GCT (STATUS 6cTIME))) (LET ((IFILE (OPEN FN '1N)) (OFILE (OPEN (MERGEF '(_RABB_ OUTPUT) FN) '0UT))) (SET' *GLOBAL-GEN-PREFIX* (CATENATE (CADAR (SYMEVAL 'DEFAULTF)) (CADR (SYHEVAL 'DEFAULTF)))) (LET ((TN (NAMESTRING (TRUENAME IFIL£)))) (PRINT '(COMMENT THIS IS THE RABBIT LISP co0E FOR .TN) OFILE) (TIMESTAMP OFILE) (TERPRI OFILE) (TERPRI (SYMEVAL 'TYO)) (PRINC '1;Beginning RABBIT compilation on I (SYMEVAL 'TYO)) (PRINC TN (SYMEVAL 'TYO))) (PRINT "(DECLARE (SPECIAL @*¢CONT+ARG-REGS** °*ENV** **FUN** **NARGS**)) OFILE) (PRINT '(DECLARE (DEFUN DISPLACE (X Y) Y)) OFILE) (AsET~ *TESTING* NIL) (AsET' *ERROR-COUNT* 0) (ASET' *ERROR-LIST* NIL) (TRANSDUCE IFILE OFILE (LIST NIL) (CATENATE 'IINIT-I (CADR (TRUENAME IFILE)))) (TIMESTAMP OFILE) (LET ((X (KOUO (- (RUNTIME) RT) 1.0E6)) (Y (DOUO (- (STATUS GCTIME) GCT) 1.0E6))) (LET ((MSG '(COMPILE TIME: ,X SECONDS (GC (NE !(I TIME .Y SECONDS) T ,(S X Y) SECONDS) F (NDT (ZEROP *ERROR-COUNTl)) "((,*ERROR-COUNT* ERRORS)))))) (PRINT "(COMMENT ,MSG) OFILE) (RENAMEF OFILE (Mauser (LIST (cnon ru) 'LIsP) )) (CLOSE oFILE) MSG)))))) FN

[Page 268]

258 TRANSDUCE processes forms from IFILE, one by one, calling PROCESS-FORM to do the real work on each one. PROCESS-FORM may print results on OFILE, and may also return a list of "random forms" to be saved.

The business of "random forms" has to do with the fact that the file being compiled may contains forms which are not function definitions. The expectation is that when the file is loaded these forms will be evaluated during the loading process, and this is indeed true if the interpreter loads the original file of source forms.

Now MacLISP provides a facility for evaluating random forms within a compiled file, but they are evaluated as MacLISP forms, not SCHEME forms. To get around this whole problem, I chose another solution. All the random forms in the file are accumulated, and then compiled as the body of a function named "INIT-firstname". In this way, once the compiled code is loaded, the user is expected to say (INIT-firstname) to cause the random forms to be evaluated.

This idea would have worked perfectly except that files typically have a large number of random forms in them (macro definitions create one or two random forms as well as the definition of the macro-function). Putting all the random forms together in a single function often creates a function too big for RABBIT to compile, given PDP-10 memory limitiations. The four lines of code in TRANSDUCE for this have therefore been commented out with a ";" at the beginning of each line.

The final solution was to compile each random form as its own function, and arrange for all these little functions to be chained; each one executes one random form and then calls the next. A call to INIT-firstname starts the chain going.

This, then, is the purpose of the big DO loop in TRANSDUCE: to construct all the little functions for the random forms. The third argument to PROCESS-FORM may be NIL, which suppresses the printing of any messages on the terminal; this spares the user having to see tens or hundreds of uninteresting messages concerning the compilation of these initialization functions. However, so that the user will not be dismayed at the long pause, a message saying how many random forms there were is printed first.

READIFY implements the MacLISP convention that if the value of the variable READ is non-nil, then that value is the read-in function to use; while if it is NIL, then the function READ is the read-in function. (This "hook" is the method by which CGOL works, for example.)

[Page 269]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 025 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 5659!I_§§§E_2§l}§(Z§__E!9€,§Z (DEFINE TRANSDUCE (LAMBDA (IFILE OFILE EOF INITNAME) (LABELS ((LooP (LAMBDA (FORM RANDOM-FORMS) (IF (EQ FORM EOF) (DO ((X (GENTEMP INITNAME) (GENTEMP INITNAME)) _ (v nu x) (Z RANDOM-FORMS (CDR Z))) ((NULL z) (IF RANDOM-FORMS (PRINT '(,(LENGTH RANDOM-FORMS) RANDOM FORMS IN FILE TO COMPILE) (SYMEVAL 'TYO))) (IF Y (PROCESS-FORM '(DECLARE (SPECIAL ,Y)) OFILE TI) (Pnocess-ronn "(DEFINE ,mnumz (LAMBDA () ,(IF Y (LIST Y) NIL))) OFILE .

TI).

(IF Y (PROCESS-FORM '(DECLARE (SPECIAL ,Y)) OFILE NIL)) (PROCESS-FORM '(DEFINE ,X (LAMBDA () (BLOCK ,(CAR Z) ,(lF Y (LIST Y) NlL)))) OFILE NIL)) ; (PROCESS-FORM I "(DEFINE ,INITNAME _ ; (LAMBDA () (BLOCK URANDOM-FORMS NIL NlL))) 2 OFILE) (LET ((X (PROCESS-FORM FORM OFILE T))) (LOOP (READIFY IFILE EOF) (NCONC X RANDOM-FORMS))))))) (LOOP (READIFY IFILE EOF) NIL)))) (DEFINE READIFY ;FUNNY MACLISP CONVENTION - READlFY'LL DO THE JOB!

(LAMBDA,(IFILE EOF) (IF (SYMEVAL 'READ) (APPLY (SYMEVAL 'READ) IFILE EOF) (READ IFILE EOF))))

[Page 270]

260 PROCESS-FORM takes a form, an output file, and a switch saying whether to be "noisy". The form is broken down into one of many special cases and processed accordingly. The returned value is a list of "random forms' for TRANSDUCE to save for later handling.

An atom, while pretty useless, is transduced directly to the output file.

A DEFINE-form, which defined a function to be compiled, is given to PROCESS-DEFINE-FORM. This is the interesting case, which we will discuss on the next page.

A special hack handed down from MacLISP is that a form (PROGN 'COMPILE ...) (and, for SCHEME, the analogous (BLOCK 'COMPILE ...)) should be treated as if all the subforms of the PROGN (or BLOCK) after the first should be processed as if they had been read as "top-level" forms from the file. This allows a macro call to generate more than one form to be compiled, for example. It is necessary to accumulate all the results of the calls to PROCESS-FORM so that they may be collectively returned.

A PROCLAIM form contains a set of forms to be evaluated by RABBIT at compile time. The evaluation is accomplished by constructing a LAMBDA form and using the SCHEME primitive ENCLOSE to create a closure, and then invoking the closure. As a special service, the variable OFILE is made apparent to the evaluated form so that it can print information to the output file if desired.

A DECLARE form is meant to be seen by the MacLISP compiler, and so it is passed on directly. _ A COMMENT form is simply eliminated. (It could be passed through directly with no harm.) A DEFUN form is passed directly, for compilation by the MacLISP compiler.

A form which is a macro call is expanded and reprocessed. As a special hack, those which are calls to DEFMAC, SCHMAC, or MACRO are also evaluated (MacLISP evaluation serves), so that the defined macro will be available for compiling calls to it later in the file.

Any other form is considered "random", and is returned to TRANSDUCE provided *BUFFER-RANDOM-FORMS* is non-NIL. This switch is provided in case it is necessary to force a random form (e.g. an ALLOC form) to be output early in the file. In this case any random forms must be MacLISP-evaluable as well as SCHEME-evaluable. (This requirement is the reason RABBIT has random forms like "(SET'

FOO ...)"; SETQ is unacceptable to SCHEME, while ASET' is unacceptable to MacLISP, but SET' happens to work in both languages for setting a global variable.) RABBIT itself sets *BUFFER-RANDOM-FORMS*' to NIL on page 1 in a PROCLAIM form.

[Page 271]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 026 029 030 031 032 033 034 035 036 037 038 (SET' *optimize* 1) FA3BlI_§§§__9§£l§LZ§_,H99¢_§§ (SET' *BUFFER-RANDOM-FORMS* T) (DEFINE PROCESS-FORM (LAMBDA (FORM OFILE NOISYP) (COND (

( (

( (

( (

( (

(ATOM FORM) (PRINT roam or1L£) NIL) (EO (CAR FORM) 'DEFINE) (PROCESS-DEFINE-FORM FORM OFILE NOISYP) NIL) (AND (MEMO (CAR FORM) '(8LOCK PROGN)) (EQUAL (CADR FORM) "COMPILE)) (DO ((F (CDDR FORM) (CDR F)) (Z NIL (NCONC Z (PROCESS-FORM (CAR F) OFILE NOISYP)))) ((NULL F) Z))) (EO (CAR FORM) 'PROCLAIM) (AMAPC (LAMBDA (X) ((ENCLOSE "(LAMBDA (OFILE) ,X)) OFILE)) (CDR FORM)) NIL) (EO (CAR Fonn) 'DECLAR£) (PRINT ronn OFILE) NIL) (EO (CAR FORM) 'COMMENT) NIL) (EO (CAR FORM) 'DEFUN) (PRINT FORM OFILE) NIL) (AND (ATOM (CAR ronn)) (eo (GET (CAR FORM) °A1nr) 'AMACRO) (NOT (co (GET (can ronn) 'AMAcno) 'Arsusn))) (IF (nemo (CAR roam) -(nernnc SCHMAC_MACRO)) (EVAL ronn)) (PROCESS-FORM (MACRO-exPANo roam) or1L£ NOISYP)) 1 (COND (*BUFF£R-RANDOM-FORMS* (LIST FORH)) (1 (PRINT ronn or1Le) u1L))))))

[Page 272]

262 PROCESS-DEFINE-FORM disambiguates the three DEFINE formats permitted in SCHEME:

(DEFINE FOO (LAMBDA (X Y) ...)) (DEFINE FOO (X Y) ...) (DEFINE (FOO X Y) ...) and constructs an appropriate LAMBDA-expression in standard form.

PROCESS-DEFINITION takes this LAMBDA-expression and compiles it, after some error checks. It then cleans up, and if desired prints a message on the console to the effect that the function was compiled successfully.

CLEANUP is used to clear out garbage left around by the compilation process which is no longer needed (but is useful during the compilation, whether for compilation proper or only for debugging should the compilation process fail).

REPLACE has to do with macros which displace calls to them with the expanded forms, but retain enough information to undo this. REPLACE undoes this and throws away the saved information. (The DISPLACE facility is normally turned off anyway, and is used only to make the compiler run faster when it itself is being run under the SCHEME interpreter. This was of great use when RABBIT wasn't running well enough to compile itself!) GENFLUSH removes from the MacLISP OBARRAY all the temporary generated symbols created by GENTEMP.

The MAPATOMS form removes from every atom on the OBARRAY the properties shown. This takes more time but less space than recording exactly which atoms had such properties created for them.

[Page 273]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 016 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 (DEFINE PROCESS-DEFINE-FORM (LAMBDA (FORM OFILE NOISYP) (COND ((ATOM (CADR FORM)) (PROCESS-DEFINITION FORM (DEFINE CLEANUP (LAMBDA () OFILE NOISYP gA§q;r sea 05/15/76 Page 69 (cmnn ronn) (lr (NULL (CDDDR ronn)) (CADDR roam) ~(LAnaoA ,(CAO0R FORM) (T (PROCESS-DEFINITION FORM OFILE NOISYP (BLOCK (CDDDR FORM)))))) (CAADR FORM) '(LAMBDA ,(CDADR FORM) (BLOCK . (CDDR FORM)))))))) (DEFINE PROCESS-DEFINITION (LAMBDA (FORM OFILE NOISYP NAME LAMBDA-EXP) (COND ((Nor (eo (1vP£P NAME) 'svMaoL)) (WARN 1Function Name Not SYMBOLI NAME FORH)) ((OR (NOT (EO (CAR LAMBDA-EXP) 'LAMBDA)) (AND (ATOM (CADR LAMBDA-EXP)) (nor (NULL (cAoR LAMBDA-EXP))))) (WARN 1Ha\formed LAMBDA-expressionl LAMBDA EXP FORH)) (T (PRINT (COMPILE NAME OFILE) (CLEANUP) (IF NOISYP (BLOCK (REPLACE) (GENFLUSH) (MAPATOMS '(LAMBDA LAMBDA-EXP NIL *OPTIMIZE*) (PRINT (LIST NAME 'coMP|L£o) (SYMEVAL °1vo))))))) (X) (REMPROP (nenpnop (REMPROP (RENPROP (REHPROP (aenpnor (REMPROP 'READ-REFS) 'WRITE-REFS) 'NODE) 'BINDlNG) 'USER-NAME) 'xuovn-function) 'EASY-LABELS FUNCTl0N))))))

[Page 274]

264 SEXPRFY and CSEXPRFY are debugging aids which Lake a node-tree or :nods-tree and produce a fairly readable S-expression version of the code it represents. They are used by the SX and CSX macros defined earlier. The USERP switch for SEXPRFV specifies whether internal variables names or user variable names should be used in the construction.

[Page 275]

001 002 003 004 005 006 007 D08 009 010 011 DIZ D13 014 015 016 017 018 019 020 021 D22 023 D24 025 026 D27 D28 029 030 031 032 033 034 D35 036 037 038 039 D40 041 042 043 044 045 046 047 048 D49 050 051 D52 053 D54 055 D56 057 058 059 060 D61 062 RABQJI_§§§__!§ll§lZ§__?!9&_ZR ;;; INVERSE OF ALPHATIZE. USED BY SX, E.G., FOR DEBUGGING.

(DEFINE SEXPRFY (LAMBDA (NODE USERP) (LET ((FM (NODE\FORM NODE))) (EOCASE (DEFINE CSEXPRFY (LAMBDA (CNODE) (TYPE rn) (CONSTANT '(ouor£ ,(CONSTANT\VALUE rn))) (VARIABLE (xr (AND usenv (NOT (VARIABLE\GLOBALP rn))) (GET (VARIA8LE\VAR rn) 'USER-NAME) (VARIABLE\VAR rM))) (LAMBDA '(LAHBDA ,(Ir usenr (LAMBDA\UVARS FM) (LAMBDA\VARS rn)) 1

(IF °(If ,(S£XPRFY .(s£xPRrv -.(s£xPnrY (ASET '(AS£T' ,(1r ISEXPRFY (LAMBDA\BODY FM) USERP))) 1r\Pn£o rn) USERP) 1r\con rn) usenr) IF\ALT rn) usenP))) (AND usenr (NOT (ASET\GLOBALP rn))) [GET (ASET\VAR rn) °us£n-uAn£) 1ASET\VAR rn)) I

.(SEXPRFY (ASET\BODY FM) USERP))) (CATCH '(CATCN ,(IF USERP (GET (CATCH\VAR rn) 'USER-NAME) (CATCN\VAR rn)) ,(SEXPRFY (CATCH\BODY FM) USERP))) (LABELS '(LABELS ,(AMAPCAR (LAMBDA (V D) '(.(IF USERP (GET v V) ,(SEXPRFY D (LABELS\FNVARS FM) (LABEL$\FNDEFS FM)) ,(SEXPRFY (LABELS\BODY FM) USERP))) (COMBINATION (AMAPCAR (LAMBDA (A) (sexvnrv A us£nP)) (COMBINATION\ARGS rn))))))) (LET ((CFM (CNODE\CFORM CNODE))) (£ocAs£ (TYPE crm) (TRIVIAL "(TRIVIAL ,(SEXPRFY (TRIVIAL\NOD£ CFM) NIL))) (cvAn1AaL£ (CVARlABLE\VAR crn)) (CLAMBDA '(CLAMBDA .(CLAMBDA\VARS crm) (CONTINUATION ,(CSEXPRFY (CLAMBDA\BDDY crn)))) "(CONTINUATION (,(conr|nuA11oN\vAn crn)) ,(csexPRrv (CONTlNUATl0N\BODY crn)))) (CIF ~(cxr ,(cs£xPnrv (cxf\Pneo CFM)) .(CSEXPRFY (CIF\CON crn)) ,(cs£xPnrv (CIF\ALT crM)))) (cAs£r ~(cAs£r' ,(cs£xPnrv (CASET\CONT crn)) ,(cAs£r\vAn crm) .(cs£xPnrv (cAs£T\Boov cFn)))) (cLAa£Ls "(CLABELS ,(AMAPCAR (LAMBDA (v n) ~(,v ,(cs£xPnrv u))) (CLABELS\FNVARS CFM) (CLABELS\FNDEFS CFM)) .(CSEXPRFY (CLABELS\BODY CFM)))) (CCOMBINATION (AMAPCAR csexrnrv (CCOM8INATION\ARG$ crn))) (RETURN ~(n£runu .(csexPnrv (R£TURN\CONT crm); ,(cs£xrnrv (ne1unn\vAL CFM)))))))) 'USER-NAME) USERP)))

[Page 276]

266 CHECK-NUMBER-OF-ARGS is used by COMPILE and ALPHA-COMBINATION to make sure that function calls and definitions agree on the number of arguments taken by a function. If a mismatch is detacted, a warning is issued. This check frequently catches various typographical errors. The argument DEFP is NIL unless this call is on behalf of a definition rather than a call. The DEFINED property is used only so that a more comprehensive warning may be given.

  • EXPR and *LEXPR are two special forms suitable for use in a PROCLAIM form for declaring that certain names refer to MacLISP functions rather than SCHEME functions. An example, for PRINT-SHORT, occurs on page 1 of RABBIT.

DUMPIT establishes a simple user interface for RABBIT. After loading a compiled RABBIT into a SCHEME run-time system, the call (DUMPIT) initializes the RABBIT, then suspends the NacLISP environment, with an argument which is an ITS command for dumping the core image. when this core image is later loaded and resumed, DUMPIT prints "FILE NAME:" and reads a line. All the user need do is typoe a file name and a carriage return to compile the file. when this is done, the call to QUIT kills the RABBIT job.

STATS prints out statistics accumulated in the counters listed in *STAT-VARS*. RESET-STATS resets all the counters.

[Page 277]

001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 D51 052 053 054 055 056 RA95lI_5§§ _A__ _9§£l§!7§_s?99¢ I) (DEFINE CHECK-NUMBER-OF-ARGS (LAMBDA (NAME NARGS DEEP) (OR (GETL NAME '(*LEXPR LSUBR)) (LET ((N (GET NAME 'NUMBER-OF-ARGS))) (IF N (IF (NOT (= N NARGS)) (IF DEFP (WARN ldefinition disagrees with earlier use on number of argsl NAME NARGS N) (IF (GET NAME 'DEFlNEO) (WARN luse disagrees with definition on number of argsl NAME NARGS N) (WARN Itwo uses disagree before definition on number of argsl NAME NARGS N)))) (vuivnop NAME NAAQS 'NUMBER-OF-ARGS)) b (IF DEFP (PUTPROP MAME 'T 'DEFINED)))))) (DEFUN *EXPR rexvn (x) (MAPCAR '(LAM8DA (Y) (PUTPROP Y 'T '*EXPR)) x)) (DEFPROP *EXPR AFSUBR AMACRO) (DEFPROP *EXPR AMACRO AINT) (DEFUN *LEXPR r£xPR (x) (MAPCAR '(LAMBDA (Y) (PUTPROP Y 'T '*LEXPR)) x)) (DEFPROP *LEXPR AFSUBR AMACRO) (DEFPROP *LEXPR AMACRO AINT) (DEFINE DUMPIT (LAMBDA () (BLOCK (INIT-RABBIT) (SUSPENU 'IZPDUMP 0SK:SCHEME;T$ RABBITI) (TERPRI) (PRINC '1Fi\e name: I) (COMFILE (READLlNE)) (0UIT)))) (DEFINE srnrs (LAMBDA () (AMAPC (LAMBDA (VAR) (BLOCK (renvnl) (PRINI VAR) (PRINC '1 = |) (rnnul (SYMEVAL VAR)))) *STAT-VARS*))) (DEFINE RESET-STATS (LAMBDA () (AMAPC (LAMBDA (VAR) (SET VAR 0)) *STAT-VARS*)))

[Page 278]

> .....

Symbol * .....

  • EXPR .

Table For: OUUX;RABBIT 558 *EXPR ._ *EXPR _ *EXPR .

  • LEXPR *LEXPR *LEXPR *LEXPR + ......

/ ._ < ._ SIDE EFFECTS ..

FEXPR PROPERTY ....

AMACRO AINT ._ FEXPR PROPERTY .H AMACRO AINT ._ SIDE EFFECTS ... ..._ SIDE EFFECTS ...

SIDE EFFECTS ..

SIDE EFFECTS SIDE EFFECTS ...

SIDE EFFECTS ._ ACCESSFN ._ ACCESSFN ..

ADDPROP ..........,.

ADJOIN ..............

ADJUST-KNOVNFN-CENV ._ AINT ......,_........

AINT ..............

QDEFINE .......

MACLISP MACRO .

SCHEME FUNCTION SCHEME FUNCTION SCHEME FUNCTION PROPERTY ......

PROPERTY ......

ALPHA-ALPHA ALPHA-ALPHA ALPHA ALPHA-ASET ...

ATOM ...

CATCH BLOCK ..... _ COMBINATION ...

ALPHA-LABELS ALPHA ALPHA IF ..........

-LABELS-DEFN _..

LAMBDA SCHEME FUNCTION ALPHATI AMACRO AMACRO ANALYZE ZE ._ ANALYZE-CCOMBINATION ...

ANALYZE ANALYZE ANALYZE ANALYZE ANALYZE APPEND -CIF ..........

-CLABELS .....

-CLAMBDA ........

~CONTINUATION ...

RETURN .......

ASET ...

ASK .__ ASSO ...

ATOM .......

BIGP .............

BIND-ANALYZE ............

BIND-ANALYZE-CASET ........ U.

BIND-ANALYZE-CCOMBINATION BIND-ANALYZE-CIF ........

BIND-ANALYZE-CLABELS ....

BIND-ANALYZE-CLAMBDA ...... .H BIND-ANALYZE-CONTINUATION BIND-ANALYZERETURN ........ .H BIND-CCOMBINATION-ANALYZE CAAAAR ................_... ...

CAAADR _.

CAAAR ..

CAADAR ..

CAADDR ._ CAADR ._ CAAR ...

CADAAR ._ CADADR ._ CADAR ._ SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION PROPERTY ......

PROPERTY ......

SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION SIDE EFFECTS ...... .,..

DATA TYPE ........... ....

PDP-10 SCHEME MACRO ......

SIDE EFFECTS ........ ....

SIDE EFFECTS ...... ....

SIDE EFFECTS ..... ....

SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION SIDE EFFECTS ... ....

SIDE EFFECTS ... ....

SIDE EFFECTS ._ SIDE EFFECTS ...

SIDE EFFECTS _._ ...

SIDE EFFECTS ..

SIDE EFFECTS ._ SIDE EFFECTS ___ ....

SIDE EFFECTS ._ SIDE EFFECTS ._ 017 071 071 071 071 071 071 071 071 017 017 017 017 017 017 004 004 006 006 052 071 071 010 009 011 010 011 010 010 011 009 009 071 071 058 060 059 059 058 056 060 017 008 003 017 017 017 034 035 037 035 036 035 035 036 037 017 017 017 017 017 017 017 017 017 017 005 026 027 029 029 031 032 034 034 003 004 006 008 007 009 002 004 00A 029 002 029 034 010 032 011 029 037 002 040 002 042 005 029 034 002 003 002 025 023 040 033 057 042 024 059 045 052 030 040 002 030 002 002 016 033 043 024 025 016 026 027 017 012 028 029 018 05/15/78 Page I

[Page 279]

Symbol Tah1e fur: 0UUX;RABBIT 568 CADDAR ......... ..................

CADDDR __ CADDR .U CADR ...

CAR ....

CASET ...

CATCH ......

CATENATE ....

CCOMBINATION ..

CDAAAR ......

CDAADR ...._ CDAAR H.

CDADAR.._ CDADDR ._ COADR .H CDAR ....

CDOAAR ,_ CDDADR __ CDDAR _._ CODDAR .

CDDDDR ._ CDDDR .N CDDR .......

CDR .,..............,_...

CENV-ANALYZE .....,........

CENV-CCOMBINATION-ANALYZE H CENV-TRIV-ANALYZE .........

CHECK-COMBINATION-PEFFS ...

CHECK-NUMBER-OF-ARGS ...

CIF ._................

CL . ....,.......... ._ CLABELS ._ CLAMBDA N CLEANUP .....

CLOBBER .,.,...

CLOSE-ANALYZE __ CNAME .........

CNODE ..,....

CNODIFY .......

COMBINATION _._ COMFILE .....

COMP-BODY ,....

COMPILATE ..._......,.

COMPILATE-LOOP ,.... _...

COMPILATE-ONE-FUNCTION H COMPILE ..,....,......

COMPONENT-NAMES ..

CONDICATE ......

CONS ....,......,.

CONS-CLOSEREFS ...

CONSTANT .......

CONTINUATION ..

CONVERT ,..... _ CONVERT-ASET ......

CONVERT-CATCH ,...,..

CONVERT-COMBINATION ._ CONVERT-IF ..........

CONVERT-LABELS ....

CONVERT-LAMBDA-FM ...

COPY-CODE .,.,....

COPY-NODES ,...

CSEXPRFY ...

CSX ........

CVARIABLE ..

CXR .........

DECARCDRATE ...

DEFINE ......

SIDE EFFECTS ._ SIDE EFFECTS ._ SIDE EFFECTS __ SIDE EFFECTS ._ SIDE EFFECTS ._ DATA TYPE ....

DATA TYPE .....

MACLISP MACRO .__ DATA TYPE .....

SIDE EFFECTS ..

SIDE EFFECTS ., SIDE EFFECTS ._ SIDE EFFECTS ..

SIDE EFFECTS __ SIDE EFFECTS ....

SIDE EFFECTS ....

SIDE EFFECTS ._ SIDE EFFECTS ..

SIDE EFFECTS ._ SIDE EFFECTS ._ SIDE EFFECTS ._ SIDE EFFECTS ._ SIDE EFFECTS ._ SIDE EFFECTS ....

SCHEME FUNCTION __ SCHEME FUNCTION ._ SCHEME FUNCTION ._ SCHEME FUNCTION ._ SCHEME FUNCTION .... ...

DATA TYPE ........... ...

PDP-10 SCHEME MACRO DATA TYPE . ..... ,... ...

DATA TYPE ........

SCHEME FUNCTION ._ MACLISP MACRO ...

SCHEME FUNCTION _.

MACLISP MACRO ...

DATA TYPE .......

SCHEME FUNCTION ._ DATA TYPE .......

SCHEME FUNCTION _.

SCHEME FUNCTION ..

SCHEME FUNCTION ..

SCHEME FUNCTION _.

SCHEME FUNCTION ._ SCHEME FUNCTION ._ PROPERTY ........

SCHEME FUNCTION _.

SIDE EFFECTS .....

SCHEME FUNCTION ._ DATA TYPE .......

DATA TYPE ........

SCHEME FUNCTION ._ SCHEME FUNCTION ..

SCHEME FUNCTION ._ SCHEME FUNCTION ._ SCHEME FUNCTION _.

SCHEME FUNCTION ..

SCHEME FUNCTION ._ SCHEME FUNCTION ._ SCHEME FUNCTION ._ SCHEME FUNCTION ._ MACLISP MACRO ...

DATA TYPE .._._ SIDE EFFECTS ..._ SCHEME FUNCTION ._ EDEFINE ......._.

017 017 017 017 017 026 008 002 026 017 017 017 017 017 017 017 017 017 017 017 017 017 017 017 032 033 033 016 071 026 007 026 026 069 004 040 001 026 027 008 066 045 041 061 064 007 005 062 017 061 008 026 027 029 029 031 028 030 028 025 025 070 003 026 017 062 001 030 031 019 013 010 039 046 016 D47 032 033 020 034 035 021 014 036 037 022 038 039 023 015 011 018 028 004 002 D02 036 049 040 017 036 025 002 015 007 002 055 002 006 016 017 002 010 050 004 055 011 025 036 006 002 020 014 024 006 009 002 U07 036 029 015 040 015 062 05/15/75 Page II

[Page 280]

Symbol Table for: OUUX;RABBIT 568 DEFMAC .......... ................

UEFTYPE ._ UEFTYPE ._ DELPROP ...

OEPROGNIFY ...

DEPROGNIFY1 ....

DEPTH-ANALYZE ..

DISPLACE .....

OUMPIT ...

EFFDEF ...

QDEFINE .....

PDEFINE .......

MACLISP MACRO .

SCHEME FUNCTION MACLISP MACRO .

SCHEME FUNCTION SCHEME FUNCTION EXPR ..........

SCHEME FUNCTION QDEFINE .......

EFFOEF ......

EFFECTLESS ..

EFFECTLESS-EXCEPT-CONS ... ...

EFFS EFFS EFFS EFFS EFFS ANALYZE ANALYZE-COMBINATION ... ...

ANALYZE-INTERSECT .

UNION ..

EMPTY .......

ENV-ANALYZE .

ENVCARCOR ...

EO .

».¢¢¢¢¢¢» EOCASE ......

ERASE-ALL-NODES ERASE-NODE ..

ERASE-NODES .

FILTER~CLOSEREFS FIXP FLOATP ...

MACLISP MACRO .

SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION SIDE EFFECTS ._ MACLISP MACRO .

MACLISP MACRO .

MACLISP MACRO .

SCHEME SCHEME FUNCTION FUNCTION SIDE EFFECTS ..

SIDE EFFECTS »»»¢o» una FN ..... .

FN .............. ...

FN ................ .

FN-SIUE-AFFECTED ...

FN-SIDE-EFFECTS ... .

GEN-GLOBAL-NAME ... .

GENFLUSH ....... _ GENTEMP ._ _ HUNKFN ... .

HUNKFN ...

HUNKP ....

IF ........

INCREMENT ...

INTERSECT ... .

LABELS ....

LAMBDA ....

LAMBDACATE ._ LIST .........

LOOKUPICATE ._ MACRO ........

MACRO-EXPAND ...

MAKE-RETURN .... .

MAP-USER-NAMES ._ MAPANALYZE ............ .

FN-SIDE-EFFECTS FN-SIDE-AFFECTED .. ...

OKAY-TO-FOLD ._ PROPERTY ......

PROPERTY ......

SCHEME FUNCTION SCHEME FUNCTION SCHEME FUNCTION PDEFINE .......

MACLISP MACRO .

SIDE EFFECTS ..

DATA TYPE .....

MACLISP MACRO .

SCHEME FUNCTION DATA TYPE .....

DATA TYPE .....

SCHEME FUNCTION SIDE EFFECTS ..

SCHEME FUNCTION QDEFINE .......

SCHEME FUNCTION SCHEME FUNCTION SCHEME FUNCTION SCHEME FUNCTION ACCESS MEMO ........

META-COMBINATION-LAMBDA .. ...

META-COMBINATION-TRIVFN ._ ...

META EVALUATE META IF-FUDGE META SUBSTITUTE ... ...

NAME ........

NAME ........

NAME ._ SIDE EFFECTS ._ SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION MACLISP MACRO .

MACRO ......

NAME .... ...

NODE ..... .H NODIFY ...

NOT .....

NULL ..

COMPONENT-NAMES SUPPRESSED-COMPONENT-NAMES DATA TYPE ........

SCHEME FUNCTION SIDE EFFECTS ... ...

SIDE EFFECTS ...

001 005 005 006 042 042 038 001 071 016 016 023 023 014 015 015 023 015 002 012 043 017 003 018 018 018 039 017 017 016 016 016 016 016 002 002 002 004 004 017 008 002 006 008 008 054 017 061 001 011 028 065 057 017 021 020 019 020 024 004 004 005 005 008 008 017 017 063 002 008 012 005 009 010 006 037 039 034 036 039 006 031 010 028 002 005 018 002 044 032 005 004 007 030 050 051 035 036 037 036 035 041 036 030 028 030 049 038 014 039 050 034 006 056 002 065 026 002 028 023 058 007 037 007 010 009 011 031 060 061 012 059 053 054 05/15/78 Page III

[Page 281]

Symbol Table for:

NUMBERP .........

OKAY-TO-FOLD ...

OUTPUT-ASET ._ PAIRLIS ....

PASS1-ANAL1ZE ...

PASSABLE .......

PRIN1 .... ...

PRINC .......

PRINT ........

PRINT-SHORT ......,.

OUUX;RABBIT 568 PRINT-WARNING .........

PROCESS-DEFINE-FORM PROCESS-DEFINITION PROCESS-FORM .......

PRODUCE-ASET ................

PRODUCE-COMBINATION ..........

PRODUCE-COMBINATION-VARIABLE .Q PRODUCE-CONTINUATION-RETURN ..

PRODUCE-IF ..................

PRODUCE-LABELS ..............

PRODUCE-LAMBDA-COMBINATION .H PRODUCE-RETURN ..............

SIDE EFFECTS ._ PROPERTY ......

SCHEME FUNCTION SCHEME FUNCTION SCHEME FUNCTION SCHEME FUNCTION SIDE EFFECTS ._ SIDE EFFECTS ._ SIDE EFFECTS ..

EXPR ..........

PRODUCE-RETURN-1 ......................

PRODUCE-TRIVFN-COMBINATION ............

PRODUCE-TRIVFN-COMBINATION-CONTINUATION PRODUCE-TRIVFN-COMBINATION-CVARIABLE PSETO-ARGS ............................

PSETO-ARGS-ENV ..

PSETO-TEMPS .....

PSETOIFY .........

PSETOIFY-METHOD-2 ._ PSETOIFY-METHOD-3 H READ .............

READIFY ......

REANALYZE1 ...

REFD-VARS ...

REGSLIST ._ REMARK-ON ...

REMOVE ......

RESET-STATS ._ RETURN ......

RPLACA ....

RPLACD ...

RPLACX .........

SCHMAC ..........

SET-UP-ASETVARS ._ SETDIFF ........

SETPROP ...

SEXPRFY ........

STATS ....................

SUBST-CANDIDATE ...........

SUPPRESSED-COMPONENT-NAMES ..

SX ................

SYMBOLP ._ TEMPLOC .....

TERPRI .......

TEST-COMPILE ...

TRANSDUCE .......

TRIV-ANALYZE .....

TRIV-ANALYZE-FN-P ._ TRIV-LAMBDACATE ..

TRIVFN ..........

TRIVIAL .....

TRIVIALIZE ..

TYI .......

TYO ...

TYPE ..

SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION SIDE EFFECTS ._ SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION DATA TYPE .....

SIDE EFFECTS ._ SIDE EFFECTS ..

SIDE EFFECTS ._ UDEFINE .......

SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION PROPERTY ......

MACLISP MACRO _ SIDE EFFECTS ..

SCHEME FUNCTION SIDE EFFECTS ._ SCHEME SCHEME SCHEME SCHEME SCHEME SCHEME FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION FUNCTION DATA TYPE .....

SCHEME FUNCTION SIDE EFFECTS ..

SIDE EFFECTS ......... ...

HUNK ACCESS MACRO .... ...

017 016 061 007 007 023 017 017 017 003 003 069 069 068 046 051 051 052 045 047 048 053 053 049 049 050 057 057 057 055 055 056 017 067 023 034 044 065 006 071 026 017 017 017 001 044 006 006 070 071 022 005 003 017 042 017 007 067 013 013 063 002 026 063 017 017 005 046 037 017 002 041 042 061 062 060 016 005 002 020 006 003 002 026 023 042 003 003 002 021 002 023 002 002 006 013 005 012 002 065 042 002 053 004 002 047 055 049 041 042 043 064 032 058 016 004 046 006 061 028 048 029 063 051 002 012 056 033 009 013 002 066 064 006 05/15/78 Page IV

[Page 282]

Symbol Table for: 0UUX;RABBIT 568 TYPEP ........ . ....... SIDE EFFECTS . .

UNION ......... . _ . SCHEME FUNCTION USED-TEMPLOCS _ _ . . . SCHEME FUNCTION VARIABLE ...... DATA TYPE WARN ......... . _ . MACLISP MACRO . . . _ . .

017 006 064 008 003 047 033 059 030 002 05/151 78 Page V