[SICP 6] Notes on SICP Chapter 1.1 “The Elements of Programming” - Sections 1.1.3, 1.1.4, 1.1.5

1.1.3 Evaluating Combinations

enter rules of expression evaluation

Now that we established primitive and compound expressions and naming things via special form define it’s time to understand how they are evaluated. (Also note that, names of built-in procedures have no special treatment, they are just predefined). Since expressions can be nested their evaluation has a recursive nature. Book says:

  1. Evaluate the sub-expressions of the combination.
  2. Apply the procedure that is the value of the leftmost sub-expression (the operator) to the arguments that are the values of the other sub-expressions (operands).

Evaluating an combination requires evaluation of each element in the combination. (this is where combinations differ from special forms).

The elements of a combination (sub-expressions) can be primitives or can be themselves a combination. Primitives evaluate easily to their values: Literals are evaluated to their meanings, say, digits 10 is evaluated to number 10, and variable names are evaluated to their associated object. Recursive means that application of the rule invokes itself.

The book gives following combination expression, E1, as an example (* (+ 2 (* 4 6)) (+ 3 5 7))

  • First apply rule 1 to E1
    • * evaluates to the built-in multiplication procedure. (if you evaluate this symbol in interpreter MIT Scheme says #[arity-dispatched-procedure 12], Racket says #<procedure:*>)
    • E2, (+ 2 (* 4 6)), itself is a combination hence its evaluation requires application of rule one to it
      • + evaluates to built-in addition procedure (#[arity-dispatched-procedure 13] in MIT Scheme, and #<procedure:+> in Racket)
      • Symbol, literal 2 evaluates to number 2
      • E3 (* 4 6) revokes rule-1 to be evaluated
        • * → multiplication procedure
        • 4 → number 4
        • 6 → number 6
        • Now, all sub-expressions of E3 are evaluated, we can revoke rule-2.
        • Apply procedure multiplication to arguments 4 and 6 → 24
      • Now, all sub-expressions of E2 are evaluated. Revoke rule-2
      • Apply procedure addition to arguments 2 and 24 → 26
    • E4 (+ 3 5 7) is a combination. Apply rule-1 to evaluate all sub-expressions.
      • All of them are primitives. Apply addition to arguments 3, 5 and 7 → 15
  • Apply multiplication to arguments 26 and 15 → 390.

In order to emphasize the tree structure of recursive rule application we can re-indent the expression in following format, where I indicated evaluations of combinations via comments.

(* ; 390
 (+ ; 26
  2
  (* ; 24
   4
   6))
 (+ ; 15
  3
  5
  7))

In this tree, each combination is a node with branches where left-most branch is an operator and the rest are operands. We do not treat the operator branch as a special case. When coming from an imperative background it might make more sense to put the operator as the label of a node and put the operands as branches, however, because the first element itself can be a combination that evaluates into a procedure it is not treated differently. (We did not see that use-case yet.)

Terminal nodes (leaves) are primitive expressions (numbers or operators). “The values of the operands percolate upward, starting from the terminal nodes and then combining at higher and higher levels”. In general, recursion is an essential technique to deal with hierarchical, tree-like objects where this percolation process is called tree accumulation.

Also observe that how an expression naturally is a tree. Lisp syntax makes Abstract Syntax Trees redundant. The syntax itself is a tree!

When evaluating primitive expressions

  • literals evaluate to corresponding object, i.e. numerals evaluate to corresponding number, letters inside double quotes evaluate to string made of that characters.
  • names evaluate to associated objects.
    • sometimes these names already exist in the global environment. Those are called built-ins, such as + and *
    • and sometimes they are defined by the user.

From the perspective of the Scheme interpreter built-ins and user-defined symbols are not different. Any expression requires an environment “in determining the meaning of symbols in expressions”. It’s totally possible to have a “funny” environment where + means * and vice-versa.

(define + *)
(+ 10 10)
=> 100 ; trololo

As a side trolling, I recommend 2+2=5 in Java - YouTube short video from Lex Fridman’s channel about how to change the meaning of number literal 4 to make 2 + 2 to evaluate to 5 in Java. Apparently numbers smaller than 256 are already evaluated and stored in an array and it’s possible to manipulate that array. Great resource for disgruntled employees. :-P In Java, it’s not as advanced as replacing the procedure that symbol + is associated with, but still, this discussion reminded me of that.

The rule for evaluation of combinations does not apply to define. (define x 1) does not evaluate x first (can’t evaluate before having a definition!), nor it does apply define operator to x and 1. The combination with define is an exception to the general combination rule. These exceptions are called special forms, a compound expression that is not a combination.

“Each special form has its own evaluation rule. The various kinds of expressions (each with its associated evaluation rule) constitute the syntax of the programming language”. Compared to other languages Lisp’s syntax is very simple!

Parentheses always denote combinations expect special forms.

Definitions are not expressions and there are restrictions where they can be written. Any expression can appear as the operator of a combination or in the consequent of a conditional, but a definition cannot.

1.1.4 Compound Procedures

enter define for procedures

Section 1.1.2 was about the simple level of abstraction where we associate a name with the value of an expression. This section is a deeper abstraction for constructing user-defined procedures, called procedure definitions.

Procedures used so far were already built-in in the interpreter, which are also called primitive procedures. Using procedure definitions a compound operation can be treated as a single unit, called a compound procedure (a user-defined procedure) of which implementation is abstracted away. Its syntax is a follows:

(define (<name> <formal parameters>)
  <body>)

Authors use squaring as example. square is already built-in in MIT Scheme, however, it’s possible for the user to redefine it. Definition for square procedure is as follows:

(define (square x)    (*       x       x)))
 |       |      |      |       |       |
To  square something, multiply it by itself

This is a compound procedure named square. x a.k.a “it” is a local name, that can be thought as a “pronoun” in a natural language. Evaluation of this definition creates the compound procedure, stores it in global environment with its associated name <name>. “<formal parameters> are the names used within the body of the procedure to refer to the corresponding arguments of the [applied] procedure.” Their associations only exist local to the procedure definition and do not leak to the global environment. (The importance of this locality will be taught soon).

“The <body> is an expression that will yield the value of the procedure application when the formal parameters are replaced by the actual arguments to which the procedure is applied”. However, note that the <body> section does not need to be a single expression, it can be a sequence of multiple expressions, and the yielded value will be the value of the last one. This can be used to introduce side-effects. (First side-effect will happen in an exercise to print values.)

Later we’ll see that procedure definition special form is a “syntactic sugar” (a term coined by Peter Landin, a designer of ALGOL) for definitions with lambda functions, i.e. (define (square x) (* x x)) is actually (define square (lambda (x) (* x x)) but since students need to know lambda functions, pedagogically they go with this second special case before explaining that it is not actually a special-case. Also probably procedure definitions is so common that they deserve their own special form.

An example procedure definition that rely on formerly user-defined procedures is (define (sum-of-squares x y) (+ (square x) (square y))) This one takes two arguments. Though, the usage of compound procedures is indistinguishable from primitive procedures (built-in ones). By looking at the definition of sum-of-squares it is not possible to tell whether square is built-in or not.

We also saw that it is possible to provide arbitrary number of arguments to a procedure (such as +) however, the syntax defining such a procedure will be taught later.

1.1.5 The Substitution Model for Procedure Application

This was the most educative section for me. The exact details of the mechanism of how an expression with a procedure application is handled, i.e. rule-2 of evaluating combinations. (rule-1 was elaborated in previous section.) Scheme users can assume that “the mechanism for applying primitive procedures to arguments is built into the interpreter”. For compound procedures

To apply a compound procedure to arguments, evaluate the body of the procedure with each formal parameter replaced by corresponding argument.

Let’s follow an application of sum-of-squares.

  • (sum-of-squares 3 4) a combination → apply rule-1: evaluate sub-expressions
    • sum-of-expression evaluates to the procedure defined above.
    • numerals evaluate to numbers.
    • Now that everything is evaluated, apply the procedure (as defined previously) to arguments
  • (+ (square 3) (square 4)) again a combination → evaluate sub-expressions
    • + primitive evaluates to built-in addition procedure
    • Next is to evaluate operands. Which one will be evaluated first? Well… “order of evaluation of sub-expressions are not specified in Scheme.” It could be random :-), left-to-right etc. If none of the sub-expressions have a side-effect the end result should be the same whatever the run-time order is. Different interpreters will choose different orders.
    • Let’s evaluate from right to left.
    • (square 4) a combination.
      • square evaluates to previously defined procedure
      • numeral 4 is number 4
      • apply square to argument 4
      • (* 4 4) combination
        • * → built-in multiplication. 4’s are number 4’s.
        • apply → 16
  • (+ (square 3) 16)
    • similarly (square 3) evaluates to 9
  • (+ 9 16) → combination
    • + → built-in addition. numerals → numbers
  • 25

When studying procedure application, one can take short-cuts such as evaluating each sub-expression in parallel (do (square 3) and (square 4) in the same step. Also, when it’s obvious to which procedure/value primitives evaluates to omit that explanation. So, a more succinct study could be:

(sum-of-squares 3 4)
(+ (square 3) (square 4))
(+ 9 16)
25

Also, when run in interpreter, can see that every line above separately evaluates to 25.

I’ve tried following compound procedure call with side effects in MIT Scheme and Racket.

(define (display-and-return x)
  (display x)
  x)
(define (check-application-order x y z)
  (+ (display-and-return x)
     (display-and-return y)
     (display-and-return z)))

Looks like their implementations are different! Former printed 3 2 1 where latter printed 1 2 3. Apparently, MIT Scheme evaluates sub-expressions from right-to-left whereas Racket does it from left-to-right. Hope this preference in evaluation order won’t cause a civil war among Lilliputians.

Instructor’s manual wants special cases to be demonstrated explicitly such as

(define five-val 5)
(define (five-func) 5)
  • five-val directly evaluates to value 5 due to name association
  • whereas (five-func) is a combination and rules for combination evaluation and procedure application (without arguments) are needed. “We evaluate the sub-expression five appearing in the operator position. This is a name, so its value is the procedure that was associated with the name by the above define. We then apply this procedure to no arguments. To apply the procedure, we evaluate its body (after “substituting” the non-existent arguments for the non-existent formal parameters) and obtain the value 5.”

This process we follow to evaluate procedure calls is called substitution model for procedure application. SICP emphasizes that this is not necessarily how interpreters compute under the hood. It’s a model for programmers to help them thinking about procedure application.

Real interpreters do not do text/symbol manipulation, like we did by replacing parts of an expressions with another form according to “substitution rules”. We are thinking more in terms of “formal system” paradigm. Inside the interpreter the substitution is done using a “local environment” which will be explained in later chapters when they teach how to implement interpreters. (can’t wait!) Authors warn us about the fact that “substitution model” breaks down when dealing with mutable data, and has to be replaced with a more complex and better fitting model.

Applicative order versus normal order

Another juicy sub-section! The evaluation rules given in 1.1.3 ask for first to “evaluate the operator and operands, then apply the resulted procedure to the resulting arguments”. However, this is not the necessary or only way for evaluating nested expressions.

An alternative evaluation model could delay the evaluation of operands until it is definitely necessary, i.e. until hitting a primitive procedure as the operator. The evaluator can replace the compound procedures with their definition bodies and substitute its formal parameters with the sub-expressions for the operands without evaluating them.

In this case (sum-of-squares (+ 1 2) (+ 1 3)) won’t become (+ (square 3) (square 4) where both operator and operands were evaluated, but becomes (+ (square (+ 1 2)) (square (+ 1 3))). Remember that the definition of sum-of-squares was (define (sum-of-squares x y) (+ (square x) (square y))), and this step replaces starting expression with the body of definition and substitutes x(+ 1 2) and y(+ 1 3). See that + in operands in the sum-of-squares call are not evaluated yet.

Now, the operator of this expression, +, is primitive (i.e. it’s a single step for the interpreter. There is no explicit definition body to be substituted), we have to evaluate the operands, which are square procedures

(+ (* (+ 1 2) (+ 1 2)) (* (+ 1 3) (+ 1 3)))

Now that all operators are primitive procedures, it’s time to do computations.

(+ (* 3 3)) (* 4 4))
(+ 9 16)
25

The result is, of course, the same. However, note that, in this model (+ 1 2) and (+ 1 3) expressions appeared and computed twice.

“This alternative ‘fully expand and then reduce’ model is known as normal-order evaluation, in contrast to the ‘evaluate the arguments and then apply” method that the interpreter actually uses, which is called applicative-order evaluation.”

For procedure applications that can be modeled with substitution model (procedures of first two chapters) these two order evaluation approaches result in the same value. Exercise 1.5 is an example where these approaches give different results.

One of the reasons why Lisp uses applicative-order is because it prevents duplicate computation which makes it more efficient. Authors warn us about the difficulty of normal-order evaluation for the cases where substitution model breaks down. (This make me really interested about learning when and how the model becomes inadequate.) Looks like normal-order evaluation is useful in lazy-evaluation and generator stuff that deals with stream processing.

published at: 2020-09-07 14:13 UTC-5
tags: