Y combinator discussion in “The Little Schemer”

Question

Great question. For the benefit of those without a functioning DrRacket installation ~~(myself included)~~ I’ll try to answer it.

First, let’s use some sane (short) variable names, easily trackable by a human eye/mind:

((lambda (h)     ; A.   
     (h h))            ; apply h to h
 (lambda (g)
     (lambda (lst)
       (if (null? lst) 0
         (add1 
               ((g g) (cdr lst)))))))

The first lambda term is what’s known as little omega, or U combinator. When applied to something, it causes that term’s self-application. Thus the above is equivalent to

(let ((h (lambda (g)
           (lambda (lst)
             (if (null? lst) 0
               (add1 ((g g) (cdr lst))))))))
  (h h))

When h is applied to h, new binding is formed:

(let ((h (lambda (g)
           (lambda (lst)
             (if (null? lst) 0
               (add1 ((g g) (cdr lst))))))))
  (let ((g h))
    (lambda (lst)
            (if (null? lst) 0
              (add1 ((g g) (cdr lst)))))))

Now there’s nothing to apply anymore, so the inner lambda form is returned — along with the hidden linkages to the environment frames (i.e. those let bindings) up above it.

This pairing of a lambda expression with its defining environment is known as closure. To the outside world it is just another function of one parameter, lst. No more reduction steps left to perform there at the moment.

Now, when that closure — our list-length function — will be called, execution will eventually get to the point of (g g) self-application, and the same reduction steps as outlined above will again be performed (recalculating the same closure). But not earlier.

Now, the authors of that book want to arrive at the Y combinator, so they apply some code transformations to the first expression, to somehow arrange for that self-application (g g) to be performed automatically — so we may write the recursive function application in the normal manner, (f x), instead of having to write it as ((g g) x) for all recursive calls:

((lambda (h)     ; B.
     (h h))            ; apply h to h
 (lambda (g)
   ((lambda (f)           ; 'f' to become bound to '(g g)',
      (lambda (lst) 
        (if (null? lst) 0
          (add1 (f (cdr lst))))))  ; here: (f x) instead of ((g g) x)!
    (g g))))                       ; (this is not quite right)

Now after few reduction steps we arrive at

(let ((h (lambda (g)
           ((lambda (f)    
              (lambda (lst) 
                (if (null? lst) 0
                  (add1 (f (cdr lst))))))
            (g g)))))
  (let ((g h))
    ((lambda (f)
       (lambda (lst)
            (if (null? lst) 0
              (add1 (f (cdr lst))))))
     (g g))))

which is equivalent to

(let ((h (lambda (g)
           ((lambda (f)    
              (lambda (lst) 
                (if (null? lst) 0
                  (add1 (f (cdr lst))))))
            (g g)))))
  (let ((g h))
    (let ((f (g g)))           ; problem! (under applicative-order evaluation)
       (lambda (lst)
            (if (null? lst) 0
              (add1 (f (cdr lst))))))))

And here comes trouble: the self-application of (g g) is performed too early, before that inner lambda can be even returned, as a closure, to the run-time system. We only want it to be reduced when the execution gets to that point inside the lambda expression, after the closure was called. To have it reduced before the closure is even created is ridiculous. A subtle error. 🙂

Of course, since g is bound to h, (g g) is reduced to (h h) and we’re back again where we started, applying h to h. Looping.

Of course the authors are aware of this. They want us to understand it too.

So the culprit is simple — it is the applicative order of evaluation: evaluating the argument before the binding is formed of the function’s formal parameter and its argument’s value.

That code transformation wasn’t quite right, then. It would’ve worked under normal order where arguments aren’t evaluated in advance.

This is remedied easily enough by “eta-expansion”, which delays the application until the actual call point: (lambda (x) ((g g) x)) actually says: “will call ((g g) x) when called upon with an argument of x“.

And this is actually what that code transformation should have been in the first place:

((lambda (h)     ; C.
     (h h))            ; apply h to h
 (lambda (g)
   ((lambda (f)           ; 'f' to become bound to '(lambda (x) ((g g) x))',
      (lambda (lst) 
        (if (null? lst) 0
          (add1 (f (cdr lst))))))  ; here: (f x) instead of ((g g) x)
    (lambda (x) ((g g) x)))))

Now that next reduction step can be performed:

(let ((h (lambda (g)
           ((lambda (f)    
              (lambda (lst) 
                (if (null? lst) 0
                  (add1 (f (cdr lst))))))
            (lambda (x) ((g g) x))))))
  (let ((g h))
    (let ((f (lambda (x) ((g g) x))))       ; here it's OK
      (lambda (lst)
            (if (null? lst) 0
              (add1 (f (cdr lst))))))))

and the closure (lambda (lst) ...) is formed and returned without a problem, and when (f (cdr lst)) is called (inside the closure) it is reduced to ((g g) (cdr lst)) just as we wanted it to be.

Lastly, we notice that (lambda (f) (lambda (lst ...)) expression in C. doesn’t depend on any of the h and g. So we can take it out, make it an argument, and be left with … the Y combinator:

( ( (lambda (rec)            ; D.
      ( (lambda (h) (h h))  
        (lambda (g)
          (rec  (lambda (x) ((g g) x))))))   ; applicative-order Y combinator
    (lambda (f)
        (lambda (lst) 
          (if (null? lst) 0
            (add1 (f (cdr lst)))))) )
  (list 1 2 3) )                            ; ==> 3

So now, calling Y on a function is equivalent to making a recursive definition out of it:

( y (lambda (f) (lambda (x) .... (f x) .... )) ) 
===  define f = (lambda (x) .... (f x) .... )

… but using letrec (or named let) is better — more efficient, defining the closure in self-referential environment frame. The whole Y thing is a theoretical exercise for the systems where that is not possible — i.e. where it is not possible to name things, to create bindings with names “pointing” to things, referring to things.

Incidentally, the ability to point to things is what distinguishes the higher primates from the rest of the animal kingdom ⁄ living creatures, or so I hear. 🙂

Leave a Comment Cancel reply