repak shawahb
a.k.a. gorgeous gergis

^

   

rsw@jfet.org


blogroll

       
Sun, 14 Jun 2009

yay!(cd)

or, Yet Another Y-Combinator Derivation

In the comp.lang.ml FAQ one of the questions asks about the Y-combinator; the answer is provided without derivation or explanation of the subtle datatype backflip you need to use. So we'll derive it together here.

We've all seen the standard Y-combinator derivation in Scheme, right? Y is a fixed-point combinator which represents, in effect, the distilled essence of recursion. To sketch Gabriel's argument (in Scheme first), we start with an example recursive function—factorial, of course:

(define fact (lambda (n) (if (= n 0)
                             1
                             (* n (fact (- n 1))))))

> (fact 5)
120
>

Now, what we want to do is define fact without using fact—so what if we change it so that we can pass the function in that we'll call recursively?

(define f (lambda (fz n) (if (= n 0)
                             1
			     (* n (fz fz (- n 1))))))

But what function do we pass in as fz? f itself, of course—it'll happily keep passing itself all the way down to the base case, and what we get out is factorial.

> (f f 5)
120
>

Now, for reasons which will become clear in a second here, we will curry this function, i.e., turn one function of two arguments into two nested functions taking one argument each:

(define ff (lambda (fz) (lambda (n) (if (= n 0)
                                        1
					(* n ((fz fz) (- n 1)))))))
> ((ff ff) 5)
120
>

See? Basically the same thing, but now ff takes one argument—another function—and returns a function that takes a number and returns its factorial. But we're not done yet: we started with (fact n) and now we have ((ff ff) 5). Well, let's start by fixing the recursive call, recognizing that we can hide the (fz fz) inside lambda:

(define ff (lambda (fz) (lambda (n) (if (= n 0)
                                        1
					(* n ((lambda (n) ((fz fz) n)) (- n 1)))))))

Well, that lambda in there is ugly, but we can always just pull it out and then pass it in via a variable:

(define ff (lambda (fz)
                   ((lambda (z)
		            (lambda (n) (if (= n 0)
			                    1
					    (* n (z (- n 1))))))
		    (lambda (n) ((fz fz) n)))))
> ((ff ff) 5)
120
>

It doesn't look like we're getting any better in terms of recovering the original (fact 5) syntax, but I assure you we're getting closer. Recognize that we could recover the original behavior by just defining a new function fff like this:

(define fff (lambda (n) ((ff ff) n)))

Yup, we've seen this trick before. But let's expand out the (ff ff) so we can do just one definition:

(define fff ((lambda (fz)
                     ((lambda (z)
		              (lambda (n) (if (= n 0)
			                      1
					      (* n (z (- n 1))))))
		      (lambda (n) ((fz fz) n))))
             (lambda (fz)
	             ((lambda (z)
		              (lambda (n) (if (= n 0)
			                      1
					      (* n (z (- n 1))))))
		      (lambda (n) ((fz fz) n))))))
> (fff 5)
120
>

And we've done it. Nothing but arithmetic operations and passed-in function names for defining a recursive operation. But we can generalize this by realizing that we can separate out the "guts" of factorial relatively easily:

(define Fguts (lambda (z)
                      (lambda (n) (if (= n 0)
		                      1
				      (* n (z (- n 1)))))))

This is a function that requires us to pass in the recursive continuation, so it is not itself recursive, and even better, now that we've defined it we can pull Fguts out of fff and make it somewhat prettier:

(define f3 ((lambda (fz) (Fguts (lambda (n) ((fz fz) n))))
            (lambda (fz) (Fguts (lambda (n) ((fz fz) n))))))

This is identical to fff, but we've folded the mess into Fguts. Note that we are still using (lambda (n) ((fz fz) n)) just as before—because this was really the first glimmer of the prize. Consider adding one more layer of lambda where we completely generalize by passing in Fguts or any other "guts"-style function:

(define Y (lambda (X)
                  ((lambda (fz) (X (lambda (n) ((fz fz) n))))
		   (lambda (fz) (X (lambda (n) ((fz fz) n)))))))

And we've arrived at the applicative-order Y-combinator. Taa daaaaaaa!

> ((Y Fguts) 5)
120
>

So could we do the same thing in Standard ML? Well... kind of. The type system gets kind of bitchy if we try to follow exactly the same derivation:

- fun f fz 0 = 1
=   | f fz n = (n * (fz fz (n-1)));
stdIn:3.20-3.31 Error: operator is not a function [circularity]
  operator: 'Z
  in expression:
    fz fz

Many ML programmers are unaware that there is a trick to deriving the Y-combinator in the same wy as above; we simply have to define a recursive datatype so that the type checking unification function doesn't complain about circular substitutions:

datatype 'a t = T of 'a t -> 'a

Now we can start at the top (almost; we'll take advantage of SML's automatic currying for now):

fun ff (T fz) 0 = 1
  | ff (T fz) n = (n * (fz (T fz) (n-1)))

- ff (T ff) 5;
val it = 120 : int

The above step is conceptually the most important and difficult in terms of getting Hindley-Milner on our side. ff has type (int -> int) t -> int -> int, meaning that we will need to keep the same fz (T fz) structure throughout the rest of this derivation. Without it, the type inference system will shun our circular definitions.

Continuing the same argument as before, we wrap the ugly fz (T fz) inside a lambda, then pull it out as an argument (with a little acrobatics because we can't take advantage of automatic currying for this step):

fun ff (T fz) 0 = 1
  | ff (T fz) n = (n * ((fn n => (fz (T fz) n)) (n-1)))

(* now pulling the new "inner function" out *)

fun ff (T fz) = ( ( fn z => ( fn 0 => 1
                               | n => (n * (z (n-1)))
		            ) )
                  ( fn n => (fz (T fz) n) ) )

Now we're ready for fff, and ditching the fun syntactic sugar completely.

(* fun fff n = (ff (T ff) n) *)

val fff = ((fn (T fz) => ((fn z => (fn 0 => 1
                                     | n => (n * (z (n-1)))))
                          (fn n => (fz (T fz) n))))
           (T 
	   (fn (T fz) => ((fn z => (fn 0 => 1
                                     | n => (n * (z (n-1)))))
                          (fn n => (fz (T fz) n))))))

We're close now! Time for Fguts (returning for the sake of brevity to automatic currying) and then f3:

fun Fguts z 0 = 1
  | Fguts z n = (n * (z (n - 1)))

val f3 = ((fn (T fz) => (Fguts (fn n => (fz (T fz) n))))
          (T
	  (fn (T fz) => (Fguts (fn n => (fz (T fz) n))))))

So close we can taste it. Abstracting Fguts gives us the Y combinator once more:

val Y = fn X => ((fn (T fz) => (X (fn n => (fz (T fz) n))))
                 (T
		 (fn (T fz) => (X (fn n => (fz (T fz) n))))))

Thus sayeth the interpreter:

val Y = fn : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b
- (Y Fguts);
val it = fn : int -> int
- (Y Fguts) 5;
val it = 120 : int

By the way, there is a ridiculously simple way to define precisely the same Y-combinator in SML:

fun Y f n = f (Y f) n

(* or, using the built in compose operator "o" *)
fun Y f n = (f o Y) f n

Of course, both of these "cheat" because they're explicitly recursive.


[ permalink | 0 comments (add one you lazy bastard!) ]