Baking the Y Combinator from Scratch

the-nerve-blog.ghost.io

125 points by mprast 3 months ago

Anyone interested in this might get a kick out of the book “To Mock A Mockingbird”.

The first half of the book is an almost exhaustive exploration of self-referential riddles. Like “if the barber shaves everyone who doesn’t shave themselves, who shaves the barber?” and “if one gremlin always tells the truth and the other always lies, how can you ask them for directions?”

The second half of the book is a rigorous examination of combinatorial logic through an analogy of birds who make calls depending on the calls of other birds they hear. It touches on function composition,recursion, fixed points, godels theorems, and the Y combinator (the mockingbird).

My only gripe with the book is that it stays completely within the bird metaphor, and doesn’t always explicitly state the underlying math concept.

sva_ 3 months ago

If anyone wants this in a humorous 1 hour format, watch the legendary Jim Weirich (rip) explain it here: https://www.youtube.com/watch?v=FITJMJjASUs

Don't be discouraged that it is in Ruby as the concepts are completely general. Great watch even though I never coded a line of Ruby.

mkagenius 3 months ago

Is there a technique to remember this? I will understand it today and forget after a few weeks.

pvg 3 months ago

There's a way to remember smaller and smaller bits of it until you terminate at a piece small enough to remember permanently.
bloppe 3 months ago
It's easier to remember if you use a more common notation:
```
    (Y(f))(x) = f(f(f(...(x)...)))
```
Or express it in python, which is still a bit weird but probably still more readable than pure LC to pretty much everybody:
```
    def Y(f):
        return lambda x: f(Y(f)(x))
```
- tromp 3 months ago
  
  > (Y(f))(x) = f(f(f(...(x)...)))
  I think that should be (Y(f))(x) = f(f(f(...)))(x)
  - bloppe 3 months ago
    
    Think of f as a function that takes a value and returns a value. Y takes a function and returns a function. Y(f) returns a new function that takes a value and applies f to it, recursively forever, ultimately converging on a value (or infinity).
    In your example, f(...) would have to return a function that is then applied to x.
    I realize there are no non-function values in LC.
- chabska 3 months ago
  
  That misses the whole point, in that you cannot refer to Y inside of Y in an environment that doesn't have recursion.
  - bloppe 3 months ago
    
    Fair enuf
crdrost 3 months ago
Yes.
Lambda calculus is about template substitution. It is the abstract mathematics of substituting templates into other templates. Alonzo Church got interested in this when he realized that it had numbers,
```
    type NumF = <X>(f: (x: X) => X) => (x: X) => X
    const zero: NumF = f => x => x
    const one: NumF = f => x => f(x)
    const two: NumF = f => x => f(f(x))
```
Addition is not too hard to define, multiplication is actually just function composition, and decrement is an unexpected motherf$@&er. To understand this last point, it may help to understand that this encoding of numbers is basically zero as [], one as [null], two as [null, null], etc and you are trying to use Array.prototype.reduce to compute the array with one fewer element. So you can do it with a (last, curr) pair or a null initial value, you have to know what those look like in the lambda calculus to translate,
```
    // predecessor and successor
    const succ = n => f => x => n(f(x))
    // Maybe fns: in λ-calc you'd uncurry
    type MaybeF<x> = <z>(ifNil: z, ifJust: (x: x) => z) => z
    const nil: MaybeF<any> = (ifNil, _ifJust) => ifNil
    function just<x>(x: x): MaybeF<x> {
      return (_ifNil, ifJust) => ifJust(x);
    }

    const pred = n =>
        n(maybe_n => maybe_n(just(zero), k => just(succ(k))))(nil)
      
```
Now you asked for a technique to remember the Y combinator. The basic thing to remember is the infinite template substitution discovered by Haskell Curry,
```
    (f => f(f))(f => f(f))
```
Forgetting everything you know about programming languages, and just thinking about template substitution, this is a template on the left, being substituted with a value on the right, and the reason it's an infinite loop is that after the substitution you get the exact same expression that you had before the substitution. [On the right we could have also written g => g(g) if that helps, the two variables are defined in separate scopes and only happen to have the same name f to make it clearer that this will loop infinitely if you try to eagerly evaluate it.]
Terms like this, were why Curry wanted a type theory. He immediately understood that this was kind of a weird expression because if you tried to express the idea of f(f) it was secretly holding on to a sort of paradox of self-reference or an infinity or whatever you want to call it. It is a function type which takes only the same function type as an argument, and returns God-knows-what. If you just forbid self-reference in the typing judgments, you don't get this sort of weird f(f) pathology.
So if you are remembering f(f) being somehow the key to the Y combinator, then you immediately see that you can kind of step it by not immediately evaluating it,
```
    const Q = (f => () => f(f))(f => () => f(f))
```
so as a matter of algebra Q() = Q although a programming language struggles to prove equality between functions so as a matter of programming that might not be equal. So this is a function you can call Q()()()() and it's just infinitely callable without producing any other values.
But why () =>, why not x => ? Well, you can introduce that but Q will just ignore it, it doesn't know what to do with that. So you need to write f(f)(x) to plumb it through to do something.
Note briefly that there IS a difference between f(f) and x => f(f)(x). The latter delays execution of f(f) until needed and so is still stepped, the former is an infinite loop.
And the last thing is that you need to hand this internal function to someone else, some decider, to decide when to call the recursion. This leads to the full form,
```
    const Y = decider => (f => decider(x => f(f)(x))(f => decider(x => f(f)(x)))
```
Usage with an example decider:
```
    const factorial = Y(recurse => n => n <= 1 ? 1 : n * recurse(n - 1));
```
- Akronymus 3 months ago
  
  > To understand this last point, it may help to understand that this encoding of numbers is basically zero as [], one as [null], two as [null, null], etc
  Isnt it rather nil, [nil], [[nil]] and [[[nil]]] and so on?
  - crdrost 3 months ago
    
    No, although you do see that in set theory, but this is a specific idea about the type signatures and encodings.
    List type:
    type Stack<x> = null | {first: x, rest: Stack<x>}
    The church encoding of a data structure, is to go through all of these cases separated by |, and define a handler for each one, and then define a function which accepts all those handlers and calls the appropriate one for its case:
    type StackF<x> = <z>( ifNull: (n: null) => z, ifItem: (first: x, rest: z) => z ) => z;
    The Scott encoding is actually a little more practical, and has `rest: StackF<x>` in that second handler, but that's not what we have here.
    Now the point is that a unit type like `null` that only has one possible value, that first handler can actually be simplified to just `ifNull: z`. Similarly if we consider a list of nulls, we would get
    type NullStackF = <z>( ifEnd: z, ifItem: (rest: z) => z ) => z;
    And after swapping the arguments and currying, you get precisely the type of the Church numerals.
    The Church encoding of a list is effectively replacing that list with its reduce function, so that's why I am saying that it's basically Array prototype.reduce on an Array<null> type. (I actually think it's reduceRight but this type is symmetric under .reverse() so who gives a darn). So it's
    const zero_ = f => [].reduce(f) const one_ = f => [null].reduce(f) const two_ = f => [null, null].reduce(f)
    in JS. But if you had just Scott-encoded the numbers instead, you would have had no problems with pred() but you need the Y combinator to actually use the numerals for something interesting:
    type ScottNum = <z>(next: (rest: ScottNum) => z) => (z: z) => z; const zero: ScottNum = _f => x => x const one: ScottNum = f => _x => f(zero) const two: ScottNum = f => _x => f(one) function succ(n: ScottNum): ScottNum { return f => _x => f(n) } function pred(n: ScottNum): ScottNum { return n((x: ScottNum) => x)(zero) } function church<z>(n: ScottNum, f: (z: z) => z): (z: z) => z { return z => n(m => f(church(m, f)(z)))(z) }
- thaumasiotes 3 months ago
  Does Q stand for quine?
  ((lambda (x) (list x (list 'quote x))) '(lambda (x) (list x (list 'quote x))))
  - crdrost 3 months ago
    
    Yes! But it's not a true quine because it returns “something functionally equivalent to myself,” not “my own source code.” so then I debated what else to call it before deciding that I was already taking too long writing an HN comment and it didn't matter what it was called, haha. Well spotted.
mprast 3 months ago

personally I don't have as much trouble wrapping my brain around the Omega combinator as I do around the Y combinator, so I just try to remember that the Y combinator is an omega combinator with an extra f injected at the site of replication, if that makes sense

fritzo 3 months ago

From scratch it's also easy dress up a quoted Y combinator as in Löb's theorem. Start with the usual Y combinator

  Y = \f. (\x.f(x x)) (\x.f(x x))
    : (p -> p) -> p

which can produce a thing of type p from a function f : p -> p.

Now Löb's theorem states that []([]p -> p) -> []p, which can be read, "if you have the source code to transform (the source code of a thing of type p) to a thing of type p, then you can produce the source code for a thing of type p". Let's embellish Y using {} to denote the quoting comonad.

  Y' = \{f}. (\{x}.{f{x{x}}}) {\{x}.{f{x{x}}}}
     : []([]p -> p) -> []p

To get there, just add quotes as needed: f must be quoted, f's result must be quoted, and f takes a quoted input.

mkckto 3 months ago

Here's a write-up I did on deriving the Y combinator from Lawvere's fixed-point theorem: https://emarzion.github.io/Y-Comb/

tromp 3 months ago

> you technically can't do name = term

But you can do

    let name = term in term2

by de-sugaring it to

    ((λname. term2) term)

zarakshR 3 months ago

This is not relevant since this de-sugaring cannot be performed if `name` is recursive (i.e., if `term` refers to `name`), which `Y` is. So the author's derivation is still required in the pure lambda calculus.

programjames 3 months ago

I'm a constructivist, but I still believe in proof by contradiction. In fact, I don't think you can believe in proof by contradiction without constructivism. How do you know you can take an arbitrary proof P, and show it leads to a contradiction, if you don't even know you can touch P to begin with?

Anyway, how I would construct a proof by contradiction is:

1. Suppose you want to know if there exists a proof P of length less than N.

2. Do the usual "proof by contradiction" with a placeholder P.

3. You can write a very short algorithm that then plugs in all 2^N possible proofs into your proof by contradiction algorithm.

4. And voila! You have a constructive proof.

yayitswei 3 months ago

At first I thought they were going to implement it in MIT Scratch.

mprast 3 months ago

that would admittedly be more impressive

sjibifi 3 months ago

[flagged]

sjibifi 3 months ago

[flagged]

sjibifi 3 months ago

[flagged]