Definitions



Evaluation order:
Reduce to:



Böhm's Theorem

Any two normal forms are either η-convertible or separable [1].

By separable, it means that there exists a discriminator function Δ such that for normal forms u and v, (Δ u) evaluates to (λt f. t) and (Δ v) evaluates to (λt f. f).


Not only does a Δ exist, but there is a constructive algorithm called the Böhm-out technique that gives us a concrete Δ! We will loosely follow [2].

Path Construction

The first step is to find a difference between the two normal forms, if one exists. For u = λx1...xm. (g a1...ap) and v = λy1...yn. (h b1...bq), we first η-expand the lesser until m = n, then α-rename y1...yn to x1...xn so that their lambda-bound variables are the same. These transformations preserve the semantics of u and v, so a difference path on the transformed terms is a difference path on the originals too. Now, they can be different in any of 3 ways:
  1. The head is different, gh
  2. The number of args is different, pq
  3. Or, the difference is nested somewhere in the ith arg, aibi
The result of this search is a list of indices ci denoting a path through the c1th arg, then the c2th arg of that term, and so on. The path terminates with a difference in either the head var or the number of args. We construct Δ by induction on this path.

Discriminator Construction

By induction on the difference path for normal forms u and v. Note that as above, we assume for each case that u and v have already been η-expanded and α-renamed to bind the same variables.

Case: head difference.
Given terms u = λx1...xn. (xg a1...ap) and v = λx1...xn. (xh b1...bq), we know g ≠ h. If we apply λx1...xp t f. t as the arg for xg, λx1...xq t f. t as the arg for xh, and any other term (e.g. λx.x) for the rest of x1...xn, then applying these args to u yields (λt f. t) and to v yields (λt f. f).

Case: number of args difference.
Given terms u = λx1...xn. (xh a1...ap) and v = λx1...xn. (xh b1...bq), we know p ≠ q. Without loss of generality, assume p < q. By applying λy1...yq+1. yq+1 as the arg for xh and λx.x for the rest, we get
(λx1...xh...xn. (xh a1...ap)) (λx.x)...(λy1...yq+1. yq+1)...(λx.x)
β (λy1...yq+1. yq+1) a1[x1↦λx.x,...]...ap[x1↦λx.x,...]
β λyp+1...yq+1. yq+1
α λz1...zq-p+1. zq-p+1
and
(λx1...xh...xn. (xh b1...bq)) (λx.x)...(λy1...yq+1. yq+1)...(λx.x)
β (λy1...yq+1. yq+1) b1[x1↦λx.x,...]...bq[x1↦λx.x,...]
β λyq+1. yq+1
α λz1. z1
η λz1...zq-p+1. (z1 z2...zq-p+1)
which is a head difference since p < q, so we can conclude by using the first case above!

Case: difference in ith arg.
Given terms u = λx1...xn. (xh a1...ai-1 b ai+1...ap) and v = λx1...xn. (xh a1...ai-1 c ai+1...ap), we have a path P to a difference between b and c. If xh does not occur along P except here, then this is easy! We can simply apply λy1...yp. yi as the arg for to the head var xh, and recursively apply this process with the rest of P and λx1...xh-1 xh+1...xn. b and λx1...xh-1 xh+1...xn. c.
Things are a little more complicated if xh occurs at the head several times on a path. This is because right now, the difference is buried inside the ith arg of xh, but somewhere inside that arg could be another term with xh as the head, but with the difference somewhere inside a different arg than i. So if have xh only select its ith arg, then we could lose the difference later on. To avoid this happening, first determine the maximum number of args M applied to a head occurrence of xh along P. Then, η-expand all head occurrences of xh in u and v along P until all have exactly M args. Then, use (λx1...xM+1. (xM+1 x1...xM)) as the argument for xh, so that all
(xh a1...aM)
β (λx1...xM+1. (xM+1 x1...xM)) a1...aM
β λxM+1. xM+1 a1...aM
which introduces a new, distinct head variable for each head occurrence of xh. We can procede with the simpler case above, because this new head only occurs once along the path.


References
[1] G. Huet. An Analysis of Böhm’s Theorem. 2008.
[2] S. Guerrini, A. Piperno, M. Dezani-Ciancaglini. Böhm's Theorem. 2015.