Designing a Theorem Prover, Part 6

Unifier Link to heading

As we saw in Part 3, unification is the process of finding substitutions for variables in two terms \(t\) and \(u\) to make equation \(t = u\) true.

The key assumptions:

Distinctness.
All functions and constants in the terms are distinct (unless explicitly stated otherwise).
Equality rules.
The only valid equalities are:
- Reflexivity (\(t = t\)): Any term is always equal to itself.
- Substitution of equals for equals: if two terms \(a\) and \(b\) are equal (\(a = b\)), then one can replace the other.

Basically we need to solve a set of equations \(\{t_1 = u_1, \dots, t_n = u_n\}\).
Each term can be a function application, a parameter, or a meta-variable:

If one term is a parameter, the other must be the same parameter.
If one term is a function application, the other must be an application of the same function with the same number of arguments.

An equation is solved when it is simplified to \(meta–variable = term\).

Let’s take a look at different cases:

\(?a = t\) (or \(t = ?a\)):
- If \(t\) is the same as \(?a\) then we just ignore the equation
- Otherwise we need to check if meta-variable \(?a\) appears in \(t\) (do the occurs check)
  - If \(?a\) is found in \(t\) then this equation has no solution (meta-variable can not appear in its own definition)
  - Otherwise it is solved by \(?a \mapsto t\)
\(?a = ?b\): Either variable can be substituted with the other

Once a solution is found, it must be substituted into the other equations and then a new equation should be selected to solve next. If we find an unsolvable equation, the process ends. Otherwise, it continues until all equations are solved. In the end we should have a solution set (or unifier) \(\{?a_1 = u_1, ?a_2 = u_2, \dots, ?a_k = u_k\}\), where \(?a_i\) are different from each other and do not appear in any of the terms \(u_i\). A unifier is simply a set of substitutions for the meta-variables \(\{?a_1, ?a_2, \dots, ?a_k\}\).

Examples (solvable):

Given \(g(t_1, t_2) = g(u_1, u_2)\) unification works if both subterms can be unified:
- \(t_1 = u_1\)
- \(t_2 = u_2\)
Given \(g(?x,h(?y)) = g(a,h(b))\) we can unify:
- \(?x = a\)
- \(?y = b\)
- \(g(a, h(b)) = g(a, h(b))\)
Given \(f(?x, ?x) = f(a, a)\) we can do:
- \(?x = a\)
- \(f(a, a) = f(a, a)\)

Examples (unsolvable):

\(f(?a) = b\)
There is no term \(?a\) such that applying \(f\) to it equals \(b\).
\(f(a) = f(b)\)
Unification fails because \(a\) and \(b\) are distinct constants.
\(f(?c) = g(?a, ?b)\)
Here, \(f\) and \(g\) are different functions, and even if they were the same, the arities don’t match.
\(g(a, c) = g(?b, ?b)\)
- \(?b = a\)
- \(?b = c\)
This creates a conflict because \(?b\) cannot simultaneously map to two different terms.
\(g(?a, f(?a)) = g(?b, ?b)\)
- \(?a = ?b\)
- \(f(?a) = ?b\)
After substituting \(?a = ?b\) in \(f(?a) = ?b\) we have \(?b = f(?b)\), which fails “occurs check”.
It means that the variable \(?b\) must be equal to a term \(f(?b)\) that contains itself.
This creates an infinite recursive structure: \(?b = f(f(f(\dots)))\)
\(h(?a, f(?a), ?d) = h(g(0, ?d), ?b, ?c)\)
Step 1:
- \(?a = g(0, ?d)\)
- \(f(?a) = ?b\)
- \(?d = ?c\)
Step 2 (substituting \(?a\) for \(g(0, ?d)\) in the second equation):
- \(?a = g(0, ?d)\)
- \(f(g(0, ?d)) = ?b\)
- \(?d = ?c\)
Step 3 (solving the third equation, substituting \(?d\) for \(?c\)):
- \(?a = g(0, ?c)\)
- \(?b = f(g(0, ?c))\)
- \(?d = ?c\)
Now we got a most general unifier:
\(\{?a = g(0, ?c), ?b = f(g(0, ?c)), ?d = ?c \}\).

If a unifier exists then a most general unifier will be found.
Two most general unifiers can differ up to variable renaming.

Parameter dependencies Link to heading

In Part 3 we introduced the concept of dependency:
If a parameter \(b\) depends on a meta-variable \(?a\), we write this as \(b[?a]\).

Parameter \(b\) generated in a quantifier rule has the form \(b[?a_1, \dots , ?a_n]\),
where the \(?a_i\) are all the variables in the conclusion.

Occurs check ensures that no variable depends on itself.
If a variable refers to itself, the equation is unsolvable, as it would lead to infinite recursion.

For example:
\(?a = g(?c) \quad ?c = b[?a]\)
Substituting \(?c \mapsto b[?a]\) in the first equation gives:
\(?a = g(b[?a])\)
Here, \(?a\) indirectly refers to itself through \(b[?a]\), making the equation unsolvable.

Now let’s see how parameter substitution works.

Substitutions in Parameters Link to heading

When generating a parameter like \(b[?a_1, \dots ?a_n]\), a dependency check ensures that all variables \(?a_i\) from the conclusion are present in the parameter. This step is essential to maintain the logical dependencies between variables.

When \(?a_i\) is replaced by a term \(t\), any occurrence of \(?a_i\) in \(b[?a_1, \dots,?a_n]\) is replaced by the meta-variables present in \(t\). But if \(t\) itself depends on \(?a_i\) it leads to an infinite loop.

Let’s go through examples from Paulson’s paper:

Unsolvable

\(?a = g(?c) \quad ?c = b[?a]\)
Here, “\(?a_i\)” is \(?a\), and the “term \(t\)” is \(g(?c)\).
So we should replace any occurrence of \(?a\) in \(b[?a]\) by meta-variables present in \(g(?c)\).
We have only one such meta-variable – \(?c\).
Substituting \(?a \mapsto ?c\) in \(b[?a]\) gives:
\(?a = g(?c) \quad ?c = b[?c]\)
This is unsolvable because \(?c\) appears within \(g(?c)\), creating a cyclic dependency that cannot be resolved.

Solvable

Now let’s consider a case where \(?a\) uses parameter \(d[?c]\):
\(?a = d[?c] \quad ?c = b[?a]\)
Substituting \(?a = d[?c]\) into \(?c = b[?a]\) gives:
\(?a = d[?c] \quad ?c = b\)
Here, the dependency is broken because \(?c\) does not appear in \(d[?c]\).
So \(b[d[?c]]\) is just \(b\). This is a solvable since \(?c\) no longer depends on itself.

Let’s break this down to clarify why \(b[d[?c]]\) simplifies to just \(b\). The parameter \(d[?c]\) represents a dependency on \(?c\), but it doesn’t mean that \(d[?c]\) literally contains \(?c\) in its structure. We can think of \(d[?c]\) as a “container labeled with \(?c\)” but the contents of the container do not contain \(?c\). If \(d[?c]\) were truly dependent on \(?c\), for example \(d[?c] = g(?c)\), where \(g\) is some function of \(?c\) (like in the unsolvable example above), then substituting \(d[?c]\) into \(b\) would result in \(b[g(?c)]\), where \(?c\) persists in the “computation”. But in our solvable case, \(d[?c]\) doesn’t actually use \(?c\), so \(b[d[?c]]\) simplifies to \(b\).

Wrapping Up Link to heading

In this part we looked into some details of how unification works and tried to understand the importance of occurs checks and parameter dependencies. Next time, we’ll attempt to turn this theory into practice by writing unification code in OCaml ;)