# SK logic in egglog: part 2, property-checking extensionality

**Update: part 3
is here**

This is a spiritual follow-on from part 1 of my
experiments with SK logic in egglog: that post serves as the
*motivation* for this one, but the content here is somewhat
standalone. It’s the result of a rabbit hole I entered, after getting
results that were so unexpected and confusing that it made me question
some of the assumptions I’d been making in those experiments.

I was modelling a small programming language called SK
(combinatory) logic; and whilst I’m no stranger to being wrong about
my code, I had chosen SK since it’s *very* simple and
*very* familiar. Implementing SK is like writing “hello world”:
something I’ve done countless times in all sorts of languages and
paradigms, hence why it was an obvious choice when I wanted to learn
egglog. In fact, combinatory
logic even appears in egglog’s `examples/`

folder!
However, an innocuous-looking one-line extension brought the whole thing
crashing down: all structure vanished and every term collapsed into one
big blob, where everything equalled everything else.

**Note for non-logicians:** That’s not
good!

Despite several rewrites, trying various different encodings, and
much scouring of egglog’s
Rust source, I kept running into the same problem. This made me
seriously doubt my understanding, particularly of free
variables. If I could spend years immersed in the worlds of
mathematics, computer science, post-graduate academia, and software
development; yet fail to grasp such a fundamental aspect in all that
time; then what *else* might I be wrong about? Am I a complete
fraud?

Thankfully I also trained as a physicist, so I know what to do when
theoretical understanding goes awry: **experiment**! Or, as
software engineers call it: **test**! I wanted to
thoroughly test the assumptions I was making in those egglog
experiments, so I went crawling back to the comfort of Haskell where I
could use my favourite hammer: property-based
testing. This would also let me try out the new falsify package
(which I’ve blogged
about before): in essence, to try and “falsify myself”, to figure
out what I’d misunderstood.

This page gives the background of what I was attempting; along with
executable Haskell code, properties it should satisfy, and
`falsify`

checks of those properties. The latter are defined
using active code.

## The setup

I’ll explain some of this jargon below, but in short: that
problematic line of code which broke everything was my attempt to
implement extensional
equality; the idea that if two things always behave in the same way,
then we can consider them equal. To specify what “always” means, we need
some notion of universal
quantification: that doesn’t quite fit into the formal system of
egglog, so my idea was to *approximate* universally-quantified
variables using symbolic
execution. This *should* be a conservative approach: failing
to spot some equalities, but never mistakenly claiming two things to be
equal. Instead, it seemed to do the extreme opposite: *always*
claiming that *everything* is equal!

## Preamble boilerplate…

Here’s the initial boilerplate for the script; it just needs GHC with
`falsify`

in its package database. For more details on how
it’s executed, see this
page’s Markdown source.

```
module Main (main) where
import Control.Applicative ((<|>))
import Control.Exception (assert)
import Data.Maybe (isJust)
import GHC.Natural (Natural)
import System.Environment (getEnv)
import Test.Falsify.Generator (Gen, Tree(..))
import qualified Test.Falsify.Generator as Gen
import Test.Falsify.Predicate (satisfies)
import Test.Falsify.Property (testGen)
import Test.Falsify.Range (Range)
import qualified Test.Falsify.Range as Range
import Test.Tasty (defaultMain)
import Test.Tasty.Falsify (testProperty)
-- Useful helpers. These are available in libraries, but I want this code to be
-- self-contained (other than falsify)
-- | Like 'uncurry', but for three arguments
uncurry3 :: (a -> b -> c -> d) -> (a, b, c) -> d
= f a b c
uncurry3 f (a, b, c)
-- | Combine two results into a single tuple
tuple2 :: Applicative f => f a -> f b -> f (a, b)
= (,) <$> a <*> b
tuple2 a b
-- | Combine three results into a single tuple
tuple3 :: Applicative f => f a -> f b -> f c -> f (a, b, c)
= (,,) <$> a <*> b <*> c
tuple3 a b c
-- | Combine four results into a single tuple
tuple4 :: Applicative f => f a -> f b -> f c -> f d -> f (a, b, c, d)
= (,,,) <$> a <*> b <*> c <*> d
tuple4 a b c d
-- | Like a list, but never ends.
data Stream a = Cons { sHead :: !a, sTail :: Stream a } deriving (Functor)
instance Show a => Show (Stream a) where
show (Cons x _) = "[" ++ show x ++ ",...]"
sPrefix :: [a] -> Stream a -> Stream a
= ys
sPrefix [] ys :xs) ys = Cons x (sPrefix xs ys)
sPrefix (x
-- | Drop elements from a 'Stream' which don't satisfy the given predicate.
sFilter :: (a -> Bool) -> Stream a -> Stream a
Cons x xs) = (if p x then Cons x else id) (sFilter p xs)
sFilter p (
-- | Extract the nth element of a 'Stream'.
sAt :: Integral n => n -> Stream a -> a
Cons x xs) = if i <= 0 then x else sAt (i - 1) xs
sAt i (
-- | All prefixes of the given Stream, e.g. [], [x], [x, y], [x, y, z], ...
sInits :: Stream a -> Stream [a]
Cons x xs) = Cons [] ((x:) <$> sInits xs)
sInits (
-- | All String values, in order of length
allStrings :: Stream String
= go 0
allStrings where strs 0 = [""]
= [c:s | c <- [minBound..maxBound], s <- strs (n - 1)]
strs n = sPrefix (strs n) (go (n + 1))
go n
-- | Like a list, but only stores data at the end. Useful for representing
-- | general recursion.
data Delay a = Now a | Later (Delay a) deriving (Eq, Functor, Ord, Show)
instance Applicative Delay where
Now f) <*> x = f <$> x
(Later f) <*> x = Later (f <*> x)
(pure = Now
instance Monad Delay where
Now x) >>= f = f x
(Later x) >>= f = Later (x >>= f)
(
-- | Unwrap (up to) a given number of 'Later' wrappers from a 'Delay' value.
runDelay :: Integral n => n -> Delay a -> Delay a
| n <= 0 = x
runDelay n x @(Now _) = x
runDelay _ xLater x) = runDelay (n - 1) x
runDelay n (
-- | Try to extract a value from the given 'Delay', or use the given default
runDelayOr :: Integral n => a -> Delay a -> n -> a
= case runDelay n x of
runDelayOr def x n Now x' -> x'
Later _ -> def
```

## SK (combinatory logic)

Combinatory logic, which I’ll shamelessly abbreviate as “SK”, is a
particularly simple programming language. It can be written entirely
with two symbols, traditionally called `S`

and
`K`

, along with parentheses for grouping symbols together.
We’ll represent SK expressions using the following Haskell datatype
called `Com`

(this
representation also matches my definition in egglog):

`data Com = C String | App Com Com deriving (Eq, Ord)`

That `deriving`

clause asks Haskell to automatically implement some useful
interfaces:

`Eq`

provides the`==`

function. Deriving this lets us check whether two`Com`

values are identical (same structure and same`String`

data).`Ord`

provides comparisons like`<`

. Deriving this gives us a lexicographic ordering, but the details aren’t important: as long as we have*some*implementation of`Ord`

, we can use efficient`Set`

datastructures.

We’ll implement the `Show`

interface
ourselves, to render expressions in traditional SK notation:

```
instance Show Com where
show (C c) = c
show (App x (App y z)) = show x ++ "(" ++ show (App y z) ++ ")"
show (App x y) = show x ++ show y
```

We represent the “primitives” `S`

and `K`

using
the `C`

constructor (annoyingly, value names in Haskell must begin with a
lowercase letter; so we call these `s`

and `k`

instead):

```
-- | Primitive SK expressions
k :: Com
s,= C "S"
s = C "K" k
```

`App`

groups expressions together two at a time, like a pair of parentheses.
Traditional SK notation only shows parentheses “on the right”, e.g.
`App s (App s k)`

is written `S(SK)`

, but `App (App s k) k`

is simply `SKK`

. The following functions extract the left and
right children of an `App`

value
(returning `Maybe`

, in case
they’re given a primitive expression):

```
right :: Com -> Maybe Com
left,App x _) = Just x
left (= Nothing
left _ App _ y) = Just y
right (= Nothing right _
```

### Anatomy of SK expressions

We can refer to various parts of an SK expression using the following terminology:

- The “root” is the top-most constructor.
- The “head” is the left-most leaf.
- The “spine” is the chain of left-nested
`App`

constructors between the head and the root. If the expression is a leaf, its spine is empty. - The “arguments” (or “args”) of that head are the right-hand children hanging off the spine, ordered from the head to the root. If the spine is empty, there are no arguments.

Examples can be seen in the figure below. We will often describe an expression as “applying 〈its head〉 to 〈its arguments〉”.

### Running SK expressions

We can “run” SK expressions using two rewriting rules:

- An expression applying
`K`

to two args can be replaced by the first arg. - An expression applying
`S`

to three args can be replaced by an expression which applies the first arg to the third arg and «the second applied to the third».

We can implement these as Haskell functions, returning `Nothing`

if the
rule didn’t match:

```
-- | Replaces Kxy with x, otherwise 'Nothing'
stepK :: Com -> Maybe Com
App (App (C "K") x) _) = Just x
stepK (= Nothing
stepK _
-- | Replaces Sxyz with xz(yz), otherwise 'Nothing'
stepS :: Com -> Maybe Com
App (App (App (C "S") x) y) z) = Just (App (App x z) (App y z))
stepS (= Nothing stepS _
```

Notice that arguments can be *any* `Com`

value,
represented using the Haskell variables `x`

, `y`

and `z`

; but only `App`

, `s`

and `k`

have any effect on the output (this
will be important for symbolic execution later!).

Remarkably, these two rules make SK a universal
(Turing-complete) programming language! The following
`step`

function tries to apply both of these rules. It also
applies the rules recursively to sub-expressions of `App`

: note that
we try stepping both children at once, to prevent any reducible parts of
an expression getting “stuck” waiting for their sibling to finish. `Nothing`

is
returned when neither rule matched and no sub-expressions were
rewritten; we say the expression is in normal
form:

```
-- | Attempt to reduce a K or S combinator, or the children of an 'App'.
-- | 'Nothing' if the argument is in normal form.
step :: Com -> Maybe Com
= stepK c <|> stepS c <|> app l' r' <|> app l' r <|> app l r'
step c where app x y = App <$> x <*> y
= (left c , right c )
(l , r ) = (l >>= step, r >>= step) (l', r')
```

Next we’ll need to *iterate* the `step`

function until its argument
reaches normal form. We’ll distinguish normalised expressions using a
separate type called `Normal`

:

```
-- | Wraps Com values which are assumed to be in normal form
newtype Normal = N Com deriving (Eq, Ord)
instance Show Normal where
show = show . toCom
-- | Coerce a 'Normal' back to a 'Com', i.e. forget that it's in normal form.
N c) = c
toCom (
-- | 'step' the given 'Com': if it worked, returns that in 'Left'; otherwise
-- | returns the original value (now 'Normal') in 'Right'.
toNormal :: Com -> Either Com Normal
= case step c of
toNormal c Just c' -> Left c'
Nothing -> Right (N c)
```

##
Understanding `Normal`

…

The `Normal`

type
has the invariant
that it only contains `Com`

values
which are in normal form. More specifically, we can say `step . toCom`

is (extensionally) equal to `const Nothing`

.
Values of type `Normal`

act as
evidence
that the `Com`

they
contain is in normal form (i.e. that it will return `Nothing`

when
given to `step`

).

This invariant is not *guaranteed* to be the case; e.g. we can
violate it by writing code like `N (App (App k s) k)`

(since `KSK`

reduces to `S`

via the `stepK`

rule). To prevent such
violations, we avoid using the `N`

constructor
directly: instead always using the `toNormal`

function, which only returns
a `Normal`

value when it is found to return `Nothing`

from
`step`

(i.e. when our invariant
holds).

`toNormal`

is a smart
constructor: a function which only returns values of the desired
type when it’s checked they’re valid. Ideally we would *enforce*
usage of `toNormal`

, by encapsulating
the `N`

constructor, making the `Normal`

type opaque. That’s
usually achieved in Haskell using modules,
but I want the code for this post to all live in one module (to make
compilation and execution easier).

`toNormal`

tells us when to
stop iterating, but since SK is a universal programming language,
there’s no way to know beforehand if it ever will. We’ll account for
this by wrapping such undecidable computations in a `Delay`

type:

```
-- | Step the given 'Com' until it reaches 'Normal' form.
reduce :: Com -> Delay Normal
= case toNormal c of
reduce c Left c' -> Later (reduce c')
Right n -> Now n
```

### Normal equivalence

The rules of SK are confluent,
meaning that every expression has at most one `Normal`

form,
and applying the reduction rules in any order, to any parts of an
expression, will not change its `Normal`

form or
the ability to reach it using those rules. As a consequence, all SK
expressions with the same `Normal`

form
are equivalent and interchangable.

To see this, consider two expressions *X* and *Y* with the same `Normal`

form
*N*. Whenever *X* occurs inside a larger expression,
confluence allows us to replace *X* with *N* without affecting the overall
result. Now imagine we instead replace *X* with *Y*: whatever that gives us, it will
be unaffected (thanks to confluence) if we reduce *Y* to its `Normal`

form.
Since the `Normal`

form of
*Y* is also *N*, and we already deduced that using
*N* is equivalent to using *X*, the result of using *Y* must *also* be equivalent
to using *X*. Hence any
expressions with the same `Normal`

form
are equivalent, in the sense that they can be interchanged inside any
expression without altering its `Normal`

form.

We’ll call this relationship “`Normal`

equivalence”. It’s undecidable in general, so it also gets wrapped in
`Delay`

:

```
-- | Result of comparing two values
data Compared a = Same a | Diff a a deriving (Show)
diff :: Compared a -> Bool
same,Same _) = True
same (= False
same _ Diff _ _) = True
diff (= False
diff _
-- | Uses (==) to find identical values
comparer :: Eq a => a -> a -> Compared a
= if x == y then Same x else Diff x y
comparer x y
-- | Compare the 'Normal' forms of the given 'Com' values.
normalEq :: Com -> Com -> Delay (Compared Normal)
= comparer <$> reduce x <*> reduce y normalEq x y
```

## Property-based testing

Let’s test this `normalEq`

function, to see whether it actually behaves in the way our theory of SK
predicts it should. For example, every equality relation should have the
“reflexive
property”, meaning that values should be equal to themselves; so any
call like `normalEq foo foo`

should always result in `True`

.
Unfortunately, such results are buried in a `Delay`

which
prevents us testing them directly. `Delay`

values
*could* be a never-ending chain of `Later`

wrappers, like the following:

```
loop :: Delay a
= Later loop loop
```

Hence any attempt to extract a result from them must eventually give up. We’ll work around this using a parameter that counts down until we give up; an approach known as fuel:

```
-- | Represents a parameter for "when to give up"
type Fuel = Word
```

Now we can write a predicate (a function returning a boolean) to
check whether its argument is `normalEq`

to itself, before its `Fuel`

runs
out:

```
normalEqToItself :: (Fuel, Com) -> Bool
= runDelayOr False (same <$> normalEq x x) n normalEqToItself (n, x)
```

We can use this predicate to make statements, by *quantifying*
its argument. Tests commonly use existential
quantification, which asserts that *some* argument satisfies
`normalEqToItself`

(AKA
“example-based” testing). For example:

`= assert (normalEqToItself (0, k)) (putStrLn "PASS") main `

```
PASS
```

However, that isn’t really what we want to say: reflexivity doesn’t
just apply to a few hand-picked examples, it says that *every
argument* satisfies `normalEqToItself`

. Talking about “every
argument” is universal
quantification. Universally-quantified predicates are called
“properties”, so this approach is known as property-based testing.

To test a property, we give it to a “property checker” which searches
for *counterexamples*: argument values which cause the assertion
to fail. If no counterexample can be found, the test passes. Search
techniques range from simple
enumeration all the way up to sophisticated
AI algorithms. Haskell has many property checkers, beginning with
the wonderful QuickCheck
package. We’ll use the state-of-the-art for 2024, which is falsify.

### Data generators

`falsify`

searches through argument values *at
random*, sampling them from a given “generator” with the Haskell
type `Gen a`

(for
generating values of some type `a`

). We can build up generators using
familiar interfaces (`Applicative`

,
`Monad`

,
etc.); from primitives such as `Gen.inRange`

, which samples fixnums
from a `Range`

and is
perfect for generating `Fuel`

:

```
-- | Generates a (relatively small) amount of Fuel
= genFuelN limit
genFuel
-- | Generates up to a certain amount of Fuel
genFuelN :: Fuel -> Gen Fuel
= Gen.inRange . to
genFuelN
-- | A reasonable default for procedures requiring Fuel. Small enough to keep
-- | exponentially-growing terms from blowing up.
limit :: Fuel
= 20
limit
-- | We usually want Ranges from zero upwards
to :: Fuel -> Range Fuel
= Range.between . (0,) to
```

##
Generating recursive `Com`

values…

Generating `Com`

values is
more tricky, since they are recursive. A naïve generator that simply
calls itself recursively will make `Com`

values of
*exponential* size: either blowing up memory (if the recursive
case `App`

is likely to be chosen) or being limited to a handful of tiny values (if
the recursive case is unlikely to be chosen). To generate a diverse
spread of values, without risking out-of-memory conditions, we can use
the same `Fuel`

trick as
above; this time dividing it up (unevenly) between recursive calls.

This is such a useful approach that I’ve implemented it for
`QuickCheck`

, `ScalaCheck`

,
`Hypothesis`

, etc. over the years. In contrast,
`falsify`

provides it out-of-the-box! The `Gen.tree`

function generates binary
`Tree`

values of a given size, so we just need to transform those into the
`Com`

values we need:

```
-- | Create a Com from a binary Tree (with unit values () at the nodes)
treeToCom :: Tree () -> Com
= case t of
treeToCom t -- The smallest Trees have one and two leaves. Turn them into K, since that
-- has the simplest step rule.
Leaf -> k
Branch _ Leaf Leaf -> k
-- There are two Trees with three leaves. Turn those into S.
Branch _ Leaf (Branch _ Leaf Leaf) -> s
Branch _ (Branch _ Leaf Leaf) Leaf -> s
-- For larger Trees, we recurse on their children and combine using App
Branch _ l r -> App (treeToCom l) (treeToCom r)
-- | Generate a Com, with size bounded by the given Fuel
genComN :: Fuel -> Gen Com
= treeToCom <$> Gen.tree (to n) (pure ())
genComN n
-- | Generate (relatively small) Com values
= genComN limit genCom
```

`falsify`

integrates with Haskell’s `tasty`

test framework. Normally a project would declare a big test suite to run
all at once, but for this literate/active style we’ll be testing things
as we go, using the following function:

```
-- | Turn the given predicate into a falsify property, universally quantified
-- | over the given generator's outputs. Check it on (at least) 100 samples.
check :: Show a => (a -> Bool) -> Gen a -> IO ()
pred gen = do
check <- getEnv "NAME" -- Avoids repetition
name pred)) gen)) defaultMain (testProperty name (testGen (satisfies (name,
```

### Included middles

The observant among you may have noticed that `normalEqToItself`

**does
not** hold for all argument values! `falsify`

can
generate a counterexample to show us why:

`= check normalEqToItself (tuple2 genFuel genCom) main `

```
normalEqToItself: FAIL (0.04s)
failed after 14 successful tests and 9 shrinks
not (normalEqToItself generated)
generated: (0,KKK)
Logs for failed test run:
Use --falsify-replay=01040523b283c5642978889c9d68d5d40d to replay.
1 out of 1 tests failed (0.05s)
```

The precise counterexample `falsify`

finds may vary
depending on the random seed, but they’ll all have the following in
common: the `Fuel`

will be
`0`

,
whilst the `Com`

expression will not be in normal form. For
example, it may produce `KKK`

(the Haskell value `App (App k k) k`

).
`stepK`

will reduce that to a
single `K`

, so whilst the value `KKK`

*is*
equal to itself (as we expect), `normalEq`

wraps it in a `Later`

constructor, which may cause `normalEqToItself`

to give up for
*some* `Fuel`

parameters: in particular, when given `0`

`Fuel`

. This
disproves our claim that the property holds for *any* amount of
`Fuel`

.

We could alter our claim to *existentially* quantify the
amount of `Fuel`

: that for
any SK expression `x`

, there is
*some* value of `n`

where
`runDelayOr False (same <$> normalEq x x) n`

holds. However that is *also* false, since there are infinite
loops which have no `Normal`

form;
hence never produce a `Now`

value; and therefore cannot have
a result extracted regardless of how much `Fuel`

we use.
Even when a `Normal`

form
*does* exist, there is no way to bound the amount of `Fuel`

required
to find it.

The correct way to fix our claim is to universally quantify `x`

and `n`

as before, but add two negations to
our predicate: instead of claiming every expression is equal to itself
(regardless of `Fuel`

), we
claim that every expression is *not unequal* to itself
(regardless of `Fuel`

):

```
notUnnormalEqToItself :: (Fuel, Com) -> Bool
= runDelayOr True (not . diff <$> normalEq x x) n notUnnormalEqToItself (n, x)
```

`= check notUnnormalEqToItself (tuple2 genFuel genCom) main `

```
notUnnormalEqToItself: OK (0.04s)
100 successful tests
All 1 tests passed (0.04s)
```

You may have learned in school that double-negatives are redundant: that anything “not false” must be “true” (the law of the excluded middle). However, we’re not dealing with the true/false ideals of classical logic, which ignore computation and its associated undecidability. All we have are the more pragmatic falsified/unfalsified results of a property checker, where there are non-excluded middles such as “don’t know”, “timed out” and “gave up”!

## Extensional equality

My problems arose when trying to extend the equality of SK expressions even further to include extensional equality. To explain how this works, we’ll need a bit more terminology…

### Inputs and agreement

To understand the behaviour of a particular SK expression, we can see
what happens when it’s applied to other SK expressions. We’ll call the
latter “input values”, to emphasise that they are not part of the
original expression we’re focused on. For example, we can understand how
the expression `KS`

behaves by applying it to an input value
like `S`

: giving `KSS`

(which reduces to `Normal`

form
`S`

). Note that `KS`

contains head `K`

and argument `S`

; whilst the resulting expression
`KSS`

also contains the input value `S`

as an
extra argument.

If we apply two expressions to the same input value, and the results
reduce to the same `Normal`

form,
we say those two expressions “agree on” that input value. For example,
`KK`

and `SKK`

agree on the input value
`K`

, since `KKK`

reduces to `K`

; and
`SKKK`

reduces to `KK(KK)`

then to `K`

.
Note that they *disagree* on the input value `S`

,
since `KKS`

reduces to `K`

; whilst
`SKKS`

reduces to `KS(KS)`

then to `S`

.
Expressions can also agree on a *sequence* of input values, where
we apply them both to the first value, then apply those results to the
second value, and so on. We’ll represent such sequences using a `Stream`

, which
is like a list except there is no “nil” case, so it goes on forever:

```
type InputValue = Com
type InputValues = Stream InputValue
-- | Apply the given 'Com' expressions to more and more of the 'InputValues',
-- | checking whether they reach the same 'Normal' form.
agree :: Com -> Com -> InputValues -> Stream (Delay (Compared Normal))
Cons x xs) = Cons (normalEq f g) (agree (App f x) (App g x) xs) agree f g (
```

Everything that satisfies `normalEq`

should also satisfy `agree`

, which we can state with the
following property:

```
normalEqImpliesAgree :: (Fuel, Com, Com, InputValues) -> Bool
=
normalEqImpliesAgree (n, f, g, xs) case runDelay n (tuple2 (normalEq f g) (sHead (agree f g xs))) of
Now (Same _, Diff _ _) -> False
-> True
_
-- | Generate a 'Stream' of Com values, with element size bounded by 'Fuel'
genComsN :: Fuel -> Gen InputValues
= Cons <$> genComN fuel <*> genComsN fuel
genComsN fuel
-- | Generate (relatively small) lists of Com values
genComs :: Gen InputValues
= genComsN limit genComs
```

`= check normalEqImpliesAgree (tuple4 genFuel genCom genCom genComs) main `

```
normalEqImpliesAgree: OK (0.07s)
100 successful tests
All 1 tests passed (0.07s)
```

We say expressions “agree on *N* inputs” when they agree on
*every* sequence of *N*
input values; i.e. the input values are universally-quantified, but the
length of the sequence is not. For example, `SK`

and
`S(K(SK))(KK)`

agree on two inputs; which we can test by
asserting that they *never disagree*:

```
skNeverDisagreesWithSKSKKK :: (Fuel, InputValues) -> Bool
=
skNeverDisagreesWithSKSKKK (n, xs) True (not . diff <$> sAt (2 + n) (agree f g xs)) n
runDelayOr where f = App s k
= App (App s (App k (App s k))) (App k k) g
```

`= check skNeverDisagreesWithSKSKKK (tuple2 genFuel genComs) main `

```
skNeverDisagreesWithSKSKKK: OK (1.14s)
100 successful tests
All 1 tests passed (1.14s)
```

Note that any expressions which agree on *N* inputs also agree on *N* + 1 inputs (and so on), since the
left child of each root has the same `Normal`

form
(by definition of agreement on *N* inputs).

```
agreementIsMonotonic :: ((Fuel, Fuel), (Com, Com), InputValues) -> Bool
=
agreementIsMonotonic ((n, m), (f, g), xs) case runDelay n (tuple2 (sAt n (agree f g xs))
+ m) (agree f g xs))) of
(sAt (n Now (Same _, Diff _ _) -> False
-> True _
```

```
= check agreementIsMonotonic (tuple3 (tuple2 genFuel genFuel)
main
(tuple2 genCom genCom ) genComs)
```

```
agreementIsMonotonic: OK (0.85s)
100 successful tests
All 1 tests passed (0.85s)
```

### Extensionality

Now we’ve defined inputs and agreement, extensional equality becomes
quite simple: SK expressions are extensionally equal iff they agree on
*some* number of inputs. We saw that `SK`

and
`S(K(SK))(KK)`

agree on *some* number of inputs
(namely: on two inputs), so they are extensionally equal. Expressions
which are normally equivalent are also extensionally equal, since they
agree on zero inputs.

Notice that the *number* of inputs is
existentially-quantified, whilst the *values* of those inputs are
universally-quantified. This makes extensional equality difficult to
determine:

- If we compare two expressions on
*N*input values and see that they*agree*, that doesn’t mean they’re extensionally equal; since there might be other input values of length*N*for which they disagree. - If we compare two expressions on
*N*input values and see that they*disagree*, that doesn’t mean they’re*not*extensionally equal; since they might only agree on inputs longer than*N*.

In order to make more definitive assertions about extensional equality, we need to borrow techniques from the world of formal methods!

## Symbolic execution

The reason I like property-checking is that it can provide a lot of confidence for relatively little effort, compared to e.g. example-based testing (less confidence for a similar effort), manual testing (less confidence for more effort) or formal verification (more confidence for more effort). However, there are occasions when the scales tip in favour of other approaches, and extensional equality checking turns out to be particularly suited to a formal method called symbolic execution.

In this approach, we avoid using real, “concrete” SK expressions for
our input values; and instead use abstract
symbols. These can be copied, discarded or rearranged during program
execution; which is perfect for seeing where different inputs end up in
the resulting `Normal`

form.
They are otherwise inert or “uninterpreted”: causing reduction to stop
if it would depend on the particular value of an input.

Implementing symbolic execution can often be laborious, but our
definition of `Com`

actually
makes it trivial to graft on top: we can represent abstract symbols
using the `C`

constructor
applied to any `String`

(*except* `"S"`

or `"K"`

,
which we’re treating specially). Execution of the `stepK`

and `stepS`

functions will treat these as atomic
symbols, to be copied, discarded and rearranged as needed.

Once we’ve normalised a `Com`

value
containing symbols, we need a semantics, or abstract
interpretation, to understand what they mean. Since we’re using each
symbol to represent a distinct input, the way they appear in `Normal`

forms
tells us something about the overall behaviour of the expression and,
crucially, how it depends on the particular *concrete* `Inputs`

it may
be applied to. It’s undecidable whether part of an SK expression will
influence its reduction (or just get discarded by the `stepK`

rule), so we’ll limit our
ambitions to understanding a few simple situations.

### Symbolic agreement proves extensional equality

Expressions which agree on symbolic inputs will agree on *all*
inputs, since symbols do not reduce, and hence those inputs must have
been irrelevant for the reductions which lead to the agreement. Unlike
property-checking, where we rely on double-negatives to “fail to
disprove agreement”, this is real positive proof that expressions will
*always* agree, and hence we can claim definitively that they
*are* extensionally equal.

We don’t need to implement this check specially, since we can reuse
`agree`

, just using symbolic
values as our inputs instead of concrete SK expressions:

```
-- | Synonym to indicate when we're dealing with uninterpreted symbolic values
type Symbol = String
-- | All Symbols except "S", "K" and "" (which doesn't show well)
symbols :: Stream Symbol
= sFilter keep allStrings
symbols where keep s = s /= "" && s /= "S" && s /= "K"
```

The following `agreeSym`

function applies two expressions to more and more symbolic inputs, to
see if they reduce to the same `Normal`

form:

```
-- | Checks whether two Com values agree on more and more symbolic input values.
agreeSym :: Com -> Com -> Stream (Delay (Compared Normal))
= agree f g (C <$> symbols) agreeSym f g
```

The return type of `agreeSym`

is a little awkward, since it’s “two dimensional”: travelling further
along the `Stream`

applies
more and more symbolic inputs; travelling further along any of those
`Delay`

values applies more and more steps to that expression. To get at the
results inside (assuming any of those expressions has a `Normal`

form)
we need a linear path which traverses this structure. We can’t use
breadth-first search since the `Stream`

never
ends; we also can’t use depth-first search, since a `Delay`

might
never end. The following `race`

function is a bit smarter: it acts like a round-robin scheduler, each
iteration running more expressions until one of them finishes (if
ever):

```
-- | Runs every 'Delay' element in an interleaved fashion until one of them
-- | produces a 'Now'. Returns that result, its index in the 'Stream', and a
-- | 'Stream' of the remaining 'Delay' elements.
race :: Stream (Delay a) -> Delay (a, Natural, Stream (Delay a))
= go [] [] s
race s where go ran run next = case (run, next) of
Cons z zs) -> go [] (reverse (z:ran)) zs
( [], Later y:ys, zs) -> Later (go (y:ran) ys zs)
(Now y:ys, zs) -> Now ( y
( fromIntegral (length ran)
, reverse ran ++ ys) zs
, sPrefix ( )
```

##
Checking `race`

…

The `race`

algorithm is a bit
tricky, so we’ll double-check it with a few property tests. First some
helper functions:

```
-- | Wrap the given number of 'Later' constructors around '()'
| n <= 0 = Now ()
wrapLater n = Later (wrapLater (n - 1))
wrapLater n
-- | Count the number of 'Later' wrappers around a 'Now' value. Gives up if the
-- | given 'Fuel' runs out.
countLaters :: Integral n => Fuel -> Delay a -> Maybe n
= go 0
countLaters where go !n _ ( Now _) = Just n
0 _ = Nothing
go _ !n m (Later x) = go (n + 1) (m - 1) x
go
= countLaters (n + m) (wrapLater n) == Just n countMatchesWrap (n, m)
```

`= check countMatchesWrap (tuple2 genFuel genFuel) main `

```
countMatchesWrap: OK
100 successful tests
All 1 tests passed (0.01s)
```

```
= case runDelay n (wrapLater n) of
runUndoesWrap n Now _ -> True
-> False _
```

`= check runUndoesWrap genFuel main `

```
runUndoesWrap: OK (0.01s)
100 successful tests
All 1 tests passed (0.01s)
```

`= countLaters m (runDelay n (wrapLater (n + m)) ) == Just m runStops (n, m) `

`= check runStops (tuple2 genFuel genFuel) main `

```
runStops: OK (0.02s)
100 successful tests
All 1 tests passed (0.02s)
```

Now on to `race`

itself: it
searches for a `Now`

value
using a triangular “flood fill” approach, illustrated below:

The numbers in the first column (i.e. which steps `race`

allocates to running the first
element of the `Stream`

) are
precisely the triangular numbers.
We can check this, by counting the number of `Later`

wrappers
`race`

produces for a `Stream`

whose
first element has a known number of `Later`

wrappers, and whose other elements are never-ending loops (and hence
don’t contribute any `Now`

values of
their own):

```
-- | Calculates the nth triangular number https://oeis.org/A000217
triangle :: (Integral n, Integral m) => n -> m
= fromIntegral . go . fromIntegral . max 0
triangle where go (n :: Natural) = (n * (n + 1)) `div` 2
loops :: Stream (Delay a)
= Cons loop loops
loops
raceFirstUsesTriangularFuel :: Fuel -> Bool
=
raceFirstUsesTriangularFuel n Cons (wrapLater n) loops)) == Just tn
countLaters tn (race (where tn = triangle n
```

`= check raceFirstUsesTriangularFuel genFuel main `

```
raceFirstUsesTriangularFuel: OK (0.03s)
100 successful tests
All 1 tests passed (0.04s)
```

At the other extreme, `race`

will run the *first* step of element `n`

just before it goes back to the
first element; i.e. at step `triangle (n + 1) - 1`

(this is OEIS A000096):

```
=
raceHitsNthElementBeforeNextTriangularNumber n == Just x
countLaters x (race (prog n)) where x = triangle (n + 1) - 1
0 = Cons (Now ()) loops
prog = Cons loop (prog (m - 1)) prog m
```

`= check raceHitsNthElementBeforeNextTriangularNumber genFuel main `

```
raceHitsNthElementBeforeNextTriangularNumber: OK (0.04s)
100 successful tests
All 1 tests passed (0.04s)
```

We can use `race`

to check
whether two expressions ever agree on symbolic inputs, and therefore
whether they’re extensionally equal:

```
-- | 'True' iff the given expressions ever agree for any number of inputs.
everAgree :: Com -> Com -> Delay Bool
= race (agreeSym x y) >>= go
everAgree x y where go (Same _ , _, _) = Now True
Diff _ _, _, s) = race s >>= go go (
```

Note that `everAgree`

is more
general than `normalEq`

, since the
latter only checks for agreement on 0
inputs; although it requires more `Fuel`

to
account for running it interleaved with 1 input, 2
inputs, etc. (see the fold-out section above for more details):

```
normalEqImpliesEverAgree :: (Fuel, Com, Com) -> Bool
=
normalEqImpliesEverAgree (n, x, y) if go (same <$> normalEq x y) n
then go ( everAgree x y) (triangle n)
else True
where go = runDelayOr False
```

`= check normalEqImpliesEverAgree (tuple3 genFuel genCom genCom) main `

```
normalEqImpliesEverAgree: OK (0.05s)
100 successful tests
All 1 tests passed (0.05s)
```

However, `everAgree`

is not yet
a predicate for checking extensional equality, since the `Compared`

values returned by `agreeSym`

don’t represent “equal/unequal”; only “equal/unsure”. Hence the `everAgree`

function *claims* to
return `Delay Bool`

,
which we can think of as “at most one boolean”. Yet its result is not
really a boolean, since it can never be `False`

! It
would be more accurate to return a unit value `Now ()`

, of
type `Delay ()`

. That
type is isomorphic to the
natural numbers, counting how long it took to prove equality. That
can’t be a predicate, since it begs the
question!

A *useful* predicate for extensional equality not only needs
to return `True`

for some
inputs it’s sure are equal; but *also* return `False`

for some
inputs it’s sure are unequal. When it’s unsure (which is unavoidable due
to undecidability), it can use a never-ending chain of `Later`

wrappers
to avoid ever returning at all!

### Different symbolic heads prove disagreement

We can easily show that two expressions disagree for *N* inputs, by checking if the *N*th element of `agreeSym`

results in `Diff`

. To be
*sure* they are not extensionally equal, we also need to prove
they will disagree for all inputs *longer* than *N*.

A simple way to prove this is when the head of each expression is a
different symbolic input. In that case, their `Normal`

forms
depend entirely on those inputs, so disagreement can be “forced” by
choosing input values which reduce to different results. For example,
say `Xab`

is some concrete SK expression applied to two
symbolic inputs, which happens to reduce to `aS`

; hence its
result depends on the *first* input, and is independent of the
second. If some other expression `Yab`

reduces those inputs
to `bS`

, it depends on the *second* input, and is
independent of the first. They can be forced to disagree by choosing
concrete input values in place of `a`

and `b`

,
which reduce to values that are known to be unequal. In this example we
could use input values `KS`

and `KK`

, which
`X`

reduces to `S`

but `Y`

reduces to
`K`

. This works even if both inputs appear in a result,
e.g. `ab`

can still be forced independently of
`b`

, by choosing a concrete expression for `a`

which discards its argument.

This ability to “force” the result of expressions which use an input
as their head can be used to extend disagreements from *N* inputs to *all* inputs
*M* > *N*. To see
this, note that the result of applying an expression to *M* > *N* inputs is
equivalent to first applying it to *N* inputs (which we can force to
result in any value we like), then applying that result to the remaining
*M* − *N* inputs. We can
choose those intermediate results to be expressions that consume the
remaining *M* − *N*
inputs (e.g. through repeated use of `K`

) *and then*
disagree with each other. ∎

The following function will spot when two expressions have different symbols in head position, and are hence provably unable to agree, even with more inputs:

```
-- | Return the head (left-most leaf) of the given Com
headPos :: Com -> Symbol
C c) = c
headPos (App l r) = headPos l
headPos (
-- | Whether a String represents an uninterpreted Symbol (i.e. not S or K)
isSym :: String -> Bool
= s /= "S" && s /= "K"
isSym s
-- | Whether the given Com values have distinct symbolic values in head position
distinctSymbolicHeads :: Com -> Com -> Bool
= isSym hX && isSym hY && hX /= hY
distinctSymbolicHeads x y where (hX, hY) = (headPos x, headPos y)
```

We can perform a few sanity checks, to confirm that our argument above holds:

```
-- | Generate String values to represent uninterpreted symbolic values
genSymN :: Fuel -> Gen Symbol
= (`sAt` symbols) <$> genFuelN n
genSymN n
-- | Generate (relatively small) symbolic values
genSym :: Gen Symbol
= genSymN limit
genSym
-- | Generate Com values which may also contain symbols
genSymComN :: Fuel -> Gen Com
= toSymCom <$> Gen.tree (to n) (genSymN n)
genSymComN n where toSymCom t = case t of
Leaf -> k
Branch _ Leaf Leaf -> k
Branch _ Leaf (Branch _ Leaf Leaf) -> s
Branch s (Branch _ Leaf Leaf) Leaf -> C s
Branch _ l r -> App (toSymCom l) (toSymCom r)
-- | Generate (relatively small) Com values which may contain symbols
= genSymComN limit
genSymCom
-- | Shouldn't matter which order we compare two Com values
distinctSymbolicHeadsCommutes :: (Com, Com) -> Bool
= distinctSymbolicHeads x y
distinctSymbolicHeadsCommutes (x, y) == distinctSymbolicHeads y x
```

`= check distinctSymbolicHeadsCommutes (tuple2 genSymCom genSymCom) main `

```
distinctSymbolicHeadsCommutes: OK (0.04s)
100 successful tests
All 1 tests passed (0.04s)
```

### Different numbers of arguments prove disagreement

Expressions whose heads are *the same* symbolic input, but
applied to a different number of arguments, can also be forced to
disagree. For example, if `Xab`

reduces to `aSb`

,
and `Yab`

reduces to `aS`

, then `distinctSymbolicHeads`

can’t be sure if
they’re different. Yet we can still force them to disagree, by giving
them *three* inputs (`a`

, `b`

and
`c`

):

- We choose a value for
`a`

which consumes two arguments and reduces to the second, such as`SK`

. - We give
`b`

the value`Kd`

(where`d`

can represent any expression). - We leave
`c`

unconstrained.

`X`

and `Y`

will reduce these inputs as
follows:

```
X(SK)(Kd) c | Y(SK)(Kd)c | Xabc and Yabc
SKS(Kd) c | SKS c | Reduced per example
K (Kd)(S(Kd))c | K c(Sc) | Applied S rule to each
Kd c | c | Applied K rule to each
d | c | Applied K rule again to first
```

Since `X`

and `Y`

reduce to different symbols,
we can use this scheme to force their results to *any* value,
including unequal `Normal`

forms
or (as above) expressions which consume any number of subsequent inputs
*then* become unequal `Normal`

forms.
∎

The following `unequalArgCount`

function will spot when two expressions apply a symbol to an unequal
number of arguments, and are hence sure to be extensionally unequal:

```
-- | Split a Com value into its head and any arguments that's applied to
headAndArgs :: Com -> (String, [Com])
C x) = (x, [])
headAndArgs (App l r) = case headAndArgs l of
headAndArgs (-> (h, args ++ [r])
(h, args)
-- | True iff the given expressions have the same Symbol in their head, but
-- | applied to a different number of arguments.
unequalArgCount :: Com -> Com -> Bool
= isSym headX
unequalArgCount x y && headX == headY
&& length argsX /= length argsY
where (headX, argsX) = headAndArgs x
= headAndArgs y (headY, argsY)
```

Again, we can sanity-check this:

```
-- | Order of Com values shouldn't affect result
unequalArgCountCommutes :: (Com, Com) -> Bool
= unequalArgCount x y
unequalArgCountCommutes (x, y) == unequalArgCount y x
```

`= check distinctSymbolicHeadsCommutes (tuple2 genSymCom genSymCom) main `

```
distinctSymbolicHeadsCommutes: OK (0.03s)
100 successful tests
All 1 tests passed (0.03s)
```

### Disagreeing arguments prove disagreement

Finally, we can force expressions to disagree by forcing a
disagreement *inside* them, then propagating it up to the overall
result. This is useful, since it can be achieved when each expression
has the *same* `Symbol`

at their heads (avoiding `distinctSymbolicHeads`

), and they apply
it to the *same* number of arguments (avoiding `unequalArgCount`

); as long as we can
force the *values* of those arguments to disagree.

For example, let’s say `Xabc`

reduces to `ab`

and `Yabc`

reduces to `ac`

: these have the same
symbol `a`

as their heads, and both apply it to a single
argument. However, the values of those arguments (`b`

and
`c`

) are provably unequal (via `distinctSymbolicHeads`

, in this case).
We can hence use the first input `a`

to propagate its unequal
argument, e.g. using the input value `SKK`

. Since
`X(SKK)bc`

reduces to `b`

, and
`Y(SKK)bc`

reduces to `c`

, we can force a
disagreement by choosing unequal values for `b`

and
`c`

; and this extends to more inputs by having them consume
the remainder, as before. ∎

The following function checks for the situation described above. It
checks whether argument values are unequal using a given `unEq`

parameter, which is more general
and useful than if we hard-coded some specific check:

```
-- | Whether the given Com values have matching symbols in their heads, but
-- | applied to unequal values (determined by the given unEq function)
symbolGivenUnequalArgs :: (Com -> Com -> Bool) -> Com -> Com -> Bool
= isSym headX
symbolGivenUnequalArgs unEq x y && headX == headY
&& any id (zipWith unEq' argsX argsY)
where (headX, argsX) = headAndArgs x
= headAndArgs y
(headY, argsY) = unEq a b || unEq b a -- Compare both ways round
unEq' a b
-- | The order of arguments shouldn't alter the result
symbolGivenUnequalArgsCommutes :: (Com -> Com -> Bool) -> Com -> Com -> Bool
= symbolGivenUnequalArgs f x y
symbolGivenUnequalArgsCommutes f x y == symbolGivenUnequalArgs f y x
```

##
Generating `unEq`

functions to
test with…

The `symbolGivenUnequalArgs`

function doesn’t care *how* arguments are deemed to be unequal,
so it lets the caller decide by passing in a function. The choice of
function shouldn’t affect commutativity, so the property `symbolGivenUnequalArgsCommutes`

is
universally quantified over that choice; allowing a counterexample
search to try many different functions.

This may seem surprising if you’ve not encountered it before, but we can generate a function value just as easily as a “data-like” value. For example, we can generate functions which look up their arguments in a (generated) lookup table:

```
genViaTable :: Eq a => Range Fuel -> Gen a -> Gen b -> Gen (a -> b)
range genA genB = do
genViaTable <- genB
def <- Gen.list range (tuple2 genA genB)
pairs pure (maybe def id . (`lookup` pairs))
```

However, such ordinary function values are opaque “black boxes”,
which makes them less useful for property-checking: firstly, it’s hard
to `show`

them (e.g. if they form part of a counterexample); and secondly it’s
hard to shrink them (to look for “simpler” alternatives).
Property-checkers avoid these problems by replacing ordinary functions
`a -> b`

with
their own alternative, whose constructors ‘remember’ how to `show`

and
shrink themselves. In `falsify`

this alternative is `Gen.Fun a b`

,
so we’ll need to convert the `Com -> Com -> Bool`

argument of `symbolGivenUnequalArgsCommutes`

into
either `Fun Com (Fun Com Bool)`

(curried form) or,
equivalently, `Fun (Com, Com) Bool`

(uncurried form). We’ll use the uncurried form, since the tuples will be
handled automatically by type class instance resolution. The following
helpers let `symbolGivenUnequalArgsCommutes`

take a
`Fun`

as argument:

```
-- | Transform a higher-order function to take a Fun instead
liftFun :: ((a -> b) -> c) -> Gen.Fun a b -> c
= f . Gen.applyFun
liftFun f
-- | Lift a function with binary argument to take an uncurried Fun instead
liftFun2 :: ((a -> b -> c) -> d) -> Gen.Fun (a, b) c -> d
= liftFun (f . curry) liftFun2 f
```

To check (the lifted version of) `symbolGivenUnequalArgsCommutes`

, we
need a generator of type `Gen (Fun (Com, Com) Bool)`

.
We can get one from `Gen.fun`

, but
that requires an instance of `Gen.Function (Com, Com)`

.
There’s an existing instance for tuples, but we still need to implement
`Gen.Function Com`

ourselves:

```
instance Gen.Function Com where
= Gen.functionMap comToPre preToCom <$> Gen.function gen function gen
```

This is piggybacking on an existing instance (the `Gen.function gen`

call at the end) by
converting our `Com`

values
back-and-forth to another type that already implements `Gen.Function`

.
The type I’ve chosen is `[Maybe (Either Bool Symbol)]`

,
and the conversions are implemented by `comToPre`

& `preToCom`

:

```
-- | Prefix notation for expressions. `Nothing` represents an 'apply' operation.
type Prefix = [Maybe (Either Bool Symbol)]
-- | Converts expressions from "applicative style" Com to "prefix style" Prefix.
comToPre :: Com -> Prefix
= case x of
comToPre x -- Treat K and S specially, so they're more likely to be generated
| x == k -> [Just (Left False)]
_ | x == s -> [Just (Left True )]
_ -- Any other symbol
C c -> [Just (Right c)]
-- Nothing acts like an apply operator for two expressions that follow it
App l r -> [Nothing] ++ comToPre l ++ comToPre r
-- | Parses "prefix style" Prefix back into an "applicative style" Com.
preToCom :: Prefix -> Com
= fst . go -- Discard any remaining suffix
preToCom where go [] = (k, []) -- Makes preToCom total, but overlaps 'Left False'
:xs) = case x of
go (xJust (Left False) -> (k , xs)
Just (Left True ) -> (s , xs)
Just (Right c ) -> (C c, xs)
Nothing -> let (l, ys) = go xs
= go ys
(r, zs) in (App l r, zs)
comToPreRoundtrips :: Com -> Bool
= preToCom (comToPre c) == c comToPreRoundtrips c
```

`= check comToPreRoundtrips genCom main `

```
comToPreRoundtrips: OK (0.03s)
100 successful tests
All 1 tests passed (0.03s)
```

```
preToComAlmostRoundtrips :: Prefix -> Bool
= comToPre (preToCom p') == p'
preToComAlmostRoundtrips p where p' = comToPre (preToCom p) -- Extra roundtrip to make p "correct"
genPre :: Gen Prefix
= Gen.list (to limit) genEntry
genPre where genEntry = Gen.choose (Just <$> genLeaf) (pure Nothing)
= Gen.choose (Left <$> Gen.bool False) (Right <$> genSym) genLeaf
```

`= check preToComAlmostRoundtrips genPre main `

```
preToComAlmostRoundtrips: OK (0.02s)
100 successful tests
All 1 tests passed (0.02s)
```

The `Prefix`

type is
a prefix
form for expressions, as opposed to the “applicative form” of `Com`

. They’re
equivalent, but the structure of an expression is less obvious in prefix
form, which is why it only appears in this hidden section. This is the
encoding used in binary
combinatory logic, although we’re allowing arbitrary `Symbol`

values
rather than just `S`

and `K`

.

The functions `preToCom`

and
`comToPre`

don’t *quite*
form an isomorphism. This
is essentially due to `Prefix`

being a
prefix(-free)
code:

- The empty list
`[]`

does not correspond to a particular`Com`

value. It a valid*prefix*, since appending it with anything will either form a complete expression or another valid prefix; but that’s not so useful for`preToCom`

, which might have only been given a*finite*list and needs to return a`Com`

. In this case, it just returns`K`

(arbitrarily). `Prefix`

values may contain “too many”`Nothing`

values. These act like`App`

, indicating that the following expression is applied to the one after. That “following expression” may itself use`Nothing`

to consume subsequent list elements, and so on; corresponding to the arbitrary nesting which`App`

allows. However, the given`Prefix`

may “run out” before specifying what those following elements should be! Again, that’s a perfectly good*prefix*, but not a complete value. When this happens, we will hit the same`[]`

case as above, and simply return`K`

(potentially many times, as the applications get “popped off the stack”).- Expressions encoded as
`Prefix`

values are self-delimiting, meaning that the encoding itself tells us when to finish parsing: if the value we parsed was`Just _`

, we have finished that expression; if the value is`Nothing`

we can finish after parsing exactly two more expressions. Hence any elements appearing*after*a correctly-encoded expression will be completely ignored, resulting in the same`Com`

. - Our hacky use of
`K`

as a fallback for empty`Prefix`

values introduces some ambiguity, since an expression like`KK`

could be encoded as`[Nothing, Just (Left False), Just (Left False)]`

(the “correct” way); or as`[Nothing, Just (Left False)]`

(relying on the fallback behaviour to introduce another`K`

); or indeed as`[Nothing]`

(relying on the fallback for*both*`K`

expressions). Hence the fallback behaviour should not be relied upon; also, it breaks when composing expressions together (since missing`K`

values will only be introduced at the end of a list). - There’s also some redundancy between
`Left False`

/`Left True`

and`Right "K"`

/`Right "S"`

. This was a deliberate choice, to make`falsify`

generate more`S`

and`K`

expressions.

These aren’t a problem for our use of `Prefix`

in
generating values; but you may need to keep them in mind when using this
encoding for other purposes (in which case I’d recommend
*embracing* its nature as a prefix(-free) code, since it has all
sorts of cool applications!)

```
= check (uncurry3 (liftFun2 symbolGivenUnequalArgsCommutes))
main False)) genSymCom genSymCom) (tuple3 (Gen.fun (Gen.bool
```

```
symbolGivenUnequalArgsCommutes: OK (0.04s)
100 successful tests
All 1 tests passed (0.04s)
```

### Combining disagreement provers

We can never spot *all* disagreements, but the simple checks
above can be combined into a single, reasonably-useful function. Not
only is this easier to use than the individual checks, but we can also
use it as the `unEq`

argument of
the `symbolGivenUnequalArgs`

function, which lets us recurse through the given expressions looking
for disagreement at any depth!

```
provablyDisagree :: Com -> Com -> Bool
= distinctSymbolicHeads x y
provablyDisagree x y || unequalArgCount x y
|| symbolGivenUnequalArgs provablyDisagree x y
provablyDisagreeCommutes :: (Com, Com) -> Bool
= provablyDisagree x y == provablyDisagree y x provablyDisagreeCommutes (x, y)
```

`= check provablyDisagreeCommutes (tuple2 genSymCom genSymCom) main `

```
provablyDisagreeCommutes: OK (0.05s)
100 successful tests
All 1 tests passed (0.05s)
```

## An extensional equality test

Now we have ways to spot both extensional equality *and*
inequality (in certain situations), we can fold them over the result of
`agreeSym`

. We’ll begin by
extracting the numerical index provided by `race`

, which should tell us how many
inputs it takes for the given expressions to start agreeing:

```
-- | 'Just n' if the given expressions agree on 'n' inputs (they might also
-- | agree on fewer). 'Nothing' if it is determined that they won't agree; if
-- | that cannot be determined, the 'Delay' will never end.
extensionalInputs :: Com -> Com -> Delay (Maybe Natural)
= race (agreeSym x y) >>= go 0
extensionalInputs x y where go !n (Same _ , m, _) = Now (Just (n+m))
!n (Diff a b, _, s) = if provablyDisagree (toCom a) (toCom b)
go then Now Nothing
else race s >>= go (n+1)
agreeOnExtensionalInputs :: (Fuel, Com, Com, InputValues) -> Bool
=
agreeOnExtensionalInputs (n, x, y, inputs) True (extensionalInputs x y >>= check) n
runDelayOr where check Nothing = Now True
Just i) = same <$> sAt i (agree x y inputs) check (
```

`= check agreeOnExtensionalInputs (tuple4 genFuel genCom genCom genComs) main `

```
agreeOnExtensionalInputs: OK (0.11s)
100 successful tests
All 1 tests passed (0.11s)
```

This is easy to transform into a legitimate predicate for testing
extensional equality, that’s able to answer both `True`

and `False`

(at
least, some of the time):

```
-- | Whether the given expressions are extensionally equal, i.e. cannot be
-- | distinguished by an SK expression.
extEq :: Com -> Com -> Delay Bool
= isJust <$> extensionalInputs x y extEq x y
```

`extEq`

is a generalisation of
our previous checks:

```
extEqGeneralisesEqAndNormalEqAndEverAgree :: (Fuel, Com, Com) -> Bool
=
extEqGeneralisesEqAndNormalEqAndEverAgree (n, x, y) case ( go ( extEq x y)
, go ( everAgree x y)<$> normalEq x y)
, go (same ==) x y) of
, (Now False, Now True, _ , _ ) -> False
(Now False, _ , Now True, _ ) -> False
(Now False, _ , _ , True) -> False
(-> True
_ where go = runDelay n
```

```
= check extEqGeneralisesEqAndNormalEqAndEverAgree
main (tuple3 genFuel genCom genCom)
```

```
extEqGeneralisesEqAndNormalEqAndEverAgree: OK (0.16s)
100 successful tests
All 1 tests passed (0.16s)
```

## Next time

We’ve seen how extensional equality works *in theory*, part 3 explores these
definitions in more depth, to find out where I was going wrong in my
egglog.