# Bottom, Null, Unit and Void

When programming (or theorem-proving), it’s important to distinguish
between the *empty type* and the *unit type*. In (type)
theory there are *no* values of the empty type, and there is
*one* value of the unit type. In a language without “bottom” (AKA
`_|_`

),
like Agda, we might define them something like this:

```
data Empty : Type
data Unit : Type where
: Unit unit
```

This says that `Empty`

is a
`Type`

,
that `Unit`

is a
`Type`

,
that `unit`

is a `Unit`

. Since
we’ve given no constructors (or “introduction forms”) for `Empty`

, there’s
no way to make one. Likewise we can write function types like `Int -> Empty`

which have no values (since there’s no way for any function to return a
value of type `Empty`

), we can
write tuple types like `Pair Int Empty`

that have no values, etc. We can also write function types like `Empty -> Int`

which *do* have values (e.g. `const 42`

),
but which can never be called (since there are no `Empty`

values
to give as an argument).

Incidentally, the fact that we can have types with no values is one
of the reasons we can’t have a value with type `forall a. a`

(in Haskell) or `(a : Type) -> a`

(in Agda): they claim to return a value of any type, but we might ask
for an `Empty`

.

The unit type is trivial, literally. Since there’s only one value of
this type (`unit`

), whenever we
know that a value will have type `Unit`

we can
infer that the value *must* be `unit`

, and hence:

- If it’s a return value, we can always optimise the function to
simply
`unit`

- If it’s an argument, it has no effect on the return value (there’s nothing to branch on).

In this sense, the unit type contains 0 bits of information (compared
to `Bool`

which contains 1 bit).

As an aside, `Either`

is
called a “sum type” because it contains all the values from its first
argument *plus* those of its second: `Either Unit Unit`

is equivalent to `Bool`

(with
values `Left unit`

and
`Right unit`

).
Pairs are called “product types” since they contain every combination of
their first and second arguments, which is the number of values in each
*multiplied* together. In this sense `Empty`

acts
like zero: the type `Either Empty a`

doesn’t contain any `Left`

values,
so it’s equivalent to `a`

, just
like `0 + a = a`

for numbers. Likewise `Pair Empty a`

contains no values, since there’s nothing to put in the first position,
just like `0 * a = 0`

for numbers. `Either Unit a`

acts like `1 + a`

, since we have all of the values of `a`

(wrapped in `Right`

) plus
the extra value `Left unit`

;
note that this is also the same as `Maybe a`

, where
`Right`

acts like `Just`

and `Left unit`

acts
like `Nothing`

. `Pair Unit a`

is the same as `a`

since each
`a`

value can only be paired with
one thing, and we know that thing is always `unit`

; this is just like
`1 * a = a`

with numbers.

If we want to compare these things in “real” languages, it can get
confusing, especially since a lot of the terminology is overloaded. In
Haskell the type `Unit`

is
written `()`

and the value `unit`

is *also* written `()`

.

In Haskell the type `Empty`

is
called `Void`

, but
that is *not* the same as `void`

in languages
like Java! In particular, remember that we cannot have a function which
returns `Empty`

, so
`void`

being
`Empty`

would mean we could not implement methods like `public static void putStrLn(String s)`

.
Such methods certainly *do* exist, since they’re all over the
place in Java, but what happens if we call them? They will run, and
return a value back to us (hence it can’t be `Empty`

). What
*is* the value we get back? We know, without even running the
method, that we’ll get back `null`

. That’s just
like the `Unit`

type
(where we know the value will be `unit`

without having to run the code).
Hence Java’s `void`

acts like
`Unit`

,
and `null`

acts like `unit`

.

If we follow a similar line of reasoning in Python, we find that
`None`

acts
like `unit`

(or Java’s `null`

) and hence
`NoneType`

is like `Unit`

(or
Java’s `void`

). AFAIK
Python doesn’t have anything which acts like `Empty`

(Haskell’s `Void`

) since,
lacking any values, `Empty`

is only
useful for type-level calculations (of the sort which get erased during
compilation), which Python doesn’t tend to do.

That just leaves “bottom”. There are a couple of ways to think about
it: we can be “fast
and loose” where we ignore bottom as a nuisance, which mostly works
since bottom isn’t a “proper” value: in particular, we can’t branch on
it; hence any time we do a calculation involving bottoms which results
in a “proper” value, we know that the bottoms were irrelevant to the
result, and hence can be ignored. Any time a bottom *is*
relevant, we have to abandon our nice, logical purity in any case, since
catching them requires `IO`

, so why
bother complicating our pure logic by trying to include them?

Alternatively we can treat bottom as an extra value of every type, in
a similar way to `null`

inhabiting
every type in Java. From this perspective Haskell’s `Void`

type
*does* contain a value (bottom), and the `()`

type contains *two* values
(`()`

and bottom). In this sense
we might think of Java’s `void`

corresponding to Haskell’s `Void`

, but I
find this line of thinking difficult to justify. In particular, we
*can* branch on `null`

in Java (in
fact we *should* be doing such “null checks” all the time!),
whereas (pure) Haskell doesn’t even let us tell whether we have `()`

or bottom.

As a final thought, whenever comparing dynamic and static languages,
it’s important to not conflate different concepts which just-so-happen
to have similar names. In particular I find it useful to think of
dynamically typed languages as being “uni-typed”:
that is, every value has the same type, which we might write in Haskell
as a sum like `data DynamicValue = S String | I Int | L List | O Object | E Exception | ...`

.
Python is actually even simpler than this, since “everything is an
object” (not quite as much as in Smalltalk, but certainly more than
e.g. Java), so Python values are more like a pair of a class (which
itself is a Python value) and a collection of instance properties.

This is important because “dynamic types”, like in Python, aren’t
directly comparable to “static types” like those of Haskell; they’re
more comparable to “tags” (e.g. constructors). In particular, think
about a Python function branching based on its argument’s “dynamic
type”; this is the same as a Haskell function branching on its
argument’s *constructor*. What does this tell us about “bottom”
in Python? From this perspective, there isn’t one (at least not reified;
an infinite loop might be comparable, but it’s not something we can
e.g. assign to a variable). Python’s `None`

is just a
normal value like any other, in the same way that Haskell’s `Nothing`

is a
normal value (we might sometimes use it to *stand for* an error,
but that’s not inherent to the system); likewise Python’s exceptions are
*also* normal values (we can assign them to variables, return
them from functions, etc.); the idea of “throwing and catching” (or
`raise`

/`except`

for
Python) is actually a perfectly normal method of control flow (it’s
actually a limited form of delimited
continuation), and this is orthogonal to error representation and
handling.

This makes raising an exception in Python *very* different to
triggering a bottom in Haskell, since Haskell provides no way to branch
on a bottom (or whether we even have one). In Python we can raise a
value (with the exception “tag”) as an alternative to using `return`

, and we
can catch those values, inspect their tag and value, etc. to determine
what our overall return value will be, with none of this being visible
by the caller. To do anything like that in Haskell we need some “proper”
data to inspect and branch on, hence why I say `None`

,
exceptions, etc. are just normal values (even though they’re used to
represent errors, and we treat them specially because of that; but the
same could be said about `Either String a`

values in Haskell, for example). Consider that in Haskell the only way
to even check if a bottom exists is to use `IO`

, but the
idea behind `IO`

is that
it’s like `State RealWorld`

,
i.e. every `IO a`

value is
a *pair* of `(RealWorld, a)`

,
so conceptually we never “actually” see a bottom; it’s more like
triggering a bottom changes the `RealWorld`

, and
we’re branching on that.