Calculating Arity

Posted on 2015-11-12 by Chris Warburton

The arity of a value is the number of arguments it can accept. We can define arity by pattern-matching on types (a feature known as typecase):

arity (I -> O) = 1 + arity O
arity  T       = 0

However, it’s difficult to implement this kind of type-level function. Here are some approaches I tried in Haskell.

Type Classes

We want arity to have a different value, depending on the type we’re using. We can do this by overloading arity using type classes:

class HasArity a where
  arity :: Int

However, we hit problems if we try to implement this type class:

instance (HasArity o) => HasArity (i -> o) where
  arity = 1 + arity

instance HasArity t where
  arity = 0

There are two issues here. Firstly, we don’t know which arity to use in the expression 1 + arity:

/build/runghcXXXX27-0.hs:2:3: error: [GHC-39999]
    • Could not deduce ‘HasArity a0’
      from the context: HasArity a
        bound by the type signature for:
                   arity :: forall {k} (a :: k). HasArity a => Int
        at /build/runghcXXXX27-0.hs:2:3-14
      The type variable ‘a0’ is ambiguous
    • In the ambiguity check for ‘arity’
      To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
      When checking the class method:
        arity :: forall {k} (a :: k). HasArity a => Int
      In the class declaration for ‘HasArity’
  |
2 |   arity :: Int
  |   ^^^^^^^^^^^^

We can fix this by turning arity into a function:

class HasArity a where
  arity :: a -> Int

instance (HasArity o) => HasArity (i -> o) where
  -- undefined inhabits every type, including i
  arity f = 1 + arity (f undefined)

instance HasArity t where
  arity _ = 0

The other problem is that these two instances overlap; the first is just a special case of the second:

main = print (arity not)

/build/runghcXXXX47-0.hs:10:15: error: [GHC-43085]
    • Overlapping instances for HasArity (Bool -> Bool)
        arising from a use of ‘arity’
      Matching instances:
        instance HasArity t -- Defined at /build/runghcXXXX47-0.hs:8:10
        instance HasArity o => HasArity (i -> o)
          -- Defined at /build/runghcXXXX47-0.hs:4:10
    • In the first argument of ‘print’, namely ‘(arity not)’
      In the expression: print (arity not)
      In an equation for ‘main’: main = print (arity not)
   |
10 | main = print (arity not)
   |               ^^^^^

Unlike pattern-matching, which tries alternatives from top to bottom, type classes don’t have such an ordering. This makes it difficult to implement a “catch-all” instance which works for any type other than functions.

Multi-Parameter Type Classes

The classic way to implement type-level computation in Haskell is to use multi-parameter type classes as a form of logic programming. Similar to Prolog, to represent a total function of the form arity x = a, we instead define a partial relation of the form arity x a. We can then implement that relation as a multi-parameter type class, relating the input type to its arity:

class HasArity a n where
  arity :: a -> n

Unfortunately, Haskell doesn’t have dependent types, so we can’t represent arities with regular numbers:

instance (HasArity o n) => HasArity (i -> o) (1 + n) where
  arity f = 1 + arity (f undefined)

instance HasArity t 0 where
  arity _ = 0

/build/runghcXXXX67-0.hs:3:47: error: [GHC-68567]
    Illegal type: ‘1’
    Suggested fix:
      Perhaps you intended to use DataKinds
      You may enable this language extension in GHCi with:
        :set -XDataKinds
  |
3 | instance (HasArity o n) => HasArity (i -> o) (1 + n) where
  |                                               ^

/build/runghcXXXX67-0.hs:3:49: error: [GHC-76037]
    Not in scope: type constructor or class ‘+’
  |
3 | instance (HasArity o n) => HasArity (i -> o) (1 + n) where
  |                                                 ^

/build/runghcXXXX67-0.hs:6:21: error: [GHC-68567]
    Illegal type: ‘0’
    Suggested fix:
      Perhaps you intended to use DataKinds
      You may enable this language extension in GHCi with:
        :set -XDataKinds
  |
6 | instance HasArity t 0 where
  |                     ^

Type-level programming in Haskell can only involve types, not regular values like numbers, so we need some type-level notion of numbers instead. In other words, an arity n is both a type and a number; it’s not a “numeric type”, like Int, since Int itself is not a number; likewise it’s not an element of a numeric type like Int, like 3 :: Int, since Ints are not types. We need a new notion, which satisfies both of these requirements:

-- Type level Peano numerals
data Zero   = Z
data Succ n = S n

To make life easier, we provide a toInt function to turn these Peano numbers into regular Ints:

class ToInt a where
  toInt :: a -> Int

instance ToInt Zero where
  toInt _ = 0

instance (ToInt n) => ToInt (Succ n) where
  toInt (S x) = 1 + toInt x

Now we can return these Peano numbers from our arity functions:

instance (HasArity o n) => HasArity (i -> o) (Succ n) where
  arity f = S (arity (f undefined))

instance HasArity t Zero where
  arity _ = Z

This gets us part of the way there; our arities are now living at the same level as our types, but now we have a couple of problems regarding ambiguity:

main = print (arity not)

/build/runghcXXXX89-0.hs:19:8: error: [GHC-39999]
    • Ambiguous type variable ‘a0’ arising from a use of ‘print’
      prevents the constraint ‘(Show a0)’ from being solved.
      Probable fix: use a type annotation to specify what ‘a0’ should be.
      Potentially matching instances:
        instance Show Ordering -- Defined in ‘GHC.Internal.Show’
        instance Show Integer -- Defined in ‘GHC.Internal.Show’
        ...plus 25 others
        ...plus 13 instances involving out-of-scope types
        (use -fprint-potential-instances to see them all)
    • In the expression: print (arity not)
      In an equation for ‘main’: main = print (arity not)
   |
19 | main = print (arity not)
   |        ^^^^^

/build/runghcXXXX89-0.hs:19:15: error: [GHC-39999]
    • Ambiguous type variable ‘a0’ arising from a use of ‘arity’
      prevents the constraint ‘(HasArity
                                  (Bool -> Bool) a0)’ from being solved.
        (maybe you haven't applied a function to enough arguments?)
      Probable fix: use a type annotation to specify what ‘a0’ should be.
      Potentially matching instances:
        instance HasArity t Zero
          -- Defined at /build/runghcXXXX89-0.hs:17:10
        instance HasArity o n => HasArity (i -> o) (Succ n)
          -- Defined at /build/runghcXXXX89-0.hs:14:10
    • In the first argument of ‘print’, namely ‘(arity not)’
      In the expression: print (arity not)
      In an equation for ‘main’: main = print (arity not)
   |
19 | main = print (arity not)
   |               ^^^^^

Functional Dependencies

These ambiguities are due to the open nature of type classes: we can declare new instances at any point, so the fact that we just-so-happen to have instances with Zero and Succ n as the second parameter doesn’t actually tell Haskell very much; it might come across other definitions elsewhere which use other types as the second parameter, and hence there is ambiguity.

To make the second parameter unambiguous we can use functional dependencies to indicate that the second parameter is completely determined by the first one:

class HasArity a n | a -> n where
  arity :: a -> n

Now we can make our two instances:

instance (HasArity o n) => HasArity (i -> o) (Succ n) where
  arity f = S (arity (f undefined))

instance HasArity t Zero where
  arity _ = Z

However, we still have an ambiguity problem; the “catch-all” parameter t of the second instance overlaps with the more-specific function type i -> o of the first:

/build/runghcXXXX109-0.hs:14:10: error: [GHC-46208]
    Functional dependencies conflict between instance declarations:
      instance HasArity o n => HasArity (i -> o) (Succ n)
        -- Defined at /build/runghcXXXX109-0.hs:14:10
      instance HasArity t Zero
        -- Defined at /build/runghcXXXX109-0.hs:17:10
   |
14 | instance (HasArity o n) => HasArity (i -> o) (Succ n) where
   |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Type Families

Type families are the solution we’ve been looking for. I originally gave up trying to understand these, after banging my head against GHC for a while. I was recently inspired to have another go, after attending a talk by Lennart Augustsson which involved some pretty hairy type-level hackery. During the course of his presentation, type families finally clicked for me, and I was able to bang out the following approach:

type family Arity t :: * where
  Arity (i -> o) = Succ (Arity o)
  Arity t        = Zero

This defines Arity as a closed type family. A type family is basically a type-level function (unlike the type-level relations encoded by multiparameter typeclasses). Type families can be open, allowing new pattern clauses to be added at any point, provided they don’t overlap (just like typeclass instantiation); alternatively they can be closed, where a fixed set of, potentially overlapping, pattern clauses are used (just like function definition). We use the latter, to allow our “catch-all” pattern to overlap with the i -> o pattern.

With this type family, we can finally get the arity of each type. However, one problem still remains: the result will only be accessible at the type-level. To associate a value with each Arity type, we can use type classes; we no longer have problems with ambiguity, since we can treat all arities in the same way.

First we use the fact that our type-level numbers are singletons, i.e. they only contain one value (ignoring undefined). We use a type class to associate the name singleton :: a with the single value of type a:

class Singleton a where
  singleton :: a

instance Singleton Zero where
  singleton = Z

instance (Singleton n) => Singleton (Succ n) where
  singleton = S singleton

We can use the Singleton class to write our arity function. All of our calculation is performed by Arity at the type level, which results in either Zero or Succ n for some n; we use singleton to access the (only) value of that result type:

arity :: (Singleton (Arity a)) => a -> Arity a
arity _ = singleton

-- Handy helper functions

-- Return the arity of a value, as an Int
intArity :: (Singleton (Arity a), ToInt (Arity a)) => a -> Int
intArity = toInt . arity

-- Pretty output
format s x = concat ["Arity of '", s, "' is ", show (intArity x)]

Now, at last, we can get the arity of values:

main = mapM_ putStrLn [
    format "Bool -> Bool"           not
  , format "String"                 "hello"
  , format "Bool -> Bool -> Bool"   (&&)
  , format "(a -> b) -> [a] -> [b]" map
  ]

Arity of 'Bool -> Bool' is 1
Arity of 'String' is 0
Arity of 'Bool -> Bool -> Bool' is 2
Arity of '(a -> b) -> [a] -> [b]' is 2

It’s also interesting to note the values which don’t work. For example, we fail to get the arity of the identity function id :: a -> a:

main = putStrLn (format "a -> a" id)

/build/runghcXXXX154-0.hs:34:18: error: [GHC-39999]
    • Ambiguous type variable ‘a0’ arising from a use of ‘format’
      prevents the constraint ‘(Singleton (Arity a0))’ from being solved.
      Probable fix: use a type annotation to specify what ‘a0’ should be.
      Potentially matching instances:
        instance Singleton n => Singleton (Succ n)
          -- Defined at /build/runghcXXXX154-0.hs:21:10
        instance Singleton Zero
          -- Defined at /build/runghcXXXX154-0.hs:18:10
    • In the first argument of ‘putStrLn’, namely
        ‘(format "a -> a" id)’
      In the expression: putStrLn (format "a -> a" id)
      In an equation for ‘main’: main = putStrLn (format "a -> a" id)
   |
34 | main = putStrLn (format "a -> a" id)
   |                  ^^^^^^

This is because the output type of id is completely polymorphic: it could be anything, including a function. We know the arity must be at least 1, but it could be arbitrarily higher; we don’t have enough information to know, and hence the compiler can’t finish the calculation.

To get around this, we can restrict the output type of id. We don’t have to completely monomorphise it, we just need to provide enough information to perform the type-level pattern-match for Arity:

main = mapM_ putStrLn [
    format "Bool       -> Bool      " (id :: Bool       -> Bool)
  , format "[a]        -> [a]       " (id :: [a]        -> [a])
  , format "Maybe a    -> Maybe a   " (id :: Maybe a    -> Maybe a)
  , format "(a -> [b]) -> (a -> [b])" (id :: (a -> [b]) -> (a -> [b]))
  ]

Arity of 'Bool       -> Bool      ' is 1
Arity of '[a]        -> [a]       ' is 1
Arity of 'Maybe a    -> Maybe a   ' is 1
Arity of '(a -> [b]) -> (a -> [b])' is 2