Rationalising Denominators 3: Sums

Posted on 2024-11-08 by Chris Warburton

Posts in this series:

Powers

Rationalising denominators

A word on notation

This page deals with a few different concepts, which I’ve tried to use in a consistent way: “numbers” are the platonic entities we want to talk about; mathematical notation is used to describe numbers; and Python code is used for our specific, concrete implementation. I’ve annotated the mathematical notation with its corresponding Python code, where relevant.

Introduction

In the previous post we generalised our Power type into a Product of arbitrarily-many powers multiplied together. However, there are values which can’t be represented that way, such as 1+√7. Instead, these can be represented as a sum of such Products, which we’ll define in this post.

Sums

Our initial definition of Sum will begin in a similar way to how we approached Product, as a list of values:

def Sum(*products: Product) -> List[Product]:
    return list(products)

The underlying representation will change later, but this is enough to define some useful constants:

sum_zero = Sum(product_zero)
sum_one = Sum(product_one)
sum_neg = Sum(product_neg)

The following function will also come in handy:

def is_zero(s: Sum) -> bool:
    return len(s) == 0

Normalising

Just like our other representations, we want to normalise Sum values, so that each number is represented in exactly one way.

Combining multiples

The first problem with our representation is that a Sum containing N copies of the same Product should be the same as a Sum containing that Product once, multiplied by a factor N. More generally, a Sum containing Product values of the form P×M₁ and P×M₂, should be the same as a Sum which instead contains P×(M₁+M₂).

The key to making this tractable is that a Sum can only have a rational, whole number of terms. Hence we can begin by restricting our attention to those factors M₁, M₂, etc. in each Product which are rational, whole numbers. The following function will split a Product into a whole part (containing all of the whole exponents) and a fractional/irrational part (containing any remaining fractions of exponents):

def split_fractionals(product: Product) -> Tuple[Product, Product]:
    if product == product_zero:
        return (product_zero, product_one)
    go = lambda power: reduce(
        multiply_products,
        [Product(Power(base, power(exp))) for base, exp in product.items()],
        product_one
    )
    return (
        go(lambda e: e // 1), # whole exponents
        go(lambda e: e %  1), # fractional exponents
    )

Since the part with whole-number exponents is guaranteed to be a whole number, we can instead return that as a native Python integer (which also makes it easier to distinguish which part is which):

def split_fractionals(product: Product) -> Tuple[int, Product]:
    if product == product_zero:
        return (0, product_one)
    go = lambda power: reduce(
        multiply_products,
        [Product(Power(base, power(exp))) for base, exp in product.items()],
        product_one
    )
    return (
        eval_product_int(go(lambda e: e // 1)), # whole exponents
        go(lambda e: e % 1), # fractional exponents
    )

Now we can check if any terms have the same fractional parts, and if so combine their whole-number parts; just like the example of P×M₁ + P×M₂ = P×(M₁+M₂). We’ll use a Counter, provided by Python’s collections module, to add up all of the whole number multiples (AKA coefficients) for each Product of fractional exponents (AKA irrationals):

def irrationals_coefficients(s: List[Product]) -> Counter:
    result = Counter()
    for p in s:
        count, fractional = split_fractionals(p)
        # The key for our Counter must be hashable, so make a tuple. Sort it, to
        # normalise all permutations of factors into the same order.
        irrational = tuple(sorted(list(Product(*fractional.items()).items())))
        result[irrational] += count  # Counters begin at 0 if not yet assigned
    return result

We’ll use this Counter as our representation of Sum, rather than the naïve list we started with:

def Sum(*products: Product) -> Counter:
    return irrationals_coefficients(products)

Negatives

So far, we’ve seen that rational, whole number factors are important. Our irrationals_coefficients function is assuming that this means factors with whole-numbered exponents; but we’ve previously decided to encode negative one as √1, which is a rational, whole number that has a fractional exponent!

We’ll leave the split_fractionals function as-is, and account for negative one by post-processing its result in irrationals_coefficients (this is why I named them differently, one with “fractional” and one with “irrational”, since they’re handling subtley different concepts):

def irrationals_coefficients(s: List[Product]) -> Counter:
    result = Counter()
    for p in s:
        count, fractional = split_fractionals(p)
        base, exp = Power(1, fractional.get(1, zero))
        if exp >= half:
            # Shift a factor of 1^(1/2) from irrationals to rationals, since
            # neg_power is rational
            count = count * (-1)
            exp -= half
            fractional = fractional | {1: exp}
            if fractional.get(1, None) == zero:
                # Avoid redundant factors appearing in .items()
                del(fractional[1])
        irrational = tuple(sorted(list(Product(*fractional.items()).items())))
        result[irrational] += count
    return result

Skipping zeroes

We can add a final tweak to the end of irrationals_coefficients, to remove any elements that end up with a count of zero (otherwise Counter may keep them around):

        result[irrational] += count
        if result[irrational] == 0:
            del(result[irrational])
    return result

Notice that with this change, our definition of sum_zero will normalise to the empty sum, as if we had defined it as Sum() instead!

Normalising recursively

To simplify later steps, we’ll implement a couple of passes through the structure of a Sum; allowing smaller parts to be normalised in isolation. First we’ll need a way to compare two Sum values, to see if they’ve reached a fixed-point of our normalisation process. We can do that by putting their contents into a sequence, and sorting that into ascending order:

def s_items(coefficients):
    return tuple(sorted(coefficients.items()))

def sums_are_same(x, y):
    return s_items(x) == s_items(y)

Next we’ll define a normalise_sum function to hold our subsequent processing and call it from the top-level Sum constructor. The first thing this function does is remove any terms that have a coefficient of zero, to simplify the rest of the processing.

def Sum(*products: Product) -> Counter:
    return normalise_sum(irrationals_coefficients(products))

def normalise_sum(s: Sum) -> Sum:
    # Remove any terms that are zero
    s = Counter({k: v for k, v in s.items() if v != zero})

    if len(s) < 2:
        # If we have 0 or 1 terms, there's no summing to do.
        return s
    # Other normalisation steps can go here
    return s

Normalising subsets

One way to simplify a complex Sum is to break it down into smaller parts, normalise them individually, and then combine the results. We could try every possible subset of terms (for example, by holding out one term at a time and recursing), but that would be very inefficient since the number of subsets grows exponentially. A better approach is to be more selective, and only try to normalise subsets of terms that are likely to simplify, such as those which share common factors.

To do this, we’ll need a way to add Sums together, for when we re-combine a normalised subset with the rest of the terms. We can implement this by accumulating the contents of their Counters:

def add_sums(*sums, normalise=True) -> Sum:
    result = Counter()
    for s in sums:
        # Annoyingly, we can't just result += s, since that ignores negatives
        for k in s:
            result[k] += s[k]
    return normalise_sum(result) if normalise else result

Notice the normalise=True parameter; this is a useful flag to prevent recursively calling normalise_sum when we’re already in the middle of a normalisation step.

Next, we need a helper function to identify promising subsets of terms. The following _partial_sums function will find groups of terms that are “connected” by shared factors. For example, in a sum like AB + BC + CD + EF, the terms AB, BC, and CD are connected (since AB shares B with BC, and BC shares C with CD), but EF is not connected to them. The function will return these groups, largest first, and will skip any group that is a subset of a larger one that’s already been returned.

def _partial_sums(s: Sum):
    # Collect up terms which share the same base
    by_base = defaultdict(set)
    for prod in s:
        for base, _ in prod:
            by_base[base].add(prod)
    if 1 in by_base and () in s:
        by_base[1].add(())

    # Drop any base which occurs in *every* term, since it doesn't give us a
    # (proper) subset
    total = len(s)
    for base, terms in list(by_base.items()):
        if len(terms) == total:
            del(by_base[base])

    terms = lambda combo: {x for base in combo for x in by_base[base]}

    # Combine bases if they have terms in common
    combos = {frozenset([base]) for base in by_base}
    stable = False
    while not stable:
        stable = True
        old = list(combos)
        for i, combo1 in enumerate(old):
            for combo2 in list(old[i+1:]):
                new = combo1 | combo2
                if new not in combos and terms(combo1) & terms(combo2):
                    combos.add(new)
                    stable = False
    del(stable)

    # Again, remove those which aren't proper subsets
    for combo in list(combos):
        if len(terms(combo)) == total:
            combos.remove(combo)

    # Sort combos by size, so that supersets get checked first
    result = sorted(
        list(combos),
        reverse=True,
        key=lambda combo: len(terms(combo))
    )
    del(combos)

    # Remove any that are subsets of earlier ones, since they're redundant
    i = 0
    while i < len(result):
        these = terms(result[i])
        if any(
                these.issubset(terms(result[j]))
                for j in range(i)
        ):
            result.pop(i)
        else:
            i += 1
    del(i)
    return terms, result

Now we can use this in normalise_sum. We’ll iterate through the subsets it gives us. For each one, we’ll try normalising it in isolation. If it simplifies, we’ll add the simplified version back to the terms we didn’t touch, and restart the whole normalisation process from the top.

    # Look for (strict) subsets which share factors, and try normalising those
    # in isolation
    terms, bases = _partial_sums(s)
    keys = set(s.keys())
    for combo in bases:
        these = terms(combo)
        within = Counter({
            term: s[term]
            for term in these
        })
        norm = normalise_sum(within)
        if not sums_are_same(within, norm):
            without = Counter({
                term: s[term]
                for term in keys.difference(these)
            })
            return add_sums(norm, without)

This divide-and-conquer strategy allows us to simplify smaller, more manageable parts of a complex Sum in isolation. By focusing on these connected subsets, we increase our chances of being able to apply other simplification techniques, such as extracting common factors, which we’ll explore next.

Holding out common factors

The second way we can hold-out part of a Sum is to divide-through by its common factors, normalise what’s left, then multiply the result by the held-out factors. This complements the normalisation of subsets we implemented above: firstly because we choose subset in order to find common factors, and secondly because dividing-out common factors reduces the remaining subsets that need to be checked (since trying subsets is more expensive, we try factorising first).

We’ll start with a multiplication function for Sums; although that’s easier to implement for our original list-of-Product representation, so we’ll include a function to convert into that format as well:

def _sum_to_products(coefficients):
    return _sort([
        Product(
            Power(abs(coeff), one),
            power_neg if coeff < 0 else power_one,
            *irrational
        )
        for irrational, coeff in coefficients.items()
        if coeff != 0
    ])

def _sort(s: Sum) -> Sum:
    return sorted(list(s), key=lambda p: sorted(p.items()))

def multiply_sums(*sums: Sum, normalise=True) -> Sum:
    # Skip multiplication by 1
    remaining = [s for s in sums if s != Sum(Product())]
    if len(remaining) == 0:
        # Empty product
        return Sum(Product())

    first, *rest = remaining
    # Short-circuit if we're only multiplying one term
    if len(rest) == 0:
        return first
    result = reduce(
        lambda s1, s2: irrationals_coefficients([
            multiply_products(p1, p2)
            for p1 in _sum_to_products(s1)
            for p2 in _sum_to_products(s2)
        ]),
        rest,
        first
    )
    return normalise_sum(result) if normalise else result

Like add_sums, this function also has a normalise=True parameter to avoid redundant recursion.

Next we’ll need to identify common factors:

def common_factors_of_sum(s: Sum) -> Product:
    if is_zero(s):
        return product_one
    first, *rest = _sum_to_products(s)
    found = reduce(
        lambda common, product: {
            base: min(common[base], augmented[base])
            for base in common
            for augmented in [{1: one} | product]
            if base in augmented
        },
        rest,
        {
            base: exp
            for base, exp in ({1: one} | first).items()
        }
    )
    # Avoid factors of 1^0 AKA 1^1
    if found.get(1, None) in [zero, one]: del(found[1])
    return found

And we need a way to “divide-through” by a common factor. Since our numeric representation doesn’t support arbitrary divisions yet, we’ll make a specialised function that relies on the divisor being a factor. To avoid infinite recursion we’ll give this function a boolean parameter called raw, to indicate when we don’t want it to attempt further normalisation:

def remove_common_factor(s: Sum, p: Power, raw=False) -> Sum:
    if p == power_one:
        return s  # Divide by 1 is a no-op
    base, exp = p
    old_products = _sum_to_products(s)

    # Check our precondition, that p is a common factor of the terms in s
    has_base = all(base in product for product in old_products)
    if not (has_base or base == 1):
        raise AssertionError(
            f"remove_common_factor({s}, {p}, {raw})"
        )

    coeffs = irrationals_coefficients([
        Product(*(
            [
                Power(b, e - exp if b == base else e) for b, e in product.items()
            ] + ([] if base != 1 or base in product else [Power(1, one - exp)])
        ))
        for product in old_products
    ])
    return coeffs if raw else normalise_sum(coeffs)

Finally we can use these in our normalise_sum function to hold-out common factors:

    # Try holding-out each common factor at a time. Recursion will take care of
    # trying each combination of held-out factors.
    factors = common_factors_of_sum(s)
    factors_list = factors.items()
    for (base, exp) in factors_list:
        rest = remove_common_factor(s, (base, exp), raw=True)
        norm = normalise_sum(rest)
        if not sums_are_same(norm, rest):
            return multiply_sums(norm, Sum(Product(Power(base, exp))))

Allowing remainders

The above implementation of common_factors_of_sum is actually quite limited, since it will only accept factors which divide exactly into every element of a Sum. However, it will sometimes be useful to hold-out factors which leave a (whole, rational) remainder after division. To support that, we’ll need a way to split the rational part off a Sum, and a way to subtract it back out again:

def _split_rational(s: Sum) -> Tuple[Sum, int]:
    coeffs = Counter(s)
    rational = coeffs[tuple()]
    del(coeffs[tuple()])
    return (coeffs, rational)

def _subtract_int_from_sum(s: Sum, i: int):
    result = s.copy()
    result[tuple()] -= i
    if result[tuple()] == 0:
        del(result[tuple()])
    return result

Now we can handle the rational part separately when looking for common factors:

def common_factors_of_sum(s: Sum, allow_remainder=False) -> (Product, int):
    """Returns a Product whose Powers appear in every term of the given Sum.
    This takes into account bases with differing exponents, by returning the
    smallest. If allow_remainder, rationals (Powers of 1^0 or 1^0.5) are allowed
    to leave a remainder, which is returned as the second part of the tuple."""
    if allow_remainder:
        to_factor_coeffs, const = _split_rational(s)
    else:
        to_factor_coeffs = s
        const = 0

    if is_zero(to_factor_coeffs):
        if const:
            return (
                multiply_products(
                    Product(Power(abs(const), one)),
                    product_one if const >= 0 else product_neg,
                ),
                0
            )
        else:
            return (product_one, 0)
    first, *rest = _sum_to_products(to_factor_coeffs)
    found = reduce(
        lambda common, product: {
            base: min(common[base], augmented[base])
            for base in common
            for augmented in [{1: one} | product]
            if base in augmented
        },
        rest,
        {
            base: exp
            for base, exp in ({1: one} | first).items()
        }
    )
    # Avoid factors of 1^0 AKA 1^1
    if found.get(1, None) in [zero, one]: del(found[1])

    if const:
        # If we held out the rational part, see if we can divide it by found
        factored_coeffs, remainder = _split_rational(
            irrationals_coefficients([found])
        )
        if is_zero(factored_coeffs):
            # No irrational part, so we can divide our held-out const
            return (found, const % remainder)
    return (found, const)

Here’s the corresponding change in normalise_sum:

    # Try holding-out each common factor at a time. Recursion will take care of
    # trying each combination of held-out factors.
    factors, remainder = common_factors_of_sum(s, allow_remainder=True)
    reduced = _subtract_int_from_sum(s, remainder)
    factors_list = factors.items()
    for (base, exp) in factors_list:
        rest = remove_common_factor(reduced, (base, exp), raw=True)
        norm = normalise_sum(rest)
        if not sums_are_same(norm, rest):
            return add_sums(
                Sum(multiply_products(
                    Product(Power(abs(remainder), 1)),
                    product_neg if remainder < 0 else product_one
                )),
                multiply_sums(norm, Sum(Product(Power(base, exp))))
            )

Conclusion

We’ve defined a Sum type, which is a natural progression from our Product type, and allows us to represent more numbers. Our representation normalises many redundant aspects of a Sum, including permutations of the order of terms, skipping terms which are zero, combining multiple copies of the same term (including those appearing with extra factors), and having negative multiples cancel-out positive ones.

The normalisation of Sum is actually overkill for the techniques seen in this blog post, but will prove invaluable when we implement roots of unity in a subsequent installment.

In the next part we’ll introduce one more type by taking ratios of these sums!

The full code for this post is available in the fraction_sums module of my conjugate repo.