[posts] [talks] [rss]

### Combinatorial Algorithms in Scala

Combinatorics is a branch of mathematics that mostly focuses on problems of counting the structures of given size and kind. The most famous and well-known examples of such problems might be often asked as job interview questions. This blog post presents four generational problems (combinations, subsets, permutations and variations) along with their purely functional implementations in Scala.

#### Implicit Classes

Scala’s implicit classes provide simple and composable way of extending the API of third-party classes. For example, the following implicit class extends default `Int` class within a new method `times(fn: Unit => Unit): Unit` that executes given function `fn` n-times.

This gives us a very neat usage way. All one need to do is import an implicit class into the current namespace and let the magic happened.

We’ll use this approach in order to extend Scala’s `List` with four new methods that implement our combinatorial algorithms. The only one restriction we have to satisfy here: new functions’ names shouldn’t conflict with an existent API. Thus, we’ll use a prefix `x` (from eXtended) for new functions. The following listing represents a skeleton class we’re going to implement.

This tiny class might be used as follows (in the exact way as `IntOps` was used below).

#### Optimistic Programming

Optimistic Programming is an implementation technique of recursive programs when it’s believed that a recursive function works as expected on a smaller input (on a sub-problem) in order to use its result for solving the full-size problem. In other words, the body of recursive function may be implemented in terms of following ideas: (a) when called recursively it gives the right answer for any sub-problem, but (b) some additional work should be done in order to merge these sub-problem solutions into the single solution of the entire problem. Doesn’t that sound optimistic? The recursive function is pretended to be correctly implemented before its body is actually being written.

An optimistic programming lie between Divide and Conquer and Dynamic Programming techniques. Rather then focussing on how sub-problems are being splited (whether or not the sub-problems overlap) an optimistic programming focuses on the nature of recursive programs and provides a simple tool making the programming of complex problems much easier.

We’ll use an optimistic programming for solving combinatorial problems in a functional setting.

#### Combinations

Imagine you’re given a standard deck of fifty-two cards and asked to select any two of them. Those pair of cards you’ll select is called a combination (i.e., a 2-combination). And there are 1326 such 2-card combinations that may be possibly selected from a standard card deck. More formally, a binomial coefficient defines the number of k-combinations from a set of `n` distinct elements.

The combination elements’ order doesn’t matter. So, `[a, b]` and `[b, a]` are the same combinations.

It’s time to use the power of optimistic programming for solving the problem of generating k-combinations. An optimistic programming guarantees that a recursive function being called on a sub-problem produces a correct answer. A sub-problem of generating k-combinations is generating (k-1)-combinations. The only one question’s left: how to solve an entire problem then? This is when the things become interesting. Obviously, there is should be an extra element in the set, which being added to a (k-1)-combination upgrades it to a full-size k-combination. A set’s extra element is nothing different from a regular set’s element. And set `S` itself is a recursive object, which without one element is still a set `S'` and may be processed recursively. Thus, the final solution contains both `S'`’s k-combinations and `S'`’s (k-1)-combinations with an extra element appended.

There are also two corner cases that we have to handle separately. There is nothing to do when `k > n` (combination’s size is greater then an entire set’s size). And there is no further grouping required if it’s a generation of 1-combinations.

#### Subsets

A set’s k-combination may also be referenced as a subset. The other combinatorial problem is generating all the subsets (all k-combinations, where `k = 1..n`) of a given set.

The implementation is straightforward - combinations of all the possible sizes should be merged together. That may be done by List’s `foldLeft` operation.

There are `2^n` subset of an n-size set. It’s choice of two: every set’s element is either taken or not into the particular subset.

#### Variations

Unlike combinations, the order of elements inside a variation does matter. Thus, tuples `[a, b]` and `[b, a]` are different variations (i.e., 2-variations). In general, variations are denoted as partial permutations or k-permutations, where `0 < k <= n`.

The number of k-permutations of `n` is the following product: `n * (n-1) * ... * (n-k+1)`. That’s a bit different from a binomial coefficient (the number of k-combinations of `n`): there is no `k!` in a denominator, since it counts all the possible k-permutations rather then treating them equal.

The same ideas of an optimistic programming may be used in generating the variations (k-permutations) of a given set. The corner cases are the same: there’s nothing to do with `k > n` or `k = 1`. Just like in combinations, these two cases should be handled separately. More interesting is the regular case: upgrading a recursively generated (k-1)-permutation to a full-size one. It’s no longer a problem of getting an extra element from a set, as well as the upgrading itself is no longer a merging.

Since the order does matter, an extra element should be inserted into the every possible place of a permutation rather then just being merged with it. So, instead of 1-by-1 mapping between unfinished and finished combination it comes to 1-by-k mapping for permutations: there are `k` places in (k-1)-permutation where an extra element may be inserted.

Ultimately, by analogy with k-combinations, k-permutations of `S` contain all the k-permutations of `S'`, where `S'` is a without-out-element version of `S`.

#### Permutations

Permutations are just set’s size variations or k-permutations with `k = n`. A permutation may also be viewed as a result of a set’s shuffle operation. In other words, every iteration of a shuffling the deck of cards process gives a new permutation. Permutations are counted by a product: `n * (n-1) * ... 1`, which is `n!`.

The implementation of a purely-functional algorithm of generating the permutations is quite simple in terms of variations.

#### Further Improvements

The full version of `CombinatorialOps` class might be found at GitHub. In order to reduce the memory footprint a bit of lazinesses may be involved by (a) replacing the output data type `List[List[A]]` with `Iterable[List[A]]` and (b) generating each piece of data on-demand.