Then sunk, and roll'd along the sand in blood.

Optimizing Differentiability in Weighted-Frequency Color Coding

Problem statement

For motivation, see my post about semantic traveler wristbands.

We have $n$ categories we'd like to color code. Each category $t$ has frequency $f_t$, where $\sum\limits_i f_i = 1$. We'd like to assign colors to each category $t$ so that the most frequent categories have very distinct colors, but less frequent categories may be similar in color to other ones.

A naive solution is to space out all the categories evenly. However, if there are a lot of infrequent categories, this will unfairly impact the differentiability of more frequent categories.


We will only consider hues, which can be arranged in a circle like this (imagine the two ends are glued together) [source Kalan:Wikipedia].

A color will be represented as a real number "mod 360", the meaning of which I will formalize. Or a computable number "mod 360", for the recursion theorists out there.

Linear programming? Maybe not.

My first guess was to try to formalize this using linear programming. Now I don't think it's possible to solve this with LP, but I'll show how I came to that conclusion for completeness (skip to the next section if you like).

If you're not familiar with linear programming, the basic idea is to formalize the problem as a set of variables where we want to maximize one objective function and we have several inequalities, but all expressions must be linear with real coefficients, no powers or roots allowed. More information can be found at PurpleMath or this video. Then we can use a simplex solver to find the values that maximize the objective function.

We present only three categories for illustrative purposes. Each color's variable is $c_i$ and $d_{i,j}$ is the distance from $i$ to $j$. I'm going to ignore frequencies for now because I don't need to consider them to show that linear programming is unsuitable for this problem.

# We want to maximize the distances between the colors. $$ d_{0, 1} + d_{0, 2} + d_{1, 2} $$ subject to
# Each color is between 0 and 360, tail exclusive. This maps nicely onto HSL hues with a bit of rounding. $$ \forall i.0 \leq c_i \lt 360 $$ # This constraint lets us compute distances easily without worrying about absolute values, without loss of generality. $$ c_0 \leq c_1 \leq c_2 $$ # The distance between two colors is the minimum of the "inner distance" $a$ and the "outer distance" $b$. $$ \forall i\forall j \gt i.d_{i,j} \leq a_{i, j} $$ $$ \forall i\forall j \gt i.d_{i,j} \leq b_{i,j} $$ # The inner distance between two colors is the difference. $$ \forall i\forall j \gt i.a_{i,j} = c_j - c_i $$ # If $a \leq b$, the outer distance is the distance from $a$ to 0 to $b$ via the boundary of the color circle. $$ \forall i\forall j \gt i.b_{i,j} = (c_i - 0) + (360 - c_j) $$

Plug it into a simplex solver and we get...$c_0 = 0, c_1 = c_2 = 180$. These aren't spread out! The problem is the total pairwise distance is 360, whereas the optimal solution, $c_0 = 0, c_1, = 120, c_2 = 240$ also has a total pairwise distance of 360. The linear nature of our expressions make it okay to shift around distances without incurring penalty (recall for a linear function $f$, $f(a+b) = f(a) + f(b)$).

We'd like to formalize that we'd rather have the distances "spread out" rather than be clumped together. In other words, increasing the distances between two colors should yield diminishing returns. An easy way to implement this is to just maximize the sum of the square roots of the distances. For example, $\sqrt{180} \lt \sqrt{90} + \sqrt{90}$. Unfortunately, our function is now non-linear so we can't use LP.

Approximating the solution

We can still try to use approximation methods to get a good result. First we want to weight some of the distances more than others, how should we come up with the weights? Here is one way.

We weight the importance of the pairwise distances of first two by their average frequency, the pairwise distances of the first three by their average frequency, etc. If you think another weighting strategy is better, try it out and let me know!

$$ f = \frac{\sum\limits_{i=0}^1 f_i}{2}\frac{d_{0,1}}{2} + \frac{\sum\limits_{i=0}^2 f_i}{3}\frac{(d_{0,1} + d_{0,2} + d_{1,2})}{3} $$ where $$ d_{ij} = \sqrt{\operatorname{min}(c_j - c_i, 360 - c_j + c_i)} $$

We can use scipy's scipy.optimize.fmin_slsqp function which uses sequential least squares to find an approximate global maximum. However there are a few issues. fmin_slsqp works better if we can specify the gradient of the function, but the minimum function is nondifferentiable, so we replace it with an approximation. $$ \operatorname{min}(x, y) \approx \left(\frac{1}{2}(x^{-k} + y^{-k})\right)^{-\frac{1}{k}} $$ where $k$ is a constant. This approximation gets better with higher $k$, but I've set it to 5 and it seems to work fine.

We also need to negate $f$ since the scipy function is a minimizer and we want to maximize, and finally we can use WolframAlpha to compute partial derivatives of $f$ with respect to each of the three color variables.

[In]: from scipy.optimize import fmin_slsqp 
[In]: f, gradient = ...
[In]: initial_guesses = [0, 120, 240]
[In]: bounds = [(0, 360), (0, 360), (0, 360)]
[In]: fmin_slsqp(f, initial_guesses, bounds=bounds, fprime=gradient)
[Out]: array([0., 165.78662431, 262.92332524])

Sample Results

I ran the program for the following input corresponding to the normalized three most spoken languages in France: French at .7431, English at .1869, German at .0700. Clearly French and English are very different, but German has hints of both cyan and red because it's not as far apart.

Language Swatch HSL
French (0, 100, 50)
English (165.78, 100, 50)
German (262.92, 100, 50)

Have anything to add? Let me know at surya at [this domain name].