How not to shuffle a list

A frequent thing that people want to do in making games or interactive applications is to shuffle a list. One common and intuitive approach that people take is to simply sort the list, but use a random number generator as the comparison operation. (For example, this is what’s recommended in Fuzzball’s MPI documentation, and it is a common answer that comes up on programming forums as well.)

This way is very, very wrong.

First, there are two approaches which work really well that I’ll describe briefly; one is the Fisher-Yates shuffle algorithm, which works by walking down the array and randomly swapping the current element with any of the elements which follow it (including itself). This is an \(\mathcal{O}(N)\) operation which is trivial to implement in pretty much any imperative language and can be done in-place.

If you are in an environment which does not allow array-level swapping like that (such as, say, a functional programming language or a spreadsheet), an alternate approach is to map the array to one where each element is prefixed with a random number, then sort based on that Random prefix, and then remove the prefix. Here is how to do it in Excel, and how to do it in Fuzzball MPI. This generally is an \(\mathcal{O}(N \log_2 N)\) operation: not as efficient, but it still generates high-quality results.

But why are these approaches better? Well, both of them have an equal chance of any input cell ending up in any output cell; there is absolutely no bias to where things might end up, because all of their placements are independent from one another, and each placement happens effectively once.

But the coin-flip sorting approach introduces a lot of bias; cells can move an arbitrary number of times, and where they can move to depends on where they are located within the array to begin with.

So, let’s measure this bias using a \(\chi^2\) test.

Briefly, \(\chi^2\) measures how close your output distribution is to an expected distribution. If we were shuffling our lists uniformly, then we’d expect every value to end up in every cell about the same number of times; for an array of length \(N\), with a shuffle repeated \(K\) times, you’d expect each value to be in each cell \(\frac{K}{N}\) times.

Here is what the \(\chi^2\) distribution is for the good algorithms, with \(N=235\) and \(K=100000\):

And here it is for three different random-sort algorithms, bubble sort, std::stable_sort, and merge sort, all using a coin-toss ordering function:

(The reason I went with std::stable_sort instead of std::sort is that the latter uses some optimizations that assume that the sort is transitive – which isn’t the case when you’re randomly flipping a coin. So std::sort tends to crash when this constraint is violated.)

For a better comparison of how the different algorithms stack up as \(N\) changes, here’s a chart:

N Fisher-Yates Random prefix Bubble sort std::stable_sort Merge sort
4 0.00005 0.00004 0.139569 0.158157 0.00004
5 0.00002 0.00005 0.218803 0.273358 0.0487565
6 0.00006 0.00008 0.29487 0.400601 0.0330905
7 0.0000 0.00006 0.364549 0.527574 0.0342361
8 0.00006 0.00006 0.430965 0.660637 0.00005
10 0.0001 0.00010 0.557523 0.93249 0.0335068
12 0.000127262 0.000108561 0.674842 1.19897 0.024186
15 0.000148504 0.000128046 0.831853 1.60815 0.0178444
18 0.000186571 0.000189251 0.971268 2.01685 0.0259307
22 0.000212744 0.000221075 1.14734 2.56097 0.0321786
27 0.000249649 0.000268954 1.34988 3.24243 0.0271932
33 0.000348344 0.000328459 1.56487 4.06064 0.0116926
41 0.000417497 0.000404196 1.82521 5.1443 0.0301371
51 0.000494652 0.000514062 2.11672 6.50646 0.026951
63 0.000612439 0.000614844 2.4303 8.14154 0.0057482
78 0.000753832 0.000759035 2.7903 10.1829 0.0270242
97 0.000952913 0.000986311 3.19424 12.7711 0.022091
121 0.0011958 0.0012227 3.65048 16.0263 0.0137545
151 0.00147693 0.0015004 4.16317 2.2788 0.026883
188 0.00185188 0.00186338 4.7293 2.72105 0.0242407
235 0.00231242 0.00233842 5.3744 3.24268 0.0189719

And here is a logarithmic graph showing how the total \(\chi^2\) error changes as \(N\) increases:

So, as N increases, you would expect the \(\chi^2\) to increase if \(K\) remains constant, as it did in this case (as there are fewer overlapping samples). And in fact the bubble sort and std::stable_sort approaches increase pretty much in parallel with the Fisher-Yates and random prefix approaches (aside from the discontinuity at \(N=128\) due to the performance-tuning heuristics that algorithm uses) – but this is in a logarithmic graph, meaning that the random sorts increase orders of magnitude faster than the other ones.

Merge sort is a bit interesting, in that it is highly dependent on the actual size of the array, and in particular how close it is to a power of 2, with perfect powers of 2 having the same \(\chi^2\) as the ideal algorithms. (I leave this as an exercise to the reader to consider why this would be.) But there’s still a lot of inherent bias, and it’s not even casually-predictable bias.

Anyway, this is why when shuffling a list, you should use Fisher-Yates if possible, or sort based on a randomly-generated unassociated number if not.

(Here is the rather ugly source code if you want to play with it yourself.)