How not to shuffle a list

A frequent thing that people want to do in making games or interactive applications is to shuffle a list. One common and intuitive approach that people take is to simply sort the list, but use a random number generator as the comparison operation. (For example, this is what’s recommended in Fuzzball’s MPI documentation, and it is a common answer that comes up on programming forums as well.)

This way is very, very wrong.

First, there are two approaches which work really well that I’ll describe briefly; one is the Fisher-Yates shuffle algorithm, which works by walking down the array and randomly swapping the current element with any of the elements which follow it (including itself). This is an \(\mathcal{O}(N)\) operation which is trivial to implement in pretty much any imperative language and can be done in-place.

If you are in an environment which does not allow array-level swapping like that (such as, say, a functional programming language or a spreadsheet), an alternate approach is to map the array to one where each element is prefixed with a random number, then sort based on that Random prefix, and then remove the prefix. Here is how to do it in Excel, and how to do it in Fuzzball MPI. This generally is an \(\mathcal{O}(N \log_2 N)\) operation: not as efficient, but it still generates high-quality results.

But why are these approaches better? Well, both of them have an equal chance of any input cell ending up in any output cell; there is absolutely no bias to where things might end up, because all of their placements are independent from one another, and each placement happens effectively once.

But the coin-flip sorting approach introduces a lot of bias; cells can move an arbitrary number of times, and where they can move to depends on where they are located within the array to begin with.

So, let’s measure this bias using a \(\chi^2\) test.

Briefly, \(\chi^2\) measures how close your output distribution is to an expected distribution. If we were shuffling our lists uniformly, then we’d expect every value to end up in every cell about the same number of times; for an array of length \(N\), with a shuffle repeated \(K\) times, you’d expect each value to be in each cell \(\frac{K}{N}\) times.

Here is what the \(\chi^2\) distribution is for the good algorithms, with \(N=235\) and \(K=100000\):


And here it is for three different random-sort algorithms, bubble sort, std::stable_sort, and merge sort, all using a coin-toss ordering function:


(The reason I went with std::stable_sort instead of std::sort is that the latter uses some optimizations that assume that the sort is transitive – which isn’t the case when you’re randomly flipping a coin. So std::sort tends to crash when this constraint is violated.)

For a better comparison of how the different algorithms stack up as \(N\) changes, here’s a chart:

N Fisher-Yates Random prefix Bubble sort std::stable_sort Merge sort
4 0.00005 pick-4.png 0.00004 prefix-4.png 0.139569 sort_bubble-4.png 0.158157 sort_std-4.png 0.00004 sort_merge-4.png
5 0.00002 pick-5.png 0.00005 prefix-5.png 0.218803 sort_bubble-5.png 0.273358 sort_std-5.png 0.0487565 sort_merge-5.png
6 0.00006 pick-6.png 0.00008 prefix-6.png 0.29487 sort_bubble-6.png 0.400601 sort_std-6.png 0.0330905 sort_merge-6.png
7 0.0000 pick-7.png 0.00006 prefix-7.png 0.364549 sort_bubble-7.png 0.527574 sort_std-7.png 0.0342361 sort_merge-7.png
8 0.00006 pick-8.png 0.00006 prefix-8.png 0.430965 sort_bubble-8.png 0.660637 sort_std-8.png 0.00005 sort_merge-8.png
10 0.0001 pick-10.png 0.00010 prefix-10.png 0.557523 sort_bubble-10.png 0.93249 sort_std-10.png 0.0335068 sort_merge-10.png
12 0.000127262 pick-12.png 0.000108561 prefix-12.png 0.674842 sort_bubble-12.png 1.19897 sort_std-12.png 0.024186 sort_merge-12.png
15 0.000148504 pick-15.png 0.000128046 prefix-15.png 0.831853 sort_bubble-15.png 1.60815 sort_std-15.png 0.0178444 sort_merge-15.png
18 0.000186571 pick-18.png 0.000189251 prefix-18.png 0.971268 sort_bubble-18.png 2.01685 sort_std-18.png 0.0259307 sort_merge-18.png
22 0.000212744 pick-22.png 0.000221075 prefix-22.png 1.14734 sort_bubble-22.png 2.56097 sort_std-22.png 0.0321786 sort_merge-22.png
27 0.000249649 pick-27.png 0.000268954 prefix-27.png 1.34988 sort_bubble-27.png 3.24243 sort_std-27.png 0.0271932 sort_merge-27.png
33 0.000348344 pick-33.png 0.000328459 prefix-33.png 1.56487 sort_bubble-33.png 4.06064 sort_std-33.png 0.0116926 sort_merge-33.png
41 0.000417497 pick-41.png 0.000404196 prefix-41.png 1.82521 sort_bubble-41.png 5.1443 sort_std-41.png 0.0301371 sort_merge-41.png
51 0.000494652 pick-51.png 0.000514062 prefix-51.png 2.11672 sort_bubble-51.png 6.50646 sort_std-51.png 0.026951 sort_merge-51.png
63 0.000612439 pick-63.png 0.000614844 prefix-63.png 2.4303 sort_bubble-63.png 8.14154 sort_std-63.png 0.0057482 sort_merge-63.png
78 0.000753832 pick-78.png 0.000759035 prefix-78.png 2.7903 sort_bubble-78.png 10.1829 sort_std-78.png 0.0270242 sort_merge-78.png
97 0.000952913 pick-97.png 0.000986311 prefix-97.png 3.19424 sort_bubble-97.png 12.7711 sort_std-97.png 0.022091 sort_merge-97.png
121 0.0011958 pick-121.png 0.0012227 prefix-121.png 3.65048 sort_bubble-121.png 16.0263 sort_std-121.png 0.0137545 sort_merge-121.png
151 0.00147693 pick-151.png 0.0015004 prefix-151.png 4.16317 sort_bubble-151.png 2.2788 sort_std-151.png 0.026883 sort_merge-151.png
188 0.00185188 pick-188.png 0.00186338 prefix-188.png 4.7293 sort_bubble-188.png 2.72105 sort_std-188.png 0.0242407 sort_merge-188.png
235 0.00231242 pick-235.png 0.00233842 prefix-235.png 5.3744 sort_bubble-235.png 3.24268 sort_std-235.png 0.0189719 sort_merge-235.png

And here is a logarithmic graph showing how the total \(\chi^2\) error changes as \(N\) increases:


So, as N increases, you would expect the \(\chi^2\) to increase if \(K\) remains constant, as it did in this case (as there are fewer overlapping samples). And in fact the bubble sort and std::stable_sort approaches increase pretty much in parallel with the Fisher-Yates and random prefix approaches (aside from the discontinuity at \(N=128\) due to the performance-tuning heuristics that algorithm uses) – but this is in a logarithmic graph, meaning that the random sorts increase orders of magnitude faster than the other ones.

Merge sort is a bit interesting, in that it is highly dependent on the actual size of the array, and in particular how close it is to a power of 2, with perfect powers of 2 having the same \(\chi^2\) as the ideal algorithms. (I leave this as an exercise to the reader to consider why this would be.) But there’s still a lot of inherent bias, and it’s not even casually-predictable bias.

Anyway, this is why when shuffling a list, you should use Fisher-Yates if possible, or sort based on a randomly-generated unassociated number if not.

(Here is the rather ugly source code if you want to play with it yourself.)


Before commenting, please read the comment policy.

Avatars provided via Libravatar