fanf: (Default)
[personal profile] fanf

https://dotat.at/@/2025-12-25-shuffle.html

Here's a pearlescent xmas gift for you!

There are four variants of the algorithm for shuffling an array, arising from two independent choices:

  • whether to swap elements in the higher or lower parts of the array
  • whether the boundary between the parts moves upwards or downwards

The variants are perfectly symmetrical, but they work in two fundamentally different ways: sampling or permutation.

The most common variant is Richard Durstenfeld's shuffle algorithm, which moves the boundary downwards and swaps elements in the lower part of the array. Knuth describes it in TAOCP vol. 2 sect. 3.4.2; TAOCP doesn't discuss the other variants.

(Obeying Stigler's law, it is often called a "Fisher-Yates" shuffle, but their pre-computer algorithm is arguably different from the modern algorithm.)

the four variants

Below, min and max are the inclusive bounds on the array to be shuffled; the arguments to rand() are the inclusive bounds on its return value; and the loop bounds are inclusive too. I chose this style to make the symmetries more obvious.

In all variants, it's possible for the indexes this (the boundary between the parts of the array) and that (chosen at random) to be the same, in which case the swap is a no-op.

I could have written the loop bounds as min and max instead of min+1 and max-1 to make the variants look as similar as possible, but it's more realistic to omit the loop iterations when this and that are guaranteed to be equal.

It should be clear that rand() is invoked for spans of each size between 2 and N (where N = max - min + 1) so the algorithms produce N! possible permutations as expected.

boundary moves down, pick from lower

shuffle(a, min, max)
    for this = max to min+1 step -1
        that = rand(min, this)
        swap a[this] and a[that]

boundary moves up, pick from higher

shuffle(a, min, max)
    for this = min to max-1 step +1
        that = rand(this, max)
        swap a[this] and a[that]

boundary moves down, pick from higher

shuffle(a, min, max)
    for this = max-1 to min step -1
        that = rand(this, max)
        swap a[this] and a[that]

boundary moves up, pick from lower

shuffle(a, min, max)
    for this = min+1 to max step +1
        that = rand(min, this)
        swap a[this] and a[that]

the two kinds of shuffle

To merge four variants into two kinds, I'll introduce some names that allow me to describe them independently of the direction of iteration. The todo part of the array is ahead of the iterator, i.e. lower if we are moving the boundary downwards, higher if we are moving it upwards. The done part of the array is behind the iterator, vice versa.

Earlier I called this the boundary between the parts, but that was a deliberate fencepost error which allowed me to gloss over an important difference. In fact the boundary is next to this, and which side it's on depends on the kind of shuffle.

shuffle by sampling

The first two variants above work by sampling.

In this kind of shuffle, the indexes this and that are on the todo side of the boundary. We pick an element at random from the todo part of the array, and swap it into position next to the boundary. Then the boundary is moved so that the picked element is appended to the done part of the array.

The loop invariant or inductive hypothesis is that the done part of the array is a random sample of all the elements from the original array.

shuffle by permutation

The last two variants above work by permutation.

In this kind of shuffle, the indexes this and that are on the done side of the boundary. First we move the boundary to append an element from todo to done. Then we pick a random index in done, and swap the element next to the boundary into that position.

The loop invariant or inductive hypothesis is that the done part of the array is a random permutation of the elements that were originally in the same prefix or suffix of the array.

coda

I was inspired to write this by "The Fisher-Yates shuffle is backward" and the examples that were added to Wikipedia's article on the Fisher-Yates shuffle a few months ago.

The article by "Possibly Wrong" correctly points out that there are two kinds of shuffle, but it misses out one of the four variants. Wikipedia's examples show all four variants but they are cluttered by conventional array indexing, and it doesn't explain there are actually two kinds of shuffle.

There's a classic off-by-one mistake that turns a shuffle into Sattolo's algorithm for cyclic permutations. It happens when the random bounds are too narrow, so that is never equal to this, and an element can never land in the same place that it started. This change has the same effect on both kinds of shuffle.

I enjoyed revisiting this old classic and trying to beautify it. I hope you like the results!

This account has disabled anonymous posting.
(will be screened if not on Access List)
(will be screened if not on Access List)
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org

December 2025

S M T W T F S
 123456
78910111213
14151617181920
21222324 252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated 2025-12-26 05:58
Powered by Dreamwidth Studios