Fisher Shuffle - Yates

Fisher-Yates Shuffle (named after Ronald Fisher and Frank Yates), also known as Knuth Shuffle (after Donald Knuth ), is an algorithm for creating random permutations of a finite set , simply put, for random shuffling of a set. The Fisher-Yates shuffle option, known as the Sattolo algorithm , can be used to generate a random loop of permutations of length n . A correctly implemented Fisher-Yates shuffling algorithm is unbiased , so that each permutation is generated with the same probability. The modern version of the algorithm is very efficient and requires time proportional to the number of elements in the set and does not require additional space.

The basic Fischer-Yates shuffle procedure is similar to accidentally pulling notes with numbers from a hat or cards from a deck, one element after another, until the elements run out. The algorithm provides an effective and rigorous method for such operations, guaranteeing an unbiased result.

Original Fisher-Yates Method

Fisher – Yates shuffling, in its original form, was described in 1938 by Ronald Fisher and Frank Yates in their book Statistical Tables for Research in Biology, Architecture, and Medicine ^[1] (The latest edition of the book describes another shuffling algorithm attributed to K. R . Rao .) Their method was developed for pencil and paper and used pre-computed tables of random numbers as a source of randomness. It looked like this:

We write numbers from 1 to N.
Choose a random number k between unity and the number of remaining numbers.
We cross out the kth remaining number, counting the numbers in ascending order, and write it down somewhere.
Repeat step 2 until all numbers are selected.
The sequence of numbers recorded in step 3 is a random permutation.

If the numbers used in step 2 are really random and not biased, we get the same random permutations (random and not biased). Fisher and Yates described how to obtain such random numbers for any interval, and provided tables to avoid bias. They also suggested the possibility of simplifying the method — selecting random numbers from one to N , and then discarding repetitions — to generate half the generated permutation, and only then use a more complex algorithm for the remaining half, otherwise the repeating numbers will come across too often.

Modern Algorithm

A modern version of the Fischer-Yates shuffling algorithm, intended for use in computers, was introduced by Richard Durstenfeld in 1964 in Communications of the ACM , Volume 7, Issue 7, entitled “Algorithm 235: Random permutation” ^{[2 ]} , and was popularized by Donald Knuth in the second volume of his book “The Art of Programming ” as “Algorithm P” ^[3] . Neither Durschenfeld nor Knut in the first edition of the book mentioned the Fisher and Yates algorithm, and did not seem to be aware of it. In the second edition of The Art of Programming, however, this omission was corrected ^[4]

The algorithm described by Durschenfeld differs from the Fisher and Yates algorithm in small but significant details. To prevent the computer implementation of the algorithm from wasting time using the remaining numbers in step 3, at Durschenfeld, at each iteration, the selected numbers were transferred to the end of the list by permutation with the last number not selected. This modification reduced the time complexity of the algorithm to O ( n ), compared with O ( n ² ) for the original algorithm ^[5] . This change leads to the following algorithm.

  To shuffle an array of a of n elements (indices 0..n-1):
   for all i from n - 1 to 1, execute
        j ← random number 0 ≤ j ≤ i
        swap a [ j ] and a [ i ]

Inverted Algorithm

The Fischer – Yates shuffle in the Durschenfeld variant is an in-place shuffle . That is, when specifying a filled array, it shuffles the elements in the same array, and does not create a copy of the array with rearranged elements. This can give a significant advantage when shuffling large arrays.

Simultaneous initialization and shuffling of the array can slightly increase efficiency if you use the “inverted” version of shuffle. In this version, the original element with index i is transferred to a random position among the first i positions after the element is moved from position to position i . If you select a random number equal to i , the unassigned value will be transferred first, but it will be immediately erased by the correct value. No separate initialization is needed in this variant, and no permutations are performed. If the source is defined by a simple function, such as integers from 0 to n - 1 , the source can simply be replaced by this function, since the source never changes at run time.

  To create an array a of n randomly shuffled source elements:
   for i from 0 to n - 1 do
       j ← random integer with 0 ≤ j ≤ i
       a [ i ] ← a [ j ]
       a [ j ] ← source [ i ]

The correctness of the “twisted” shuffle can be proved by induction; any of n ! different sequences of random numbers obtained during the operation of the algorithm forms its own permutation, so that all permutations will be obtained only once.

Another advantage of this approach is that, even without knowing the number "n", the number of source elements, we can create evenly distributed permutations. Below, an array a is created iteratively, starting from empty and a .length represents its current length.

  while the source is .
       j ← random number 0 ≤ j ≤ a .length
       if j = a .length
           a .add ( source .next Element)
       otherwise
           a .add ( a [ j ])
           a [ j ] ← source .next Element

Examples

With Pencil and Paper

As an example, we rearrange numbers from 1 to 8 using the original Fisher-Yates method . First, write the numbers on paper:

Interval	Selected	Draft	Result
		1 2 3 4 5 6 7 8

Now take a random number k from 1 to 8 - let it be 3 - and cross out the k- th (that is, the third) number (3, of course) and transfer it to the resulting sequence:

Interval	Selected	Draft	Result
1-8	3	1 2 3 4 5 6 7 8	3

Now we select the second random number, this time from the interval 1-7, let it be 4. Now we cross out the fourth number that has not yet been deleted from the draft - this will be the number 5 - and add it to the result:

Interval	Selected	Draft	Result
1-7	four	1 2 3 4 5 6 7 8	3 5

Now we select a random number from the interval 1-6, then from the interval 1-5, and so on, repeating the process of crossing out, as described above:

Interval	Selected	Draft	Result
1-6	five	1 2 3 4 5 6 7 8	3 5 7
1-5	3	1 2 3 4 5 6 7 8	3 5 7 4
1-4	four	1 2 3 4 5 6 7 8	3 5 7 4 8
1-3	one	1 2 3 4 5 6 7 8	3 5 7 4 8 1
1-2	2	1 2 3 4 5 6 7 8	3 5 7 4 8 1 6
		1 2 3 4 5 6 7 8	3 5 7 4 8 1 6 2

Modern Method

We do the same with the Durschenfeld method . This time, instead of deleting selected numbers and copying them somewhere, we rearrange them with numbers not yet selected. As before, start by writing out numbers from 1 to 8:

Interval	Selected	Draft	Result
		1 2 3 4 5 6 7 8

We select the first random number from 1 to 8, let it be 6, so we rearrange the 6th and 8th numbers in the list:

Interval	Selected	Draft	Result
1-8	6	1 2 3 4 5 8 7	6

We select the next random number from the interval 1 - 7, and let it be 2. Now we rearrange the 2nd and 7th numbers:

Interval	Selected	Draft	Result
1-7	2	1 7 3 4 5 8	2 6

We choose the next random number from the interval 1 - 6, and let it be 6, which means that we should leave the 6th number in place (after the previous permutations, the number 8 is here). We continue to act in this way until the entire permutation is formed:

Interval	Selected	Draft	Result
1-6	6	1 7 3 4 5	8 2 6
1-5	one	5 7 3 4	1 8 2 6
1-4	3	5 7 4	3 1 8 2 6
1-3	3	5 7	4 3 1 8 2 6
1-2	one	7	5 4 3 1 8 2 6

Options

Sattolo Algorithm

A very similar algorithm was published in 1986 by Sandra Sattolo to generate uniformly distributed cycles of (maximum) length n ^[6] The difference between the Durschenfeld and Sattolo algorithms is only in step 2 - in the Sattolo algorithm, a random number j is selected from the interval 1 - i −1, not from 1 - i . This simple modification leads to single-cycle permutations.

In fact, as described below, it is easy to accidentally implement the Sattolo algorithm when trying to create the Fisher-Yates algorithm. Such an error leads to the generation of permutations from the smaller set ( n −1)! cycles of length N , instead of the complete set of n ! possible permutations.

The fact that the Sattolo algorithm always creates a cycle of length n can be shown by induction. Suppose that after the first iteration (moving the element n to the position k < n ), the remaining n - 1 iterations formed a cycle of elements of length n - 1. We trace the chain of movements - take some element, it will move to some position p , displacing the element located there, to some new position, and so on. According to the accepted assumption, we will get to the initial position only after going through all the other positions. The last element, after moving to position k , and successive permutations of the first n - 1 elements will fall to position l ; compare the permutation π of all n elements with the permutation σ of the first n - 1 elements. Tracking the permutations, as mentioned above, we will not find the difference between π and σ until we reach position k . In π, the element located at position k moves to the last position, and not to position l , and the element located at the last position falls into place l . Starting from this place, the sequence of movements of the elements π will again coincide with σ , and all positions will be completed before returning to the initial position, as required.

As in the case of proving the equiprobability of the permutations, it suffices to show that the Sattolo algorithm creates ( n −1)! various permutations, which, due to the assumed non-bias of the random number generator, have an equal probability. ( n −1)! various permutations generated by the algorithm exactly cover the set of cycles of length n .

A simple Python implementation of the Sattolo algorithm:

  from random import randrange

 def sattoloCycle ( items ):
     i = len ( items )
     while i > 1 :
         i = i - 1
         j = randrange ( i ) # 0 <= j <= i-1
         items [ j ], items [ i ] = items [ i ], items [ j ]
     return

Comparison with other shuffling algorithms

The Fisher-Yates algorithm is quite efficient, and even more so, its speed and memory consumption are asymptotically optimal. When using a high-quality unbiased random number generator, the algorithm guarantees an unbiased result. The algorithm has another advantage - if you want to get some part of the permutations, the algorithm can be stopped halfway or even stopped and restarted many times.

There is an alternative method - a random number is assigned to each element of the set, and then the set is sorted according to the assigned numbers .. The asymptotic estimate of the sorting speed at best is O ( n log n ), which is incomparable with the O ( n ) estimate of the speed of the Fisher-Yates algorithm. Like the Fisher-Yates shuffle, the sorting method produces unbiased permutations, but it is less sensitive to possible problems of the random number generator . However, special attention should be paid to the generation of random numbers in order to avoid repetition, since the algorithm with sorting, in general, does not sort the elements randomly.

There is a variant of the sorting method in which the list is sorted using a comparison function that returns a random number. However, this is an exceptionally bad method : it is likely to form an uneven distribution, which in addition depends on the sorting method used ^[7] ^[8] . For example, when using quick sort , with a fixed item used as the starting item. This sorting algorithm compares the remaining elements of the list with the selected one (less or more than it) and in this way the resulting position of the element is determined. If the comparison returns “less” and “more” with equal probability, then the selected element will be in the center with a much greater probability than at the ends. Another sorting method, similar to merge sorting , can create permutations with a more uniform probability, but still has drawbacks, since merging two sequences with a random choice of sequence with the same probability (until we exhaust the sequence of random numbers) does not create a result with a uniform probability distribution, since the probability of choosing a sequence should be proportional to the number of elements in the sequence. In fact, no method using a “coin toss”, that is, a sequential choice of two possibilities, can create permutations (of more than two elements) with a uniform distribution, since any event in this scheme has the probability of a rational fraction with a divisor equal to the degree deuces, while the required probability should be 1 / n !.

In principle, such shuffling methods can lead to an algorithm looping or memory access error, since the correctness of the sorting algorithm may depend on the properties of ordering, such as transitivity ^[9] . Although this kind of behavior should not happen in sortings in which a comparison cannot be predicted from previous comparisons, sometimes there are reasons for such comparisons. For example, the fact that an element must always be equal to itself, for efficiency purposes, can be used as some attribute or flag, and this, in case of random comparisons, violates the sorting algorithm.

Potential sources of uneven distribution

Caution should be exercised when implementing the Fisher-Yates algorithm, both in terms of the algorithm itself and for the random number generator on which it is based. Some common implementation errors are listed below.

Errors in the implementation of the algorithm

A common mistake in the implementation of the Fisher-Yates shuffle is the choice of random numbers from the wrong interval ^[10] . A defective algorithm may seem to work correctly, but it does not create all possible permutations with equal probability, and it may not create some permutations at all. For example, a general error of understating or overestimating by one can occur when choosing the index j of the rearranged element in the example above is always less than the index i with which the element must be rearranged. As a result, instead of the Fischer-Yates shuffle, we get the Sattolo algorithm , which forms permutations that affect all elements. In particular, in this case, no element can be in the initial position.

The probability that the element i will be at position j if the algorithm is implemented incorrectly

Similarly, choosing j from all the indices in the array at each iteration also generates unequal permutations, although not so obvious. This can be established from the fact that such an implementation forms n ⁿ different exchanges of places for elements, while there are only n ! possible permutations of an array of n elements. Since n ^{n can} never be divided by n ! without a remainder for n > 2 (since n ! is divisible by the number n −1, which does not have common prime divisors with n ), some permutations should appear more often than others. Consider a specific example of permutations of three elements [1, 2, 3]. There are 6 possible permutations of this set (3! = 6), but the algorithm forms 27 shuffles (3 ³ = 27). In this case, [1, 2, 3], [3, 1, 2], and [3, 2, 1] occur 4 times in 27 shuffles, while the remaining 3 occur 5 times.

The matrix on the right shows the probability of occurrence of each element from the list of length 7 in the final position. Note that for most elements to stay shuffled at their initial position (the main diagonal of the matrix) has a minimum probability, and to move one position to the left is the maximum.

Violation of uniformity of distribution during capture modulo

The Fisher-Yates algorithm uses a sample of uniformly distributed random numbers from different ranges. Most random number generators , however, both real random and pseudo random numbers give numbers in a fixed range, say, from 0 to 2 ³² −1. A simple and commonly used method of reducing such numbers to the desired interval is to use the remainder of the division by the upper bound. The need to generate random numbers in all intervals from 0-1 to 0- n ensures that some of these boundaries will not completely divide the natural boundary of the generator. As a result, the distribution will not be uniform and small residues will occur more often.

Suppose, for example, that the generator gives random numbers between 0 and 99 (as Fisher and Yates had in their original tables), and you want to get random numbers between 0 and 15. If you just find the remainder of the number when dividing by 16, you will find that the numbers 0-3 are found 17% more often than the rest. The reason for this is that 16 does not divide 100 completely - the largest number that is a multiple of 16 and does not exceed 100 is 6 × 16 = 96, and the remaining numbers from the range 96-99 create unevenness. The easiest way to avoid this problem is to drop such numbers before taking the remainder. Although, in principle, numbers from such an interval can come across, the mathematical expectation of the number of retries is always less than unity.

A similar problem arises when using a random number generator giving floating point numbers , usually in the range [0,1). The resulting number is multiplied by the size of the desired range and rounded. The problem here is that even neatly generated random floating-point numbers have finite precision, which means that you can only get a finite number of possible floating-point numbers, and, as in the case above, these numbers are divided into segments that do not divide the whole number and some segments are more likely to appear than others.

Pseudorandom generators: problems caused by the internal states of the random number generator, initial parameters, and use

Additional problems appear when using a pseudo random number generator (PRNG). The generation of a pseudo-random sequence of such generators is completely determined by their internal state at the beginning of generation. A shuffle program based on such a generator cannot create more permutations than the number of internal states of the generator. Even when the number of possible generator states overlaps the number of permutations, some permutations may appear more often than others. In order to avoid the appearance of uneven distribution, the number of internal states of the random number generator must exceed the number of permutations by several orders of magnitude.

For example, the built-in pseudo-random number generator, which is available in many programming languages and libraries, usually uses a 32-bit number for internal states, which means that such a generator can create only 2 ³² different random numbers. If such a generator is used to shuffle a deck of 52 playing cards , it can create a very small portion of 52! ≈ 2 ^225.6 possible permutations. A generator with less than 226-bit number of internal states cannot create all permutations of a deck of 52 playing cards. It is believed that to create a uniform distribution, a generator having at least a 250-bit number of states is needed.

Naturally, no pseudo-random number generator can create more random sequences given by different initial data than the number of these initial data. So, a generator with a 1024-bit number of internal states, for which 32-bit initial parameters are set, can create only 2 ³² different sequences of permutations. You can get more permutations by choosing a lot of random numbers from the generator before using it in the shuffle algorithm, but this approach is very inefficient - sampling, say, 2 ³⁰ random numbers before using the sequence in the shuffle algorithm increases the number of permutations only to 2 ⁶² .

Another problem arises when using a simple linear congruent PRNG, when the remainder of division is used to reduce the random number interval, as mentioned above. The problem here is that the low-order bits of the linear congruent PRNG are less random compared to the high-order bits - the low-order n bits have a period of maximum 2 ⁿ . If the divisor is a power of two, taking the remainder means dropping the most significant bits, which leads to a significant reduction in randomness.

Finally, it should be noted that even the use of an excellent generator, a flaw in the algorithm may occur due to improper use of the generator. Imagine, for example, that implementing an algorithm in Java creates a new generator for each call to the shuffle process without specifying parameters in the constructor. The current time (System.currentTimeMillis ()) will be used to initialize the generator. Thus, two calls with a time difference of less than a millisecond will give identical permutations. This will almost certainly happen if shuffling occurs repeatedly and quickly, resulting in an extremely uneven distribution of permutations. The same problem can occur when getting random numbers from two different streams. It is more correct to use one static instance of the generator defined outside the shuffling procedure.

Notes

↑ Fisher, RA; Yates, F. Statistical tables for biological, agricultural and medical research. - 3rd. - London: Oliver & Boyd, 1948. - P. 26–27. . (Note: sixth edition, ISBN 0-02-844720-4 , available online Archived October 23, 2009 on the Wayback Machine , but it provides another shuffling algorithm - CR Rao algorithm)
↑ Durstenfeld, Richard (July 1964). "Algorithm 235: Random permutation." Communications of the ACM 7 (7): 420.
↑ Knuth, Donald E. The Art of Computer Programming volume 2: Seminumerical algorithms. - Reading, MA: Addison – Wesley, 1969. - P. 124–125.
↑ Knuth. The Art of Computer Programming vol. 2. - 3rd. - Boston: Addison – Wesley, 1998 .-- P. 145–146. - ISBN 0-201-89684-2 .
↑ Black, Paul E. Fisher – Yates shuffle (neopr.) . Dictionary of Algorithms and Data Structures . National Institute of Standards and Technology (December 19, 2005). Date of treatment August 9, 2007. Archived June 4, 2013.
↑ Wilson, Mark C. (2004-06-21). " Overview of Sattolo's Algorithm " in Algorithms Seminar 2002-2004 . F. Chyzak (ed.), Summary by Éric Fusy. INRIA Research Report 5542 : 105-108. ISSN 0249-6399 . .
↑ A simple shuffle that proved not so simple after all (unspecified) . require 'brain' (June 19, 2007). Date of treatment August 9, 2007. Archived June 4, 2013.
↑ Doing the Microsoft Shuffle: Algorithm Fail in Browser Ballot (unspecified) . Rob Weir: An Antic Disposition (February 27, 2010). Date of treatment February 28, 2010. Archived June 4, 2013.
↑ Writing a sort comparison function, redux ( unspecified ) . require 'brain' (May 8, 2009). Date of treatment May 8, 2009. Archived June 4, 2013.
↑ Jeff Atwood. The Danger of Naïveté (neopr.) . Coding Horror: programming and human factors (December 7, 2007). Archived June 4, 2013.

Links

Mike Bostock. Fisher – Yates Shuffle (Neopr.) (January 14, 2012). - Interactive example.

[fisheryates-1] Fisher, RA; Yates, F. Statistical tables for biological, agricultural and medical research. - 3rd. - London: Oliver & Boyd, 1948. - P. 26–27. . (Note: sixth edition, ISBN 0-02-844720-4 , available online Archived October 23, 2009 on the Wayback Machine , but it provides another shuffling algorithm - CR Rao algorithm)

[2] Durstenfeld, Richard (July 1964). "Algorithm 235: Random permutation." Communications of the ACM 7 (7): 420.

[knuth-3] Knuth, Donald E. The Art of Computer Programming volume 2: Seminumerical algorithms. - Reading, MA: Addison – Wesley, 1969. - P. 124–125.

[knuth3-4] Knuth. The Art of Computer Programming vol. 2. - 3rd. - Boston: Addison – Wesley, 1998 .-- P. 145–146. - ISBN 0-201-89684-2 .

[nist-5] Black, Paul E. Fisher – Yates shuffle (neopr.) . Dictionary of Algorithms and Data Structures . National Institute of Standards and Technology (December 19, 2005). Date of treatment August 9, 2007. Archived June 4, 2013.

[wilson-6] Wilson, Mark C. (2004-06-21). " Overview of Sattolo's Algorithm " in Algorithms Seminar 2002-2004 . F. Chyzak (ed.), Summary by Éric Fusy. INRIA Research Report 5542 : 105-108. ISSN 0249-6399 . .

[7] A simple shuffle that proved not so simple after all (unspecified) . require 'brain' (June 19, 2007). Date of treatment August 9, 2007. Archived June 4, 2013.

[8] Doing the Microsoft Shuffle: Algorithm Fail in Browser Ballot (unspecified) . Rob Weir: An Antic Disposition (February 27, 2010). Date of treatment February 28, 2010. Archived June 4, 2013.

[9] Writing a sort comparison function, redux ( unspecified ) . require 'brain' (May 8, 2009). Date of treatment May 8, 2009. Archived June 4, 2013.

[10] Jeff Atwood. The Danger of Naïveté (neopr.) . Coding Horror: programming and human factors (December 7, 2007). Archived June 4, 2013.