How many shuffles are necessary to randomize a deck of cards? Mathematically, card-shuffling can be viewed as a random walk on a finite group and, thus, it can be modeled by a Markov chain. We measure the degree of “randomness” of a deck of cards after
shuffles using the Kullback-Leibler distance between the probability distribution of the Markov chain after
shuffles and the uniform probability distribution. Lastly, we discuss an intriguing phenomenon that resembles a phase transition.
__________
Introduction
Imagine that we have a deck of cards. We assign number
to the card at the top of the deck, number
to the next card, and so forth, until we assign number
to the card at the bottom of the deck. This ordered arrangement of cards can be represented by an
-tuple
. The simplest possible shuffle would be to take the card at the top of the deck and insert it at some random position in the deck, say, at the bottom. The new ordered arrangement of cards would then be represented by the
-tuple
. This shuffling technique is called the top-in-at-random shuffle [1]. In this paper, we will consider the dovetail shuffle instead [2].
In order to model the shuffling process, we need to introduce some mathematical terminology first. We define . Let
denote the symmetric group on set
, i.e.,
is the group that consists of all bijections
with function composition as the group operation. The elements
are called permutations. Given
, then
is the bijection obtained by applying first
and then
. Every ordered arrangement, or configuration, of a deck of
cards can be associated with an element
. A sequence of
shuffles can be thought of as a composition of
bijections
, as follows
.
Hence, the card to which we assigned number , and that was at position
initially, will be at position
after
shuffles. One can think of the shuffling process as a random walk on
. Thus, we can define a Markov chain on
. The shuffling method that one uses defines the transition probabilities of the Markov chain. It would be an interesting challenge to start with a desired transition probability matrix and try to design shuffling methods that yield a Markov chain as close as possible to the desired one.
Let us look again at the top-in-at-random shuffling example given above. If we take the top card and insert it at the bottom of the deck, then we know exactly which permutation we did apply. However, the goal of card-shuffling is to “randomize” a deck, and if we know which permutation was applied with each shuffle, then we know exactly the new ordering of the deck. This is desirable only if our purpose is cheating. Therefore, in order to randomize a deck of cards we must apply a sufficient number of permutations chosen at random from
, so that there is enough uncertainty on what the new ordering of the deck of cards is. In this post we will study how many shuffles suffice to attain a desired level of uncertainty.
__________
The Dovetail Shuffle
The dovetail shuffle, or riffle shuffle, is a common card-shuffling method that consists of cutting a deck of cards in two packets, and then interleaving the two packets together. A variation is the -shuffle, which consists of cutting the deck in
packets and then interleaving them all together. In this post, we will consider dovetail
-shuffles only.
A rising sequence is defined by Bayer & Diaconis in [2] as a “maximal subset of an arrangement of cards, consisting of successive face values displayed in order”. Initially, the ordered arrangement of the deck is for all
and, hence, there is exactly one rising sequence. Each dovetail shuffle tends to double the number of rising sequences, until a saturation effect takes place due to the finiteness of the size of the deck. This is an “experimental” result that anyone can verify by playing with an actual deck of cards. Do note that:
- the minimum number of rising sequences in a deck of
cards is
. A deck with a single rising sequence is one in which the ordering of the cards is
for all
.
- the maximum number of rising sequences in a deck of
cards is
. A deck with
rising sequences is one in which the ordering of the cards is
for all
.
- after
dovetail shuffles, there are at most
rising sequences.
We introduce function , where
is defined to be the number of rising sequences in
. From Bayer & Diaconis’ paper [2], we know that the probability mass function on
after
dovetail shuffles,
, is defined by
.
Since we know the probability mass function on after
shuffles, we can study shuffling from an information-theoretic viewpoint. In particular, we are interested in how the entropy of the probability mass function
evolves as the deck is shuffled.
__________
Entropy Generation
The Shannon entropy [3] of the probability mass function is
After shuffles there are at most
rising sequences in the deck. Therefore, we only need to sum
terms
.
As mentioned in [2], the cardinality of the set is the Eulerian number
. Let
denote the probability
for any
, as follows
for . The entropy of the probability mass function
is thus given by
.
Hence, we can compute by summing
terms (instead of summing
terms), which makes the computation tractable for modestly large values of
, assuming that the Eulerian numbers can be computed efficiently, of course. A non-recursive formula for the Eulerian number
is
.
Using the expression for above to compute the entropy
is potentially dangerous: the binomial coefficients may take values which are too big, and this might lead to numerical errors. Therefore, it would be wise to break up the binomial coefficients into products
and to write logarithms of products as sums of logarithms
.
However, do note that might take very small values as we might want to consider decks with, say,
cards. A safer way would be to “combine”
with the Eulerian numbers
.
Let the uniform probability mass function on ,
, be defined by
for all
. The entropy of the uniform probability mass function
is
and, since the uniform distribution maximizes entropy, we conclude that . We denote
. In [1] Aldous & Diaconis use an algebraic argument to show that
as
, i.e., the probability mass funtion converges to the uniform one as the number of shuffles tends to infinity. Therefore, we can conclude that
as
and, hence, for every
there exists a
such that
implies that
. If one is given a value of
, then one can find how many shuffles suffice to attain an entropy higher than
.
The question “how many shuffles suffice to randomize a deck of cards?” is ill-posed. One must specify a measure of “randomness” (in this post, we use the Shannon entropy to measure randomness) and a value of
to measure how much randomness is “enough” for us to say that a deck is mixed.
As mentioned before, each dovetail shuffle tends to double the number of rising sequences, until saturation occurs. Hence, after dovetail shuffles, where
is “small”, there should be
rising sequences. We can make the approximation
which yields
and, thus
for “small” . We conclude that dovetail shuffling generates entropy at a rate of approximately
bits per shuffle, until the deck becomes saturated. Using the approximation
, we have that maximum entropy would be attained after approximately
shuffles.
In figures 1 and 2 we plot (in blue) the normalized entropy as a function of the number of dovetail shuffles,
, for decks of sizes
and
, respectively. We plot also (in red) the approximation
, and we plot (in magenta) the maximum normalized entropy.
[ Figure 1: Normalized entropy as a function of the number of shuffles for a deck with cards ]
[ Figure 2: Normalized entropy as a function of the number of shuffles for a deck with cards ]
Note that, as predicted theoretically, the entropy increases in an almost linear fashion during the first few shuffles. In fact, the approximation is surprisingly good until
for
, and until
for
. For
, we have
, which is a rough estimate of the number of shuffles after which the deck becomes saturated.
__________
Approach to Uniformity
As shown by Aldous & Diaconis in [1], the stationary probability mass function of the Markov chain is the uniform one. Therefore, the entropy generated by dovetail shuffling converges to the maximum entropy. We measure how “close” to the maximum entropy we are after shuffles using the Kullback-Leibler distance [3] between the probability mass functions
and
, as follows
and, hence, we obtain
.
Since , we have that
.
We can rewrite the equation above in the form
,
or in the form
,
because among all probability mass functions on , the one that maximizes the entropy is the uniform distribution
. Since the logarithm is a strictly concave function, we can use Jensen’s inequality [3] to obtain
and, therefore, we have the following upper bound on the Kullback-Leibler distance
.
Since , the inequality above can be rewritten as follows
.
Imagine that our initial probability mass function is a unit mass at permutation , i.e.,
, where
is the Kronecker delta. If this is the case, then there is zero uncertainty and, hence, the entropy is
, while the
-norm is maximum,
. At the other extreme, the uniform probability mass function has maximum entropy
, but minimum
-norm,
. Hence, we have that
.
Since as
, then we conclude that
and, from inequality
, we have that
. For
sufficiently large, the decay of
is given by the second largest (in absolute value) eigenvalue of the transition probability matrix of the Markov chain. Hence, for large
there exist constants
and
, such that
and, thus, . Applying logarithms, we obtain
where we used the inequality . From the inequality above and the inequality on the relative entropy, we conclude that
for sufficiently large
. Hence, the Kullback-Leibler distance converges to zero exponentially.
__________
Concluding Remarks
We have concluded that during the first shuffles, where
, the dovetail shuffling process generates entropy at a rate of approximately
bits per shuffle. However, when
, then
becomes
and the sum in the formula for the entropy “saturates” in the sense that the number of terms being summed no longer increases with
. For
, the deck is saturated and entropy approaches the maximum entropy asymptotically, as
decays exponentially. This interesting phenomenon resembles a phase transition and was pointed out by Trefethen & Trefethen in [4], while in [5] Stark et al. present detailed analytical results.
__________
References
[1] David Aldous, Persi Diaconis, Shuffling cards and stopping times, The American Mathematical Monthly, Vol. 93, No. 5, pp. 333–348, Mathematical Association of America, 1986.
[2] Dave Bayer, Persi Diaconis, Trailing the Dovetail Shuffle to its Lair, The Annals of Applied Probability, Vol. 2, No. 2, pp. 294-313, May 1992.
[3] Thomas M. Cover and Joy A. Thomas, Elements of Information Theory, John Wiley & Sons 2006.
[4] L. N. Trefethen, L. M. Trefethen, How many shuffles to randomize a deck of cards?, Proceedings of the Royal Society of London, Vol. 456, No. 2002, pp. 2561-2568, 2000.
[5] Dudley Stark, A. Ganesh, Neil O’Connell, Information Loss in Card Shuffling, HP Laboratories Bristol, September 1999.
__________
Related:
Tags: Card Shuffling


January 4, 2010 at 10:40 |
Nice post. I’d heard that if you shuffle the cards 6 times that should be enough to get to randomness. When playing bridge there was always a discussion that with insufficiently shuffled cards you didn’t get as many voids and singletons as you would expect with a random distribution.
Amusingly though if you are able to do 8 “perfect” shuffles in a row the cards return to their original order.
I’ve seen this on TV and it was very impressive. (The guy could also cut the deck to a particular card, say the 17th.) It’s hard to imagine how much practice it would take to achieve though.
January 4, 2010 at 16:06 |
I have never played Bridge, but it should be possible to simulate Bridge games and see what the effect of insufficiently shuffled decks is. Thanks for the observation.
Indeed, perfect shuffles are amazing. If you want to know more on them, I suggest the following 1983 paper by Persi Diaconis:
The Mathematics of Perfect Shuffles
I did not even know it was possible for a human to perform perfect shuffles in succession! That must take years of work.
March 28, 2010 at 13:48 |
Nice post, reminds me of an exercise I gave to students:
0-On a computer generate, a set of N random numbers between 0 and 1
1-Multiply them by 2
2-Substract one to those who are greater than one
3-Count the number of unique occurences is the modified set
4-Go back to step 1
March 28, 2010 at 14:05 |
In Bayer & Diaconis’ 1992 paper, Trailing the Dovetail Shuffle to its Lair, they briefly discuss the use of the dyadic map for shuffling purposes.
April 12, 2010 at 07:57 |
[...] Decks and shuffling cards , how many permutations are needed to make sure that the deck is shuffled?. It is a very good article that explains it very carefully. December 18, 2009 Alejandro [...]