## Fixed-Odds Betting Arbitrage

Imagine that we want to bet on an event with $m \geq 2$ possible outcomes, and that there are $n \geq 2$ bookmakers taking bets on those outcomes. Note that the odds are known at the time the bets are placed, i.e., we have fixed-odds betting instead of pari-mutuel betting.

We would like to know how to allocate our (limited) money, i.e., how much to bet on each outcome at each bookmaker. As we cannot foresee the future, we would very much like to lock in risk-less profits. In other words, we would like to find arbitrage opportunities (aka: “arbs”). In this post we discuss how, given the (fixed) odds posted by $n$ bookmakers, we can detect the existence of arbitrage opportunities.

__________

Odds, Outcomes, and Profit

Let $[m] := \{1, 2, \dots, m\}$, and $[n] := \{1, 2, \dots, n\}$. Suppose that the decimal odds offered by bookmaker $j \in [n]$ for outcome $i \in [m]$ are $\omega_{ij} \geq 1$. Let the $m \times n$ odds matrix be the following

$\Omega = \left[\begin{array}{cccc}\omega_{11} & \omega_{12} & \ldots & \omega_{1n}\\ \omega_{21} & \omega_{22} & \ldots & \omega_{2n}\\\vdots & \vdots & \ddots & \vdots\\ \omega_{m1} & \omega_{m2} & \ldots & \omega_{mn}\\\end{array}\right]$.

Let $x_{ij} \geq 0$ be the amount of money that we bet on outcome $i \in [m]$ at bookmaker $j \in [n]$, and let

$X = \left[\begin{array}{cccc}x_{11} & x_{12} & \ldots & x_{1n}\\x_{21} & x_{22} & \ldots & x_{2n}\\\vdots & \vdots & \ddots & \vdots\\x_{m1} & x_{m2} & \ldots & x_{mn}\\\end{array}\right]$

be the $m \times n$ money allocation matrix. Since we have a limited amount of money available, $c > 0$, we must impose the inequality constraint

$\displaystyle\sum_{i \in [m]} \sum_{j \in [n]} x_{ij} \leq c$

to ensure that we do not spend more than what we have. Let $\mathbb{F}_2 := \{0,1\}$. We introduce binary variables $\theta_1, \dots, \theta_m \in \mathbb{F}_2$, where

$\theta_i = \begin{cases} 1, & \text{if outcome } i \text{ occurs}\\ 0, & \text{otherwise}\end{cases}$

and we define $\theta := \left(\theta_1, \theta_2, \ldots, \theta_m\right) \in \mathbb{F}_2^m$. Let $\Theta \subseteq \mathbb{F}_2^m$ be the set of all admissible $\theta$ vectors. The $m$ outcomes are mutually exclusive if

$\theta_i = 1 \Rightarrow \theta_k = 0, \quad{} \forall k \in [m] \setminus \{i\}$.

If the $m$ outcomes are mutually exclusive, then there are only $m$ admissible $\theta$ vectors. If the $m$ outcomes are not mutually exclusive, then the number of admissible $\theta$ vectors is $m \leq |\Theta|\leq 2^m$.

The odds for outcome $i \in [m]$ offered by bookmaker $j \in [n]$ are $\omega_{ij}$. Hence, if we bet $x_{ij} > 0$ on  outcome $i$ at bookmaker $j$ and outcome $i$ does occur, then we get paid $\omega_{ij} x_{ij}$ and make a profit of $(\omega_{ij} - 1) x_{ij}$. However, if outcome $i$ does not occur, we make a profit of $-x_{ij}$. Thus, our profit from betting on outcome $i$ at bookmaker $j$ is

$p_{ij} = (\theta_i \omega_{ij} - 1 ) x_{ij}$.

Betting on all outcomes and at all bookmakers, our total profit is

$P (\theta, X; \Omega) = \displaystyle\sum_{i \in [m]}\sum_{j \in [n]} p_{ij} = \displaystyle\sum_{i \in [m]}\sum_{j \in [n]} (\theta_i \omega_{ij} - 1 ) x_{ij}$

where $\theta$ is chosen by fate, $X$ is our choice, and $\Omega$ is the bookmakers’ choice. Note that $\theta$ and $X$ are variables, whereas the odds $\Omega$ are parameters. Suppose we want to bet on the outcomes of a tennis match, say, Sharapova versus Kirilenko; as soon as the tennis match is over, vector $\theta$ is known and, thus, we know our profit / loss.

Let us define

$R (\theta; \Omega) := \left[\begin{array}{cccc}(\theta_1\omega_{11} - 1) & (\theta_1\omega_{12} - 1) & \ldots & (\theta_1\omega_{1n} - 1)\\ (\theta_2\omega_{21} - 1) & (\theta_2\omega_{22} - 1) & \ldots & (\theta_2\omega_{2n} - 1)\\\vdots & \vdots & \ddots & \vdots\\ (\theta_m\omega_{m1} - 1) & (\theta_m\omega_{m2} - 1) & \ldots & (\theta_m\omega_{mn} - 1)\\\end{array}\right]$

and let $\text{diag}(\theta)$ be the $m \times m$ diagonal matrix whose $m$ diagonal entries are the entries of vector $\theta$. Moreover, let $1_m = (1,1,\dots,1)$ be the $m$-dimensional vector of ones. Then, we can write $R (\theta; \Omega)$ in a more compressed form, as follows

$R (\theta; \Omega) = \text{diag}(\theta) \Omega - 1_m 1_n^T$.

Finally, we can write the total profit $P (\theta, X; \Omega)$ more economically, as follows

$P (\theta, X; \Omega) = 1_m^T \left(R (\theta; \Omega) \circ X\right) 1_n$

where $R (\theta; \Omega) \circ X$ is the Hadamard product (entry-wise matrix multiplication) of matrices $R (\theta; \Omega)$ and $X$.

Let $\text{vec}(\cdot)$ denote the vectorization operator, which stacks a given matrix’s columns in one large column vector. Note that if we apply operator $\text{vec}(\cdot)$ to a scalar, we obtain the same scalar. Hence, we have that $\text{vec}(P (\theta, X; \Omega)) = P (\theta, X; \Omega)$. Thus,

$P (\theta, X; \Omega) = \text{vec}\left(1_m^T \left(R (\theta; \Omega) \circ X\right) 1_n\right) = 1_{mn}^T \text{vec} \left(R (\theta; \Omega) \circ X\right)$

because $1_n^T \otimes 1_m^T = 1_{mn}^T$. It can be shown that

$\text{vec} \left(R (\theta; \Omega) \circ X\right) = \text{diag}\left( \text{vec} \left( R (\theta; \Omega)\right) \right) \text{vec}(X)$

and, hence, the total profit is

$P (\theta, X; \Omega) = 1_{mn}^T \text{diag}\left( \text{vec} \left( R (\theta; \Omega)\right) \right) \text{vec}(X)$.

It’s not difficult to show that

$1_{mn}^T \text{diag}\left( \text{vec} \left( R (\theta; \Omega)\right) \right) = \text{vec}^T \left( R (\theta; \Omega) \right)$

and, thus, we obtain

$P (\theta, X; \Omega) = \text{vec}^T \left( R (\theta; \Omega) \right) \text{vec}(X)$

which is the inner product of two $m n$-dimensional vectors. The usefulness of writing the profit as an inner product will be evident later on. Note that the inequality constraint

$\displaystyle\sum_{i \in [m]} \sum_{j \in [n]} x_{ij} \leq c$

can be written in a rather more compressed form as $1_m^T X 1_n \leq c$. Or, using vectorization, as $1_{m n}^T \text{vec}(X) \leq c$.

__________

Arbitrage

Given the odds matrix $\Omega$, the set of admissible $\theta$ vectors $\Theta \subseteq \mathbb{F}_2^m$, and a budget $c > 0$, we would like to find a money allocation matrix $X$ such that $1_{m n}^T \text{vec}(X) \leq c$ and

$P (\theta, X; \Omega) \geq 0, \quad{} \forall \theta \in \Theta$.

In other words, we would like to allocate the money such that, no matter the outcomes, we do not lose money. This is a constraint satisfaction problem. To be more precise, we have to satisfy $|\Theta| + 1$ inequality constraints.

From a game-theoretic viewpoint, we are playing a sequential game with fate. We play first, by choosing $X$. Fate plays next by choosing $\theta \in \Theta$. We have an arbitrage opportunity when, no matter what $\theta$ fate does happen to choose, we make a profit. We want to “Dutch” fate.

__________

Betting on Mutually Exclusive Outcomes

If $1_m^T \theta = 1$, i.e., if only one entry in $\theta$ can take take the value $1$, then the $m$ outcomes are mutually exclusive. Why is the case where the outcomes are mutually exclusive interesting? The reason is that when the outcomes are mutually exclusive, the set of admissible $\theta$ vectors $\Theta \subseteq \mathbb{F}_2^m$ has minimal cardinality, $|\Theta| = m$, and the $m$ elements of $\Theta$ are easy to index.

If the outcomes are mutually exclusive, then

$\Theta = \left\{ \left[\begin{array}{c} 1 \\ 0 \\ \vdots \\ 0\end{array}\right], \left[\begin{array}{c} 0 \\ 1 \\ \vdots \\ 0\end{array}\right], \dots, \left[\begin{array}{c} 0 \\ 0 \\ \vdots \\ 1\end{array}\right] \right\}$

i.e., the $m$ elements of $\Theta$ are the vectors of the canonical basis for $\mathbb{R}^m$. Let us denote

$\mathrm{e}_1 := \left[\begin{array}{c} 1 \\ 0 \\ \vdots \\ 0\end{array}\right], \quad{} \mathrm{e}_2 := \left[\begin{array}{c} 0 \\ 1 \\ \vdots \\ 0\end{array}\right], \dots, \quad{} \mathrm{e}_m := \left[\begin{array}{c} 0 \\ 0 \\ \vdots \\ 1\end{array}\right]$

so that $\Theta = \{\mathrm{e}_1, \mathrm{e}_2, \dots, \mathrm{e}_m\}$. Let $\tilde{x} := \text{vec}(X)$ be a $m n$-dimensional decision vector, and $q_i := \text{vec}\left( R (\mathrm{e}_i; \Omega)\right)$. Then, the arbitrage condition

$P (\theta, X; \Omega) = \text{vec}^T \left( R (\theta; \Omega) \right) \text{vec}(X) \geq 0, \quad{} \forall \theta \in \Theta$

can be written as $m$ inequality constraints

$q_1^T \tilde{x} \geq 0, \quad{} q_2^T \tilde{x} \geq 0, \dots, \quad{} q_m^T \tilde{x} \geq 0$.

Let $Q := \left[\begin{array}{cccc} q_1 & q_2 & \ldots & q_m\end{array}\right]$ be the $m n \times m$ matrix whose $i$-th column is $q_i$, then the $m$ arbitrage inequalities can be written simply as $Q^T \tilde{x} \geq 0_m$, where $0_m$ is the $m$-dimensional vector whose entries are all zero.

Due to our limited budget, we have the inequality $1_{mn}^T \tilde{x} \leq c$. The $m$ arbitrage inequalities can also be written as $-Q^T \tilde{x} \leq 0_m$. Hence, we have $1+m$ inequalities

$\left[\begin{array}{c} 1_{mn}^T\\ -Q^T\end{array}\right] \tilde{x} \leq \left[\begin{array}{c} c \\ 0_m\end{array}\right]$.

From a geometric viewpoint, these $1+m$ inequalities correspond to the intersection of $1+m$ closed half-spaces in $\mathbb{R}^{mn}$, which defines a polytope

$\left\{ \tilde{x} \in (\mathbb{R}_0^+)^{mn} : \left[\begin{array}{c} 1_{mn}^T \\ -Q^T\end{array}\right] \tilde{x} \leq \left[\begin{array}{c} c \\ 0_m\end{array}\right] \right\}$.

If this polytope is empty, there are no arbitrage opportunities. If the polytope is non-empty, there is at least one arbitrage opportunity. Who would have thought that betting arbitrage could be studied using Discrete Geometry? ;-)

Determining whether the polytope is nonempty is as hard as solving as linear program (so says Peter Shor). Hence, we can determine if the polytope is non-empty by solving a linear program with an arbitrary objective function

$\begin{array}{ll} \displaystyle\min_{\tilde{x} \in \mathbb{R}^{mn}} & 0_{m n}^T \tilde{x} \\ \text{subject to} & \left[\begin{array}{c} 1_{mn}^T\\ -Q^T\end{array}\right] \tilde{x} \leq \left[\begin{array}{c} c \\ 0_m\end{array}\right], \quad{} \tilde{x} \geq 0_{m n}\end{array}$

If this linear program is feasible, then an arbitrage opportunity does exist. Any solution that the LP solver returns will be an “arb” :-)

We have considered mutually exclusive outcomes in this section. If the outcomes are not mutually exclusive, then $|\Theta| > m$ and it gets harder to enumerate all the admissible $\theta$ vectors. However, the exact same approach can be used, although we will have more than $m$ arbitrage inequalities.

__________

Example: Sharapova versus Kirilenko

Suppose that $n = 2$ bookmakers are posting odds for the Sharapova versus Kirilenko tennis match we mentioned before. Suppose further that the bookies are offering odds for $m = 2$ mutually exclusive outcomes: Sharapova wins, or  Kirilenko wins. They can’t both win the match, of course. One thing that we do know beforehand is that the winner will be called Maria ;-)

Imagine that the bookies posted the following decimal odds

$\Omega = \left[\begin{array}{cc} 1.25 & 1.43\\ 3.90 & 2.85\end{array}\right]$

which means that bookie #1 is offering odds $1.25$ for a Sharapova victory, whereas bookie #2 is offering odds $1.43$ for the same outcome. The following Python 2.5 / CVXOPT script solves a linear program to determine whether there are arbitrage opportunities:


from cvxopt import matrix
from cvxopt import solvers

# available budget
c = 100

# decimal odds posted by the bookies
omega_11 = 1.25
omega_12 = 1.43
omega_21 = 3.90
omega_22 = 2.85

# build q vectors
q1 = matrix([(omega_11 - 1.0), -1.0, (omega_12 - 1.0), -1.0])
q2 = matrix([-1.0, (omega_21 - 1.0), -1.0, (omega_22 - 1.0)])

# build 4x4 identity matrix
Id = matrix([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])

# ---------------
# linear program
# ---------------

# build objective vector
f = matrix(0.0, (4,1))

# build inequality constraint matrices
A = matrix([matrix(1.0, (1,4)), -q1.T, -q2.T, -Id])
b = matrix([c, matrix(0.0, (6,1))])

# solve linear program
solvers.options['show_progress'] = True
solution = solvers.lp(f, A, b)

# print solution and profits
x = solution['x']
print "\nMoney allocation:"
print x
print "Profit if Sharapova wins:"
print q1.T * x
print "Profit if Kirilenko wins:"
print q2.T * x
print "Total amount at stake:"
print sum(x)


The output of this script is the following:

     pcost       dcost         gap    pres   dres   k/t
0:  0.0000e+000 -1.0000e+002  1e+002 2e-001 2e+000 1e+000
1:  0.0000e+000 -1.7972e+001  3e+001 4e-002 5e-001 3e+000
2:  0.0000e+000 -5.3598e-001  6e-001 1e-003 2e-002 3e-001
3:  0.0000e+000 -5.3759e-003  6e-003 1e-005 2e-004 3e-003
4:  0.0000e+000 -5.3757e-005  6e-005 1e-007 2e-006 3e-005
5:  0.0000e+000 -5.3757e-007  6e-007 1e-009 2e-008 3e-007
6:  0.0000e+000 -5.3757e-009  6e-009 1e-011 2e-010 3e-009
Optimal solution found.

Money allocation:
[ 1.09e+001]
[ 2.07e+001]
[ 4.86e+001]
[ 2.21e-001]

Profit if Sharapova wins:
[ 2.71e+000]

Profit if Kirilenko wins:
[ 9.14e-001]

Total amount at stake:
80.4618403492


Since a solution has been found, we conclude that the linear program is feasible and, therefore, at least one arbitrage opportunity exists. Note that the profit is positive regardless of who wins the match. The money allocation matrix is approximately

$X = \left[\begin{array}{cc} 10.9 & 48.6\\ 20.7 & 0.22\end{array}\right]$

and the total amount at stake is approximately $80.46$, for a budget of $100$. Who is paying the “free lunch”? ;-)

You should never allocate any money to an inferior line. Therefore, your proposed solution is dominated by [[0,58.84],[21.58,0]] for guaranteed profit of $3.72, no matter who wins. Furthermore, to determine if arbitrage exists, add reciprocals of the best line of each mutually exclusive outcome. If this sum <1, arbitrage exists. 1/1.43 + 1/3.9 = 0.956. • Rod Carvalho Says: You should never allocate any money to an inferior line. You are quite correct. In fact, the first time I played with betting arbitrage a few years ago, I noticed that the LP solver would produce a solution in which two of the entries were zero, which is quite intuitive. So, why did the LP solver fail to provide a sparse solution now? The reason is that I chose the objective function $0_4^T \tilde{x}$, as I was concerned with constraint-satisfaction, rather than optimality. The two possible profits are $q_1^T \tilde{x}$ and $q_2^T \tilde{x}$, so it would make sense to not only ensure that they are positive, but also to try to maximize them somehow. For example, we could try to maximize the sum of two possible profits $(q_1 + q_2)^T \tilde{x}$, which is the same as minimizing $- (q_1 + q_2)^T x$. Using this objective function, the LP solver outputs the following: Money allocation: [ 0.00e+000] [ 3.00e+001] [ 7.00e+001] [ 0.00e+000] Profit if Sharapova wins: [ 1.00e-001] Profit if Kirilenko wins: [ 1.70e+001]  which is in agreement with intuition. However, when $m, n$ are large and the outcomes are not mutually exclusive, intuition breaks down. That’s when the model I presented in this post becomes truly interesting. 2. Guy Gur-Ari Says: That was beautifully written, and quite interesting. The part that shows how to write the profit P as a vector product is redundant though, since it is obvious from the original definition of P. Simply treat the pair (ij) as a vector index (a.k.a. vectorization), and then you see that P = R^T X. • Rod Carvalho Says: You are totally right, of course. If $R_{ij} (\theta; \Omega)$ is the $(i,j)$ entry of matrix $R (\theta; \Omega)$, then the profit is $P (\theta, X; \Omega) = \displaystyle\sum_{i \in [m]}\sum_{j \in [n]} (\theta_i \omega_{ij} - 1 ) x_{ij} = \displaystyle\sum_{i \in [m]}\sum_{j \in [n]}R_{ij} (\theta; \Omega) x_{ij}$ which is the Frobenius inner product of matrices $R (\theta; \Omega)$ and $X$. We can also write the profit as the inner product of the vectorized matrices $P (\theta, X; \Omega) = \text{vec}^T\left(R (\theta; \Omega)\right) \text{vec}\left(X\right)$ as you suggested. An even simpler way would be the following $P (\theta, X; \Omega) = \text{tr} \left(R^T (\theta; \Omega) X\right)$ where $\text{tr}(\cdot)$ is the trace operator. Thanks for the kudos, btw. 3. Sam Says: Does this seriously work? Is there any way you can lose using this? Is it simply just really hard to find enough variance in odds to find sure thing games? Great post regardless. • Rod Carvalho Says: Finding arbitrage opportunities is not an easy task. Any opportunity that is easy to find will almost certainly be exploited and vanish rather quickly. If you want to give it a try, you should look for arbitrage opportunities that are very hard to find. For example, in the post above I assumed that the outcomes are mutually exclusive. What if they are not mutually exclusive? The complexity of the problem then explodes, and you can exploit correlations between outcomes. This is a surprisingly rich problem in intellectual terms. Lots and lots of Mathematics can be applied, and there’s the possibility (though remote) of finding free money. If it’s an arbitrage opportunity, then (by definition) one cannot lose money, in theory. In practice, it’s not that simple: the bookmakers / betting exchanges charge commissions. A more realistic model would include the commissions, of course, but I opted to neglect them to keep things simple. Do not search for variance in odds. Search for outcomes that are not mutually exclusive. • Sam Says: Thanks for the quick reply! Are you using these methods to obtain ‘free’ money? If so what type of things are you betting on? What are not mutually exclusive betting options? • Rod Carvalho Says: I cannot answer your first question ;-) Let us consider soccer (aka: “football” outside of the U.S.). The obvious outcomes are 1) home team wins, 2) draw, 3) visiting team wins. These are mutually exclusive, of course. After all, the home team cannot win and lose simultaneously! However, if you can bet on the number of goals, then you can perhaps find arbitrage opportunities. For instance, saying that the home team wins is equivalent to saying that the home team scored more goals. Perhaps you cannot find an arbitrage opportunity exploiting such non-mutually exclusive outcomes, but you can synthesize bets that are profitable with high-probability. In other words, you sacrifice the requirement of making money no matter the outcome, and hope to be able to turn betting into a casino in which you have an edge. Then, play many times, and be thankful for the Central Limit Theorem. It all depends on what kind of outcomes the bookmakers / betting exchanges allow you to bet on. 4. Peter Says: Why would you not put all your money for Sharapova (10.9 + 48.6) on the better odds –$1.43 and the same for Kirilenko (20.7 + 0.22) on \$3.9??