Posts Tagged ‘Operations Research’

Fixed-Odds Betting Arbitrage II

December 28, 2010

My recent post on fixed-odds betting arbitrage left some rather intriguing questions unanswered. Although I used Linear Programming to find an arbitrage opportunity, the focus was on constraint satisfaction, rather than optimality. But, then, if betting can be viewed as playing a sequential game with fate, what would the word “optimal” even mean in this context?

In this post, I will focus on optimality. I will assume the reader is already acquainted with my previous post and, therefore, for convenience I will abstain from “beginning at the beginning”. In other words, this post is not self-contained.

__________

Arbitrage

Given the odds matrix \Omega, the set of admissible \theta vectors \Theta \subseteq \mathbb{F}_2^m, and a budget c > 0, we would like to find a money allocation matrix X such that 1_{m n}^T \text{vec}(X) \leq c and

P (\theta, X; \Omega) \geq 0, \quad{} \forall \theta \in \Theta

so that, regardless of what \theta vector fate happens to choose, we make a profit. Let us enumerate the \theta vectors, \Theta = \{\theta^{(1)}, \theta^{(2)}, \dots, \theta^{(|\Theta|)}\}, where m \leq |\Theta|\leq 2^m. Since P (\theta, X; \Omega) = \text{vec}^T \left( R (\theta;  \Omega) \right) \text{vec}(X), the |\Theta| arbitrage inequalities can be written as

\left[\begin{array}{c} \text{vec}^T \left( R (\theta^{(1)};  \Omega) \right) \\ \text{vec}^T \left( R (\theta^{(2)};  \Omega) \right)\\ \vdots \\ \text{vec}^T \left( R (\theta^{|\Theta|};  \Omega) \right)\end{array}\right] \text{vec}(X) \geq \left[\begin{array}{c} 0 \\ 0 \\ \vdots \\ 0\end{array}\right].

Let \tilde{x} := \text{vec}(X). Hence, the inequality 1_{m n}^T \text{vec}(X) \leq c can be rewritten as 1_{m n}^T \tilde{x} \leq c. Let us define q_k := \text{vec} \left( R (\theta^{(k)}; \Omega) \right), and let

Q := \left[\begin{array}{cccc} q_1 & q_2 & \ldots & q_{|\Theta|}\end{array}\right]

be the m n \times |\Theta| matrix whose k-th column is q_k. Thus, the |\Theta| arbitrage inequalities can be written more compactly as Q^T \tilde{x} \geq 0_{|\Theta|} or, equivalently, as -Q^T \tilde{x} \leq 0_{|\Theta|}. Hence, so far, we have 1+|\Theta| inequality constraints

\left[\begin{array}{c} 1_{mn}^T\\ -Q^T\end{array}\right] \tilde{x} \leq \left[\begin{array}{c} c \\ 0_{|\Theta|}\end{array}\right].

Since the m n entries of matrix X must be non-negative, we have m n additional inequality constraints \tilde{x} \geq 0_{m n} or, equivalently, - I_{m n} \tilde{x} \leq 0_{m n}, where I_{m n} is the m n \times m n identity matrix.

To make the model more realistic, let us suppose that the amount of money being betted on outcome i \in [m] at bookmaker j \in [n] is bounded above by b_{ij} \geq 0, i.e., 0 \leq x_{ij} \leq b_{ij}. This is useful not only because bookmakers impose upper bounds on how much money one can bet, but also because it may happen that not all bookmakers are posting odds on the very same outcomes. By making b_{ij} = 0 one can make sure that no money will be betted on a certain outcome at a certain bookmaker. If bookmaker j is not imposing any upper bound on x_{ij} then we can make b_{ij} = c, i.e., the most we can bet on outcome i at bookmaker j is our budget c > 0. Let

B = \left[\begin{array}{cccc}b_{11} & b_{12} & \ldots & b_{1n}\\b_{21} & b_{22} & \ldots & b_{2n}\\\vdots & \vdots & \ddots & \vdots\\b_{m1} & b_{m2} & \ldots & b_{mn}\\\end{array}\right]

be the matrix of upper bounds, and let \tilde{b} := \text{vec}\left(B\right). Then, the m n inequalities x_{ij} \leq b_{ij} can be written more compactly as \tilde{x} \leq \tilde{b} or, equivalently, as I_{m n} \tilde{x} \leq \tilde{b}.

Finally, we have a total of 1+|\Theta| + 2 m n inequality constraints

\left[\begin{array}{c} 1_{mn}^T\\ -Q^T\\ - I_{m n}\\ I_{m n}\end{array}\right] \tilde{x}  \leq \left[\begin{array}{c} c \\ 0_{|\Theta|} \\ 0_{m n} \\ \tilde{b}\end{array}\right].

If we can find an m n-dimensional decision vector \tilde{x} that satisfies the 1+|\Theta| + 2 m n inequality constraints above, then we conclude that an arbitrage opportunity does exist. This is a constraint satisfaction problem.

As we discussed before, the 1+|\Theta| + 2 m n inequality constraints define a polytope in \mathbb{R}^{m n}. If this polytope is non-empty, then there exists at least one arbitrage opportunity. We can determine if the polytope is non-empty by solving a linear program with an arbitrary objective function

\begin{array}{ll} \displaystyle\min_{\tilde{x} \in \mathbb{R}^{mn}} & 0_{m n}^T \tilde{x} \\ \text{subject to} & \left[\begin{array}{c} 1_{mn}^T\\ -Q^T\\ - I_{m n}\\ I_{m  n}\end{array}\right] \tilde{x}  \leq \left[\begin{array}{c} c \\  0_{|\Theta|} \\ 0_{m n} \\ \tilde{b}\end{array}\right]\end{array}

where we chose 0_{m n}^T \tilde{x}, the most nihilistic of all linear objective functions. If this linear program is feasible, then an arbitrage opportunity does exist. In other words, we solve the constraint satisfaction problem via linear programming.

__________

Minimum Guaranteed Profit

We can go beyond mere arbitrage. Suppose we want to guarantee a minimum return of \alpha \geq 0 regardless of what \theta is chosen by fate, i.e.,

P (\theta, X; \Omega) \geq \alpha c, \quad{} \forall \theta \in \Theta.

Hence, instead of the linear inequality constraints Q^T \tilde{x} \geq 0_{|\Theta|}, we now have the linear inequality constraints Q^T \tilde{x} \geq \alpha c 1_{|\Theta|} or, equivalently, -Q^T \tilde{x} \leq -\alpha c 1_{|\Theta|}. Suppose also that the bookmakers impose no upper bounds on the x_{ij}. Hence, we have b_{ij} = c, and \tilde{b} = c 1_{m n}. We then have the following 1+|\Theta| + 2 m n inequality constraints

\left[\begin{array}{c} 1_{mn}^T\\ -Q^T\\ - I_{m n}\\ I_{m  n}\end{array}\right] \tilde{x}  \leq \left[\begin{array}{c} c \\ -\alpha c 1_{|\Theta|} \\ 0_{m n} \\ c 1_{m n}\end{array}\right].

Yet once again, we have a constraint satisfaction problem that can be solved via linear programming with an arbitrary objective function

\begin{array}{ll} \displaystyle\min_{\tilde{x} \in  \mathbb{R}^{mn}} & 0_{m n}^T \tilde{x} \\ \text{subject to} & \left[\begin{array}{c} 1_{mn}^T\\ -Q^T\\ - I_{m n}\\ I_{m   n}\end{array}\right] \tilde{x}  \leq \left[\begin{array}{c} c \\ -\alpha  c 1_{|\Theta|} \\ 0_{m n} \\ c 1_{m n}\end{array}\right]\end{array}.

Starting with a small \alpha, say \alpha = 0.01, and increasing it in a successive fashion until the linear program is no longer feasible, one can obtain money allocation matrices that yield higher and higher minimum guaranteed profits.

__________

Beyond Constraint Satisfaction

Instead of trying various values of \alpha until the linear program that solves the constraint satisfaction problem is no longer feasible, we can find the maximum \alpha via optimization. Note that Q^T \tilde{x} \geq \alpha c 1_{|\Theta|} can be rewritten as

\alpha c 1_{|\Theta|} - Q^T \tilde{x} \leq 0_{|\Theta|}.

Hence, we can find the maximum \alpha that yields a feasible linear program by solving the following maximization problem in (\alpha, \tilde{x}) with the additional inequality constraint \alpha \geq 0 or, equivalently, -\alpha \leq 0

\begin{array}{ll} \displaystyle\max_{(\alpha, \tilde{x}) \in   \mathbb{R}^{1 + mn}} & \left[\begin{array}{cc} 1 & 0_{m n}^T \end{array}\right]\left[\begin{array}{c} \alpha \\ \tilde{x}\end{array}\right] \\\\ \text{subject to} &  \left[\begin{array}{cc} -1 & 0_{m n}^T\\ 0 & 1_{mn}^T\\ c 1_{|\Theta|} & -Q^T\\ 0_{m n} & - I_{m n}\\ 0_{m n} & I_{m    n}\end{array}\right] \left[\begin{array}{c} \alpha \\ \tilde{x}\end{array}\right]  \leq \left[\begin{array}{c} 0\\ c \\ 0_{|\Theta|} \\ 0_{m n} \\ c 1_{m n}\end{array}\right]\end{array}

Note that now we do have an actual optimization problem, i.e., the linear program above is a maximization problem, not merely a constraint satisfaction problem. Since maximizing \alpha is the same as minimizing -\alpha, we can rewrite the maximization problem as a minimization problem, as follows

\begin{array}{ll} \displaystyle\min_{(\alpha, \tilde{x}) \in    \mathbb{R}^{1 + mn}} & \left[\begin{array}{cc} -1 & 0_{m n}^T  \end{array}\right]\left[\begin{array}{c} \alpha \\  \tilde{x}\end{array}\right] \\\\ \text{subject to} &  \left[\begin{array}{cc} -1 & 0_{m n}^T\\ 0  & 1_{mn}^T\\ c 1_{|\Theta|} & -Q^T\\ 0_{m n} & - I_{m n}\\  0_{m n} & I_{m    n}\end{array}\right] \left[\begin{array}{c} \alpha  \\ \tilde{x}\end{array}\right]  \leq \left[\begin{array}{c} 0\\ c \\  0_{|\Theta|} \\ 0_{m n} \\ c 1_{m n}\end{array}\right]\end{array}

The solutions of the maximization / minimization problems are maximin solutions, i.e., we are maximizing the minimum profit. In other words, we are pessimists and assume that fate will always choose the most damaging \theta \in \Theta. Hence, we choose a money allocation matrix X such that the minimum profit will be maximized. We are looking for the best worst-case scenario.

__________

Example: Sharapova versus Kirilenko

Let us revisit the tennis match we considered previously. Suppose that n = 2 bookmakers are posting odds for a Sharapova versus Kirilenko match, and that the bookies are offering odds for m = 2 mutually exclusive outcomes: Sharapova wins, or Kirilenko wins. Imagine that the bookies posted the following decimal odds

\Omega = \left[\begin{array}{cc} 1.25 & 1.43\\ 3.90 & 2.85\end{array}\right]

which means that bookie #1 is offering odds 1.25 for a Sharapova victory, whereas bookie #2 is offering odds 1.43 for the same outcome. Note that |\Theta| = 2 and that the Q matrix is

Q = \left[\begin{array}{cc} 0.25 & -1\\ -1 & 2.90\\ 0.43 & -1\\ -1 & 1.85\end{array}\right].

Assuming that the bookies impose no upper bounds on the x_{ij}, then the maximin return and the maximin money allocation matrix can be found by solving the linear program

\begin{array}{ll} \displaystyle\min_{(\alpha, \tilde{x}) \in     \mathbb{R}^{5}} & \left[\begin{array}{cc} -1 & 0_{4}^T   \end{array}\right]\left[\begin{array}{c} \alpha \\   \tilde{x}\end{array}\right] \\\\ \text{subject to} &   \left[\begin{array}{cc} -1 & 0_{4}^T\\ 0  & 1_{4}^T\\ c  1_{2} & -Q^T\\ 0_{4} & - I_{4}\\  0_{4} & I_{4}\end{array}\right] \left[\begin{array}{c} \alpha  \\  \tilde{x}\end{array}\right]  \leq \left[\begin{array}{c} 0\\ c \\   0_{2} \\ 0_{4} \\ c 1_{4}\end{array}\right]\end{array}

where c > 0 is our budget. The following Python 2.5 / CVXOPT script solves a linear program to determine whether there are arbitrage opportunities and, if so, what the maximin solution is:

 from cvxopt import matrix
 from cvxopt import spmatrix
 from cvxopt import solvers

 # available budget
 c = 100

 # build (decimal) odds matrix
 Omega = matrix([[1.25, 3.90],
                 [1.43, 2.85]])

 # build q vectors and Q matrix
 q1 = matrix([0.25, -1.0, 0.43, -1.0])
 q2 = matrix([-1.0, 2.90, -1.0, 1.85])
 Q = matrix([[q1], [q2]])

 # build 4x4 identity matrix
 Id = spmatrix(1.0, range(4), range(4))

 # ---------------
 # linear program
 # ---------------

 # build objective vector
 f = matrix([-1.0, matrix(0.0, (4,1))])

 # build inequality constraint matrices
 A = matrix([[-1.0, 0.0, c*matrix(1.0, (2,1)), matrix(0.0, (8,1))],
 [matrix(0.0, (1,4)), matrix(1.0, (1,4)), -Q.T, -Id, Id]])
 b = matrix([0.0, c, matrix(0.0, (6,1)), c*matrix(1.0, (4,1))])

 # solve linear program
 solvers.options['show_progress'] = True
 solution = solvers.lp(f, A, b)

 # print solution and profits
 sol = solution['x']
 alpha = sol[0]
 x = sol[1:5]

 # build money allocation matrix
 X = matrix([[sol[1], sol[2]],[sol[3], sol[4]]])

 print "\nMaximin return (%):"
 print 100 * alpha
 print "\nMaximin money allocation matrix:"
 print X
 print "Profit if Sharapova wins:"
 print q1.T * x
 print "Profit if Kirilenko wins:"
 print q2.T * x
 print "Total amount at stake:"
 print sum(x)

The output of this script is the following:

     pcost       dcost         gap    pres   dres   k/t
 0:  1.7194e-002 -5.0054e+002  7e+002 4e-001 2e+002 1e+000
 1:  1.9534e-001 -1.7513e+001  2e+001 2e-002 7e+000 1e+000
 2:  1.3994e-001 -2.3748e+000  2e+000 2e-003 1e+000 2e-001
 3:  1.4821e-002 -8.9421e-001  9e-001 8e-004 4e-001 7e-002
 4: -1.7326e-002 -9.9421e-002  7e-002 7e-005 3e-002 9e-004
 5: -3.8136e-002 -5.1946e-002  1e-002 1e-005 6e-003 3e-004
 6: -4.6244e-002 -4.6469e-002  2e-004 2e-007 9e-005 8e-006
 7: -4.6340e-002 -4.6343e-002  2e-006 2e-009 9e-007 8e-008
 8: -4.6341e-002 -4.6341e-002  2e-008 2e-011 9e-009 8e-010
Optimal solution found.

Maximin return (%):
4.63414537166

Maximin money allocation matrix:
[ 2.31e-006  7.32e+001]
[ 2.68e+001  1.53e-006]

Profit if Sharapova wins:
[ 4.63e+000]

Profit if Kirilenko wins:
[ 4.63e+000]

Total amount at stake:
99.999998456

Since a solution has been found, we conclude that at least one arbitrage opportunity exists. Moreover, the maximin return is 4.63\% and, hence, no matter what \theta fate does happen to choose, we have a guaranteed profit of at least 4.63\%. Indeed, note that regardless of which Maria wins the tennis match, the profit will be 4.63 for a budget of c = 100. The money allocation matrix is approximately

X = \left[\begin{array}{cc} 0.00 & 73.2\\ 26.8 & 0.00\end{array}\right].

As we have rounded off the money allocation matrix produced by the CVXOPT script, the profits will deviate slightly from the maximin ones:

Profit if Sharapova wins:
[ 4.68e+000]

Profit if Kirilenko wins:
[ 4.52e+000]

Total amount at stake:
100.0

Note that the profit in case Sharapova wins is slightly increased, whereas the profit in case Kirilenko wins is slightly decreased.

How “good” is the maximin solution? Let us compare the maximin solution we have just obtained with the mere constraint-satisfying solution we found before:

Profit if Sharapova wins:
[ 2.71e+000]

Profit if Kirilenko wins:
[ 9.14e-001]

Total amount at stake:
80.4618403492

where we have a return of 3.37\% in case Sharapova wins, and a return of 1.14\% in case Kirilenko wins. It is clear that the maximin solution is superior, not only because the returns are higher, but also because the variance of the returns is considerably lower.

Fixed-Odds Betting Arbitrage

December 20, 2010

Imagine that we want to bet on an event with m \geq 2 possible outcomes, and that there are n \geq 2 bookmakers taking bets on those outcomes. Note that the odds are known at the time the bets are placed, i.e., we have fixed-odds betting instead of pari-mutuel betting.

We would like to know how to allocate our (limited) money, i.e., how much to bet on each outcome at each bookmaker. As we cannot foresee the future, we would very much like to lock in risk-less profits. In other words, we would like to find arbitrage opportunities (aka: “arbs”). In this post we discuss how, given the (fixed) odds posted by n bookmakers, we can detect the existence of arbitrage opportunities.

__________

Odds, Outcomes, and Profit

Let [m] := \{1, 2, \dots, m\}, and [n] := \{1, 2, \dots, n\}. Suppose that the decimal odds offered by bookmaker j \in [n] for outcome i \in [m] are \omega_{ij} \geq 1. Let the m \times n odds matrix be the following

\Omega  = \left[\begin{array}{cccc}\omega_{11} & \omega_{12} & \ldots & \omega_{1n}\\ \omega_{21} & \omega_{22} & \ldots & \omega_{2n}\\\vdots &  \vdots & \ddots & \vdots\\ \omega_{m1} & \omega_{m2} & \ldots & \omega_{mn}\\\end{array}\right].

Let x_{ij} \geq 0 be the amount of money that we bet on outcome i \in [m] at bookmaker j \in [n], and let

X  = \left[\begin{array}{cccc}x_{11} & x_{12} & \ldots &  x_{1n}\\x_{21} & x_{22} & \ldots & x_{2n}\\\vdots &  \vdots & \ddots & \vdots\\x_{m1} & x_{m2} & \ldots &  x_{mn}\\\end{array}\right]

be the m \times n money allocation matrix. Since we have a limited amount of money available, c > 0, we must impose the inequality constraint

\displaystyle\sum_{i \in [m]} \sum_{j \in [n]} x_{ij} \leq c

to ensure that we do not spend more than what we have. Let \mathbb{F}_2 := \{0,1\}. We introduce binary variables \theta_1, \dots, \theta_m \in \mathbb{F}_2, where

\theta_i = \begin{cases} 1, & \text{if outcome } i \text{ occurs}\\ 0, & \text{otherwise}\end{cases}

and we define \theta := \left(\theta_1, \theta_2, \ldots,  \theta_m\right) \in \mathbb{F}_2^m. Let \Theta \subseteq \mathbb{F}_2^m be the set of all admissible \theta vectors. The m outcomes are mutually exclusive if

\theta_i = 1 \Rightarrow \theta_k = 0, \quad{} \forall k \in [m] \setminus \{i\}.

If the m outcomes are mutually exclusive, then there are only m admissible \theta vectors. If the m outcomes are not mutually exclusive, then the number of admissible \theta vectors is m \leq |\Theta|\leq 2^m.

The odds for outcome i \in [m] offered by bookmaker j \in [n] are \omega_{ij}. Hence, if we bet x_{ij} > 0 on  outcome i at bookmaker j and outcome i does occur, then we get paid \omega_{ij} x_{ij} and make a profit of (\omega_{ij} - 1) x_{ij}. However, if outcome i does not occur, we make a profit of -x_{ij}. Thus, our profit from betting on outcome i at bookmaker j is

p_{ij} = (\theta_i \omega_{ij} - 1 ) x_{ij}.

Betting on all outcomes and at all bookmakers, our total profit is

P (\theta, X; \Omega) = \displaystyle\sum_{i \in [m]}\sum_{j \in [n]} p_{ij} = \displaystyle\sum_{i \in [m]}\sum_{j \in [n]} (\theta_i \omega_{ij} - 1 ) x_{ij}

where \theta is chosen by fate, X is our choice, and \Omega is the bookmakers’ choice. Note that \theta and X are variables, whereas the odds \Omega are parameters. Suppose we want to bet on the outcomes of a tennis match, say, Sharapova versus Kirilenko; as soon as the tennis match is over, vector \theta is known and, thus, we know our profit / loss.

Let us define

R (\theta; \Omega) := \left[\begin{array}{cccc}(\theta_1\omega_{11} - 1) & (\theta_1\omega_{12} - 1) & \ldots  & (\theta_1\omega_{1n} - 1)\\ (\theta_2\omega_{21} - 1) & (\theta_2\omega_{22} - 1) & \ldots & (\theta_2\omega_{2n} - 1)\\\vdots &  \vdots & \ddots & \vdots\\ (\theta_m\omega_{m1} - 1) & (\theta_m\omega_{m2} - 1) & \ldots & (\theta_m\omega_{mn} - 1)\\\end{array}\right]

and let \text{diag}(\theta) be the m \times m diagonal matrix whose m diagonal entries are the entries of vector \theta. Moreover, let 1_m = (1,1,\dots,1) be the m-dimensional vector of ones. Then, we can write R (\theta; \Omega) in a more compressed form, as follows

R (\theta; \Omega) = \text{diag}(\theta) \Omega - 1_m 1_n^T.

Finally, we can write the total profit P (\theta, X; \Omega) more economically, as follows

P (\theta, X; \Omega) = 1_m^T \left(R (\theta; \Omega) \circ X\right) 1_n

where R (\theta; \Omega) \circ X is the Hadamard product (entry-wise matrix multiplication) of matrices R (\theta; \Omega) and X.

Let \text{vec}(\cdot) denote the vectorization operator, which stacks a given matrix’s columns in one large column vector. Note that if we apply operator \text{vec}(\cdot) to a scalar, we obtain the same scalar. Hence, we have that \text{vec}(P (\theta, X; \Omega)) = P (\theta, X; \Omega). Thus,

P (\theta, X; \Omega) = \text{vec}\left(1_m^T \left(R (\theta; \Omega) \circ X\right) 1_n\right) = 1_{mn}^T \text{vec} \left(R (\theta; \Omega) \circ X\right)

because 1_n^T \otimes 1_m^T = 1_{mn}^T. It can be shown that

\text{vec} \left(R (\theta; \Omega) \circ X\right) = \text{diag}\left( \text{vec} \left( R (\theta; \Omega)\right) \right) \text{vec}(X)

and, hence, the total profit is

P (\theta, X; \Omega) = 1_{mn}^T \text{diag}\left( \text{vec} \left( R (\theta; \Omega)\right) \right) \text{vec}(X).

It’s not difficult to show that

1_{mn}^T \text{diag}\left( \text{vec} \left( R (\theta; \Omega)\right) \right) = \text{vec}^T \left( R (\theta; \Omega) \right)

and, thus, we obtain

P (\theta, X; \Omega) = \text{vec}^T \left( R (\theta; \Omega) \right) \text{vec}(X)

which is the inner product of two m n-dimensional vectors. The usefulness of writing the profit as an inner product will be evident later on. Note that the inequality constraint

\displaystyle\sum_{i \in [m]} \sum_{j \in [n]} x_{ij} \leq c

can be written in a rather more compressed form as 1_m^T X 1_n \leq c. Or, using vectorization, as 1_{m n}^T \text{vec}(X) \leq c.

__________

Arbitrage

Given the odds matrix \Omega, the set of admissible \theta vectors \Theta \subseteq \mathbb{F}_2^m, and a budget c > 0, we would like to find a money allocation matrix X such that 1_{m n}^T \text{vec}(X) \leq c and

P (\theta, X; \Omega) \geq 0, \quad{} \forall \theta \in \Theta.

In other words, we would like to allocate the money such that, no matter the outcomes, we do not lose money. This is a constraint satisfaction problem. To be more precise, we have to satisfy |\Theta| + 1 inequality constraints.

From a game-theoretic viewpoint, we are playing a sequential game with fate. We play first, by choosing X. Fate plays next by choosing \theta \in \Theta. We have an arbitrage opportunity when, no matter what \theta fate does happen to choose, we make a profit. We want to “Dutch” fate.

__________

Betting on Mutually Exclusive Outcomes

If 1_m^T \theta = 1, i.e., if only one entry in \theta can take take the value 1, then the m outcomes are mutually exclusive. Why is the case where the outcomes are mutually exclusive interesting? The reason is that when the outcomes are mutually exclusive, the set of admissible \theta vectors \Theta \subseteq \mathbb{F}_2^m has minimal cardinality, |\Theta| = m, and the m elements of \Theta are easy to index.

If the outcomes are mutually exclusive, then

\Theta = \left\{ \left[\begin{array}{c} 1 \\ 0 \\ \vdots \\ 0\end{array}\right], \left[\begin{array}{c} 0 \\ 1 \\ \vdots \\ 0\end{array}\right], \dots, \left[\begin{array}{c} 0 \\ 0 \\ \vdots \\ 1\end{array}\right] \right\}

i.e., the m elements of \Theta are the vectors of the canonical basis for \mathbb{R}^m. Let us denote

\mathrm{e}_1 := \left[\begin{array}{c} 1 \\ 0 \\ \vdots \\  0\end{array}\right], \quad{} \mathrm{e}_2 := \left[\begin{array}{c} 0 \\ 1 \\ \vdots \\  0\end{array}\right], \dots, \quad{} \mathrm{e}_m := \left[\begin{array}{c} 0 \\ 0 \\ \vdots \\  1\end{array}\right]

so that \Theta = \{\mathrm{e}_1, \mathrm{e}_2, \dots, \mathrm{e}_m\}. Let \tilde{x} := \text{vec}(X) be a m n-dimensional decision vector, and q_i := \text{vec}\left( R (\mathrm{e}_i; \Omega)\right). Then, the arbitrage condition

P (\theta, X; \Omega) = \text{vec}^T \left( R (\theta; \Omega) \right) \text{vec}(X) \geq 0, \quad{} \forall \theta \in \Theta

can be written as m inequality constraints

q_1^T \tilde{x} \geq 0, \quad{} q_2^T \tilde{x} \geq 0, \dots, \quad{} q_m^T \tilde{x} \geq 0.

Let Q := \left[\begin{array}{cccc} q_1 & q_2 & \ldots & q_m\end{array}\right] be the m n \times m matrix whose i-th column is q_i, then the m arbitrage inequalities can be written simply as Q^T \tilde{x} \geq 0_m, where 0_m is the m-dimensional vector whose entries are all zero.

Due to our limited budget, we have the inequality 1_{mn}^T \tilde{x} \leq c. The m arbitrage inequalities can also be written as -Q^T \tilde{x} \leq 0_m. Hence, we have 1+m inequalities

\left[\begin{array}{c} 1_{mn}^T\\ -Q^T\end{array}\right] \tilde{x} \leq \left[\begin{array}{c} c \\ 0_m\end{array}\right].

From a geometric viewpoint, these 1+m inequalities correspond to the intersection of 1+m closed half-spaces in \mathbb{R}^{mn}, which defines a polytope

\left\{ \tilde{x} \in (\mathbb{R}_0^+)^{mn} : \left[\begin{array}{c} 1_{mn}^T \\ -Q^T\end{array}\right] \tilde{x} \leq \left[\begin{array}{c} c \\ 0_m\end{array}\right] \right\}.

If this polytope is empty, there are no arbitrage opportunities. If the polytope is non-empty, there is at least one arbitrage opportunity. Who would have thought that betting arbitrage could be studied using Discrete Geometry? ;-)

Determining whether the polytope is nonempty is as hard as solving as linear program (so says Peter Shor). Hence, we can determine if the polytope is non-empty by solving a linear program with an arbitrary objective function

\begin{array}{ll} \displaystyle\min_{\tilde{x} \in \mathbb{R}^{mn}} & 0_{m n}^T \tilde{x} \\ \text{subject to} & \left[\begin{array}{c} 1_{mn}^T\\ -Q^T\end{array}\right] \tilde{x} \leq \left[\begin{array}{c} c \\ 0_m\end{array}\right], \quad{} \tilde{x} \geq 0_{m n}\end{array}

If this linear program is feasible, then an arbitrage opportunity does exist. Any solution that the LP solver returns will be an “arb” :-)

We have considered mutually exclusive outcomes in this section. If the outcomes are not mutually exclusive, then |\Theta| > m and it gets harder to enumerate all the admissible \theta vectors. However, the exact same approach can be used, although we will have more than m arbitrage inequalities.

__________

Example: Sharapova versus Kirilenko

Suppose that n = 2 bookmakers are posting odds for the Sharapova versus Kirilenko tennis match we mentioned before. Suppose further that the bookies are offering odds for m = 2 mutually exclusive outcomes: Sharapova wins, or  Kirilenko wins. They can’t both win the match, of course. One thing that we do know beforehand is that the winner will be called Maria ;-)

Imagine that the bookies posted the following decimal odds

\Omega = \left[\begin{array}{cc} 1.25 & 1.43\\ 3.90 & 2.85\end{array}\right]

which means that bookie #1 is offering odds 1.25 for a Sharapova victory, whereas bookie #2 is offering odds 1.43 for the same outcome. The following Python 2.5 / CVXOPT script solves a linear program to determine whether there are arbitrage opportunities:


 from cvxopt import matrix
 from cvxopt import solvers

 # available budget
 c = 100

 # decimal odds posted by the bookies
 omega_11 = 1.25
 omega_12 = 1.43
 omega_21 = 3.90
 omega_22 = 2.85

 # build q vectors
 q1 = matrix([(omega_11 - 1.0), -1.0, (omega_12 - 1.0), -1.0])
 q2 = matrix([-1.0, (omega_21 - 1.0), -1.0, (omega_22 - 1.0)])

 # build 4x4 identity matrix
 Id = matrix([[1, 0, 0, 0],
              [0, 1, 0, 0],
              [0, 0, 1, 0],
              [0, 0, 0, 1]])

 # ---------------
 # linear program
 # ---------------

 # build objective vector
 f = matrix(0.0, (4,1))

 # build inequality constraint matrices
 A = matrix([matrix(1.0, (1,4)), -q1.T, -q2.T, -Id])
 b = matrix([c, matrix(0.0, (6,1))])

 # solve linear program
 solvers.options['show_progress'] = True
 solution = solvers.lp(f, A, b)

 # print solution and profits
 x = solution['x']
 print "\nMoney allocation:"
 print x
 print "Profit if Sharapova wins:"
 print q1.T * x
 print "Profit if Kirilenko wins:"
 print q2.T * x
 print "Total amount at stake:"
 print sum(x)

The output of this script is the following:

     pcost       dcost         gap    pres   dres   k/t
 0:  0.0000e+000 -1.0000e+002  1e+002 2e-001 2e+000 1e+000
 1:  0.0000e+000 -1.7972e+001  3e+001 4e-002 5e-001 3e+000
 2:  0.0000e+000 -5.3598e-001  6e-001 1e-003 2e-002 3e-001
 3:  0.0000e+000 -5.3759e-003  6e-003 1e-005 2e-004 3e-003
 4:  0.0000e+000 -5.3757e-005  6e-005 1e-007 2e-006 3e-005
 5:  0.0000e+000 -5.3757e-007  6e-007 1e-009 2e-008 3e-007
 6:  0.0000e+000 -5.3757e-009  6e-009 1e-011 2e-010 3e-009
Optimal solution found.

Money allocation:
[ 1.09e+001]
[ 2.07e+001]
[ 4.86e+001]
[ 2.21e-001]

Profit if Sharapova wins:
[ 2.71e+000]

Profit if Kirilenko wins:
[ 9.14e-001]

Total amount at stake:
80.4618403492

Since a solution has been found, we conclude that the linear program is feasible and, therefore, at least one arbitrage opportunity exists. Note that the profit is positive regardless of who wins the match. The money allocation matrix is approximately

X = \left[\begin{array}{cc} 10.9 & 48.6\\ 20.7 & 0.22\end{array}\right]

and the total amount at stake is approximately 80.46, for a budget of 100. Who is paying the “free lunch”? ;-)

Optimal Account Balancing II

July 28, 2009

Consider a network of n = 10 agents, each owing money to and being owed money by other agents. The debts are represented pictorially by the weighted directed graph

IOU graph - 10 nodes and 20 edgeswhich I called an IOU graph on my previous post on this topic. The IOU graph representing this network is an ordered triple \mathcal{G} = (\mathcal{N}, \mathcal{E}, w), where \mathcal{N} = \{1, 2, \ldots, 10\} is the node set (each node of the IOU graph represents a different agent), \mathcal{E} \subset \mathcal{N} \times \mathcal{N} is the edge set (containing |\mathcal{E}| = 20 elements), and w: \mathcal{E} \rightarrow \mathbb{R}^{+} is a weight function that assigns positive weights to each edge in \mathcal{E}. Each directed edge of graph \mathcal{G} is an ordered pair (i,j), where i is the edge’s tail and j is the edge’s head. If (i,j) \in \mathcal{E} then node i owes money to node j. We do not allow nodes to owe money to themselves and, therefore, for each i \in \mathcal{N} we must have (i,i) \notin \mathcal{E}. The weight of edge (i, j) \in \mathcal{E} is given by w\left( (i, j) \right) > 0, and it tells us the amount of money that node i owes to node j.

Alternatively, the information contained in the IOU graph can be encoded in the following nonnegative matrix

\left[\begin{array}{cccccccccc} 0 & 10 & 0 & 20 & 20 & 5 & 15 & 0 & 0 & 0\\ 0 & 0 & 5 & 10 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 15 & 0 & 5 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0\\ 5 & 20 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0\\ 0 & 15 & 0 & 0 & 0 & 0 & 0 & 20 & 0 & 10\\ 0 & 5 & 15 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ \end{array}\right]

where entry (i,j) is the amount of money node i owes to node j. Since we do not allow nodes to owe money to themselves, the entries on the main diagonal of this matrix are zero. We will call this matrix the IOU matrix. Note that the IOU matrix is the adjacency matrix of the IOU graph. Every nonnegative matrix with zero entries on the main diagonal is an admissible IOU matrix, and we can associate with it a corresponding IOU graph.

If the IOU matrix contains all the information about the IOU graph, then we can work with the IOU matrix instead of the IOU graph. Is there any advantage in doing so? I believe there is. In my opinion, the IOU graph is good for conceptual thinking and intuition, while the IOU matrix is a data structure that is convenient for computation. However, the matrix is not the only data structure that one can use to encode the information in an IOU graph. Another possibility would be to use adjacency lists, for example.

__________

Divergence and IOU Matrices

Suppose we are given an IOU graph \mathcal{G}_0 = (\mathcal{N}, \mathcal{E}_0, w_0), whose node set is \mathcal{N} = \{1, 2, \ldots, n\}. Note that (i,i) \notin \mathcal{E}_0 for each i \in \mathcal{N}, because nodes cannot owe money to themselves. We now introduce variables y_{ij} \geq 0 as follows

y_{ij} = \displaystyle\left\{\begin{array}{cl} w_0 ((i,j)) & \text{if} \quad{} (i,j) \in \mathcal{E}_0\\ 0 & \text{if} \quad{} (i,j) \notin \mathcal{E}_0\\ \end{array}\right.

and build the (nonnegative) adjacency matrix of the IOU graph \mathcal{G}_0

Y = \left[\begin{array}{ccccc} 0 & y_{12} & y_{13} & \ldots & y_{1n}\\ y_{21} & 0 & y_{23} & \ldots & y_{2n}\\ y_{31} & y_{32} & 0 & \ldots & y_{3n}\\\vdots & \vdots & \vdots & \ddots & \vdots\\ y_{n1} & y_{n2} & y_{n3} & \ldots & 0\\\end{array}\right]

which is the IOU matrix associated with the IOU graph \mathcal{G}_0. The (i,j) entry of matrix Y \in \mathbb{R}^{n \times n} denotes the amount of money that node i owes to node j. The divergence of node i \in \mathcal{N} is a scalar defined as

\mbox{div}_i (\mathcal{G}_0) = \displaystyle \sum_{j \neq i} y_{ij} - \displaystyle \sum_{j \neq i} y_{ji}

and it gives us the amount of money node i owes to other nodes minus the amount of money node i is owed to by other nodes. In terms of the IOU matrix Y, the divergence of node i \in \mathcal{N} is given by

\mbox{div}_i (\mathcal{G}_0) = e_i^T Y 1_n - e_i^T Y^T 1_n = e_i^T Y 1_n - 1_n^T Y e_i

where e_i \in \mathbb{R}^{n} is the i-th vector of the canonical basis for \mathbb{R}^n, i.e., the i-th column of the n \times n identity matrix I_n, and 1_n is an n-dimensional vector of ones. Let \mbox{vec} denote the vectorization operator, which stacks a given matrix’s columns in one large column vector. Note that if we apply operator \mbox{vec} to a scalar, we obtain the same scalar. Hence, we have that

e_i^T Y 1_n = \mbox{vec} (e_i^T Y 1_n) = (1_n^T \otimes e_i^T) \mbox{vec} (Y)

and

1_n^T Y e_i = \mbox{vec} (1_n^T Y e_i) = (e_i^T \otimes 1_n^T) \mbox{vec} (Y)

where \otimes denotes the Kronecker product. Therefore, the divergence of node i \in \mathcal{N} can be written as

\mbox{div}_i (\mathcal{G}_0) = \left[(1_n^T \otimes e_i^T) - (e_i^T \otimes 1_n^T)\right] \mbox{vec} (Y).

We define also the divergence vector of graph \mathcal{G}_0 as the n-dimensional vector whose i-th component is \mbox{div}_i (\mathcal{G}_0)

\mbox{div} (\mathcal{G}_0) = \left[\begin{array}{c} (1_n^T \otimes e_1^T) - (e_1^T \otimes 1_n^T)\\ (1_n^T \otimes e_2^T) - (e_2^T \otimes 1_n^T)\\ \vdots\\ (1_n^T \otimes e_n^T) - (e_n^T \otimes 1_n^T)\\\end{array}\right] \mbox{vec} (Y).

Alternatively, we can use equation \mbox{div}_i (\mathcal{G}_0) = e_i^T Y 1_n - e_i^T Y^T 1_n, and write the divergence vector as

\mbox{div} (\mathcal{G}_0) = I_n Y 1_n - I_n Y^T 1_n = (Y - Y^T) 1_n

and, since

I_n Y 1_n = \mbox{vec} (I_n Y 1_n) = (1_n^T \otimes I_n) \mbox{vec} (Y)

and

I_n Y^T 1_n = \mbox{vec}(I_n Y^T 1_n) = (1_n^T \otimes I_n) \mbox{vec} (Y^T),

we obtain

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) \mbox{vec} (Y) - (1_n^T \otimes I_n) \mbox{vec} (Y^T).

Note that n^2-dimensional vectors \mbox{vec} (Y) and \mbox{vec} (Y^T) contain the same entries, but in a different order. Hence, there exists a n^2 \times n^2 permutation matrix P such that

\mbox{vec} (Y^T) = P \mbox{vec} (Y)

which allows us to write

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) (I_{n^2} - P) \mbox{vec} (Y).

Do note that the permutation matrix depends on n only. It does not depend on the entries of the matrix to which operator \mbox{vec} is applied. Henceforth, we will denote \tilde{y} = \mbox{vec}(Y). Finally, we have a rather compact equation for the divergence vector

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) (I_{n^2} - P) \tilde{y}.

The divergence vector of an IOU graph \mathcal{G}_0 tells us how much money each node in the graph will lose after debts have been paid. We say that two IOU graphs (on the same node set) are equi-divergent if their divergence vectors are the same. This means that, although the debt relationships are different, both IOU graphs result in every node getting paid what he is owed and paying what he owes.

__________

Minimum Money Flow

Suppose we are given an IOU graph \mathcal{G}_0 = (\mathcal{N}, \mathcal{E}_0, w_0). We would like to find a new IOU graph \mathcal{G} = (\mathcal{N}, \mathcal{E}, w) on the same node set \mathcal{N}, such that the two IOU graphs are equi-divergent, i.e., \mbox{div} (\mathcal{G}) = \mbox{div} (\mathcal{G}_0). Moreover, we would like the new IOU graph to be “optimal” in some sense. On this post, we will consider the following optimality criterion:

minimum money flow: we want to minimize the total amount of money being exchanged between the nodes. This is the same as minimizing the sum of the weights of the IOU graph \mathcal{G}.

Let us start with matrix X \in \mathbb{R}^{n \times n}, which will be an admissible IOU matrix if we impose two constraints:

  • nonnegativity: the entries of matrix X must be nonnegative. This is equivalent to imposing the constraint that vector \tilde{x} = \mbox{vec}(X) has nonnegative entries, i.e., \tilde{x} \geq 0_{n^2}, which is equivalent to (-I_{n^2}) \tilde{x} \leq 0_{n^2}.
  • zero entries on the main diagonal: we must have x_{ii} = 0 for all i \in \mathcal{N}. Note that x_{ii} = e_i^T X e_i = \mbox{vec}(e_i^T X e_i), and therefore x_{ii} = 0 is equivalent to (e_i^T \otimes e_i^T) \tilde{x} = 0. Hence, imposing the constraint that the entries on the main diagonal be zero is equivalent to imposing the n equality constraints

\left[\begin{array}{c} e_1^T \otimes e_1^T\\ e_2^T \otimes e_2^T\\ \vdots\\ e_n^T \otimes e_n^T\\\end{array}\right] \tilde{x} = 0_n.

Once these two constraints have been imposed, matrix X is the adjacency matrix of an IOU graph. Finally, we must impose the constraint that the divergence vector of the new IOU graph is the same as the divergence vector of the given IOU graph, i.e., \mbox{div} (\mathcal{G}) = \mbox{div} (\mathcal{G}_0), which yields

(1_n^T \otimes I_n) (I_{n^2} - P) \tilde{x} = \mbox{div} (\mathcal{G}_0)

where \mbox{div} (\mathcal{G}_0) = (Y - Y^T) 1_n can be computed from the given IOU graph. Considering the minimum money flow criterion, the cost function to be minimized is the following

\displaystyle\sum_{i \in \mathcal{N}} \sum_{j \neq i} x_{ij} = 1_n^T X 1_n = (1_n^T \otimes 1_n^T) \tilde{x} = 1_{n^2}^T \tilde{x}

and, hence, we obtain the following linear program in \tilde{x} \in \mathbb{R}^{n^2}

\begin{array}{ll} \text{minimize} & \displaystyle 1_{n^2}^T \tilde{x}\\\\ \text{subject to} & (1_n^T \otimes I_n) (I_{n^2} - P) \tilde{x} = \mbox{div} (\mathcal{G}_0)\\\\ & \left[\begin{array}{c} e_1^T \otimes e_1^T\\ e_2^T \otimes e_2^T\\ \vdots\\ e_n^T \otimes e_n^T\\\end{array}\right] \tilde{x} = 0_n\\\\ & (-I_{n^2}) \tilde{x} \leq 0_{n^2}\\\\\end{array}

which has 2 n equality and n^2 inequality constraints. One can easily solve this linear program with MATLAB using function linprog.

However, this formulation looks a bit silly. Note that initially we have n^2 degrees of freedom (the n^2 entries of matrix X), then we impose n equality constraints x_{ii} = 0 to ensure that the entries on the main diagonal are zero, which leaves us with m = n^2 - n = n (n-1) degrees of freedom. If we already know that the entries on the main diagonal are zero, why consider them as decision variables? Wouldn’t it make much more sense to consider only the m = n (n-1) entries off the main diagonal as decision variables?

Let R be a m \times n^2 binary matrix such that \hat{x} = R \tilde{x} is the reduced vector of m decision variables, containing solely the entries of X off the main diagonal. We now eliminate the n columns of n \times n^2 matrix (1_n^T \otimes I_n) (I_{n^2} - P) which were to be multiplied by the x_{ii} variables. We do so by multiplying matrix (1_n^T \otimes I_n) (I_{n^2} - P) on the right by R^T, which produces a n \times m matrix (1_n^T \otimes I_n) (I_{n^2} - P) R^T.

The new vector of decision variables is \hat{x} \in \mathbb{R}^m. Since we no longer consider the entries on the main diagonal as decision variables, we don’t need to impose the n equality constraints x_{ii} = 0. Hence, we obtain a lower-dimensional linear program

\begin{array}{ll} \text{minimize} & \displaystyle 1_{m}^T \hat{x}\\\\ \text{subject to} & (1_n^T \otimes I_n) (I_{n^2} - P) R^T \hat{x} = \mbox{div} (\mathcal{G}_0)\\\\ & (-I_m) \hat{x} \leq 0_{m}\\\\\end{array}

where we now have n equality constraints (divergences at the nodes), and m inequality constrains (x_{ij} \geq 0). My intuition tells me that

(1_n^T \otimes I_n) (I_{n^2} - P) R^T = \left[\begin{array}{c} (1_n^T \otimes e_1^T) - (e_1^T \otimes 1_n^T)\\ (1_n^T \otimes e_2^T) - (e_2^T \otimes 1_n^T)\\ \vdots\\ (1_n^T \otimes e_n^T) - (e_n^T \otimes 1_n^T)\\\end{array}\right] R^T

is an incidence matrix of the directed complete graph on node set \mathcal{N}. What do you think? Do you agree?

__________

A Numerical Example

Consider a network of n = 10 agents whose debts are represented by the following IOU graph (with 15 edges)

IOU graph - 10 nodes and 15 edgesThe IOU matrix associated with this given IOU graph is

Y = \left[\begin{array}{cccccccccc} 0 & 0 & 0 & 20 & 20 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0\\ 5 & 20 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 20 & 0 & 10\\ 0 & 5 & 15 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ \end{array}\right]

Solving the linear program, we obtain a new IOU matrix

X = \left[\begin{array}{cccccccccc} 0 & 6.097 & 3.924 & 10.776 & 10.776 & 8.427 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0.850 & 0.644 & 1.234 & 1.234 & 1.039 & 0 & 0 & 0 & 0\\ 0 & 1.690 & 1.185 & 2.505 & 2.505 & 2.114 & 0 & 0 & 0 & 0\\ 0 & 4.672 & 3.063 & 7.980 & 7.980 & 6.306 & 0 & 0 & 0 & 0\\ 0 & 1.690 & 1.185 & 2.505 & 2.505 & 2.114 & 0 & 0 & 0 & 0\\\end{array}\right]

which is the adjacency matrix of a new IOU graph. The given IOU graph and the new IOU graph are equi-divergent. The given IOU graph had 15 edges and a total of 165 flowing in it, while the new IOU graph has 25 edges and a total of 95 flowing in it. While the flow of money has been reduced, the number of edges (i.e., the number of transactions) has increased dramatically!!

On my previous post on this topic, I gave two very simple examples, and in both of them the minimum money flow solution also yielded the least number of transactions. I wondered back then if that was always the case. This example clearly shows that it is not. Generally speaking, not only does the minimum money flow criterion fail to minimize the number of transactions, but it can actually result in more transactions than the ones in the initial IOU graph. I cannot think of any reason why minimizing the total amount of money flowing in the IOU graph at the cost of a larger number of transactions would be a good idea…

Therefore, the next step is to abandon the minimum money flow criterion, and focus solely on the least number of transactions. Minimizing the number of transactions is a hard combinatorial problem, but maybe there are approximation algorithms that make the problem tractable.

Optimal Account Balancing

June 20, 2009

Suppose that Alice owes Bob $5 and is owed $10 by Bob. These debts can be represented as the weighted directed graph

IOU graph - Alice and Bobwhich we will call an IOU graph. Alice pays Bob $5 and Bob pays Alice $10. The debts are paid and $15 have been exchanged in a total of two transactions. Let us now consider the following cash flow diagram

FBD - Alice and Boband, accounting for the net cash flows, we obtain

FBD - Alice and Bob 2which illustrates that if Bob paid Alice $5, then the debts would still be paid, though only $5 would be exchanged and only one transaction would be made. This looks much cleverer and more efficient than exchanging $15 in a total of two transactions. Hence, the following IOU graph

IOU graph - Alice and Bob 2is “equivalent” to the original IOU graph. In both IOU graphs, if debts are paid then Alice will be $5 up, while Bob will be $5 down.

We add a third agent, Charlie. Consider now the IOU graph

IOU graph - Alice Bob and Charliewhere:

  • Alice must pay a total of $20 and be paid a total of $30.
  • Bob must pay a total of $15 and be paid a total of $20.
  • Charlie must pay a total of $35 and be paid a total of $20.

Hence, after the debts are paid, Alice will have $30 – $20 = $10 more in her account, Bob will have $20 – $15 = $5 more in his account, and Charlie will have $20 – $35 = -$15 more (= $15 less) in his account. Note that $10 + $5 – $15 = $0, i.e., money is conserved. In total, $70 have been exchanged in six transactions. We can do better than that. For example, we could pay all debts in just three transactions, as illustrated in the following IOU graph

iou-graph-alice-bob-and-charlie-2bwhere a total of $20 are exchanged. Note that Alice is $10 up, Bob is $5 up, and Charlie is $15 down, just like before. We can do even better than this. If Charlie pays Bob’s debt to Alice, then all debts can be paid in just two transactions as shown in the IOU graph below

IOU graph - Alice Bob and Charlie 3where only $15 are exchanged. This is an “optimal” debt-paying scheme because the debts cannot be paid in less than two transactions, nor exchanging less than $15. Do you agree?

__________

Debts and IOU Graphs

An IOU graph is a weighted directed graph given by an ordered triple \mathcal{G} = (\mathcal{N}, \mathcal{E}, w), where:

  • \mathcal{N} is the node set. Each node of the IOU graph represents a different agent.
  • \mathcal{E} \subset \mathcal{N} \times \mathcal{N} is the edge set. Each directed edge is an ordered pair (i,j), where i is the edge’s tail and j is the edge’s head. If (i,j) \in \mathcal{E} then node i owes money to node j. We do not allow nodes to owe money to themselves and, therefore, for each i \in \mathcal{N} we must have (i,i) \notin \mathcal{E}.
  • w: \mathcal{E} \rightarrow \mathbb{R}^{+} is a weight function that assigns positive weights to each edge in \mathcal{E}. The weight of edge (i, j) \in \mathcal{E} is given by w\left( (i, j) \right) > 0, and it tells us the amount of money that node i owes to node j.

We define the in-neighborhood and out-neighborhood of node i as follows

\begin{array}{ll} \mathcal{N}_i^{in} &= \displaystyle\{ j \in \mathcal{N} \mid (j,i) \in \mathcal{E}\}\\\\ \mathcal{N}_i^{out} &= \displaystyle\{ j \in \mathcal{N} \mid (i,j) \in \mathcal{E}\}\end{array}

and illustrate them below

IN & OUT neighborhoods

The divergence [1] of node i of graph \mathcal{G} is a scalar defined as

\textbf{div}_i (\mathcal{G}) = \displaystyle \sum_{j \in \mathcal{N}_i^{out}} x_{ij} - \displaystyle \sum_{j \in \mathcal{N}_i^{in}} x_{ji}

and it gives us the amount of money node i owes to other nodes minus the amount of money node i is owed to by other nodes. The divergence of a node can be viewed as the discrete, graph-theoretic analogue of the divergence operator in vector calculus. We define also the divergence vector of graph \mathcal{G} as the vector whose i-th component is \textbf{div}_i (\mathcal{G})

\textbf{div} (\mathcal{G}) = \left[\begin{array}{c} \vdots\\ \textbf{div}_i (\mathcal{G})\\ \vdots \\ \end{array}\right].

The divergence vector of an IOU graph \mathcal{G} tells us how much money each node in the graph will lose after debts have been paid. We say that two IOU graphs (on the same node set) are equi-divergent if their divergence vectors are the same. This means that, although the debt relationships are different, both IOU graphs result in every node getting paid what he is owed and paying what he owes.

__________

Optimal Account Balancing

Suppose we are given an IOU graph \mathcal{G}_0 = (\mathcal{N}, \mathcal{E}_0, w_0). We would like to find a new IOU graph \mathcal{G} = (\mathcal{N}, \mathcal{E}, w) on the same node set \mathcal{N}, such that the two IOU graphs are equi-divergent

\textbf{div} (\mathcal{G}) = \textbf{div} (\mathcal{G}_0).

Moreover, we would like the new IOU graph to be “optimal” in some sense. I can think of the two following optimality criteria:

minimum money flow: we want to minimize the total amount of money being exchanged between the nodes. This is the same as minimizing the sum of the (positive) weights of the edges of the IOU graph.

least number of transactions: we want to minimize the total number of transactions being made. This is the same as minimizing the number of edges in the IOU graph, i.e., maximizing the sparsity of the IOU graph.

Can you think of other interesting optimality criteria?

It would be reasonable to consider the case where the transaction fees are nonzero. If these are variable and proportional to the amount of money being transferred, then we would like to minimize the amount of money flowing in the network of agents. If the transaction fees are fixed, then we would like to minimize the total number of transactions being made. For example, in the two examples above the minimum money flow solution is the same as the least number of transactions solution. Is this always the case?

In order to find the optimal IOU graph, we start with a weighted directed graph on node set \mathcal{N} whose edge set is the one of the directed complete graph on \mathcal{N}. Hence, every two distinct nodes are connected by two directed edges with opposite directions. The weight of edge (i,j) is denoted by x_{ij} \geq 0. Since some of these weights might be zero, there might be edges in this graph which do not represent debt relationships! Therefore, although this graph is weighted and directed, it is not necessarily an IOU graph. We optimize for the n (n-1) nonnegative variables \{x_{ij}\} and, in the end, we remove the edges whose weight is zero so that we obtain an IOU graph. I view this approach as somewhat akin to scaffolding ;-)

For example, considering the minimum money flow criterion, we have the following minimum cost flow problem [1]

\begin{array}{ll} \text{minimize} & \displaystyle \sum_{i \in \mathcal{N}}\sum_{j \neq i} x_{ij}\\\\ \text{subject to} & \displaystyle \sum_{j \neq i} x_{ij} - \displaystyle \sum_{j \neq i} x_{ji} = \textbf{div}_i (\mathcal{G}_0), \quad{} \forall i \in \mathcal{N}\\\\ & x_{ij} \geq 0, \quad{} \forall i \neq j \in \mathcal{N}\\\\\end{array}

 

which can be solved via linear programming [2]. In words: we want to find the debts x_{ij} \geq 0 such that all debts are paid (each node should be paid what it is owed minus what it owes to other nodes) and the total amount of debt is minimized, which results in the total amount of money flowing to be also minimized.

The least number of transactions criterion is much harder to tackle because we want to maximize the sparsity of the graph, while ensuring that the graph and the original IOU graph equi-divergent. We want to optimize the topology of the graph and, hence, a combinatorial flavor is added to the problem. I will write more about it on future posts.

__________

A Numerical Example

Let us again consider the 3-agent example where the agents are Alice, Bob and Charlie. The IOU graph is

IOU graph - Alice Bob and Charliewhere the node set is \mathcal{N} = \{\text{a}, \text{b}, \text{c}\}. We introduce vector

x = (x_{ab}, x_{ac}, x_{ba}, x_{bc}, x_{ca}, x_{cb})

where x_{ij} \geq 0 is the amount of money node i owes to node j. The divergence of the new graph at each node must be the same as the divergence of the given IOU graph at each node, which yields the equality constraint C x = b, where

C = \left[\begin{array}{rrrrrr} -1 & -1 & 1 & 0 & 1 & 0\\ 1 & 0 & -1 & -1 & 0 & 1\\ 0 & 1 & 0 & 1 & -1 & -1\\\end{array}\right]

is the incidence matrix of the graph, and b = [\begin{array}{ccc} 10 & 5 & -15\end{array}]^T is the opposite of the divergence vector of the given IOU graph above. This equality constraint guarantees that the given IOU graph and the new graph we obtain are equi-divergent.

Let I_6 be the 6 \times 6 identity matrix, let 1_6 be the 6-dimensional vector of ones, and let 0_6 be the 6-dimensional vector of zeros. The 6 nonnegativity constraints x_{ij} \geq 0 can be written in matrix form as (-I_6) x \leq 0_6. Considering the minimum money flow criterion, we obtain the linear program

\begin{array}{ll} \text{minimize} & 1_6^T x\\\\ \text{subject to} & C x = b, \quad{} (-I_6) x \leq 0_6\\\\\end{array}

Note that the equality constraint C x = b is equivalent to two inequality constraints: C x \leq b and C x \geq b. Thus, we can transform the linear program above into a linear program with inequality constraints only

\begin{array}{ll} \text{minimize} & 1_6^T x\\\\ \text{subject to} & \left[\begin{array}{r} C\\ -C\\ -I_6\\\end{array}\right] x \leq \left[\begin{array}{r} b\\ -b\\ 0_6\\\end{array}\right]\\\\\end{array}

If you have MATLAB, you can easily solve these linear programs using function linprog. The following Python 2.5 script solves the latter linear program using CVXOPT:

from cvxopt import matrix
from cvxopt import solvers

# -------------------
# auxiliary matrices
# -------------------

# defines incidence matrix
C = matrix([ [-1.0, 1.0, 0.0],
             [-1.0, 0.0, 1.0],
             [1.0, -1.0, 0.0],
             [0.0, -1.0, 1.0],
             [1.0, 0.0, -1.0],
             [0.0, 1.0, -1.0] ])

# defines desired divergence vector
d = matrix( [-10.0, -5.0, 15.0] )

# defines 6x6 identity matrix
Id = matrix([ [1, 0, 0, 0, 0, 0],
              [0, 1, 0, 0, 0, 0],
              [0, 0, 1, 0, 0, 0],
              [0, 0, 0, 1, 0, 0],
              [0, 0, 0, 0, 1, 0],
              [0, 0, 0, 0, 0, 1] ])

# ---------------
# linear program
# ---------------

# defines objective vector
c = matrix(1.0, (6,1))

# defines inequality constraint matrices
G = matrix( [C, -C, -Id] )
h = matrix( [-d, d, matrix(0.0, (6,1))] )

# solves linear program
solvers.options['show_progress'] = True
solution = solvers.lp(c, G, h)

# prints solution
print solution['x']

The solution is

x = \left[\begin{array}{r}5.86\text{e-}008\\ -1.50\text{e-}008\\ -1.36\text{e-}008\\ -1.99\text{e-}008\\ 1.00\text{e+}001\\ 5.00\text{e+}000\\\end{array}\right],

which is “contaminated” with numerical noise. Adjusting the tolerances of the numerical solver, one can reduce the numerical noise and obtain the true solution, which is x = (0, 0, 0, 0, 10, 5). Removing the edges with zero weight, we obtain an IOU graph with 2 edges

IOU graph - Alice Bob and Charlie 3

and, hence, only two transactions are required, and only $15 need to be exchanged. This is the exact same solution we obtained previously via heuristic methods! :-)

__________

Concluding Remarks

The two optimal debt-paying schemes considered on this post rely on an “accountant” who has perfect global information about all the debts in the network of agents. This accountant solves an optimization problem to find out what transactions should be carried out. This is a centralized approach, as the decision power is centralized in the accountant. Another (and arguably more interesting) approach would be a decentralized one, where the nodes have only local information (i.e., each node only knows who owes him money and to whom he owes money) and where each node solves a local optimization problem using, for example, message-passing algorithms. I will write about it on future posts.

The only modeling of this problem I have found was on Demetri Spanos‘ PhD thesis [3]. If you know of any papers on this topic, please let me know.

Last but not least, allow me to add a personal note: I first started thinking of this problem years ago, while going to dinners with friends and trying to figure out how to split the bill in an efficient manner. The real world is a great source of inspiration when it comes to interesting problems.

__________

References

[1]  Dimitri Bertsekas, Network Optimization: Continuous and Discrete Models, Athena Scientific, 1998.

[2] Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, 2004.

[3] Demetri Spanos, Distributed gradient systems and dynamic coordination, Ph.D. thesis, California Institute of Technology, December 2005.

Optimal Downloading Strategies

February 24, 2008

Imagine that you go on a road trip with some friends. One of your friends has a video camera and films the best moments of the trip. After the trip, your friend edits the footage and creates a movie with the very best moments. Your friend could then burn a CD (or a DVD) containing the movie, and send it to you via post-mail. However, that solution involves:

  • wasting time to go to the mail office.
  • spending money on CD’s, stamps, and envelopes.

We could do better than that: why not compress the movie into one or more ZIP files and upload them to a hosting website such as Rapidshare? Bandwidth is cheap these days, and downloading involves only a few mouseclicks. Your friend uploads the ZIP files to a hosting website and sends you the list of URL’s via email, including the size of each file next to each URL. Let us assume that:

  • in this hosting website, you need to wait 1 minute for each MB you downloaded in the previous file download.
  • you need to download all ZIP files before you can unzip them and watch the movie your friend created.

You then think: “in which order should I download the files”? This is a decision problem under certainty. Let us try to model it!

__________

Mathematical Model

Let us say that we have a total of n files to download, where n \geq 2. We consider the following variables and parameters (and respective units):

  • x_i: size of the i-th file (unit: Megabytes).
  • T_i^D: time it takes to download the i-th file (unit: seconds).
  • T_i^W: time one needs to wait after downloading the i-th file (unit: seconds).
  • R: downloading speed (unit: Megabytes per second)

for all i \in \mathcal{I}, where \mathcal{I} = \{1, 2, \dots, n\}. The time it takes to download the i-th file is

T_i^D = \displaystyle\frac{x_i}{R}

and the time one needs to wait after downloading the i-th file is

T_i^W = 60 x_i.

The total downloading time is the sum of each file’s downloading time plus the time one needs to wait after downloading each file. Note that after we download the last file, we can unzip the n files and obtain the full-length movie. Therefore, when counting the total downloading time, we do not need to include the waiting time of the last downloaded file. Let us denote by j \in \mathcal{I} the index of the last downloaded file. Thus, the total downloading time is

T = \displaystyle\sum_{i \in \mathcal{I}} T_i^D + \displaystyle\sum_{i \in \mathcal{I} \setminus \{j\} } T_i^W,

which can also be written as

T = \displaystyle\sum_{i \in \mathcal{I}} \left(T_i^D + T_i^W\right) - T_j^W,

or, then, as follows

T = \left(60 + \frac{1}{R}\right)\displaystyle\sum_{i \in \mathcal{I}} x_i - 60 x_j.

__________

Optimal Downloading Strategies

Suppose that our objective is to minimize the total downloading time T. Let us introduce the cost function C: \mathcal{I} \to \mathbb{R}_0^+ defined by

C(j) = T = \left(60 + \frac{1}{R}\right)\displaystyle\sum_{i \in  \mathcal{I}} x_i - 60 x_j.

Note that our decision variable is j \in \mathcal{I}, i.e., our goal is to find the value of j \in \mathcal{I} that minimizes the total downloading time T. Note also that:

  • \left(60 + \frac{1}{R}\right)\displaystyle\sum_{i \in \mathcal{I}} x_i is the “fixed cost”.
  • \left(- 60 x_j \right) is the “variable cost”.

We cannot “control” the fixed cost, but we can try to minimize the variable cost. The minimal cost is thus

C^* = \displaystyle\min_{j \in \mathcal{I}} C(j) = \left(60 + \frac{1}{R}\right)\displaystyle\sum_{i \in \mathcal{I}} x_i - \max_{j \in \mathcal{I}} \left\{60 x_j \right\}.

Hence, we can conclude that the optimal j \in \mathcal{I} is

j^* = \displaystyle\arg\max_{j \in \mathcal{I}} \left\{ x_j \right\},

i.e., we should leave the biggest of all n files for last! This is quite intuitive. Given that

  • the downloading speed, R, is assumed to be constant.
  • we need to download all the n files.

we can conclude that the only way to minimize the total downloading time is to minimize the total waiting time, and the way to do that is to download the biggest file at the end.

Alternatively, we could have used the following cost function

C(j) = \displaystyle\sum_{i \in \mathcal{I} \setminus \{j\} } T_i^W =  \left(\displaystyle\sum_{i \in \mathcal{I}} 60 x_i\right) - 60 x_j

where cost equals waiting time (i.e., we ignore the fixed cost). The optimal solution will be the same as before

j^* = \displaystyle\arg\max_{j \in \mathcal{I}} \left\{ x_j \right\},

and the only difference will be that the minimum cost will now be the minimum waiting time, not the minimum total downloading time.

__________

Concluding Remarks

The solution we have computed only tells us which file to download in the n-th download. The solution tells us nothing on which files to download on the previous n-1 downloads. We can then conclude that:

  • there are n! distinct ways of downloading the files.
  • out of those n! ways, (n-1)! of them are optimal.

If all the files have the same size (i.e., if x_i = \bar{x}, \, \forall i \in \mathcal{I}), then there are n! optimal ways of downloading them. In other words, when all files have the same size, the order in which one downloads them is irrelevant, which is somewhat obvious.

This problem could perhaps be generalized. Any ideas?

__________

DISCLAIMER: please note that downloading copyrighted material is illegal under U.S. law. I do not advocate downloading copyrighted material.


Follow

Get every new post delivered to your Inbox.

Join 77 other followers