Archive for July, 2009

Optimal Account Balancing II

July 28, 2009

Consider a network of n = 10 agents, each owing money to and being owed money by other agents. The debts are represented pictorially by the weighted directed graph

IOU graph - 10 nodes and 20 edgeswhich I called an IOU graph on my previous post on this topic. The IOU graph representing this network is an ordered triple \mathcal{G} = (\mathcal{N}, \mathcal{E}, w), where \mathcal{N} = \{1, 2, \ldots, 10\} is the node set (each node of the IOU graph represents a different agent), \mathcal{E} \subset \mathcal{N} \times \mathcal{N} is the edge set (containing |\mathcal{E}| = 20 elements), and w: \mathcal{E} \rightarrow \mathbb{R}^{+} is a weight function that assigns positive weights to each edge in \mathcal{E}. Each directed edge of graph \mathcal{G} is an ordered pair (i,j), where i is the edge’s tail and j is the edge’s head. If (i,j) \in \mathcal{E} then node i owes money to node j. We do not allow nodes to owe money to themselves and, therefore, for each i \in \mathcal{N} we must have (i,i) \notin \mathcal{E}. The weight of edge (i, j) \in \mathcal{E} is given by w\left( (i, j) \right) > 0, and it tells us the amount of money that node i owes to node j.

Alternatively, the information contained in the IOU graph can be encoded in the following nonnegative matrix

\left[\begin{array}{cccccccccc} 0 & 10 & 0 & 20 & 20 & 5 & 15 & 0 & 0 & 0\\ 0 & 0 & 5 & 10 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 15 & 0 & 5 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0\\ 5 & 20 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0\\ 0 & 15 & 0 & 0 & 0 & 0 & 0 & 20 & 0 & 10\\ 0 & 5 & 15 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ \end{array}\right]

where entry (i,j) is the amount of money node i owes to node j. Since we do not allow nodes to owe money to themselves, the entries on the main diagonal of this matrix are zero. We will call this matrix the IOU matrix. Note that the IOU matrix is the adjacency matrix of the IOU graph. Every nonnegative matrix with zero entries on the main diagonal is an admissible IOU matrix, and we can associate with it a corresponding IOU graph.

If the IOU matrix contains all the information about the IOU graph, then we can work with the IOU matrix instead of the IOU graph. Is there any advantage in doing so? I believe there is. In my opinion, the IOU graph is good for conceptual thinking and intuition, while the IOU matrix is a data structure that is convenient for computation. However, the matrix is not the only data structure that one can use to encode the information in an IOU graph. Another possibility would be to use adjacency lists, for example.

__________

Divergence and IOU Matrices

Suppose we are given an IOU graph \mathcal{G}_0 = (\mathcal{N}, \mathcal{E}_0, w_0), whose node set is \mathcal{N} = \{1, 2, \ldots, n\}. Note that (i,i) \notin \mathcal{E}_0 for each i \in \mathcal{N}, because nodes cannot owe money to themselves. We now introduce variables y_{ij} \geq 0 as follows

y_{ij} = \displaystyle\left\{\begin{array}{cl} w_0 ((i,j)) & \text{if} \quad{} (i,j) \in \mathcal{E}_0\\ 0 & \text{if} \quad{} (i,j) \notin \mathcal{E}_0\\ \end{array}\right.

and build the (nonnegative) adjacency matrix of the IOU graph \mathcal{G}_0

Y = \left[\begin{array}{ccccc} 0 & y_{12} & y_{13} & \ldots & y_{1n}\\ y_{21} & 0 & y_{23} & \ldots & y_{2n}\\ y_{31} & y_{32} & 0 & \ldots & y_{3n}\\\vdots & \vdots & \vdots & \ddots & \vdots\\ y_{n1} & y_{n2} & y_{n3} & \ldots & 0\\\end{array}\right]

which is the IOU matrix associated with the IOU graph \mathcal{G}_0. The (i,j) entry of matrix Y \in \mathbb{R}^{n \times n} denotes the amount of money that node i owes to node j. The divergence of node i \in \mathcal{N} is a scalar defined as

\mbox{div}_i (\mathcal{G}_0) = \displaystyle \sum_{j \neq i} y_{ij} - \displaystyle \sum_{j \neq i} y_{ji}

and it gives us the amount of money node i owes to other nodes minus the amount of money node i is owed to by other nodes. In terms of the IOU matrix Y, the divergence of node i \in \mathcal{N} is given by

\mbox{div}_i (\mathcal{G}_0) = e_i^T Y 1_n - e_i^T Y^T 1_n = e_i^T Y 1_n - 1_n^T Y e_i

where e_i \in \mathbb{R}^{n} is the i-th vector of the canonical basis for \mathbb{R}^n, i.e., the i-th column of the n \times n identity matrix I_n, and 1_n is an n-dimensional vector of ones. Let \mbox{vec} denote the vectorization operator, which stacks a given matrix’s columns in one large column vector. Note that if we apply operator \mbox{vec} to a scalar, we obtain the same scalar. Hence, we have that

e_i^T Y 1_n = \mbox{vec} (e_i^T Y 1_n) = (1_n^T \otimes e_i^T) \mbox{vec} (Y)

and

1_n^T Y e_i = \mbox{vec} (1_n^T Y e_i) = (e_i^T \otimes 1_n^T) \mbox{vec} (Y)

where \otimes denotes the Kronecker product. Therefore, the divergence of node i \in \mathcal{N} can be written as

\mbox{div}_i (\mathcal{G}_0) = \left[(1_n^T \otimes e_i^T) - (e_i^T \otimes 1_n^T)\right] \mbox{vec} (Y).

We define also the divergence vector of graph \mathcal{G}_0 as the n-dimensional vector whose i-th component is \mbox{div}_i (\mathcal{G}_0)

\mbox{div} (\mathcal{G}_0) = \left[\begin{array}{c} (1_n^T \otimes e_1^T) - (e_1^T \otimes 1_n^T)\\ (1_n^T \otimes e_2^T) - (e_2^T \otimes 1_n^T)\\ \vdots\\ (1_n^T \otimes e_n^T) - (e_n^T \otimes 1_n^T)\\\end{array}\right] \mbox{vec} (Y).

Alternatively, we can use equation \mbox{div}_i (\mathcal{G}_0) = e_i^T Y 1_n - e_i^T Y^T 1_n, and write the divergence vector as

\mbox{div} (\mathcal{G}_0) = I_n Y 1_n - I_n Y^T 1_n = (Y - Y^T) 1_n

and, since

I_n Y 1_n = \mbox{vec} (I_n Y 1_n) = (1_n^T \otimes I_n) \mbox{vec} (Y)

and

I_n Y^T 1_n = \mbox{vec}(I_n Y^T 1_n) = (1_n^T \otimes I_n) \mbox{vec} (Y^T),

we obtain

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) \mbox{vec} (Y) - (1_n^T \otimes I_n) \mbox{vec} (Y^T).

Note that n^2-dimensional vectors \mbox{vec} (Y) and \mbox{vec} (Y^T) contain the same entries, but in a different order. Hence, there exists a n^2 \times n^2 permutation matrix P such that

\mbox{vec} (Y^T) = P \mbox{vec} (Y)

which allows us to write

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) (I_{n^2} - P) \mbox{vec} (Y).

Do note that the permutation matrix depends on n only. It does not depend on the entries of the matrix to which operator \mbox{vec} is applied. Henceforth, we will denote \tilde{y} = \mbox{vec}(Y). Finally, we have a rather compact equation for the divergence vector

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) (I_{n^2} - P) \tilde{y}.

The divergence vector of an IOU graph \mathcal{G}_0 tells us how much money each node in the graph will lose after debts have been paid. We say that two IOU graphs (on the same node set) are equi-divergent if their divergence vectors are the same. This means that, although the debt relationships are different, both IOU graphs result in every node getting paid what he is owed and paying what he owes.

__________

Minimum Money Flow

Suppose we are given an IOU graph \mathcal{G}_0 = (\mathcal{N}, \mathcal{E}_0, w_0). We would like to find a new IOU graph \mathcal{G} = (\mathcal{N}, \mathcal{E}, w) on the same node set \mathcal{N}, such that the two IOU graphs are equi-divergent, i.e., \mbox{div} (\mathcal{G}) = \mbox{div} (\mathcal{G}_0). Moreover, we would like the new IOU graph to be “optimal” in some sense. On this post, we will consider the following optimality criterion:

minimum money flow: we want to minimize the total amount of money being exchanged between the nodes. This is the same as minimizing the sum of the weights of the IOU graph \mathcal{G}.

Let us start with matrix X \in \mathbb{R}^{n \times n}, which will be an admissible IOU matrix if we impose two constraints:

  • nonnegativity: the entries of matrix X must be nonnegative. This is equivalent to imposing the constraint that vector \tilde{x} = \mbox{vec}(X) has nonnegative entries, i.e., \tilde{x} \geq 0_{n^2}, which is equivalent to (-I_{n^2}) \tilde{x} \leq 0_{n^2}.
  • zero entries on the main diagonal: we must have x_{ii} = 0 for all i \in \mathcal{N}. Note that x_{ii} = e_i^T X e_i = \mbox{vec}(e_i^T X e_i), and therefore x_{ii} = 0 is equivalent to (e_i^T \otimes e_i^T) \tilde{x} = 0. Hence, imposing the constraint that the entries on the main diagonal be zero is equivalent to imposing the n equality constraints

\left[\begin{array}{c} e_1^T \otimes e_1^T\\ e_2^T \otimes e_2^T\\ \vdots\\ e_n^T \otimes e_n^T\\\end{array}\right] \tilde{x} = 0_n.

Once these two constraints have been imposed, matrix X is the adjacency matrix of an IOU graph. Finally, we must impose the constraint that the divergence vector of the new IOU graph is the same as the divergence vector of the given IOU graph, i.e., \mbox{div} (\mathcal{G}) = \mbox{div} (\mathcal{G}_0), which yields

(1_n^T \otimes I_n) (I_{n^2} - P) \tilde{x} = \mbox{div} (\mathcal{G}_0)

where \mbox{div} (\mathcal{G}_0) = (Y - Y^T) 1_n can be computed from the given IOU graph. Considering the minimum money flow criterion, the cost function to be minimized is the following

\displaystyle\sum_{i \in \mathcal{N}} \sum_{j \neq i} x_{ij} = 1_n^T X 1_n = (1_n^T \otimes 1_n^T) \tilde{x} = 1_{n^2}^T \tilde{x}

and, hence, we obtain the following linear program in \tilde{x} \in \mathbb{R}^{n^2}

\begin{array}{ll} \text{minimize} & \displaystyle 1_{n^2}^T \tilde{x}\\\\ \text{subject to} & (1_n^T \otimes I_n) (I_{n^2} - P) \tilde{x} = \mbox{div} (\mathcal{G}_0)\\\\ & \left[\begin{array}{c} e_1^T \otimes e_1^T\\ e_2^T \otimes e_2^T\\ \vdots\\ e_n^T \otimes e_n^T\\\end{array}\right] \tilde{x} = 0_n\\\\ & (-I_{n^2}) \tilde{x} \leq 0_{n^2}\\\\\end{array}

which has 2 n equality and n^2 inequality constraints. One can easily solve this linear program with MATLAB using function linprog.

However, this formulation looks a bit silly. Note that initially we have n^2 degrees of freedom (the n^2 entries of matrix X), then we impose n equality constraints x_{ii} = 0 to ensure that the entries on the main diagonal are zero, which leaves us with m = n^2 - n = n (n-1) degrees of freedom. If we already know that the entries on the main diagonal are zero, why consider them as decision variables? Wouldn’t it make much more sense to consider only the m = n (n-1) entries off the main diagonal as decision variables?

Let R be a m \times n^2 binary matrix such that \hat{x} = R \tilde{x} is the reduced vector of m decision variables, containing solely the entries of X off the main diagonal. We now eliminate the n columns of n \times n^2 matrix (1_n^T \otimes I_n) (I_{n^2} - P) which were to be multiplied by the x_{ii} variables. We do so by multiplying matrix (1_n^T \otimes I_n) (I_{n^2} - P) on the right by R^T, which produces a n \times m matrix (1_n^T \otimes I_n) (I_{n^2} - P) R^T.

The new vector of decision variables is \hat{x} \in \mathbb{R}^m. Since we no longer consider the entries on the main diagonal as decision variables, we don’t need to impose the n equality constraints x_{ii} = 0. Hence, we obtain a lower-dimensional linear program

\begin{array}{ll} \text{minimize} & \displaystyle 1_{m}^T \hat{x}\\\\ \text{subject to} & (1_n^T \otimes I_n) (I_{n^2} - P) R^T \hat{x} = \mbox{div} (\mathcal{G}_0)\\\\ & (-I_m) \hat{x} \leq 0_{m}\\\\\end{array}

where we now have n equality constraints (divergences at the nodes), and m inequality constrains (x_{ij} \geq 0). My intuition tells me that

(1_n^T \otimes I_n) (I_{n^2} - P) R^T = \left[\begin{array}{c} (1_n^T \otimes e_1^T) - (e_1^T \otimes 1_n^T)\\ (1_n^T \otimes e_2^T) - (e_2^T \otimes 1_n^T)\\ \vdots\\ (1_n^T \otimes e_n^T) - (e_n^T \otimes 1_n^T)\\\end{array}\right] R^T

is an incidence matrix of the directed complete graph on node set \mathcal{N}. What do you think? Do you agree?

__________

A Numerical Example

Consider a network of n = 10 agents whose debts are represented by the following IOU graph (with 15 edges)

IOU graph - 10 nodes and 15 edgesThe IOU matrix associated with this given IOU graph is

Y = \left[\begin{array}{cccccccccc} 0 & 0 & 0 & 20 & 20 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0\\ 5 & 20 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 20 & 0 & 10\\ 0 & 5 & 15 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ \end{array}\right]

Solving the linear program, we obtain a new IOU matrix

X = \left[\begin{array}{cccccccccc} 0 & 6.097 & 3.924 & 10.776 & 10.776 & 8.427 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0.850 & 0.644 & 1.234 & 1.234 & 1.039 & 0 & 0 & 0 & 0\\ 0 & 1.690 & 1.185 & 2.505 & 2.505 & 2.114 & 0 & 0 & 0 & 0\\ 0 & 4.672 & 3.063 & 7.980 & 7.980 & 6.306 & 0 & 0 & 0 & 0\\ 0 & 1.690 & 1.185 & 2.505 & 2.505 & 2.114 & 0 & 0 & 0 & 0\\\end{array}\right]

which is the adjacency matrix of a new IOU graph. The given IOU graph and the new IOU graph are equi-divergent. The given IOU graph had 15 edges and a total of 165 flowing in it, while the new IOU graph has 25 edges and a total of 95 flowing in it. While the flow of money has been reduced, the number of edges (i.e., the number of transactions) has increased dramatically!!

On my previous post on this topic, I gave two very simple examples, and in both of them the minimum money flow solution also yielded the least number of transactions. I wondered back then if that was always the case. This example clearly shows that it is not. Generally speaking, not only does the minimum money flow criterion fail to minimize the number of transactions, but it can actually result in more transactions than the ones in the initial IOU graph. I cannot think of any reason why minimizing the total amount of money flowing in the IOU graph at the cost of a larger number of transactions would be a good idea…

Therefore, the next step is to abandon the minimum money flow criterion, and focus solely on the least number of transactions. Minimizing the number of transactions is a hard combinatorial problem, but maybe there are approximation algorithms that make the problem tractable.

Geometry with POV-Ray

July 28, 2009

If you would like to use POV-Ray to visualize 3-dimensional geometrical objects, I highly recommend Friedrich A. Lohmueller‘s beautiful tutorial on Analytical Geometry with POV-Ray. Other POV-Ray tutorials by Friedrich A. Lohmueller can be found here.

The Shadow of a Pyramid (by Friedrich A. Lohmueller, 2007)

[ image courtesy of Friedrich A. Lohmueller ]

In my humble opinion, this is how Analytical Geometry should be taught. Descartes had no access to 3-d graphics. We do. Why not take advantage of the technology?

The virtual Apollo Guidance Computer

July 20, 2009

I remember on the trip home on Apollo 11 it suddenly struck me that that tiny pea, pretty and blue, was the Earth. I put up my thumb and shut one eye, and my thumb blotted out the planet Earth. I didn’t feel like a giant. I felt very, very small.

Neil Armstrong [1]

On this day 40 years ago, the Apollo 11′s Eagle lunar module landed on the Moon. As Neil Armstrong and Buzz Aldrin became the first humans to walk on the lunar surface, Michael Collins orbited the Moon inside the Columbia command module, experiencing “an aloneness unknown to man before”, as aviator Charles Lindbergh put it [2].

The Apollo 11 crew portrait. Left to right are Armstrong, Michael Collins, and Buzz Aldrin.

[ photo courtesy of NASA ]

The Apollo program was an enormous technical achievement. I shall forever be amazed at how the engineers who worked on this program managed to accomplish so much with the technology they had available at the time. The rocket engine was a relatively recent technology back then, the transistor had been invented in 1947, the integrated circuit had been invented in 1958, and the microprocessor had not even been invented yet. In particular, the Apollo Guidance Computer (AGC) was a wonderful work of ingenuity: it was the world’s first modern real-time embedded system, and it led to the development of fly-by-wire systems.

To commemorate the 40th anniversary of the Apollo 11 mission, the command module code (Comanche054) and the lunar module code (Luminary099) have been transcribed from scanned images by the Virtual AGC and AGS project! [3] The code can be found here.

For example, take a look at the following modules: ascent guidance, Kalman filter, master ignition routine. Yes, programming the AGC seems to have been a spartan endeavor! ;-) If you happen to dislike vintage assembly programming languages, take a look at the yaAGC code, which is written in C.

__________

References

[1] The Greening of the Astronauts, Time Magazine, December 11, 1972.

[2] Robin McKie, How Michael Collins became the forgotten astronaut of Apollo 11, The Observer, July 18, 2009.

[3] Nathaniel Manista, Apollo 11 mission’s 40th Anniversary: One large step for open source code…, The official Google Code blog, July 20, 2009.

Matching plus Allocation

July 9, 2009

Suppose we are given n = 3 red balls and n = 3 black balls. All of these balls are labeled with a real number. For example

Red + Black ballsWe now pair red balls with black balls. There are n! = 3! = 6 such matchings, as illustrated below

6 matchings

Additionally, suppose that a pair composed of a red ball with label r \in \mathbb{R} and a black ball with label b \in \mathbb{R} can be exchanged for a white ball with label b - r. For example

3 reds & 3 blacks produce 3 whitesSince there are 3! = 6 matchings, in general there will be 6 collections of 3 white balls, as depicted below

6x3 white ballsNote that the sum of the labels of the white balls is always 6, regardless of the matching. This is due to the fact that addition is commutative and associative. Note also that the 3! = 6 resulting collections of 3 white balls are not necessarily distinct.

Suppose now that we would like to allocate the n = 3 white balls to m = 2 bowls such that the sum of the labels of the white balls in each bowl is 50% of the total sum of the labels of the white balls. Pictorially, we have

123-50-50 allocationand, since the sum of the labels of the white balls in each bowl is exactly 3, the allocation is feasible. Nonetheless, this allocation is not unique: we could also have allocated ball with label 3 to bowl 1 and balls with labels 1 and 2 to bowl 2. If we wanted to allocate 1/3 of the total to bowl 1 and 2/3 of the total to bowl 2, we could have allocated ball with label 2 to bowl 1, and balls with labels 1 and 3 to bowl 2, for instance. This allocation would also be feasible.

However, not all desired allocations are feasible. For example, suppose that we want to allocate 90% to bowl 1 and 10% to bowl 2. This allocation is unfeasible. We can allocate a total of 5 to bowl 1 and 1 to bowl 2, which corresponds to 83.3% and 16.7% instead of 90% and 10%.

123-90-10 allocationThough we cannot attain the desired allocation exactly, we are able to get relatively “close” to it. We could use, for instance, a quadratic criterion to measure how “far” a particular allocation is from the desired one.

In each bowl we are attempting to reconstruct a given real number from a finite collection of real numbers. Hence, the unfeasibility of a desired allocation stems from the “chunkiness” of the “parts” we have at our disposal. But here is the interesting aspect of this problem: we can change the “parts” (i.e., the white balls) by matching the red and black balls in a different way. Therefore, allocation can not be “decoupled” from matching, so to say. The two operations are now intertwined. When performing the matching, we should strive to build a good collection of white balls so that the actual allocation is as “close” as possible to the desired one.

If we have n red balls and n black balls, then there are n! matchings. Each red-black pair is then allocated to one of m \leq n bowls. Hence, our search space has n! m^n points, and an exhaustive search will run in time O(n! m^n) which is intractable for “large” values of n. For example, if n = 10 and m = 3, the search space has over 200 billion points! Yet once again, we are doomed by the curse of combinatorial explosion.

Our challenge (and only hope) is to design a heuristic-based algorithm which, although not optimal, is “good enough” most of the time. Any ideas?

This problem seems to have some of the ingredients of many classical problems such as integer partitions, knapsack, and bipartite / tripartite matching. It is not difficult to formulate the problem as a constrained quadratic binary optimization problem, but solving it is hard.


Follow

Get every new post delivered to your Inbox.

Join 77 other followers