Logicomix

September 27, 2009 by Rod Carvalho

It cannot be a complete coincidence that several outstanding logicians of the 20th century found shelter in asylums at some time in their lives.

-Gian-Carlo Rota (1932-1999)

Logicomix: an Epic Search for Truth is an ambitious and highly original graphic novel on the heroic quest to make the foundations of Mathematics exact, consistent, and complete. The narrator and protagonist of the story is the great logician and philosopher Bertrand Russell. Some of the other characters are David Hilbert, Henri Poincaré, Georg Cantor, Gottlob Frege, Alfred North Whitehead, Kurt Gödel, Ludwig Wittgenstein, and John von Neumann. It is a book about ideas, passions, and madness.

Logicomix

[ image courtesy of Logicomix ]

The novel was written by Apostolos Doxiadis and Christos Papadimitriou. Alecos Papadatos did the drawings, and Annie Di Donna added color to them. Here are some videos on the making of Logicomix:

[ part 1 | part 2 | part 3 ]

If you happen to be interested in this topic, I would suggest you read Greg Chaitin’s nice essay on Computers, Paradoxes and the Foundations of Mathematics [pdf].

__________

Reviews:

Automated generation of machining sequences

September 18, 2009 by Rod Carvalho

When I was an undergraduate student I took a course on Machining. I did not enjoy to study the details of how various machine tools are built or how they work, but I had a lot of fun creating machining sequences, which required ingenuity. The regular visits to the machine shop were also fun. After all, it’s not everyday that one gets to see an object taking shape right before one’s eyes :-)

The instructor would give us a certain mechanical part and ask us to come up with a machining sequence for it. If you are not acquainted with machinist’s jargon, here’s a definition:

Definition: a machining sequence is a sequence of machining operations (such as turning, drilling, milling, boring, broaching, etc) that remove material from, say, an aluminium block in order to obtain a desired object (such as a piston or a cylinder for a car engine).

Note that machining is about removing material, i.e., it’s a subtractive technology. By contrast, rapid prototyping is an additive technology.

Let us consider an example. Suppose I give you the following 3D model:

3d-model

What machining sequence would you propose? Here’s my proposal:

  1. Start with a crude block of aluminium.
  2. Given the object’s rotational symmetry about the longitudinal axis, use the lathe to shape the cylindrical / conical parts. Use the lathe also to drill the central hole along the longitudinal axis.
  3. Note that the surface of the central hole is not smooth. It has “teeth”. I would say that a combined lathe / broach would allow one to obtain the “teeth”.
  4. Drill 6 holes at the base.
  5. Use a milling machine to remove material at the base, in between the 6 holes.

Here’s a sketchy illustration of the machining sequence I proposed, where a machine tool is assigned to various sections of the object being fabricated:

machining operations

I am not an experienced machinist, and I make no claims that the machining sequence I have just proposed is any good. If you have ideas, feel free to share.

Suppose you have a 3D model of some object you would like to fabricate via machining. An experienced machinist is able to visualize the object and devise a machining sequence. Could a computer be taught to do the same?

Problem: develop algorithms that, given a 3D model of some object, automatically generate machining sequences. These algorithms should determine which machining operations to use, and in which order.

The machining sequence to be carried out is a function of the geometry and the topology of the object to be fabricated. Hence, algorithms that automatically generate machining sequences would have to “understand” the geometry and the topology of the given 3D model. This suggests that ideas from Computational Geometry and Computational Topology could perhaps be applied in this particular problem. For example, if the object to be fabricated has cylindrical holes in it, then it’s likely that one should use a drill or a lathe. These holes are a geometrical / topological property of the object. How can a computer be taught to understand them?

Furthermore, there probably is more than one possible machining sequence to fabricate the desired object from the same crude block of metal. If there are many possible machining sequences, then one can start thinking of optimality. Some possible optimality criteria would be to minimize the total number of operations, or to minimize the total fabrication time, for example. Can you think of other criteria? Given one optimality criterion, one can then try to find the “best” machining sequence.

I wonder if Machine Learning could help, too. Give the same 3D model to an experienced machinist and to the software system we’re developing. Both devise machining sequences. The human will probably beat the computer. Feed the computer with the machining sequence devised by the human. Repeat this procedure many times. Iterate until one has obtained a skilled “machining expert system“. The ingenuity and creativity required to transform crude blocks of metal into beautifully machined parts would be emulated by algorithms. I make no claims that this would be easy to implement ;-)

I looked for work on this area. I found a few patents [1]-[5], and a bunch of technical papers [6]-[12] dedicated to this problem. Interestingly, most patents I found on this topic were filed by Japanese. This should come as no surprise. After all, the Japanese are notoriously good at optimizing manufacturing processes.

This post is admittedly vague. I propose a problem, but I do not present any sketch of a solution. Basically, I wanted to share my thoughts in order to invite constructive discussion. Any ideas?

__________

References

[1] Hajimu Kishi, Masaki Seki, Kunio Tanaka, Teruyuki Matsumura, Automatic machining process determination method in an automatic programming system, U.S. patent #4723203. Filed: 1985 / Issued: 1988.

[2] Koichi Asakura, Machine tool with tool selection and work sequence determination, U.S. patent #4739488. Filed: 1985 / Issued: 1988.

[3] Yasushi Fukaya, Yuto Mizukami, Method for determining a machining method in numerical control information generating apparatus, U.S. patent #5107414. Filed: 1989 / Issued: 1992.

[4] Kotaro Watanabe, CAD/CAM unit data generating apparatus and process, U.S. patent #5297022. Filed: 1992 / Issued: 1994.

[5] Takashi Takegahara, Shigetoshi Takagi, Koji Suzuki, Machining sequence determining method and apparatus for wire-cut electric discharge machining and computer readable medium storing a machining sequence determining program, U.S. patent #6445972. Filed: 1999 / Issued: 2002.

[6] In-Ho Kim, Jung-Soo Oh and Kyu-Kab Cho, Computer aided setup planning for machining processes, Computers & Industrial Engineering, 18th International Conference on Computers and Industrial Engineering, Vol. 31, Issues 3-4, pp. 613-617, December 1996.

[7] Z. Gu, Y.F. Zhang, A.Y.C. Nee, Identification of important features for machining operations sequence generation, International Journal of Production Research, Vol. 35, Issue 8, pp. 2285-2308, August 1997.

[8] Majid Tolouei-Rad, An efficient algorithm for automatic machining sequence planning in milling operations, International Journal of Production Research, Vol. 41, Issue 17, pp. 4115-4131, November 2003.

[9] W.D. Li, S. K. Ong, A.Y.C. Nee, Optimization of process plans using a constraint-based tabu search approach, International Journal of Production Research, Vol. 42, Issue 10, pp. 1955-1985, May 2004.

[10] D. Kingsly Jeba Singh, C. Jebaraj, Feature-based design for process planning of machining processes with optimization using genetic algorithms, International Journal of Production Research, Vol. 43, Issue 18, pp. 3855-3887, September 2005.

[11] L. Wang, N. Cai, H.-Y. Feng, Z. Liu, Enriched machining feature-based reasoning for generic machining process sequencing, International Journal of Production Research, Vol. 44, Issue 8, pp 1479-1501, April 2006.

[12] Rezo Aliyev, A strategy for selection of the optimal machining sequence in high speed milling process, International Journal of Computer Applications in Technology, Vol. 27, Issue 1, pp. 72-82, September 2006.

Interviews with Carver Mead

September 9, 2009 by Rod Carvalho

To understand reality, you have to understand how things work. If you do that, you can start to do engineering with it, build things. And if you can’t, whatever you’re doing probably isn’t good science. To me, engineering and science aren’t separate endeavors.

-Carver Mead

Prof. Carver Mead (b. 1934) has made major contributions to solid-state electronics, such as his work on “electron tunneling” and building the first GaAs MESFET.

Carver Mead

Here are a couple of interesting interviews with Carver Mead:

  • Interview with Carver A. Mead (Caltech – July 17, 1996): in this interview Mead talks about his childhood in California, how he got interested in electrical apparatuses and radios at a tender age, his undergraduate years at Caltech, how he started working in solid-state electronics, fundamental research, technology transfer, etc.
  • Carver Mead’s natural inspiration (Technology Review, Sept. 2004): Mead talks about science and engineering, his way of doing research, the startups he helped get off the ground, the business of technology, “demand pull” versus “technology push”, neuromorphic electronic systems, etc. A PDF copy of this interview can be found here.

__________

Related:

Are we running out of helium?

September 2, 2009 by Rod Carvalho

Contrary to popular belief, helium is not used only to inflate party balloons. Given that helium is inert, it is used to create nonreactive atmospheres in semiconductor fabrication and in arc welding, for instance. However, the most important industrial application of this noble gas is probably in cryogenics. Liquid helium is used to cool infrared sensors, nuclear reactors, and superconducting magnets (for example, in MRI equipment). Helium is also used for leak detection, and as a gain medium in He-Ne lasers.

Helium is the 2nd most abundant element in the universe, right after hydrogen. Helium is produced in stars by nuclear fusion:

Deuterium-Tritium Fusion

[ source ]

On Earth, the helium-4 isotope is by far the most abundant, a byproduct of millions of years of alpha decay of uranium and thorium (note that the nucleus of helium-4 is the same as the one of an alpha particle). Helium is extracted during the refining of natural gas. If it’s not extracted during this refining process, it soars off and is forever lost.

Since nuclear fusion technology is still decades away and given that alpha decay is an extremely slow process, industrial production of helium is indeed very far from becoming a reality. We will need to keep extracting helium from natural gas reserves. The problem is that helium reserves may be depleted within the next decade [1][2]. Once that happens, where will we get the helium demanded by high-tech industries?

If helium becomes a sought-after commodity, then the same companies that extract natural gas will have an economic incentive to extract helium as well, instead of letting it escape to the atmosphere. However, helium is a non-renewable resource. Sooner or later, we will deplete the reserves that have built up over millions and millions of years.

The Sun produces colossal amounts of helium by nuclear fusion. In fact, helium is named after Helios, the Greek god who personified the Sun (helium was first found in the Sun). Harvesting helium from the Sun is impossible, however. At least directly. A possibility would be to harvest helium-3 (to serve as fuel for fusion reactors) from moon rocks [3][4][5], where it can be found due to billions of years of bombardment by the solar wind. If the idea of mining the Moon sounds insane to you, please note that Harrison Schmitt (geologist and Apollo 17 astronaut) is one of the proponents.

Is the Moon the El Dorado of helium-3? I don’t know, but I have heard rumors that India and China are developing robotic systems to harvest helium-3 from moon rocks. It seems that Space is the next frontier in the race for natural resources. What next? Will we be mining asteroids to extract platinum for fuel cells?

__________

References:

[1] Emily Jenkins, A Helium Shortage?, Wired Magazine, Aug. 2000.

[2] Laura Deakin, The coming helium shortage, Chemical Innovation, June 2001.

[3] Harrison Schmitt, Mining the Moon, Popular Mechanics, Oct. 2004.

[4] John Lasker, Race to the Moon for Nuclear Fuel, Wired Magazine, 2006.

[5] Mark Williams, Mining the Moon, Technology Review, Aug. 2007.

Dangerous Knowledge

September 1, 2009 by Rod Carvalho

Bring vor, was wahr ist.
Schreib so, dass es klar ist.
Und verficht’s, bis es mit dir gar ist.

-Ludwig Boltzmann

Dangerous Knowledge is a 90-minute long BBC documentary about the lives of four great thinkers: Georg Cantor (1845-1918), Ludwig Boltzmann (1844-1906), Kurt Gödel (1906-1978), and Alan Turing (1912-1954). Cantor founded Set Theory, Boltzmann founded Statistical Mechanics, Gödel’s incompleteness theorems had an immense impact on Mathematical Logic, and Turing was one of the fathers of Computer Science. Why these four scientists? What is the pattern? The answer is that all four of them committed suicide.

Cantor Boltzmann Gödel Turing

The documentary is a bit too sensationalist for my taste. Its message is that these four brilliant minds were driven to madness (and, ultimately, to suicide) due to the earth-shattering nature of their ideas. While one could argue that such a claim is not too far-fetched in the cases of Cantor and Boltzmann, it seems somewhat distasteful when applied to Turing. Turing did not go insane because of the depth of his revolutionary insights. Turing’s homosexuality was viewed as a security problem during the troubled times of the Cold War, and the brutal punishment which he underwent (chemical castration) was probably what led him to take his life. Gödel feared that someone was poisoning his food, and would refuse to eat unless his wife would taste his food first. When his wife was hospitalized for some time, Gödel starved himself to death. Hence, the claim that these four suicides were due to “dangerous knowledge” seems rather crude to me.

On the bright side, the claim that these four geniuses were ahead of their time sounds rather plausible. Cantor’s or Boltzmann’s ideas did not meet fierce resistance because they were technically wrong, but because they challenged the belief that the universe was perfect and orderly. No one wanted to hear about Boltzmann’s work on entropy, because entropy is, by definition, a measure of disorder. As Hilbert’s program tried to fix the inconsistencies in  the foundations of Mathematics, Gödel’s incompleteness theorems made Hilbert’s monumental mission “wir müssen wissen, wir werden wissen” seem even harder to accomplish.

In my most humble opinion the times were not yet ripe for theories that imposed limitations on human knowledge. “What about Heisenberg’s uncertainty principle?”, I hear you ask. Good point. Do note that two decades can make a world of difference. Heisenberg’s work was created when Quantum Mechanics was already out of the box. By contrast, Boltzmann had to fight the 19th Century establishment that was still deeply entrenched in the Newtonian paradigm.

If you have a couple of hours, here are the documentary’s videos:

[ part 1 | part 2]

Greg Chaitin and Sir Roger Penrose are included in this documentary, which itself makes it worth watching.

Carnival of Mathematics #56

August 27, 2009 by Rod Carvalho

Welcome to the 56th installment of the Carnival of Mathematics!

Turing test

Qiaochu Yuan writes about Kraft’s inequality for prefix codes and the probabilistic method, and also on linear relationships between square roots.

Prolific blogger John Cook writes about an intriguing nonlinear differential equation \dot{y}(t) = t^2 + y^2(t), with initial value y(0) = 1. Suppose we would like to compute y(1). Numerical methods produce rather bizarre results. The reason is that there is no solution at t=1 (finite escape time, anyone?). The moral of the story is that the theorems on existence and uniqueness of solutions to initial value problems do indeed matter.

Rick Regan discovered that positive integers of the form 10^n -1 have binary representations where the n least significant bits are exactly 1. For example, 999_{10} = 1111100111_2. It’s interesting to note that this was a somewhat serendipitous finding.

Harrison Brown has a post on Roth’s proof of Roth’s theorem on arithmetic progressions from a combinatorial viewpoint.

Dave Richeson writes about the prime 8675309, twin primes and Pythagorean triples.

Américo Tavares lists three gamma function identities.

Mike Croucher writes about Philipp Kühl and Daniel Kirsch’s awesome Detexify project, a \LaTeX symbol classifier. This is not mathematics per se, but it is an interesting tool for those who need to type mathematical expressions. And if you are interested in machine learning, you will definitely be intrigued.

Since there were few submissions to this Carnival, I take the liberty of including some more links of interest to me:

  • Math Alive was a course on mathematical concepts that have changed the world. It covers many interesting topics: Cryptography, Error Correction Codes, Data Compression, Probability and Statistics, Chaos and Complex Systems, Graph Theory, Voting and Social Choice Theory. Take a look at the lecture notes. The course was taught at Princeton University by Ingrid Daubechies.
  • Edsger Dijkstra once said that “Computer Science is no more about computers than Astronomy is about telescopes”. Computer Science Unplugged is a very interesting initiative whose aim is to teach the principles of Computer Science via a series of recreational learning activities that are suitable for kids of all ages.
  • If you are interested in Combinatorial Game Theory, you have to read Games of No Chance.
  • Gil Kalai wrote on strategy-proof mechanisms and an impossibility theorem. Fascinating stuff!
  • Ron Doerfler wrote three beautiful posts on the lost art of Nomography. If you like to write code in Python, you can create your own nomographs with PyNomo. If this looks pre-historical to you, be reminded that electrical engineers (such as myself) still use Smith charts :-P
  • If you are studying or teaching elementary Game Theory, give Gambit a try!! Human beings were not designed to compute Nash equilibria with pencil and paper…
  • A very interesting applied problem I recently read about is the German tank problem. During WW2, the allies wanted to estimate the number of tanks the Germans were producing each year, so that they could assess the likelihood of success of an invasion of Continental Europe. Intelligence reports were not conclusive. The allies’ statisticians were able to estimate with great precision the number of German tanks being produced from the serial numbers of captured German tanks. The fact that the Germans, in typical German fashion, numbered their tanks sequentially turned out to be quite a blunder ;-)
  • Erico Guizzo wrote a thesis on Claude Shannon and the making of Information Theory.

The next edition of the CoM will be posted in two weeks, on 360.

The Binary Marble Adding Machine

August 25, 2009 by Rod Carvalho

Matthias Wandel built a rudimentary digital computer out of wood. The ingenious Binary Marble Adding Machine is, basically, a 6-bit adder that runs on gravity and uses mechanical flip-flops for memory storage.

schematic of the marble adding machine

[ schematic courtesy of Matthias Wandel ]

Here’s a demo video:

More info on the lovely Marble Adding Machine:

Other interesting creations of Matthias Wandel:

(hat tip: Rick Regan)

Optimal Account Balancing II

July 28, 2009 by Rod Carvalho

Consider a network of n = 10 agents, each owing money to and being owed money by other agents. The debts are represented pictorially by the weighted directed graph

IOU graph - 10 nodes and 20 edgeswhich I called an IOU graph on my previous post on this topic. The IOU graph representing this network is an ordered triple \mathcal{G} = (\mathcal{N}, \mathcal{E}, w), where \mathcal{N} = \{1, 2, \ldots, 10\} is the node set (each node of the IOU graph represents a different agent), \mathcal{E} \subset \mathcal{N} \times \mathcal{N} is the edge set (containing |\mathcal{E}| = 20 elements), and w: \mathcal{E} \rightarrow \mathbb{R}^{+} is a weight function that assigns positive weights to each edge in \mathcal{E}. Each directed edge of graph \mathcal{G} is an ordered pair (i,j), where i is the edge’s tail and j is the edge’s head. If (i,j) \in \mathcal{E} then node i owes money to node j. We do not allow nodes to owe money to themselves and, therefore, for each i \in \mathcal{N} we must have (i,i) \notin \mathcal{E}. The weight of edge (i, j) \in \mathcal{E} is given by w\left( (i, j) \right) > 0, and it tells us the amount of money that node i owes to node j.

Alternatively, the information contained in the IOU graph can be encoded in the following nonnegative matrix

\left[\begin{array}{cccccccccc} 0 & 10 & 0 & 20 & 20 & 5 & 15 & 0 & 0 & 0\\ 0 & 0 & 5 & 10 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 15 & 0 & 5 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0\\ 5 & 20 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0\\ 0 & 15 & 0 & 0 & 0 & 0 & 0 & 20 & 0 & 10\\ 0 & 5 & 15 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ \end{array}\right]

where entry (i,j) is the amount of money node i owes to node j. Since we do not allow nodes to owe money to themselves, the entries on the main diagonal of this matrix are zero. We will call this matrix the IOU matrix. Note that the IOU matrix is the adjacency matrix of the IOU graph. Every nonnegative matrix with zero entries on the main diagonal is an admissible IOU matrix, and we can associate with it a corresponding IOU graph.

If the IOU matrix contains all the information about the IOU graph, then we can work with the IOU matrix instead of the IOU graph. Is there any advantage in doing so? I believe there is. In my opinion, the IOU graph is good for conceptual thinking and intuition, while the IOU matrix is a data structure that is convenient for computation. However, the matrix is not the only data structure that one can use to encode the information in an IOU graph. Another possibility would be to use adjacency lists, for example.

__________

Divergence and IOU Matrices

Suppose we are given an IOU graph \mathcal{G}_0 = (\mathcal{N}, \mathcal{E}_0, w_0), whose node set is \mathcal{N} = \{1, 2, \ldots, n\}. Note that (i,i) \notin \mathcal{E}_0 for each i \in \mathcal{N}, because nodes cannot owe money to themselves. We now introduce variables y_{ij} \geq 0 as follows

y_{ij} = \displaystyle\left\{\begin{array}{cl} w_0 ((i,j)) & \text{if} \quad{} (i,j) \in \mathcal{E}_0\\ 0 & \text{if} \quad{} (i,j) \notin \mathcal{E}_0\\ \end{array}\right.

and build the (nonnegative) adjacency matrix of the IOU graph \mathcal{G}_0

Y = \left[\begin{array}{ccccc} 0 & y_{12} & y_{13} & \ldots & y_{1n}\\ y_{21} & 0 & y_{23} & \ldots & y_{2n}\\ y_{31} & y_{32} & 0 & \ldots & y_{3n}\\\vdots & \vdots & \vdots & \ddots & \vdots\\ y_{n1} & y_{n2} & y_{n3} & \ldots & 0\\\end{array}\right]

which is the IOU matrix associated with the IOU graph \mathcal{G}_0. The (i,j) entry of matrix Y \in \mathbb{R}^{n \times n} denotes the amount of money that node i owes to node j. The divergence of node i \in \mathcal{N} is a scalar defined as

\mbox{div}_i (\mathcal{G}_0) = \displaystyle \sum_{j \neq i} y_{ij} - \displaystyle \sum_{j \neq i} y_{ji}

and it gives us the amount of money node i owes to other nodes minus the amount of money node i is owed to by other nodes. In terms of the IOU matrix Y, the divergence of node i \in \mathcal{N} is given by

\mbox{div}_i (\mathcal{G}_0) = e_i^T Y 1_n - e_i^T Y^T 1_n = e_i^T Y 1_n - 1_n^T Y e_i

where e_i \in \mathbb{R}^{n} is the i-th vector of the canonical basis for \mathbb{R}^n, i.e., the i-th column of the n \times n identity matrix I_n, and 1_n is an n-dimensional vector of ones. Let \mbox{vec} denote the vectorization operator, which stacks a given matrix’s columns in one large column vector. Note that if we apply operator \mbox{vec} to a scalar, we obtain the same scalar. Hence, we have that

e_i^T Y 1_n = \mbox{vec} (e_i^T Y 1_n) = (1_n^T \otimes e_i^T) \mbox{vec} (Y)

and

1_n^T Y e_i = \mbox{vec} (1_n^T Y e_i) = (e_i^T \otimes 1_n^T) \mbox{vec} (Y)

where \otimes denotes the Kronecker product. Therefore, the divergence of node i \in \mathcal{N} can be written as

\mbox{div}_i (\mathcal{G}_0) = \left[(1_n^T \otimes e_i^T) - (e_i^T \otimes 1_n^T)\right] \mbox{vec} (Y).

We define also the divergence vector of graph \mathcal{G}_0 as the n-dimensional vector whose i-th component is \mbox{div}_i (\mathcal{G}_0)

\mbox{div} (\mathcal{G}_0) = \left[\begin{array}{c} (1_n^T \otimes e_1^T) - (e_1^T \otimes 1_n^T)\\ (1_n^T \otimes e_2^T) - (e_2^T \otimes 1_n^T)\\ \vdots\\ (1_n^T \otimes e_n^T) - (e_n^T \otimes 1_n^T)\\\end{array}\right] \mbox{vec} (Y).

Alternatively, we can use equation \mbox{div}_i (\mathcal{G}_0) = e_i^T Y 1_n - e_i^T Y^T 1_n, and write the divergence vector as

\mbox{div} (\mathcal{G}_0) = I_n Y 1_n - I_n Y^T 1_n = (Y - Y^T) 1_n

and, since

I_n Y 1_n = \mbox{vec} (I_n Y 1_n) = (1_n^T \otimes I_n) \mbox{vec} (Y)

and

I_n Y^T 1_n = \mbox{vec}(I_n Y^T 1_n) = (1_n^T \otimes I_n) \mbox{vec} (Y^T),

we obtain

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) \mbox{vec} (Y) - (1_n^T \otimes I_n) \mbox{vec} (Y^T).

Note that n^2-dimensional vectors \mbox{vec} (Y) and \mbox{vec} (Y^T) contain the same entries, but in a different order. Hence, there exists a n^2 \times n^2 permutation matrix P such that

\mbox{vec} (Y^T) = P \mbox{vec} (Y)

which allows us to write

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) (I_{n^2} - P) \mbox{vec} (Y).

Do note that the permutation matrix depends on n only. It does not depend on the entries of the matrix to which operator \mbox{vec} is applied. Henceforth, we will denote \tilde{y} = \mbox{vec}(Y). Finally, we have a rather compact equation for the divergence vector

\mbox{div} (\mathcal{G}_0) = (1_n^T \otimes I_n) (I_{n^2} - P) \tilde{y}.

The divergence vector of an IOU graph \mathcal{G}_0 tells us how much money each node in the graph will lose after debts have been paid. We say that two IOU graphs (on the same node set) are equi-divergent if their divergence vectors are the same. This means that, although the debt relationships are different, both IOU graphs result in every node getting paid what he is owed and paying what he owes.

__________

Minimum Money Flow

Suppose we are given an IOU graph \mathcal{G}_0 = (\mathcal{N}, \mathcal{E}_0, w_0). We would like to find a new IOU graph \mathcal{G} = (\mathcal{N}, \mathcal{E}, w) on the same node set \mathcal{N}, such that the two IOU graphs are equi-divergent, i.e., \mbox{div} (\mathcal{G}) = \mbox{div} (\mathcal{G}_0). Moreover, we would like the new IOU graph to be “optimal” in some sense. On this post, we will consider the following optimality criterion:

minimum money flow: we want to minimize the total amount of money being exchanged between the nodes. This is the same as minimizing the sum of the weights of the IOU graph \mathcal{G}.

Let us start with matrix X \in \mathbb{R}^{n \times n}, which will be an admissible IOU matrix if we impose two constraints:

  • nonnegativity: the entries of matrix X must be nonnegative. This is equivalent to imposing the constraint that vector \tilde{x} = \mbox{vec}(X) has nonnegative entries, i.e., \tilde{x} \geq 0_{n^2}, which is equivalent to (-I_{n^2}) \tilde{x} \leq 0_{n^2}.
  • zero entries on the main diagonal: we must have x_{ii} = 0 for all i \in \mathcal{N}. Note that x_{ii} = e_i^T X e_i = \mbox{vec}(e_i^T X e_i), and therefore x_{ii} = 0 is equivalent to (e_i^T \otimes e_i^T) \tilde{x} = 0. Hence, imposing the constraint that the entries on the main diagonal be zero is equivalent to imposing the n equality constraints

\left[\begin{array}{c} e_1^T \otimes e_1^T\\ e_2^T \otimes e_2^T\\ \vdots\\ e_n^T \otimes e_n^T\\\end{array}\right] \tilde{x} = 0_n.

Once these two constraints have been imposed, matrix X is the adjacency matrix of an IOU graph. Finally, we must impose the constraint that the divergence vector of the new IOU graph is the same as the divergence vector of the given IOU graph, i.e., \mbox{div} (\mathcal{G}) = \mbox{div} (\mathcal{G}_0), which yields

(1_n^T \otimes I_n) (I_{n^2} - P) \tilde{x} = \mbox{div} (\mathcal{G}_0)

where \mbox{div} (\mathcal{G}_0) = (Y - Y^T) 1_n can be computed from the given IOU graph. Considering the minimum money flow criterion, the cost function to be minimized is the following

\displaystyle\sum_{i \in \mathcal{N}} \sum_{j \neq i} x_{ij} = 1_n^T X 1_n = (1_n^T \otimes 1_n^T) \tilde{x} = 1_{n^2}^T \tilde{x}

and, hence, we obtain the following linear program in \tilde{x} \in \mathbb{R}^{n^2}

\begin{array}{ll} \text{minimize} & \displaystyle 1_{n^2}^T \tilde{x}\\\\ \text{subject to} & (1_n^T \otimes I_n) (I_{n^2} - P) \tilde{x} = \mbox{div} (\mathcal{G}_0)\\\\ & \left[\begin{array}{c} e_1^T \otimes e_1^T\\ e_2^T \otimes e_2^T\\ \vdots\\ e_n^T \otimes e_n^T\\\end{array}\right] \tilde{x} = 0_n\\\\ & (-I_{n^2}) \tilde{x} \leq 0_{n^2}\\\\\end{array}

which has 2 n equality and n^2 inequality constraints. One can easily solve this linear program with MATLAB using function linprog.

However, this formulation looks a bit silly. Note that initially we have n^2 degrees of freedom (the n^2 entries of matrix X), then we impose n equality constraints x_{ii} = 0 to ensure that the entries on the main diagonal are zero, which leaves us with m = n^2 - n = n (n-1) degrees of freedom. If we already know that the entries on the main diagonal are zero, why consider them as decision variables? Wouldn’t it make much more sense to consider only the m = n (n-1) entries off the main diagonal as decision variables?

Let R be a m \times n^2 binary matrix such that \hat{x} = R \tilde{x} is the reduced vector of m decision variables, containing solely the entries of X off the main diagonal. We now eliminate the n columns of n \times n^2 matrix (1_n^T \otimes I_n) (I_{n^2} - P) which were to be multiplied by the x_{ii} variables. We do so by multiplying matrix (1_n^T \otimes I_n) (I_{n^2} - P) on the right by R^T, which produces a n \times m matrix (1_n^T \otimes I_n) (I_{n^2} - P) R^T.

The new vector of decision variables is \hat{x} \in \mathbb{R}^m. Since we no longer consider the entries on the main diagonal as decision variables, we don’t need to impose the n equality constraints x_{ii} = 0. Hence, we obtain a lower-dimensional linear program

\begin{array}{ll} \text{minimize} & \displaystyle 1_{m}^T \hat{x}\\\\ \text{subject to} & (1_n^T \otimes I_n) (I_{n^2} - P) R^T \hat{x} = \mbox{div} (\mathcal{G}_0)\\\\ & (-I_m) \hat{x} \leq 0_{m}\\\\\end{array}

where we now have n equality constraints (divergences at the nodes), and m inequality constrains (x_{ij} \geq 0). My intuition tells me that

(1_n^T \otimes I_n) (I_{n^2} - P) R^T = \left[\begin{array}{c} (1_n^T \otimes e_1^T) - (e_1^T \otimes 1_n^T)\\ (1_n^T \otimes e_2^T) - (e_2^T \otimes 1_n^T)\\ \vdots\\ (1_n^T \otimes e_n^T) - (e_n^T \otimes 1_n^T)\\\end{array}\right] R^T

is an incidence matrix of the directed complete graph on node set \mathcal{N}. What do you think? Do you agree?

__________

A Numerical Example

Consider a network of n = 10 agents whose debts are represented by the following IOU graph (with 15 edges)

IOU graph - 10 nodes and 15 edgesThe IOU matrix associated with this given IOU graph is

Y = \left[\begin{array}{cccccccccc} 0 & 0 & 0 & 20 & 20 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 10 & 0 & 0 & 0 & 0\\ 5 & 20 & 0 & 0 & 0 & 0 & 5 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 20 & 0 & 10\\ 0 & 5 & 15 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ \end{array}\right]

Solving the linear program, we obtain a new IOU matrix

X = \left[\begin{array}{cccccccccc} 0 & 6.097 & 3.924 & 10.776 & 10.776 & 8.427 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0.850 & 0.644 & 1.234 & 1.234 & 1.039 & 0 & 0 & 0 & 0\\ 0 & 1.690 & 1.185 & 2.505 & 2.505 & 2.114 & 0 & 0 & 0 & 0\\ 0 & 4.672 & 3.063 & 7.980 & 7.980 & 6.306 & 0 & 0 & 0 & 0\\ 0 & 1.690 & 1.185 & 2.505 & 2.505 & 2.114 & 0 & 0 & 0 & 0\\\end{array}\right]

which is the adjacency matrix of a new IOU graph. The given IOU graph and the new IOU graph are equi-divergent. The given IOU graph had 15 edges and a total of 165 flowing in it, while the new IOU graph has 25 edges and a total of 95 flowing in it. While the flow of money has been reduced, the number of edges (i.e., the number of transactions) has increased dramatically!!

On my previous post on this topic, I gave two very simple examples, and in both of them the minimum money flow solution also yielded the least number of transactions. I wondered back then if that was always the case. This example clearly shows that it is not. Generally speaking, not only does the minimum money flow criterion fail to minimize the number of transactions, but it can actually result in more transactions than the ones in the initial IOU graph. I cannot think of any reason why minimizing the total amount of money flowing in the IOU graph at the cost of a larger number of transactions would be a good idea…

Therefore, the next step is to abandon the minimum money flow criterion, and focus solely on the least number of transactions. Minimizing the number of transactions is a hard combinatorial problem, but maybe there are approximation algorithms that make the problem tractable.

Analytical Geometry with POV-Ray

July 28, 2009 by Rod Carvalho

If you would like to use POV-Ray to visualize 3-dimensional geometrical objects, I highly recommend Friedrich A. Lohmueller’s beautiful tutorial on Analytical Geometry with POV-Ray. Other POV-Ray tutorials by Friedrich A. Lohmueller can be found here.

The Shadow of a Pyramid (by Friedrich A. Lohmueller, 2007)

[ image courtesy of Friedrich A. Lohmueller ]

In my humble opinion, this is how Analytical Geometry should be taught. Descartes had no access to 3-d graphics. We do. Why not take advantage of the technology?

Information, Physics and Computation

July 20, 2009 by Rod Carvalho

Andrea Montanari and Marc Mézard have written a book: Information, Physics and Computation. It is a unique, ambitious book which covers the exciting intersection of Statistical Physics,  Theoretical Computer Science, Discrete Mathematics, Information Theory, and Coding Theory.

parity check code & factor graph (small)

[ graphs courtesy of Montanari & Mézard ]

The book was published recently. Hopefully, the PDF files will continue to be available online. If you liked David MacKay’s gorgeous book, you will probably like this one too.