## Posts Tagged ‘Mathematics’

### Matrix decompositions with sparsity constraints

December 12, 2008

Suppose we are given a finite square matrix $A \in \mathbb{R}^{n \times n}$. We would like to factorize $A$ into $N$ (where $N$ is finite) matrices

$A = W_{N-1} W_{N-2} \ldots W_ {2} W_{1} W_{0}$,

where factors $W_0, W_1, \ldots, W_{N-1} \in \mathbb{R}^{n \times n}$ are required to have a certain sparsity pattern, i.e., a subset of their $n^2$ entries must be zero.

Definition: Let $\mathcal{I}_n = \{1, 2, \ldots, n\}$ and $\mathcal{I}_n^2 = \mathcal{I}_n \times \mathcal{I}_n$. A sparsity pattern $\mathcal{S}_n$ is a set of ordered pairs $(i,j) \in \mathcal{I}_n^2$ that indicate which entries of a $n \times n$ matrix are constrained to be zero.

Note that $\mathcal{S}_n \subseteq \mathcal{I}_n^2$. If there are no sparsity constraints, then $\mathcal{S}_n$ is an empty set. Set $\mathcal{I}_n^2 \setminus \mathcal{S}_n$ contains the indices of the entries of a $n \times n$ matrix which are unconstrained, i.e., “free”.

__________

Problem: Given a matrix $A \in \mathbb{R}^{n \times n}$ and a particular sparsity pattern $\mathcal{S}_n$, how can we find if it is possible to factorize $A$ into $N$ factor matrices (where $N$ is finite) whose sparsity pattern is the specified one? If such factorization is possible, how can we compute each factor $W_1, W_2, \ldots, W_{N-1}$ and $N$?

__________

This problem is easy to pose, but I have found it very hard to attack. I have tried various approaches… and none worked. Does anyone have any ideas?

__________

Example: suppose we are given a $5 \times 5$ real matrix

$A = \left[\begin{array}{ccccc} 1 & 1 & 1 & 1 & 1\\ 1 & 1 & 1 & 1 & 1\\ 1 & 1 & 1 & 1 & 1\\ 1 & 1 & 1 & 1 & 1\\ 1 & 1 & 1 & 1 & 1\\ \end{array}\right]$.

We would like to factorize matrix $A$ into $N$ factors $\{W_k\}_{k =0}^{N-1} \in \mathbb{R}^{5 \times 5}$ whose sparsity pattern is represented pictorially by

$\left[\begin{array}{ccccc} \times & \times & 0 & 0 & 0\\ \times & \times & \times & 0 & 0\\ 0 & \times & \times & \times & 0\\ 0 & 0 & \times & \times & \times\\ 0 & 0 & 0 & \times & \times\\\end{array}\right]$,

where a “$\times$” in the $(i,j)$ entry means that the $(i,j)$ entry is unconstrained, i.e., it is allowed to take any value in $\mathbb{R}$. If you are familiar with numerical PDE’s, you certainly encountered matrices with this sparsity pattern when solving diffusion or wave equations in one spatial dimension using finite difference methods.

__________

Even though this example seems fairly “simple” (it’s only a $5 \times 5$ matrix after all!), I have not yet found an approach that works.

### Building a polynomial from its roots II

December 10, 2008

Suppose we are given an $n$-tuple of (not necessarily distinct) complex numbers $(w_1, w_2, \ldots, w_n)$ and we construct a monic univariate polynomial of degree $n$ over field $\mathbb{C}$ whose $n$ roots are the elements of $(w_1, w_2, \ldots, w_n)$

$p_n(z) = \displaystyle\prod_{i=1}^n (z - w_i) = \displaystyle\sum_{m=0}^n a(m,n) z^m$,

where $a(m,n) \in \mathbb{C}$ is the coefficient of monomial $z^m$ of degree $m$ in polynomial $p_n(z)$ of degree $n$. This double-indexing might seem a bit confusing, but its usefulness will be made evident later on.

Usually, we take a univariate polynomial and try to find its roots. Now we are interested in the reverse operation, i.e., we would like to find the values of the coefficients $\{a(m,n)\}_{m=0}^{n}$ of polynomial $p_n(z)$ in terms of the roots $(w_1, w_2, \ldots, w_n)$. Note that every permutation of the $n$-tuple $(w_1, w_2, \ldots, w_n)$ results in the same monic polynomial of degree $n$.

_____

Constructing polynomials

Say we know the coefficients of polynomial $p_k(z)$ of degree $k$. Then we can write the coefficients $\{a(m, k+1)\}_{m=0}^{k+1}$ of polynomial $p_{k+1}(z)$ in terms of the coefficients  $\{a(m, k)\}_{m=0}^{k}$ and the $(k+1)$-th element of $n$-tuple $(w_1, w_2, \ldots, w_n)$. On my previous post on this topic, I derived the recursion

$a(m,k+1) = \displaystyle\left\{\begin{array}{rl} - w_{k+1} a(0,k) & \text{if} \quad{} m = 0\\ a(m-1,k) - w_{k+1} a(m,k) & \text{if} \quad{} 1 \leq m \leq k\\ a(k,k) & \text{if} \quad{} m=k+1\\ \end{array}\right.$

Let us start with a monic polynomial of degree zero, $p_0(z) = 1$, whose only coefficient is $a(0,0) = 1$. We take the first element of the $n$-tuple $(w_1, w_2, \ldots, w_n)$ and construct the monic polynomial $p_1(z) = z - w_1$ of degree $1$ using the recursion above. Then we take the 2nd root $w_2$ and construct the monic polynomial of degree $2$ whose roots are $w_1, w_2$

$p_2(z) = z^2 - (w_1 + w_2) z + w_1 w_2$.

Iterating successively, we obtain a monic polynomial $p_n(z)$ of degree $n$ whose roots are $w_1, w_2, \ldots, w_n$.

_____

A graphical approach

Consider the following set of 15 nodes arranged in 5 columns and 5 rows. Each node is labelled with an ordered pair $(i,j)$, where $i$ is the row index and $j$ is the column index.

We now attach a complex number $a(i,j)$ to each node $(i,j)$. The coefficient of monomial $z^m$ in polynomial $p_n(z)$ can be thought of as the number attached to node $(m,n)$. For example, the coefficients of polynomial $p_2(z)$ are the numbers attached to the nodes of the 3rd column (note that we count rows and columns starting at the zero index, like in the C programming language). The double-indexing now makes some sense ;-)

Finally, we add directed edges linking nodes in adjacent columns, and obtain the following directed graph (which reminds me of the trellis diagrams used in convolutional coding)

Recall the recursion formula which allows us to write the coefficients of polynomial $p_{k+1}(z)$ in terms of the coefficients of polynomial $p_{k}(z)$

$a(m,k+1) = \displaystyle\left\{\begin{array}{rl} - w_{k+1} a(0,k) & \text{if} \quad{} m = 0\\ a(m-1,k) - w_{k+1} a(m,k) & \text{if} \quad{} 1 \leq m \leq k\\ a(k,k) & \text{if} \quad{} m=k+1\\ \end{array}\right.$.

Since the coefficients $a(\cdot, \cdot)$ are attached to the nodes of the directed graph depicted above, we can use the recursion formula to compute the values attached to the nodes of a given column as a function of the values attached to the nodes in the column to the left.

This graphical approach is (most likely) of no practical use. Nonetheless, this approach does reveal the “structure” of the process of constructing a polynomial from its roots, and such structure is of much greater beauty than cumbersome algebraic manipulation.

_____

An example

Suppose we want to compute the coefficients of the monic polynomial $p_4(z) = (z - 1)^4$, which has a root of multiplicity $4$ at $z = 1$. In this case we have $w_1 = w_2 = w_3 = w_4 = 1$. From the well-known binomial theorem, we know that

$p_4(z) = z^4 - 4 z^3 + 6 z^2 - 4 z + 1$,

and therefore, the coefficients are $a(0,4) = 1$, $a(1,4) = -4$, $a(2,4) = 6$, $a(3,4) = -4$, and $a(4,4) = 1$. Note that the absolute values of these coefficients are the binomial coefficients and can be found in the 5th row of Pascal’s triangle.

Let us now construct this polynomial one step at a time using the graphical approach described above. We start with monic polynomial $p_0(z) = 1$, whose only coefficient is $a(0,0) = 1$

and then we compute the coefficients of the monic polynomial $p_1(z) = z-1$ using the recursion formula. Graphically, this means that the values of the 2 nodes in the 2nd column can be computed from $a(0,0)$, as follows

where the directed edges now have weights. For example, the edge linking nodes $(0,0)$ and $(0,1)$ has weight $-1$, while the edge linking $(0,0)$ and $(1,1)$ has weight $1$. If the weight of an edge is equal to $1$, then we do not label the edge in order to avoid cluttering the diagram.

From the coefficients of $p_1(z)$ and the second root, $w_2 = 1$, we can compute the coefficients of $p_2(z)$ using the recursion again, i.e., the values of the nodes in the 3rd column can be computed from the values in the nodes in the 2nd column, as follows

The values of the nodes in the 3rd column are known, and since the 3rd root is $w_3 = 1$ we can now obtain the coefficients of $p_3(z)$

and since $w_4 = 1$, we can finally obtain the coefficients of the monic polynomial $p_4(z)$, which are the values of the nodes in the 5th column

In this example the roots I chose have zero imaginary parts, but this approach still works if the roots have nonzero imaginary parts. This method could be used to compute the coefficients of polynomials of degrees larger than $4$, of course.

### Building a polynomial from its roots

November 22, 2008

Suppose we are given a set of $n \geq 2$ distinct real numbers $\mathcal{R} = \{r_1, r_2, \ldots, r_n\}$, and we build a monic univariate polynomial (over field $\mathbb{R}$) of degree $n$ whose $n$ distinct roots are the elements of set $\mathcal{R}$

$p_n(x) = \displaystyle\prod_{i=1}^n (x - r_i) = \displaystyle\sum_{m=0}^n a_{m,n} x^m$,

where $a_{m,n} \in \mathbb{R}$ is the coefficient of monomial $x^m$. The second subscript in $a_{m,n}$ tells us that the coefficient corresponds to a polynomial whose degree is $n$.

Problem: for $n \geq 2$, what are the values of the coefficients $\{a_{m,n}\}_{m=0}^{n}$ in terms of the roots $\{r_i\}_{i=1}^n$?

At first glance, this problem seems elementary. However, I had never thought of it and I found it quite interesting. I prefer to think of examples before attempting to see the “big picture”, so let us consider the simplest cases.

_____

Case $n=1$

The monic polynomial of degree $1$ whose root is $r_1$ is

$p_1(x) = x - r_1$

and the coefficients are

$\left[\begin{array}{c}a_{0,1}\\a_{1,1}\end{array}\right] = \left[\begin{array}{c} -r_1\\1\end{array}\right]$.

_____

Case $n=2$

The monic polynomial of degree $2$ whose roots are $r_1, r_2$ is

$\begin{array}{rl} p_2(x) &= (x - r_1) (x - r_2)\\ &= x^2 - (r_1 + r_2) x + r_1 r_2\end{array}$

and the coefficients are

$\left[\begin{array}{c}a_{0,2}\\a_{1,2}\\a_{2,2}\end{array}\right] = \left[\begin{array}{c}r_1 r_2\\-(r_1+r_2)\\1\end{array}\right]$.

_____

Case $n=3$

The monic polynomial of degree $3$ whose roots are $r_1, r_2, r_3$ is

$\begin{array}{rl} p_3(x) &= (x - r_1) (x - r_2) (x - r_3)\\ &= x^3 - (r_1+r_2 + r_3) x^2 + (r_1 r_2 + r_1 r_3 + r_2 r_3) x - r_1 r_2 r_3\\\end{array}$

and the coefficients are

$\left[\begin{array}{c}a_{0,3}\\a_{1,3}\\a_{2,3}\\a_{3,3}\end{array}\right] = \left[\begin{array}{c} - r_1 r_2 r_3\\ r_1 r_2 + r_1 r_3 + r_2 r_3\\ -(r_1+r_2 + r_3)\\1\end{array}\right]$.

_____

The coefficients $\{a_{m,n}\}_{m=0}^{n}$ can be computed in a recursive fashion. Consider the monic polynomials

$p_{k}(x) = \displaystyle\prod_{i=1}^{k} (x - r_i)$

and

$p_{k+1}(x) = \displaystyle\prod_{i=1}^{k+1} (x - r_i) = (x - r_{k+1}) \displaystyle\prod_{i=1}^k (x - r_i)$,

then $p_{k+1} = (x - r_{k+1}) p_k(x)$, and thus

$\begin{array}{rl} p_{k+1}(x) &= (x - r_{k+1})\displaystyle\sum_{m=0}^k a_{m,k} x^m\\ &= \displaystyle\sum_{m=0}^k a_{m,k} x^{m+1} - \displaystyle\sum_{m=0}^k r_{k+1} a_{m,k} x^m.\\\end{array}$

Note that

$\begin{array}{rl} \displaystyle\sum_{m=0}^k a_{m,k} x^{m+1} &= \displaystyle\sum_{m=1}^k a_{m-1,k} x^m + a_{k,k} x^{k+1}\\ \displaystyle\sum_{m=0}^{k} r_{k+1} a_{m,k} x^m &= r_{k+1} a_{0,k} + \displaystyle\sum_{m=1}^k r_{k+1} a_{m,k} x^m\end{array}$

and therefore

$p_{k+1}(x) = a_{k,k} x^{k+1} + \displaystyle\sum_{m=1}^k \left(a_{m-1,k} - r_{k+1} a_{m,k}\right) x^m - r_{k+1} a_{0,k}$.

We can write $p_{k+1}(x)$ in the expanded form in terms of the coefficients $\{a_{m,k+1}\}_{m=0}^{k+1}$

$p_{k+1}(x) = a_{k+1,k+1} x^{k+1} + \displaystyle\sum_{m=1}^k a_{m,k+1} x^m + a_{0,k+1}$

and therefore we can write the coefficients $\{a_{m,k+1}\}_{m=0}^{k+1}$ in terms of the coefficients $\{a_{m,k}\}_{m=0}^{k}$, as follows

$a_{m,k+1} = \displaystyle\left\{\begin{array}{rl} - r_{k+1} a_{0,k} & \text{if} \quad{} m = 0\\ a_{m-1,k} - r_{k+1} a_{m,k} & \text{if} \quad{} 1 \leq m \leq k\\ a_{k,k} & \text{if} \quad{} m=k+1\\ \end{array}\right.$

Hence, starting with the zero-degree monic polynomial $p_0(x)$, whose only coefficient is $a_{0,0} = 1$, we can build $p_1(x)$, the monic polynomial of degree $1$ whose root is $r_1$. From $p_1(x)$, and given a second root $r_2$, we can build $p_2(x)$, the monic polynomial of degree $2$ whose roots are $r_1, r_2$. Iterating successively, we can build $p_n(x)$, the monic polynomial of degree $n$ whose roots are $r_1,r_2, \ldots, r_n$.

### Representing complex numbers as 2×2 matrices

November 9, 2008

We can represent complex numbers as $2 \times 2$ real matrices, such that arithmetic operations $(+, -, \times, \div)$ on complex numbers become equivalent to arithmetic operations on their matrix representations.

Consider a matrix-valued function $M: \mathbb{C} \rightarrow \mathbb{R}^{2 \times 2}$ defined as

$M(z) = \left[\begin{array}{cc} \Re\{z\} & -\Im\{z\}\\ \Im\{z\} & \Re\{z\}\\ \end{array}\right]$,

where $\Re\{z\}$ and $\Im\{z\}$ are the real and imaginary parts of $z \in \mathbb{C}$, respectively. It can be shown that for all $z_1, z_2 \in \mathbb{C}$, we have that

• $M(z_1 + z_2) = M(z_1) + M(z_2)$.
• $M(z_1 - z_2) = M(z_1) - M(z_2)$.
• $M(z_1 z_2) = M(z_1) M(z_2)$.
• $M(z_1 / z_2) = [M(z_2)]^{-1} M(z_1)$, if $z_2 \neq 0$.

There seems to be an homomorphism here.

__________

Matrix Representation of a Complex Number

Let $z = a + i b$, where $i = \sqrt{-1}$ and $a,b \in \mathbb{R}$, be a complex number. Its matrix representation is

$M(z) = M(a + i b) = \left[\begin{array}{cc} a & -b\\ b & a\\ \end{array}\right] = a \left[\begin{array}{cc} 1 & 0\\ 0 & 1\\ \end{array}\right] + b \left[\begin{array}{cc} 0 & -1\\ 1 & 0\\ \end{array}\right]$.

Let matrices $I_2, Q \in \mathbb{R}^{2 \times 2}$ be

$I_2 = \left[\begin{array}{cc} 1 & 0\\ 0 & 1\\ \end{array}\right], \quad{} Q = \left[\begin{array}{cc} 0 & -1\\ 1 & 0\\ \end{array}\right]$,

then the matrix representation of $z \in \mathbb{C}$ is

$M(z) = M(a + i b) = a I_2 + b Q$.

Note that both $I_2$ and $Q$ are orthogonal matrices, and therefore $Q^{-1} = Q^T$. Since $I_2$ is the $2 \times 2$ identity matrix, we have that $I_2$ and $Q$ commute, i.e., $I_2 Q = Q I_2$.

The matrix representation of the complex conjugate of $z$ is

$M(z^*) = M(a - i b) = a I_2 - b Q = M^T (z)$,

which follows from the fact that matrix $Q$ is skew-symmetric, i.e., $Q^T = -Q$. The complex conjugate of a complex number can thus be computed by transposing its matrix representation. Since $Q^T = Q^{-1}$ and $Q^T = - Q$, we have $Q^2 = - I_2$.

The square of the magnitude of a complex number, $|z|^2 = zz^*$, is equal to the determinant of its matrix representation, as follows

$\det(M(z)) = \det(M(a + i b)) = \begin{vmatrix} a & -b \\ b & a\\ \end{vmatrix} = a^2 + b^2 = |z|^2$.

A complex number written in the cartesian form, $z = a + i b$, can be converted to its polar form, $z = \rho e^{i \theta}$, where $\rho = \sqrt{a^2 + b^2}$ and $\theta = \arg(z)$. Given a complex number in the polar form $z = \rho e^{i \theta}$, its cartesian form is

$z = \rho (\cos(\theta) + i \sin(\theta))$,

and its matrix representation is

$M(z) = \rho \left[\begin{array}{cc} \cos(\theta) & -\sin(\theta)\\ \sin(\theta) & \cos(\theta)\\ \end{array}\right] = \rho R(\theta)$,

where

$R(\theta) = \left[\begin{array}{cc} \cos(\theta) & -\sin(\theta)\\ \sin(\theta) & \cos(\theta)\\ \end{array} \right]$

is a rotation matrix that rotates a vector in $\mathbb{R}^2$ by a counterclockwise angle $\theta$ radians, leaving the vector’s $\|\cdot\|_2$ unchanged. Matrix $R(\theta)$ is orthogonal, and thus $R^{-1} (\theta) = R^T (\theta)$. Given that

$\sin(-\theta) = - \sin(\theta), \quad{} \cos(-\theta) = \cos(\theta)$,

we have $R(-\theta) = R^T (\theta) = R^{-1}$. As $\det(R(\theta)) = 1$ for any value of $\theta$, we can conclude that $R(\theta) \in \text{SO}(2)$.

Finally, note that $Q = R(\frac{\pi}{2})$, i.e., matrix $Q$ rotates a vector in $\mathbb{R}^2$ $\pi/2$ radians counter-clockwise. If we multiply a complex number by $i$, we rotate it by a counterclockwise angle of $\pi/2$ radians

$i z = i \rho e^{i \theta} = \rho e^{i (\theta + \frac{\pi}{2})}$,

and in terms of matrix representations, this is equivalent to multiplying $M(z)$ by matrix $Q$. It does not matter if we multiply by $Q$ on the left or on the right, because matrices $I_2$ and $Q$ do commute.

__________

Arithmetic Operations

Let $z_1 = a_1 + i b_1$ and $z_2 = a_2 + i b_2$ be two complex numbers. Their addition/subtraction is thus

$z_1 \pm z_2 = (a_1 \pm a_2) + i (b_1 \pm b_2)$,

and from the rules of matrix addition / subtraction it can easily be shown  that $M(z_1) \pm M(z_2) = M(z_1 \pm z_2)$.

It’s easier to compute products of complex numbers if we write them in the polar form, i.e., $z_1 = \rho_1 e^{i \theta_1}$ and $z_2 = \rho_2 e^{i \theta_2}$. The product is thus $z_1 z_2 = \rho_1 \rho_2 e^{i (\theta_1 + \theta_2)}$. The product of the matrix representations is

$M(z_1) M(z_2) = \rho_1 R(\theta_1) \rho_2 R(\theta_2) = \rho_1 \rho_2 R(\theta_1 + \theta_2) = M(z_1 z_2)$,

since $R(\theta_1) R(\theta_2) = R(\theta_1 + \theta_2)$. Dividing $z_1 \in \mathbb{C}$ by $z_2 \in \mathbb{C} \setminus \{0\}$ yields

$\displaystyle\frac{z_1}{z_2} = \frac{\rho_1}{\rho_2} e^{i (\theta_1 - \theta_2)}$,

whose corresponding matrix representation is

$\displaystyle M(z_1 / z_2) = \frac{\rho_1}{\rho_2} R(\theta_1 - \theta_2)$.

Note that

$M^{-1} (z_2) M(z_1) = \displaystyle \frac{\rho_1}{\rho_2} R(-\theta_2) R(\theta_1) = \frac{\rho_1}{\rho_2} R(\theta_1 - \theta_2)$

and therefore $M^{-1} (z_2) M(z_1) = M(z_1 / z_2)$.

We can conclude that arithmetic operations on complex numbers are equivalent to arithmetic operations on the matrices representing such complex numbers.

I have absolutely no idea what use this matrix representation could possibly have, but it’s quite interesting nonetheless.

__________

Related:

January 18, 2008

Consider a set $S$ with $|S|=n$ elements (where all elements of $S$ are distinct). We can build ordered lists of $n$ elements by taking elements from set $S$ and using them (with no repetition) to build $n$-tuples. With $n$ elements we can build $n!$ distinct $n$-tuples.

Let us think in terms of vectors in $\mathbb{R}^n$, where each vector represents a particular ordered list of the elements of some set with $n$ elements. If $x, y \in \mathbb{R}^n$ are permutations of one another, then they contain the same elements, but in a different order; consequently, for every $p \in \mathbb{N}$ we have

$\displaystyle\sum_{i=1}^n x_i^p = \displaystyle\sum_{i=1}^n y_i^p$.

Let $I: \mathbb{R}^n \times \mathbb{R}^n \rightarrow \{0,1\}$ be an indicator function, such that $I(x,y) = 1$ if and only if $x,y$ are permutations of one another ($I(x,y)=0$ otherwise). Let $g_p: \mathbb{R}^n \rightarrow \mathbb{R}$ be defined as

$g_p(z) = \displaystyle\sum_{i=1}^n z_i^p$,

for all $p \in \mathbb{N}$. We can then conclude that

$I(x,y) = 1 \Rightarrow g_p(x) = g_p(y), \quad{} \forall p \in \mathbb{N}$.

The condition $g_p(x) = g_p(y), \quad{} \forall p \in \mathbb{N}$ is necessary. Is it sufficient?! In other words, if $g_p(x) = g_p(y), \quad{} \forall p \in \mathbb{N}$ is true, can we conclude that vectors $x,y$ are permutations of one another?

This is an interesting problem. Think about it!

Update (Jan. 25, 2008): this post was featured on the 25th edition (Silver Jubilee) of the Carnival of Mathematics.