Archive for June, 2012

June 29, 2012

Last time I implemented a Runge-Kutta method was in December 2001. Back then, I implemented the classical 4th order Runge-Kutta (RK4) method in C. However, in the past decade, whenever I needed a numerical solution of an ordinary differential equation (ODE), I used MATLAB. Now, I would like to use Haskell instead.

Consider the following initial value problem (IVP)

$\dot{x} (t) = f ( x (t) )$,     $x (t_0) = x_0$

where the vector field $f : \mathbb{R}^n \to \mathbb{R}^n$ and the initial condition $x_0$ are given. Solving an instance of the IVP consists of finding a function $x : [t_0, \infty) \to \mathbb{R}^n$ that satisfies both the differential equation and the initial condition.

Let $h > 0$ be the time-step, and let $t_k = t_0 + k h$ be the $k$-th time instant, where $k \in \mathbb{N}_0 := \{0, 1, 2, \dots\}$. Suppose that we solve a given instance of the IVP and obtain a closed-form solution $x : [t_0, \infty) \to \mathbb{R}^n$; we then discretize the closed-form solution to obtain an infinite sequence $x_0, x_1, x_2, \dots$ where $x_k = x (t_k)$. However, not all instances of the IVP have a closed-form solution (or even an analytic solution), and even in the cases where a closed-form solution does exist, it can be quite hard to find such a solution. Therefore, instead of discretizing the closed-form solution, one is tempted to discretize the ODE itself to obtain an approximation of the sequence $x_0, x_1, x_2, \dots$, which we will denote by $\tilde{x}_0, \tilde{x}_1, \tilde{x}_2, \dots$. We will call the approximate sequence a numerical solution.

Consider now the following discretized initial value problem (DIVP)

$\tilde{x}_{k+1} = g ( \tilde{x}_k )$,     $\tilde{x}_0 = x_0$

where $g : \mathbb{R}^n \to \mathbb{R}^n$ depends on what particular numerical method one wants to use. We will use the (classical) 4th order Runge-Kutta (RK4) method [1], which is given by

$g (x) = \displaystyle x + \frac{1}{6} \left( k_1 + 2 k_2 + 2 k_3 + k_4 \right)$

where the $k_i$ variables are given by

$\begin{array}{rl} k_1 &= h f (x)\\\\ k_2 &= h f (x + \frac{1}{2} k_1)\\\\ k_3 &= h f (x + \frac{1}{2} k_2)\\\\ k_4 &= h f (x + k_3)\end{array}$

In a nutshell, one hopes that by making the time-step $h$ “small enough”, the numerical solution $\tilde{x}_0, \tilde{x}_1, \tilde{x}_2, \dots$ will be “close enough” to $x_0, x_1, x_2, \dots$, the discretization of the actual solution $x$. This hoping involves a fair amount of faith. Especially so if floating-point arithmetic is used, as underflow, overflow, and round-off error are not to be dismissed.

__________

The following Haskell script implements the RK4 method:

-- define 4th order Runge-Kutta map (RK4)
rk4 :: Floating a => (a -> a) -> a -> a -> a
rk4 f h x = x + (1/6) * (k1 + 2*k2 + 2*k3 + k4)
where k1 = h * f (x)
k2 = h * f (x + 0.5*k1)
k3 = h * f (x + 0.5*k2)
k4 = h * f (x + k3)

Let us now test this rather succinct implementation.

__________

Example

Consider the following IVP where $t_0 = 0$, $x_0 = 1$, and $n = 1$

$\dot{x} (t) = - x (t)$,     $x (0) = 1$.

The closed-form solution of this IVP is $x (t) = e^{-t}$. We load the script above into GHCi and compute the numerical solution of the IVP using the higher-order function iterate:

*Main> -- discretize closed-form solution
*Main> let xs = [ exp (-t) | t <- [0,0.1..]]
*Main> -- define function f
*Main> let f x = -x
*Main> -- define time-step h
*Main> let h = 1 / 100000
*Main> -- define initial condition
*Main> let x0 = 1.0
*Main> -- compute numerical solution
*Main> let xs_tilde = iterate (rk4 f h) x0
*Main> -- decimate numerical solution
*Main> let xs_dec = [ xs_tilde !! k | k <- [0,1..], rem k 10000 == 0]

Note that the time-step is $h = 1 / 100000$. Hence, the numerical solution will have $10^5$ samples in between $t = 0$ and $t = 1$, which probably is too fine a discretization. Thus, we decimate the numerical solution by a factor of $10^4$, so that we are left with only eleven samples in between $t = 0$ and $t = 1$, with a uniform spacing of $\Delta t = 0.1$ between consecutive samples. We can now compare the discretization of the closed-form solution with the heavily decimated numerical solution, as follows:

*Main> -- print discretized closed-form solution
*Main> take 11 xs
[1.0,0.9048374180359595,0.8187307530779818,
0.7408182206817179,0.6703200460356392,
0.6065306597126333,0.5488116360940264,
0.49658530379140947,0.44932896411722156,
0.4065696597405991,0.36787944117144233]
*Main> -- print decimated numerical solution
*Main> take 11 xs_dec
[1.0,0.9048374180359563,0.8187307530779842,
0.7408182206817214,0.6703200460356442,
0.6065306597126376,0.5488116360940266,
0.49658530379141175,0.44932896411722334,
0.40656965974059905,0.367879441171439]
*Main> -- compute difference of two solutions
*Main> let es = zipWith (-) xs_dec xs
*Main> -- print difference
*Main> take 11 es
[0.0,-3.219646771412954e-15,2.3314683517128287e-15,
3.552713678800501e-15,4.9960036108132044e-15,
4.3298697960381105e-15,2.220446049250313e-16,
2.275957200481571e-15,1.7763568394002505e-15,
-5.551115123125783e-17,-3.3306690738754696e-15]
*Main> -- compute energy of the difference
*Main> sum $take 11$ map (^2) es
9.161263559461204e-29

Note how small the values of difference sequence are, of the order of $10^{-15}$. The implementation of the RK4 method seems to be working. Who needs MATLAB now? ;-)

__________

References

[1] Richard W. Hamming, Numerical Methods for Scientists and Engineers, Dover Publications, 1973.

Deciding the convexity of semialgebraic sets II

June 23, 2012

Let $S_3 \subset \mathbb{R}^{3 \times 3}$ denote the set of real symmetric $3 \times 3$ matrices. We introduce a matrix-valued function $A: \mathbb{R}^3 \to S_3$ defined by

$A (x) = \left[\begin{array}{ccc} 1 & x_1 & x_2\\ x_1 & 1 & x_3\\ x_2 & x_3 & 1\end{array}\right]$

The elliptope $E_3 \subset \mathbb{R}^3$ is defined by

$E_3 = \left\{ x \in \mathbb{R}^3 \mid A (x) \succeq 0 \right\}$

where $A (x) \succeq 0$ means that matrix $A (x)$ is positive semidefinite. Since the elliptope $E_3$ is defined by the linear matrix inequality (LMI) $A (x) \succeq 0$, we can conclude that $E_3$ is a spectrahedron and, therefore, we have that $E_3$ is convex.

The $E_3$ elliptope (also known as “inflated tetrahedron”) is the yellow part (surface and interior) of the Cayley’s cubic surface plotted below

[ source ]

__________

$E_3$ is a semialgebraic set

Saying that $A (x)$ is positive semidefinite is equivalent to saying that the $2^3 - 1 = 7$ principal minors of $A (x)$ are nonnegative. If $A (x) \succeq 0$, then the three 1st order principal minors (which are the diagonal entries of $A (x)$) lead to the following three (admittedly uninteresting) inequalities

$1 \geq 0$,     $1 \geq 0$,     $1 \geq 0$

which we can discard. The three 2nd order principal minors (determinants of $2 \times 2$ submatrices of $A (x)$) and the 3rd order principal minor (the determinant of $A (x)$) can be computed using the following Python 2.5 + SymPy script:

from sympy import *

# symbolic variables
x1 = Symbol('x1')
x2 = Symbol('x2')
x3 = Symbol('x3')

# build matrix A(x)
A = Matrix([[1,x1,x2],
[x1,1,x3],
[x2,x3,1]])

print "\nMatrix A(x):"
print A

print "\n2nd order principal minors:"
print A.extract([0,1],[0,1]).det()
print A.extract([0,2],[0,2]).det()
print A.extract([1,2],[1,2]).det()

print "\n3rd order principal minor:"
print A.det()


which produces the following output:

Matrix A(x):
[ 1, x1, x2]
[x1,  1, x3]
[x2, x3,  1]

2nd order principal minors:
-x1**2 + 1
-x2**2 + 1
-x3**2 + 1

3rd order principal minor:
-x1**2 + 2*x1*x2*x3 - x2**2 - x3**2 + 1

Hence, the 2nd order principal minors yield the following inequalities

$1 - x_1^2 \geq 0$,     $1 - x_2^2 \geq 0$,     $1 - x_3^2 \geq 0$

which define the $3$-dimensional cube $[-1,1]^3 \subset \mathbb{R}^3$. The 3rd order principal minor (i.e., the determinant) yields the inequality

$1 + 2 x_1 x_2 x_3 - (x_1^2 + x_2^2 + x_3^3) \geq 0$.

We can thus conclude that $E_3$ is a basic closed semialgebraic set.

__________

Deciding the convexity of $E_3$

In my last post, I used quantifier elimination to decide the convexity of two semialgebraic sets in $\mathbb{R}^2$. Let us now decide the convexity of $E_3$, which is of higher “complexity” than the sets I considered last time. What is the goal? The goal is to find out whether REDLOG can decide the convexity of $E_3$ in less than 10 minutes.

Based on the scripts in my previous post, we write the following REDUCE + REDLOG script to decide the convexity of $E_3$:

% decide the convexity of the elliptope E3

rlset ofsf;

% define predicates in antecedent
p1 := (1-x1^2>=0) and (1-x2^2>=0) and (1-x3^2>=0) and
(1-x1^2-x2^2-x3^2+2*x1*x2*x3>=0);
p2 := (1-y1^2>=0) and (1-y2^2>=0) and (1-y3^2>=0) and
(1-y1^2-y2^2-y3^2+2*y1*y2*y3>=0);
p3 := (theta>=0) and (theta<=1);

% define predicate in the consequent
z1 := theta*x1+(1-theta)*y1;
z2 := theta*x2+(1-theta)*y2;
q  := (1-z1^2>=0) and (1-z2^2>=0) and (1-z3^2>=0) and
(1-z1^2-z2^2-z3^2+2*z1*z2*z3>=0);

% define universal formula
phi := all({x1,x2,x3,y1,y2,y3,theta}, (p1 and p2 and p3) impl q);

% perform quantifier elimination
rlqe phi;

end;

After 30 minutes (!!!) waiting for the decision procedure to terminate, I killed the reduce.exe process (which was eating all my machine’s CPU cycles and RAM memory). This is disappointing…

Deciding the convexity of semialgebraic sets

June 16, 2012

Given a basic closed semialgebraic set [1], $S \subset \mathbb{R}^n$, defined by

$S = \{ x \in \mathbb{R}^n \mid g_1 (x) \geq 0 \land \dots \land g_m (x) \geq 0\}$

where $m \in \mathbb{N}$ and $g_1, \dots, g_m \in \mathbb{R}[x]$, how can one decide if set $S$ is convex? Can one use quantifier elimination?

__________

Convexity

Let us recall the definition of convexity [2]:

Definition: A set $S$ is convex if the line segment between any two points in $S$ lies in $S$, i.e., if for any $x, y \in S$ and any $\theta$ with $0 \leq \theta \leq 1$, we have $\theta x + (1-\theta) y \in S$.

Let us introduce a predicate $p : \mathbb{R}^n \to \{\text{True}, \text{False}\}$, defined by

$\displaystyle p (x) = \bigwedge_{i = 1}^m g_i (x) \geq 0$

so that we can write $S$ in the more parsimonious form

$S = \{ x \in \mathbb{R}^n \mid p (x) \}$.

Hence, writing $x \in S$ is equivalent to asserting that $p (x) = \text{True}$. From the definition above, it follows that set $S$ is convex if and only if the following universally quantified formula

$\forall x \, \forall y \, \forall \theta \, \left[ \, p(x) \land p(y) \land (\theta \geq 0 \land \theta \leq 1) \implies p (\theta x + (1-\theta) y) \, \right]$

where $x, y$ range over $\mathbb{R}^n$ and $\theta$ ranges over $\mathbb{R}$, evaluates to $\text{True}$. Since $g_1, \dots, g_m$ are polynomials, the formula above can be decided using quantifier elimination software like QEPCAD or REDLOG. Unfortunately, decidability does not imply that real quantifier elimination will decide the formula in an expedite manner. Let us now consider two instances of the decision problem under study.

__________

Example #1

Consider the following semialgebraic set

$S_1 := \{ x \in \mathbb{R}^2 \mid x_1 - x_2^2 + 3 \geq 0 \land -x_1 - x_2^2 + 3 \geq 0\}$

which we depict below

Visual inspection of the plot above allows us to conclude that set $S_1$ is convex. However, relying on the human visual system is extremely limiting. Therefore, we would like to automate the decision. We decide the convexity of $S_1$ via quantifier elimination using the following REDUCE + REDLOG script:

% decide the convexity of a semialgebraic set

rlset ofsf;

% define predicates in antecedent
p1 := (x1-x2^2+3>=0) and (-x1-x2^2+3>=0);
p2 := (y1-y2^2+3>=0) and (-y1-y2^2+3>=0);
p3 := (theta>=0) and (theta<=1);

% define predicate in the consequent
z1 := theta*x1+(1-theta)*y1;
z2 := theta*x2+(1-theta)*y2;
q  := (z1-z2^2+3>=0) and (-z1-z2^2+3>=0);

% define universal formula
phi := all({x1,x2,y1,y2,theta}, (p1 and p2 and p3) impl q);

% perform quantifier elimination
rlqe phi;

end;

This script returns the truth value $\text{True}$ and, hence, $S_1$ is, indeed, convex. However, it took REDLOG approximately 60 seconds to decide the convexity of $S_1$, which is considerably longer than any of my previous experiments with REDLOG. Since we are performing quantifier elimination on a formula with five universal quantifiers, I am not incredibly surprised that it takes a while to obtain an answer.

__________

Example #2

Consider now the following semialgebraic set

$S_2 := \{ x \in \mathbb{R}^2 \mid x_1 - x_2^2 + 3 \geq 0 \land x_1 - x_2^2 + 1 \leq 0\}$

part of which we depict below

Visual inspection of the plot above allows us to conclude that set $S_2$ is not convex. We decide the convexity of $S_2$ using the following REDUCE + REDLOG script:

% decide the convexity of a semialgebraic set

rlset ofsf;

% define predicates in antecedent
p1 := (x1-x2^2+3>=0) and (x1-x2^2+1<=0);
p2 := (y1-y2^2+3>=0) and (y1-y2^2+1<=0);
p3 := (theta>=0) and (theta<=1);

% define predicate in the consequent
z1 := theta*x1+(1-theta)*y1;
z2 := theta*x2+(1-theta)*y2;
q  := (z1-z2^2+3>=0) and (z1-z2^2+1<=0);

% define universal formula
phi := all({x1,x2,y1,y2,theta}, (p1 and p2 and p3) impl q);

% perform quantifier elimination
rlqe phi;

end;

This script returns the truth value $\text{False}$ and, hence, $S_2$ is, indeed, not convex. Interestingly, it took REDLOG less than one second to decide the formula. That is over two orders of magnitude faster than in example #1.

__________

References

[1] Markus Schweighofer, Describing convex semialgebraic sets by linear matrix inequalities, International Symposium on Symbolic and Algebraic Computation, Korea Institute for Advanced Study, Seoul, July 2009.

[2] Stephen Boyd, Lieven Vandenberghe, Convex Optimization.

ODE notation considered harmful

June 5, 2012

Whenever I want to learn a new subject, I announce a graduate course in it, since the best way to learn is by teaching. But even better than teaching humans is teaching computers, i.e. program! Since computers will not let you wave your hands and wing it.

Doron Zeilberger (1999)

Suppose that we are given an $n$-dimensional first-order vector ordinary differential equation (ODE)

$\dot{x} = f (x)$

where $f : \mathbb{R}^n \to \mathbb{R}^n$ is a vector field. A solution of this ODE is a function $x : \mathbb{R} \to \mathbb{R}^n$ whose first derivative, $\dot{x}$, is equal to $f (x)$. Elementary, my dear Watson!

Despite the fact that the notation $\dot{x} = f (x)$ is rather pervasive, it is dangerous, in my humble opinion. One should keep in mind that such notation is an abbreviation. How could it not be an abbreviation? Note that $x$ is a function, and that the argument of $f$ is a vector in $\mathbb{R}^n$, not a function! In other words, writing $f (x)$ results in a type error, and a compiler / interpreter with type checking would never accept the declaration $\dot{x} = f (x)$. We, fallible humans, accept it because we know (do we?) that it is an abbreviation, and because it is tradition. Who are we to challenge a centuries-old tradition?

If $\dot{x} = f (x)$ is an abbreviation, what exactly is it an abbreviation for? It is an abbreviation for the following

$\dot{x} (t) = f ( x (t) )$

where the $=$ symbol denotes equality of vectors. Note that the argument of $f$ is the vector $x (t)$ (i.e., function $x$ evaluated at $t$), not the function $x$. But, what does the equation above say? It says the following:

Proposition: Given a vector field $f : \mathbb{R}^n \to \mathbb{R}^n$, there exists a function $x : \mathbb{R} \to \mathbb{R}^n$, whose first derivative is $\dot{x} : \mathbb{R} \to \mathbb{R}^n$, such that for every $t \in \mathbb{R}$ we have that $\dot{x} (t) = f ( x (t) )$.

This proposition may be true or false. Not all differential equations have solutions, after all, which is why the study of existence of solutions is (was?) an active area of research. Note the presence of the words “for every” in the proposition above. Do you see any universal quantifier, $\forall$, anywhere? You do not, and that is due to the fact that $\dot{x} (t) = f ( x (t) )$ is also an abbreviation! The universal quantifier is implicit. If we make it explicit, we obtain

$\forall t \left( \dot{x} (t) = f ( x (t) ) \right)$

where $t$ ranges over $\mathbb{R}$. The universally quantified formula above is the conjunction of infinitely many equations. However, do note that the words “there exists” also appear in the proposition, which suggests that an existential quantifier is missing. Therefore, the non-abbreviated notation would be as follows

$\exists x \, \forall t \left( \dot{x} (t) = f ( x (t) ) \right)$

where $x$ ranges over the set of all (continuous?) functions from $\mathbb{R}$ to $\mathbb{R}^n$, and $t$ ranges over $\mathbb{R}$.

__________

A better notation

So far, I have criticized the abbreviated notation for differential equations. I consider it harmful, as confusing a function $x$ with its evaluation $x(t)$ is an atrocious crime against types. To make this post more “constructive”, I will take the liberty of proposing a better notation.

We start by noting that $f ( x (t) )$ can be written in the form $(f \circ x) (t)$, where the symbol $\circ$ denotes function composition. We now introduce a differential operator $D$, which maps functions to functions, so that we obtain the first derivative of function $x$ via $\dot{x} = D (x)$. Hence, $\dot{x} (t) = (D (x)) (t)$, which allows us to rewrite $\forall t \left( \dot{x} (t) = f ( x (t) ) \right)$ in the following form

$\forall t \left( (D(x)) (t) = (f \circ x) (t) \right)$

where $t$ ranges over $\mathbb{R}$. The universally quantified formula above states that functions $D(x)$ and $f \circ x$, both of which from $\mathbb{R}$ to $\mathbb{R}^n$ evaluate to the same values for all possible choices of the input $t$, i.e., functions $D(x)$ and $f \circ x$ are equal. The formula above can thus be rewritten more compactly as $D(x) = f \circ x$.

Finally, we conclude that $\exists x \, \forall t \left( \dot{x} (t) = f ( x (t) ) \right)$ is equivalent to the following existentially quantified formula

$\exists x \left( D(x) = f \circ x \right)$

where the $=$ symbol denotes equality of functions. This new notation may not be as compact as $\dot{x} = f (x)$, but at least now the RHS does not result in type error. Criticism would be most welcome.