Posts Tagged ‘Ordinary Differential Equations’

June 29, 2012

Last time I implemented a Runge-Kutta method was in December 2001. Back then, I implemented the classical 4th order Runge-Kutta (RK4) method in C. However, in the past decade, whenever I needed a numerical solution of an ordinary differential equation (ODE), I used MATLAB. Now, I would like to use Haskell instead.

Consider the following initial value problem (IVP)

$\dot{x} (t) = f ( x (t) )$,     $x (t_0) = x_0$

where the vector field $f : \mathbb{R}^n \to \mathbb{R}^n$ and the initial condition $x_0$ are given. Solving an instance of the IVP consists of finding a function $x : [t_0, \infty) \to \mathbb{R}^n$ that satisfies both the differential equation and the initial condition.

Let $h > 0$ be the time-step, and let $t_k = t_0 + k h$ be the $k$-th time instant, where $k \in \mathbb{N}_0 := \{0, 1, 2, \dots\}$. Suppose that we solve a given instance of the IVP and obtain a closed-form solution $x : [t_0, \infty) \to \mathbb{R}^n$; we then discretize the closed-form solution to obtain an infinite sequence $x_0, x_1, x_2, \dots$ where $x_k = x (t_k)$. However, not all instances of the IVP have a closed-form solution (or even an analytic solution), and even in the cases where a closed-form solution does exist, it can be quite hard to find such a solution. Therefore, instead of discretizing the closed-form solution, one is tempted to discretize the ODE itself to obtain an approximation of the sequence $x_0, x_1, x_2, \dots$, which we will denote by $\tilde{x}_0, \tilde{x}_1, \tilde{x}_2, \dots$. We will call the approximate sequence a numerical solution.

Consider now the following discretized initial value problem (DIVP)

$\tilde{x}_{k+1} = g ( \tilde{x}_k )$,     $\tilde{x}_0 = x_0$

where $g : \mathbb{R}^n \to \mathbb{R}^n$ depends on what particular numerical method one wants to use. We will use the (classical) 4th order Runge-Kutta (RK4) method [1], which is given by

$g (x) = \displaystyle x + \frac{1}{6} \left( k_1 + 2 k_2 + 2 k_3 + k_4 \right)$

where the $k_i$ variables are given by

$\begin{array}{rl} k_1 &= h f (x)\\\\ k_2 &= h f (x + \frac{1}{2} k_1)\\\\ k_3 &= h f (x + \frac{1}{2} k_2)\\\\ k_4 &= h f (x + k_3)\end{array}$

In a nutshell, one hopes that by making the time-step $h$ “small enough”, the numerical solution $\tilde{x}_0, \tilde{x}_1, \tilde{x}_2, \dots$ will be “close enough” to $x_0, x_1, x_2, \dots$, the discretization of the actual solution $x$. This hoping involves a fair amount of faith. Especially so if floating-point arithmetic is used, as underflow, overflow, and round-off error are not to be dismissed.

__________

The following Haskell script implements the RK4 method:

-- define 4th order Runge-Kutta map (RK4)
rk4 :: Floating a => (a -> a) -> a -> a -> a
rk4 f h x = x + (1/6) * (k1 + 2*k2 + 2*k3 + k4)
where k1 = h * f (x)
k2 = h * f (x + 0.5*k1)
k3 = h * f (x + 0.5*k2)
k4 = h * f (x + k3)

Let us now test this rather succinct implementation.

__________

Example

Consider the following IVP where $t_0 = 0$, $x_0 = 1$, and $n = 1$

$\dot{x} (t) = - x (t)$,     $x (0) = 1$.

The closed-form solution of this IVP is $x (t) = e^{-t}$. We load the script above into GHCi and compute the numerical solution of the IVP using the higher-order function iterate:

*Main> -- discretize closed-form solution
*Main> let xs = [ exp (-t) | t <- [0,0.1..]]
*Main> -- define function f
*Main> let f x = -x
*Main> -- define time-step h
*Main> let h = 1 / 100000
*Main> -- define initial condition
*Main> let x0 = 1.0
*Main> -- compute numerical solution
*Main> let xs_tilde = iterate (rk4 f h) x0
*Main> -- decimate numerical solution
*Main> let xs_dec = [ xs_tilde !! k | k <- [0,1..], rem k 10000 == 0]

Note that the time-step is $h = 1 / 100000$. Hence, the numerical solution will have $10^5$ samples in between $t = 0$ and $t = 1$, which probably is too fine a discretization. Thus, we decimate the numerical solution by a factor of $10^4$, so that we are left with only eleven samples in between $t = 0$ and $t = 1$, with a uniform spacing of $\Delta t = 0.1$ between consecutive samples. We can now compare the discretization of the closed-form solution with the heavily decimated numerical solution, as follows:

*Main> -- print discretized closed-form solution
*Main> take 11 xs
[1.0,0.9048374180359595,0.8187307530779818,
0.7408182206817179,0.6703200460356392,
0.6065306597126333,0.5488116360940264,
0.49658530379140947,0.44932896411722156,
0.4065696597405991,0.36787944117144233]
*Main> -- print decimated numerical solution
*Main> take 11 xs_dec
[1.0,0.9048374180359563,0.8187307530779842,
0.7408182206817214,0.6703200460356442,
0.6065306597126376,0.5488116360940266,
0.49658530379141175,0.44932896411722334,
0.40656965974059905,0.367879441171439]
*Main> -- compute difference of two solutions
*Main> let es = zipWith (-) xs_dec xs
*Main> -- print difference
*Main> take 11 es
[0.0,-3.219646771412954e-15,2.3314683517128287e-15,
3.552713678800501e-15,4.9960036108132044e-15,
4.3298697960381105e-15,2.220446049250313e-16,
2.275957200481571e-15,1.7763568394002505e-15,
-5.551115123125783e-17,-3.3306690738754696e-15]
*Main> -- compute energy of the difference
*Main> sum $take 11$ map (^2) es
9.161263559461204e-29

Note how small the values of difference sequence are, of the order of $10^{-15}$. The implementation of the RK4 method seems to be working. Who needs MATLAB now? ;-)

__________

References

[1] Richard W. Hamming, Numerical Methods for Scientists and Engineers, Dover Publications, 1973.

ODE notation considered harmful

June 5, 2012

Whenever I want to learn a new subject, I announce a graduate course in it, since the best way to learn is by teaching. But even better than teaching humans is teaching computers, i.e. program! Since computers will not let you wave your hands and wing it.

Doron Zeilberger (1999)

Suppose that we are given an $n$-dimensional first-order vector ordinary differential equation (ODE)

$\dot{x} = f (x)$

where $f : \mathbb{R}^n \to \mathbb{R}^n$ is a vector field. A solution of this ODE is a function $x : \mathbb{R} \to \mathbb{R}^n$ whose first derivative, $\dot{x}$, is equal to $f (x)$. Elementary, my dear Watson!

Despite the fact that the notation $\dot{x} = f (x)$ is rather pervasive, it is dangerous, in my humble opinion. One should keep in mind that such notation is an abbreviation. How could it not be an abbreviation? Note that $x$ is a function, and that the argument of $f$ is a vector in $\mathbb{R}^n$, not a function! In other words, writing $f (x)$ results in a type error, and a compiler / interpreter with type checking would never accept the declaration $\dot{x} = f (x)$. We, fallible humans, accept it because we know (do we?) that it is an abbreviation, and because it is tradition. Who are we to challenge a centuries-old tradition?

If $\dot{x} = f (x)$ is an abbreviation, what exactly is it an abbreviation for? It is an abbreviation for the following

$\dot{x} (t) = f ( x (t) )$

where the $=$ symbol denotes equality of vectors. Note that the argument of $f$ is the vector $x (t)$ (i.e., function $x$ evaluated at $t$), not the function $x$. But, what does the equation above say? It says the following:

Proposition: Given a vector field $f : \mathbb{R}^n \to \mathbb{R}^n$, there exists a function $x : \mathbb{R} \to \mathbb{R}^n$, whose first derivative is $\dot{x} : \mathbb{R} \to \mathbb{R}^n$, such that for every $t \in \mathbb{R}$ we have that $\dot{x} (t) = f ( x (t) )$.

This proposition may be true or false. Not all differential equations have solutions, after all, which is why the study of existence of solutions is (was?) an active area of research. Note the presence of the words “for every” in the proposition above. Do you see any universal quantifier, $\forall$, anywhere? You do not, and that is due to the fact that $\dot{x} (t) = f ( x (t) )$ is also an abbreviation! The universal quantifier is implicit. If we make it explicit, we obtain

$\forall t \left( \dot{x} (t) = f ( x (t) ) \right)$

where $t$ ranges over $\mathbb{R}$. The universally quantified formula above is the conjunction of infinitely many equations. However, do note that the words “there exists” also appear in the proposition, which suggests that an existential quantifier is missing. Therefore, the non-abbreviated notation would be as follows

$\exists x \, \forall t \left( \dot{x} (t) = f ( x (t) ) \right)$

where $x$ ranges over the set of all (continuous?) functions from $\mathbb{R}$ to $\mathbb{R}^n$, and $t$ ranges over $\mathbb{R}$.

__________

A better notation

So far, I have criticized the abbreviated notation for differential equations. I consider it harmful, as confusing a function $x$ with its evaluation $x(t)$ is an atrocious crime against types. To make this post more “constructive”, I will take the liberty of proposing a better notation.

We start by noting that $f ( x (t) )$ can be written in the form $(f \circ x) (t)$, where the symbol $\circ$ denotes function composition. We now introduce a differential operator $D$, which maps functions to functions, so that we obtain the first derivative of function $x$ via $\dot{x} = D (x)$. Hence, $\dot{x} (t) = (D (x)) (t)$, which allows us to rewrite $\forall t \left( \dot{x} (t) = f ( x (t) ) \right)$ in the following form

$\forall t \left( (D(x)) (t) = (f \circ x) (t) \right)$

where $t$ ranges over $\mathbb{R}$. The universally quantified formula above states that functions $D(x)$ and $f \circ x$, both of which from $\mathbb{R}$ to $\mathbb{R}^n$ evaluate to the same values for all possible choices of the input $t$, i.e., functions $D(x)$ and $f \circ x$ are equal. The formula above can thus be rewritten more compactly as $D(x) = f \circ x$.

Finally, we conclude that $\exists x \, \forall t \left( \dot{x} (t) = f ( x (t) ) \right)$ is equivalent to the following existentially quantified formula

$\exists x \left( D(x) = f \circ x \right)$

where the $=$ symbol denotes equality of functions. This new notation may not be as compact as $\dot{x} = f (x)$, but at least now the RHS does not result in type error. Criticism would be most welcome.

Safety verification via barrier certificates

June 19, 2011

Suppose that we are given a dynamical system of the form $\dot{x} (t) = f ( x (t) )$, where $x : [0, \infty) \to \mathbb{R}^n$ is the state trajectory, and $f : \mathbb{R}^n \to \mathbb{R}^n$ is a known vector field [1]. Suppose further that we are given an initial set $\mathcal{X}_0 \subset \mathbb{R}^n$, and an unsafe set $\mathcal{X}_u \subset \mathbb{R}^n$.

If there exists a state trajectory $x : [0, \infty) \to \mathbb{R}^n$ with initial condition $x(0) \in \mathcal{X}_0$ such that $x(T) \in \mathcal{X}_u$ for some $T \geq 0$, we say that the system is unsafe. An example of an unsafe system is depicted below

If there exists no state trajectory $x : [0, \infty) \to \mathbb{R}^n$ with initial condition $x(0) \in \mathcal{X}_0$ such that $x(T) \in \mathcal{X}_u$ for some $T \geq 0$, we say that the system is safe. Note that if $\mathcal{X}_0 \cap \mathcal{X}_u \neq \emptyset$, we can conclude immediately that the system is unsafe, as there exists an initial condition $x(0) \in \mathcal{X}_u$.

The safety verification problem can then be stated as follows:

Given a vector field $f : \mathbb{R}^n \to \mathbb{R}^n$, an initial set $\mathcal{X}_0 \subset \mathbb{R}^n$ and an unsafe set $\mathcal{X}_u \subset \mathbb{R}^n$, decide whether the dynamical system $\dot{x} = f (x)$ is safe.

To be more precise, safety is not a property of the dynamical system $\dot{x} = f (x)$ per se, but rather of the ordered triple $(f, \mathcal{X}_0, \mathcal{X}_u)$.

How do we go about solving the safety verification problem? The most obvious approach would be to “propagate” the initial set $\mathcal{X}_0$ forwards in time, i.e., to compute (forward) reachable sets. However, computing exact reachable sets is seldom possible, which forces us to compute approximations of reachable sets.

When performing safety verification, we are asking a binary question: is $(f, \mathcal{X}_0, \mathcal{X}_u)$ safe? Hence, computation of the reachable sets seems to provide more information than what we actually do need. This is a personal opinion, not a fact.

__________

Barrier certificates

Probably inspired on Lyapunov stability theory [1], Stephen Prajna introduced a few years ago a new method for safety verification of continuous-time dynamical systems that relies on what he termed barrier certificates [2]. A barrier certificate is a function of state $B : \mathbb{R}^n \to \mathbb{R}$ that satisfies the following inequalities

$\begin{array}{rl} B(x) \leq 0 \qquad{} \forall x \in \mathcal{X}_0 \\ B(x) > 0 \qquad{} \forall x \in \mathcal{X}_u\\ \displaystyle\frac{\partial B}{\partial x} (x) f (x) \leq 0 \qquad{} \forall x \in \mathbb{R}^n\end{array}$

Let us now introduce a function of time $b (t) := B (x (t))$ and fix the initial condition $x(0)$. Taking the derivative of $b$ with respect to time, we obtain

$\dot{b} (t) = \displaystyle\frac{\partial B}{\partial x} (x(t)) \dot{x} (t) = \displaystyle\frac{\partial B}{\partial x} (x(t)) f (x(t)) \leq 0$

which follows from the non-positivity of the Lie derivative of $B$ along the flow of $f$. From $\dot{b}(t) \leq 0$, we conclude that $b(t) \leq b(0)$ for all $t \geq 0$, i.e., we have that $B(x(t)) \leq B(x(0))$ for all $t \geq 0$. Hence, the following sublevel set

$\Omega_0 := \{ x \in \mathbb{R}^n \mid B(x) \leq B(x(0))\}$

is positively invariant [1]. In other words, a solution starting in $\Omega_0$ will remain in $\Omega_0$ for all $t \geq 0$. Since $B(x(0)) \leq 0$ for all $x(0) \in \mathcal{X}_0$, we conclude that $B(x(t)) \leq 0$ for all $t \geq 0$. However, since $B(x) > 0$ for all $x \in \mathcal{X}_u$, we can finally conclude that the safety of $(f, \{x(0)\}, \mathcal{X}_u)$ is guaranteed. Alternatively, one could argue that, since, by construction, $\Omega_0 \cap \mathcal{X}_u = \emptyset$, then safety is certified.

and once we know that $\Omega_0$ is a positively invariant set (from $B(x(t)) \leq B(x(0))$), we conclude that the state trajectory must live in $\Omega_0$, which is painted in dark gray below

Please do recall that $B(x(0)) \leq 0$, and also note that the unsafe set is contained in the superlevel set $\{ x \in \mathbb{R}^n \mid B(x) > 0\}$.

So far we assumed that the initial condition was fixed. Henceforth, we shall make the initial condition free to take any values in the initial set $\mathcal{X}_0$. Let us define

$b_0 := \displaystyle\sup_{x_0 \in \mathcal{X}_0} B(x_0)$

so that we get an upper bound on $B(x(0))$, as follows

$B(x(t)) \leq B(x(0)) \leq b_0 \leq 0$

where we used the inequality $B(x) \leq 0$ for all $x \in \mathcal{X}_0$ to conclude that $b_0 \leq 0$. This allows us to conclude that the sublevel set

$\tilde{\Omega}_0 := \{ x \in \mathbb{R}^n \mid B(x) \leq b_0\}$

is positively invariant. Since $\tilde{\Omega}_0 \cap \mathcal{X}_u = \emptyset$, safety of $(f, \mathcal{X}_0, \mathcal{X}_u)$ is guaranteed. Note that the sublevel set $\tilde{\Omega}_0$ is the union of all possible $\Omega_0$ sublevel sets, i.e.,

$\tilde{\Omega}_0 = \displaystyle\bigcup_{x_0 \in \mathcal{X}_0} \{ x \in \mathbb{R}^n \mid B(x) \leq B(x_0)\}$.

This new sublevel set is depicted below, painted in dark gray

Note that $\mathcal{X}_0 \subset \tilde{\Omega}_0$. Essentially, the zero level set of the barrier certificate $B$ separates an unsafe set, $\mathcal{X}_u$, from all system trajectories starting in the initial set $\mathcal{X}_0$, which are contained in $\tilde{\Omega}_0$, thus providing a proof of safety without requiring the computation of reachable sets. Personally, I like to think of this approach as a sophisticated generalization of the well-known separating hyperplane theorem.

At this point, you may be thinking that this framework is too good to be true. Instead of computing the flow or the reachable sets, all we need to do is find a function $B : \mathbb{R}^n \to \mathbb{R}$ that satisfies three inequalities, and then (voilà!) infinite-time safety of $(f, \mathcal{X}_0, \mathcal{X}_u)$ is certified! As it usually happens, the devil is in the details…

If finding Lyapunov functions [1] for low-dimensional dynamical systems is often “hard”, why should we expect the hunt for barrier certificates to be an “easy” endeavor? Quel panache! Are we merely transferring the complexity of the problem from the computation of reachable sets to the satisfaction of three inequalities? Will we fail to find barrier certificates due to excessive conservativeness? Should we abandon all hope? Please be patient, dear reader, for we will address these issues and alleviate your doubts in future posts ;-)

Next installment will be devoted to an example. до скорой встречи!

_________

References

[1] Hassan K. Khalil, Nonlinear Systems, 3rd edition, Prentice Hall.

[2] Stephen Prajna, Optimization-based methods for nonlinear and hybrid systems verification, Dissertation (Ph.D.), California Institute of Technology, 2005.

Sailing in state space

April 11, 2011

Consider the following controlled (autonomous) dynamical system [1]

$\dot{x} (t) = f (x (t), u (t))$

where $x : [0, \infty) \to \mathbb{R}^n$ is the state trajectory, $u : [0,\infty) \to \mathcal{U}$ is the control input history, $\mathcal{U} \subseteq \mathbb{R}^m$ is the set of admissible control inputs, and $f : \mathbb{R}^n \times \mathcal{U} \to \mathbb{R}^n$ is a (known) vector field.

If we know the control input $u (t)$ for all $t \geq 0$, then we can integrate the ODE above with initial condition $x(0) = x_0$ and obtain the state trajectory $x : [0, \infty) \to \mathbb{R}^n$, as follows

$x (t) = x_0 + \displaystyle \int_{0}^{t} f \left(x (\tau), u (\tau) \right) \mathrm{d} \tau$

which can be represented pictorially by a single streamline in $\mathbb{R}^n$

I am (quite explicitly) alluding to fluid flow. Just like a cork on the ocean will follow a certain path depending on the velocity field (i.e., “ocean currents”), a point in state space will flow along a certain streamline.

__________

Let us fix the control input, i.e., $u (t) = \bar{u}$ for all $t \geq 0$, and define

$\begin{array}{rl} v^{(\bar{u})} : \mathbb{R}^n & \to \mathbb{R}^n\\ x & \mapsto f(x, \bar{u})\\\end{array}$

For the sake of simplicity, let us consider $\mathcal{U} = \{-1, +1\}$. Since $\bar{u}$ can only take two values, we have two vector fields, $v^{(+1)}(x) := f(x, +1)$ and $v^{(-1)}(x) := f(x, -1)$. Integrating the following ODEs

$\begin{array}{rl} \dot{x} (t) &= v^{(+1)} ( x(t) )\\ \dot{x} (t) &= v^{(-1)} ( x(t) )\\\end{array}$

for various initial conditions, we obtain two families of streamlines, as depicted below

For each fixed control input and collection of initial conditions, we will have a family of streamlines in state space.

Imagine that we have the following control input history

$u (t) = \begin{cases} +1, & t \in [0, t^{\star})\\ -1, & t \geq t^{\star}\\\end{cases}$

where we toggle the control input at time $t^{\star} > 0$, thus interrupting the flow along a “blue” streamline and initiating the flow along a “pink” streamline. Pictorially, we have

If the control input is fixed, say, $\bar{u} = +1$, then given an initial condition $x(0)$, we will flow along the “blue” streamline that passes through $x(0)$. This streamline is $1$-dimensional and does not allow us to “explore” the $n$-dimensional state space. However, if we switch between $\bar{u} = +1$ and $\bar{u} = -1$, then we are able to “travel” to other regions of the state space. Allowing the control input to take values in $\mathcal{U} = [-1,+1]$ would give us even more freedom.

Lastly, we arrive at a most childish idea: in some cases, controller design can be viewed as shaping the streamlines differently in different parts of the state space. This is childish because it is purely conceptual. One obviously cannot design optimal controllers by doodling with crayons.

__________

References

[1] Hassan K. Khalil, Nonlinear Systems, 3rd edition, Prentice Hall.