## Archive for August, 2012

### Born under the second law of thermodynamics

August 26, 2012

[ source ]

Neil Armstrong once portrayed himself as follows [1, 2]:

I am, and ever will be, a white-socks, pocket-protector, nerdy engineer—born under the second law of thermodynamics, steeped in the steam tables, in love with free-body diagrams, transformed by Laplace, and propelled by compressible flow. As an engineer, I take a substantial amount of pride in the accomplishments of my profession.

Neil “slipped the surly bonds of earth” and “trod the high untrespassed sanctity of space” [3]. He died yesterday [4, 5].

__________

Sources

[1] Neil A. Armstrong, Greatest Engineering Achievements of the 20th Century (transcript of speech), National Press Club, February 22, 2000.

[2] Neil A. Armstrong, Greatest Engineering Achievements of the 20th Century (audio of speech – Neil’s address starts at approximately 5:30), National Press Club, February 22, 2000.

[3] John Gillespie Magee, Jr., High Flight, England, 1941.

[4] John Noble Wilford, Neil Armstrong, first man on Moon, dies at 82, The New York Times, August 25, 2012.

[5] Craig Nelson, Neil Armstrong, hero with a slide rule, The Wall Street Journal, August 25, 2012.

### From classical deduction to Bayesian probability

August 15, 2012

Terence Tao on classical deduction and Bayesian probability:

In classical logic, one can represent one’s information about a system as a set of possible states that the system could be in, based on the information at hand. With each new measurement of the system, some possibilities could be eliminated, leading to an updated posterior set of information that is an improvement over the prior set of information. A good example of this type of updating occurs when solving a Sudoku puzzle; each new cell value that one learns about constrains the possible values of the remaining cells. Other examples can be found in the classic detective stories of Arthur Conan Doyle featuring Sherlock Holmes. Proof by contradiction can also be viewed as an instance of this type of deduction.

A modern refinement of classical deduction is that of Bayesian probability. Here, one’s information about a system is not merely represented as a set of possible states, but by a probability distribution on the space of all states, indicating one’s current beliefs on the likelihood of each particular state actually being the true state. Each new measurement of the system then updates a prior probability distribution to a posterior probability distribution, using Bayes’ formula

$P (A \mid B) = \displaystyle \frac{P (B \mid A) P(A)}{ P(B) }$.

Bayesian probability is widely used in statistics, in machine learning, and in the sciences.

To relate Bayesian probability to classical deduction, recall that every probability distribution has a support, which (in the case when the space of states is discrete) is the set of all states that occur with non-zero probability. When performing a Bayesian update on a discrete space, any state which is inconsistent with the new piece of information will have its posterior probability set to zero, and thus be removed from the support. Thus we see that whilst the probability distribution evolves by Bayesian updating, the support evolves by classical deductive logic. Thus one can view classical logic as the qualitative projection of Bayesian probability, or equivalently, one can view Bayesian probability as a quantitative refinement of classical logic.

Alternatively, one can view Bayesian probability as a special case of classical logic by taking a frequentist interpretation. In this interpretation, one views the actual universe (or at least the actual system) as just one of a large number of possible universes (or systems). In each of these universes, the system is in one of the possible states; the probability assigned to each state is then the proportion of the possible universes in which that state is attained. Each new measurement eliminates some fraction of the universes in a given state, depending on how likely or unlikely that state was to actually produce that measurement; the surviving universes then have a new posterior probability distribution, which is related to the prior distribution by Bayes’ formula.

It is instructive to interpret Sherlock Holmes‘ famous quote, “When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth,” from a Bayesian viewpoint. The statement is technically correct; however, when performing this type of elimination to an (a priori) improbable conclusion, the denominator in Bayes’ formula is extremely small, and so the deduction is unstable if it later turns out that some of the possibilities thought to have been completely eliminated, were in fact only incompletely eliminated. (See also the mantra “extraordinary claims require extraordinary evidence”, which can be viewed as the Bayesian counterpoint to Holmes’ classical remark.)

Another interesting place where one can contrast classical deduction with Bayesian deduction is with regard to taking converses. In classical logic, if one knows that $A$ implies $B$, one cannot then deduce that $B$ implies $A$. However, in Bayesian probability, if one knows that the presence of $A$ elevates the likelihood that $B$ is true, then an observation of $B$ will conversely elevate the prior probability that $A$ is true, thanks to Bayes’ formula: if $P(B \mid A) > P(B)$, then $P(A \mid B) > P(A)$. This may help explain why taking converses is an intuitive operation to those who have not yet been thoroughly exposed to classical logic. (It is also instructive to understand why this disparity between the two types of deduction is not in conflict with the previously mentioned links between the two. This disparity is roughly analogous to the disparity between worst-case analysis and average-case analysis.)

Bayesian probability can be generalised further; for instance, quantum mechanics (with the Copenhagen interpretation) can be viewed as a noncommutative generalisation of Bayesian probability, though the connection to classical logic is then lost when one is dealing with observables that do not commute. But this is another story…

__________

Please do note that Terence Tao’s original post contains neither links nor boldface highlighting. I took the liberty of adding those for convenience and emphasis. To improve legibility I also wrote the mathematical expressions in $\LaTeX$.

__________

Source:

Terence Tao, “A modern refinement of classical deduction is that of Bayesian probability”, Google Buzz, April 4, 2010.

### Towards the Cantor set

August 3, 2012

Today we will “construct” the standard (ternary) Cantor set [1]. We start with the closed unit interval $\mathcal{C}_0 := [0,1]$, and then remove its open middle third $]1/3, 2/3[$ to obtain $\mathcal{C}_1 := [0, 1/3] \cup [2/3, 1]$. We then remove the open middle thirds of each of these two intervals to obtain

$\mathcal{C}_2 := [0, 1/9] \cup [2/9, 3/9] \cup [6/9, 7/9] \cup [8/9, 1]$

and, continuing this process ad infinitum, we obtain a nested sequence of compact sets $\mathcal{C}_0 \supset \mathcal{C}_1 \supset \mathcal{C}_2 \supset \dots$. Note that $\mathcal{C}_n$ is the union of $2^n$ intervals, each of length $3^{-n}$. We refer to the following set

$\mathcal{C} := \displaystyle\bigcap_{n=0}^{\infty} \mathcal{C}_n$

as the standard (ternary) Cantor set. This set is most interesting, indeed, since it is uncountable and has Lebesgue measure zero [1].

__________

In Haskell, a closed interval $[a, b]$ can be represented by an ordered pair $(a,b)$. Each set $\mathcal{C}_n$ can be represented by a list of ordered pairs, where each pair represents a closed interval. We create a function $f$ that takes $\mathcal{C}_n$ and returns $\mathcal{C}_{n+1} = f ( \mathcal{C}_n )$. We will work with arbitrary-precision rational numbers, not floating-point numbers.

The following Haskell script lazily generates the sequence of sets:

import Data.Ratio

type Interval = (Rational, Rational)

-- remove the middle third of an interval
removeMiddleThird :: Interval -> [Interval]
removeMiddleThird (a,b) = [(a,b'),(a',b)]
where b' = (2%3)*a + (1%3)*b
a' = (1%3)*a + (2%3)*b

-- define function f
f :: [Interval] -> [Interval]
f intervals = concat $map removeMiddleThird intervals -- create list of sets sets :: [[Interval]] sets = iterate f [(0,1)] -- define Lebesgue measure measure :: Interval -> Rational measure (a,b) = b - a Note that we used function iterate to generate the sequence of sets. Here is a GHCi session: *Main> -- take first 4 sets *Main> take 4 sets [[(0 % 1,1 % 1)],[(0 % 1,1 % 3),(2 % 3,1 % 1)], [(0 % 1,1 % 9),(2 % 9,1 % 3),(2 % 3,7 % 9),(8 % 9,1 % 1)], [(0 % 1,1 % 27),(2 % 27,1 % 9),(2 % 9,7 % 27),(8 % 27,1 % 3), (2 % 3,19 % 27),(20 % 27,7 % 9),(8 % 9,25 % 27),(26 % 27,1 % 1)]] *Main> -- compute measure of C_0 *Main> sum$ map measure (sets !! 0)
1 % 1
*Main> -- compute measure of C_1
*Main> sum $map measure (sets !! 1) 2 % 3 *Main> -- compute measure of C_2 *Main> sum$ map measure (sets !! 2)
4 % 9
*Main> -- compute measure of C_3
*Main> sum \$ map measure (sets !! 3)
8 % 27

This is arguably the most useless Haskell script ever written.

__________

References

[1] Charles Chapman Pugh, Real Mathematical Analysis, Springer-Verlag, New York, 2002.