## Archive for March, 2012

### Fair play

March 31, 2012

Last February’s Ponder This challenge was the following:

To decide who would play first in a human vs. computer game, the human suggested to flip a coin it produced from its pocket, stating “heads you win, tails I win.” The computer replied, “Humans are very devious. I know that your coin is not fair, and that the probability of heads on that coin is slightly less than one half. To make this perfectly fair, flip the coin repeatedly, and I win if the coin first shows heads on toss number 1, 14, 15, 19, 20, or 23. You win if it first shows heads on any other toss.”

If the computer is correct and $p$ is the probability that the coin shows heads, give the best rational approximation to $p$ with a denominator having fewer than 10 digits.

Here is the solution.

__________

My solution #1:

Let $\mathbb{P} (H) = p$ be the probability that the coin will land with the heads side up. If the coin is biased in favor of the human player, then $p < 1/2$. The probability that the coin toss will first produce “heads” on the $n$-th trial (where $n \geq 1$) is $(1-p)^{n-1} p$. Let us define $\mathcal{I} := \{1, 14, 15, 19, 20, 23\}$. The probability that the computer will win is thus

$\displaystyle \sum_{n \in \mathcal{I}} (1-p)^{n-1} p$.

We now assume that the computer knows the exact value of $p$ and, thus, is able to compute its probability of winning. Since the computer is interested in making this game “perfectly fair”, then his probability of winning should equal the human’s probability of winning, i.e., it should equal $1/2$. Hence, we obtain the equation

$p + (1-p)^{13} p + (1-p)^{14} p + (1-p)^{18} p + (1-p)^{19} p +(1-p)^{22} p = \frac{1}{2}$

which we can solve using the following Python + SymPy script:

from sympy import *

# create symbolic variable
p = Symbol('p')

# solve polyomial equation
f = p + p*(1-p)**13+p*(1-p)**14+p*(1-p)**18+p*(1-p)**19+p*(1-p)**22
Solutions = solve(f-0.5,p)

# find (purely) real solutions
RealSols = filter(lambda z: im(z)==0, Solutions)

# print solutions
print "\nAll solutions:"
for sol in Solutions:
print sol

# print real solutions
print "\nReal solutions:"
for sol in RealSols:
print sol


This script outputs the following list of solutions:

All solutions:
0.334938061741437 - 1.03291350457235*I
0.334938061741437 + 1.03291350457235*I
1.97217279050061 + 0.143849563303172*I
1.50662666735084 - 0.797180194029411*I
1.92419089868505 + 0.454390584787364*I
0.0780115260218319 + 0.464383156442821*I
0.00197767511752256 - 0.101887310197153*I
1.27721993255011 + 0.951650156864179*I
1.27721993255184 - 0.951650156863701*I
0.247036637335883 - 0.624592516289942*I
0.625456509765409 - 0.845921918004056*I
0.0780115260218319 - 0.464383156442821*I
0.00197767511752256 + 0.101887310197153*I
0.991590970537258 - 0.923668626671437*I
1.79082570892033 - 0.599036471538201*I
1.92419089870934 - 0.454390584808249*I
1.79082570891289 + 0.599036471546724*I
0.499905242928390
0.6254565097654 + 0.845921918004058*I
0.991590970537348 + 0.923668626671824*I
0.247036637335883 + 0.624592516289942*I
1.50662666734992 + 0.797180194030292*I
1.97217279048996 - 0.14384956326771*I

Real solutions:
0.499905242928390

Hence, we have that the probability of heads is

$p = 0.499905242928390$.

We can convert this number to a ratio of integers in Haskell using function approxRational in the library Data.Ratio:

import Data.Ratio

-- probability of heads
p = 0.499905242928390

-- create (infinite) list of epsilons
epsilons = map (10**) [-1,-2..]

-- create (infinite) list of ratios
ratios = map (approxRational p) epsilons

-- list of ratios whose denominators have less than 10 digits
ratios' = takeWhile (\x -> denominator x < 10^10) ratios

Running the script above in GHCi, we obtain a list of ratios:

*Main Data.Ratio> ratios'
[1 % 2,1 % 2,1 % 2,1 % 2,2386 % 4773,2611 % 5223,2636 % 5273,
2638 % 5277,13189 % 26383,44843 % 89703,166183 % 332429,
393036 % 786221,1405961 % 2812455,6243733 % 12489833,
14906352 % 29818355,69693988 % 139414397]

and we finally obtain the following rational approximation

$p \approx \displaystyle\frac{69693988}{139414397}$

whose denominator has nine digits. According to the webmaster, the rational approximation I presented above is not the best one. Hence, let us now try to obtain a better one.

__________

My solution #2:

Most likely, the reason why I did not obtain the best rational approximation is that the real solution I obtained was not precise enough. Hence, let us use the following MATLAB command to solve the polynomial equation:

solve('p + p*(1-p)^13+p*(1-p)^14+p*(1-p)^18+p*(1-p)^19+p*(1-p)^22 = 0.5')


This command produces the following list of solutions:

ans =

.19776751175225606651411342151624e-2-.10188731019715301414082803578298*i
.19776751175225606651411342151624e-2+.10188731019715301414082803578298*i
.78011526021831885333432975572559e-1-.46438315644282071005250209738747*i
.78011526021831885333432975572559e-1+.46438315644282071005250209738747*i
.24703663733588276238861533742100-.62459251628994218281562354280746*i
.24703663733588276238861533742100+.62459251628994218281562354280746*i
.33493806174143698737516495548579-1.0329135045723528674007841220342*i
.33493806174143698737516495548579+1.0329135045723528674007841220342*i
.49990524292838950452962194522516
.62545650976540488339104798827398-.84592191800405765193777444233924*i
.62545650976540488339104798827398+.84592191800405765193777444233924*i
.99159097053730697830960408610308-.92366862667162608535284662817351*i
.99159097053730697830960408610308+.92366862667162608535284662817351*i
1.2772199325509563050241902871670-.95165015686389448554291677056666*i
1.2772199325509563050241902871670+.95165015686389448554291677056666*i
1.5066266673486985232104204448828-.79718019402868789409470957216463*i
1.5066266673486985232104204448828+.79718019402868789409470957216463*i
1.7908257089204728277307542114304-.59903647154154201043833983995659*i
1.7908257089204728277307542114304+.59903647154154201043833983995659*i
1.9241908987061350990877558129099-.45439058479360516453156947604463*i
1.9241908987061350990877558129099+.45439058479360516453156947604463*i
1.9721727904901564352190617939256-.14384956327517564484908725760533*i
1.9721727904901564352190617939256+.14384956327517564484908725760533*i

We now obtain a much more precise real solution

$p =0.49990524292838950452962194522516$.

We modify the Haskell script, using the new value of $p$ and a much finer list of epsilons (the precisions):

import Data.Ratio

-- probability of heads
p = 0.49990524292838950452962194522516

-- create (infinite) list of epsilons
epsilons = map ((1.005)**) [-1,-2..]

-- create (infinite) list of ratios
ratios = map (approxRational p) epsilons

-- list of ratios whose denominators have less than 10 digits
ratios' = takeWhile (\x -> denominator x < 10^10) ratios

Which yields the following rational approximation

$p \approx \displaystyle\frac{61031369}{122085875}$

whose denominator has nine digits. Although this rational approximation does not match the one in the official solution, I still got my name on the “people who answered correctly” list. Surprisingly, the value of $p$ which I obtained using MATLAB is quite different from the value of $p$ used by the webmaster

$p = 0.499905242928506776611191007553$.

Which value of $p$ is “more correct”? I do not know. Do you?

### Deciding the bijectivity of finite functions

March 31, 2012

Consider a function $f : \mathcal{X} \to \mathcal{Y}$. Function $f$ is bijective if and only if it is both injective and surjective. We say that $f$ is a finite function if and only if both $\mathcal{X}$ and $\mathcal{Y}$ are finite sets.

In this post, we will restrict our attention to finite functions. For finite functions, we already know how to decide injectivity and surjectivity. Hence, deciding the bijectivity of finite functions is now trivial. In Haskell, we create the following three predicates:

isInjective :: (Eq a, Eq b) => (a -> b) -> [a] -> Bool
isInjective f xs = all phi [(x1,x2) | x1 <- xs, x2 <- xs]
where phi (x1,x2) = (f x1 /= f x2) || (x1 == x2)

isSurjective :: Eq b => (a -> b) -> [a] -> [b] -> Bool
isSurjective f xs ys = all psi ys
where psi y = or [(f x == y) | x <- xs]

isBijective :: (Eq a, Eq b) => (a -> b) -> [a] -> [b] -> Bool
isBijective f xs ys = (isInjective f xs) && (isSurjective f xs ys)

Note that predicate isInjective takes as inputs a function and a list (the domain), whereas predicate isSurjective takes as inputs a function and two lists (the domain and the codomain). Both return either True or False. Predicate isBijective is defined using the conjunction of the other two predicates. In this implementation, we separated the decision procedure from the definition of $f$.

__________

Example #1

Let us define $\mathbb{Z}_4 := \{0, 1, 2, 3\}$. Consider the finite function

$\begin{array}{rl} f : \mathbb{Z}_4 &\to \mathbb{Z}_4\\ n &\mapsto 3 n \pmod{4}\end{array}$

where $\pmod{4}$ denotes modulo 4 arithmetic. We know that this function is both injective and surjective. Hence, it is bijective. Here’s a GHCi session (after running the Haskell script above):

*Main> let f n = (3 * n) mod 4
*Main> isInjective f [0..3]
True
*Main> isSurjective f [0..3] [0..3]
True
*Main> isBijective f [0..3] [0..3]
True

__________

Example #2

Consider now the following finite function

$\begin{array}{rl} g : \mathbb{Z}_4 &\to \mathbb{Z}_4\\ n &\mapsto 2 n \pmod{4}\end{array}$

We already know that function $g$ is neither injective nor surjective. Hence, it is not bijective. Here’s a GHCi session:

*Main> let g n = (2 * n) mod 4
*Main> isInjective g [0..3]
False
*Main> isSurjective g [0..3] [0..3]
False
*Main> isBijective g [0..3] [0..3]
False

### Deciding the surjectivity of finite functions

March 19, 2012

Consider a function $f : \mathcal{X} \to \mathcal{Y}$. We say that $f$ is a finite function if and only if both $\mathcal{X}$ and $\mathcal{Y}$ are finite sets. As mentioned two weeks ago, function $f$ is surjective if and only if

$\forall y \exists x \left( f (x) = y \right)$

where $x$ and $y$ range over sets $\mathcal{X}$ and $\mathcal{Y}$, respectively. We now introduce the surjectivity predicate $\varphi : \mathcal{X} \times \mathcal{Y} \to \{\text{True}, \text{False}\}$, defined by $\varphi (x, y) =\left( f (x) = y\right)$ so that function $f$ is surjective if and only if $\forall y \exists x \, \varphi (x, y)$ evaluates to $\text{True}$.

Let $m := |\mathcal{X}|$ and $n := |\mathcal{Y}|$ denote the cardinalities of sets $\mathcal{X}$ and $\mathcal{Y}$, respectively. Deciding $\forall y \exists x \, \varphi (x, y)$ should then require a total of $m n$ evaluations of predicate $\varphi$. Let us write $\mathcal{X} = \{x_1, x_2, \dots, x_m\}$ and $\mathcal{Y} = \{y_1, y_2, \dots, y_n\}$, and introduce $\Phi \in \{\text{True}, \text{False}\}^{m \times n}$, a Boolean matrix defined as follows

$\Phi = \left[\begin{array}{cccc} \varphi (x_1,y_1) & \varphi (x_1,y_2) & \ldots & \varphi (x_1,y_n)\\ \varphi (x_2,y_1) & \varphi (x_2,y_2) & \ldots & \varphi (x_2,y_n)\\ \vdots & \vdots & \ddots & \vdots\\ \varphi (x_m,y_1) & \varphi (x_m,y_2) & \ldots & \varphi (x_m,y_n)\\\end{array}\right]$

Stating that $f$ is surjective, i.e., $\forall y \exists x \, \varphi (x, y)$ evaluates to $\text{True}$, is equivalent to saying that every column of matrix $\Phi$ contains at least one entry that evaluates to $\text{True}$. We now introduce a new predicate, $\psi : \mathcal{Y} \to \{\text{True}, \text{False}\}$, defined by

$\psi (y) = \exists x \, \varphi (x,y)$

so that $\forall y \exists x \, \varphi (x, y)$ can be rewritten in the form $\forall y \, \psi (y)$, where $y$ ranges over set $\mathcal{Y}$. Note that $\psi (y)$ can be viewed as the disjunction of the $m$ atoms $\varphi (x_1, y), \varphi (x_2, y), \dots, \varphi (x_m, y)$, i.e.,

$\psi (y) = \displaystyle\bigvee_{i =1}^m \varphi (x_i, y)$.

Note also that $\forall y \, \psi (y)$ can be viewed as the conjunction of the $n$ atoms $\psi (y_1), \psi (y_2), \dots, \psi (y_n)$. Thus, we can decide surjectivity using the following procedure:

1. Compute $\psi (y) = \exists x \, \varphi (x,y)$ for all $y \in \mathcal{Y}$.
2. Compute the conjunction $\bigwedge_{j=1}^n \psi (y_j)$. If its truth value is $\text{True}$, then the function is surjective.

Let us now consider two simple examples and implement this decision procedure in Haskell.

__________

Example #1

Let us define $\mathbb{Z}_4 := \{0, 1, 2, 3\}$. Consider the finite function

$\begin{array}{rl} f : \mathbb{Z}_4 &\to \mathbb{Z}_4\\ n &\mapsto 3 n \pmod{4}\end{array}$

where $\pmod{4}$ denotes modulo 4 arithmetic. Note that $f (0) = 0$, $f (1) = 3$, $f (2) = 2$, and $f (3) = 1$. Hence, we easily conclude that $f$ is both injective and surjective (and, thus, is bijective). In Haskell:

Prelude> let f n = (3 * n) mod 4
Prelude> let xs = [0..3]
Prelude> let ys = [0..3]
Prelude> let psi y = or [(f x == y) | x <- xs]
Prelude> all psi ys
True

which allows us to conclude that $f$ is surjective. Note that we used function or to perform disjunction, and function all to perform universal quantification.

__________

Example #2

Consider now the following finite function

$\begin{array}{rl} g : \mathbb{Z}_4 &\to \mathbb{Z}_4\\ n &\mapsto 2 n \pmod{4}\end{array}$

Note that $g (0) = g (2) = 0$ and $g (1) = g (3) = 2$. Hence, function $g$ is neither injective nor surjective. In Haskell, we have the following:

Prelude> let g n = (2 * n) mod 4
Prelude> let xs = [0..3]
Prelude> let ys = [0..3]
Prelude> let psi y = or [(g x == y) | x <- xs]
Prelude> all psi ys
False

which allows us to conclude that $g$ is not surjective. Why? The following code returns a list of lists of Booleans where each sublist is a column of matrix $\Phi$ (where each entry is $\Phi_{ij} = (g(x_i) = y_j)$)

Prelude> [[(g x == y) | x <- xs] | y <- ys]
[[True,False,True,False],[False,False,False,False],
[False,True,False,True],[False,False,False,False]]

Note that the 2nd and 4th sublists contain only $\text{False}$ truth values. Hence, it is not the case that every column of matrix $\Phi$ contains at least one entry that evaluates to $\text{True}$. Thus, $g$ is not surjective.

### Manin on abbreviated notation

March 17, 2012

Yuri Manin (Юрий Манин) on abbreviated notation:

If written down, most of the interesting expressions and texts in a formal language either would be physically extremely long, or else would be psychologically difficult to decipher and learn in an acceptable amount of time, or both.

They are therefore replaced by “abbreviated notation” (which can sometimes turn out to be physically longer). The expression “$x x x x x x$” can be briefly written “$x \dots x$ (six times)” or “$x^6$.” The expression “$\forall z \left( z \in x \Leftrightarrow z \in y\right)$” can be briefly written “$x = y$.” Abbreviated notation can also be a way of denoting any expression of a definite type, not only a single such expression; (any expression $101010 \dots 10$ can be briefly written “the sequence of length $2 n$ with ones in odd places and zeros in even places” or “the binary expansion of $\frac{2}{3} (4^n - 1)$.”)

Ever since our tradition started with Vieta, Descartes, and Leibniz, abbreviated notation has served as an inexhaustible source of inspiration and errors. There is no sense in, or possibility of, trying to systematize its devices; they bear the indelible imprint of the fashion and spirit of the times, the artistry and pedantry of the authors. The symbols $\sum$, $\int$, $\in$ are classical models worthy of imitation. Frege’s notation, now forgotten, for $P$ and $Q$ (…) shows what should be avoided.

Translated from the original Russian by Neal Koblitz.

__________

Source:

Yuri Manin, A Course in Mathematical Logic, Springer, 1977.

### Deciding the injectivity of finite functions II

March 17, 2012

Last week we considered the following finite function

$\begin{array}{rl} f : \mathbb{Z}_4 &\to \mathbb{Z}_4\\ n &\mapsto 3 n \pmod{4}\end{array}$

where $\mathbb{Z}_4 := \{0, 1, 2, 3\}$ and $\pmod{4}$ denotes modulo 4 arithmetic. We know that function $f$ is injective if and only if

$\forall x_1 \forall x_2 \left( f (x_1) = f (x_2) \implies x_1 = x_2\right)$

where $x_1$ and $x_2$ range over set $\mathbb{Z}_4$. We introduce the injectivity predicate $\varphi : \mathbb{Z}_4 \times \mathbb{Z}_4 \to \{\text{True}, \text{False}\}$, defined by

$\varphi (x_1, x_2) =\left( f (x_1) = f (x_2) \implies x_1 = x_2\right)$

so that function $f$ is injective if and only if the universally-quantified sentence $\forall x_1 \forall x_2 \, \varphi (x_1, x_2)$ evaluates to $\text{True}$.

__________

A naive decision procedure

We can determine the truth value of $\forall x_1 \forall x_2 \, \varphi (x_1, x_2)$ by evaluating the predicate $\varphi$ at all $(x_1, x_2) \in \mathbb{Z}_4 \times \mathbb{Z}_4$, which will require a total of $|\mathbb{Z}_4|^2 = 4^2 = 16$ evaluations. We implemented this decision procedure in Haskell last week. Here’s a rather brief GHCi session:

Prelude> let f n = (3 * n) mod 4
Prelude> let phi (x1,x2) = (f x1 /= f x2) || (x1 == x2)
Prelude> all phi [(x1,x2) | x1 <- [0..3], x2 <- [0..3]]
True

which allows us to conclude that $f$ is injective. Note that we used the following equivalence

$\left( f (x_1) = f (x_2) \implies x_1 = x_2\right) \equiv \left( f (x_1) \neq f (x_2) \lor x_1 = x_2\right)$

to implement $\varphi$. Can we do better than this? In fact, we can.

__________

A better decision procedure

We introduce matrix $\Phi \in \{\text{True}, \text{False}\}^{4 \times 4}$, defined by

$\Phi = \left[\begin{array}{cccc} \varphi (0,0) & \varphi (0,1) & \varphi (0,2) & \varphi (0,3)\\ \varphi (1,0) & \varphi (1,1) & \varphi (1,2) & \varphi (1,3)\\ \varphi (2,0) & \varphi (2,1) & \varphi (2,2) & \varphi (2,3)\\ \varphi (3,0) & \varphi (3,1) & \varphi (3,2) & \varphi (3,3)\\\end{array}\right]$

Stating that $f$ is injective, i.e., $\forall x_1 \forall x_2 \, \varphi (x_1, x_2)$ evaluates to $\text{True}$, is equivalent to saying that all the $4^2 = 16$ entries of matrix $\Phi$ evaluate to $\text{True}$. Note, however, that the four entries on the main diagonal of $\Phi$ will always evaluate to $\text{True}$, as the implication

$f (x) = f (x) \implies x = x$

evaluates to $\text{True}$ for every $x$. Moreover, from the equivalence

$\left( f (x_1) = f (x_2) \implies x_1 = x_2\right) \equiv \left( f (x_2) = f (x_1) \implies x_2 = x_1\right)$

we can conclude that $\varphi (x_1, x_2) = \varphi (x_2, x_1)$ for all $x_1$ and $x_2$, which tells us that matrix $\Phi$ is symmetric (i.e., $\Phi = \Phi^T$). Therefore, we do not need to evaluate all entries of matrix $\Phi$, only the ones above the main diagonal, as illustrated below

$\left[\begin{array}{cccc} \ast & \varphi (0,1) & \varphi (0,2) & \varphi (0,3)\\ \ast & \ast & \varphi (1,2) & \varphi (1,3)\\ \ast & \ast & \ast & \varphi (2,3)\\ \ast & \ast & \ast & \ast\\\end{array}\right]$

where the symbol $\ast$ denotes “don’t care”, i.e., entries we do not need to evaluate. Hence, instead of evaluating the predicate $\varphi$ a total of $4^2 = 16$ times, we now only need to evaluate it ${4 \choose 2} = 6$ times, which is $37.5 \%$ of the original total. We can generate all pairs $(x_1, x_2)$ with $x_2 > x_1$ using the following Haskell code:

Prelude> [(x1,x2) | x1 <- [0..3], x2 <- [(x1+1)..3]]
[(0,1),(0,2),(0,3),(1,2),(1,3),(2,3)]

We thus implement a more efficient decision procedure as follows:

Prelude> let f n = (3 * n) mod 4
Prelude> let phi (x1,x2) = (f x1 /= f x2) || (x1 == x2)
Prelude> all phi [(x1,x2) | x1 <- [0..3], x2 <- [(x1+1)..3]]
True

which requires less than half of the number of evaluations required by the previous (naive) procedure.

__________

Conclusions

Given a finite function $f : \mathcal{X} \to \mathcal{Y}$, we introduce the injectivity predicate $\varphi : \mathcal{X} \times \mathcal{X} \to \{\text{True}, \text{False}\}$, defined by

$\varphi (x_1, x_2) =\left( f (x_1) = f (x_2) \implies x_1 = x_2\right)$

so that (finite) function $f$ is injective if and only if $\forall x_1 \forall x_2 \, \varphi (x_1, x_2)$ evaluates to $\text{True}$. Let $n := |\mathcal{X}|$ denote the cardinality of set $\mathcal{X}$. Deciding $\forall x_1 \forall x_2 \, \varphi (x_1, x_2)$ using the naive procedure will require a total of $n^2$  evaluations of the predicate $\varphi$. Since $\varphi (x,x) = \text{True}$ for every $x$, and taking into account that $\varphi (x_1,x_2) = \varphi (x_2,x_1)$, using the improved procedure we can decide injectivity at a cost of only ${n \choose 2} = \frac{n (n-1)}{2}$ evaluations of the predicate $\varphi$. As $n$ approaches infinity, the ratio ${n \choose 2} / n^2$ approaches $1/2$. Hence, for “large” problems, the improved procedure will require approximately half the number of evaluations required by the naive procedure.