Bounded Science: July 2014

[Edit: This post essentially does right what Ewert, Dembski, and Marks did wrong. I’ve since shown that the theorem follows from a very simple result in probability.]

In what appears to be a forthcoming journal article [published version], ID creationists Winston Ewert, William Dembski, and Robert J. Marks II state a formal result, and justify it merely by citing a three-page paper in the proceedings of a symposium. I was already annoyed with the reviewers and the editors for missing a huge defect, and decided to while away the sleepless hours by checking the putative proof in On the Improbability of Algorithmic Specified Complexity.

The formalism and the argument are atrocious. I eventually decided that it would be easier to reformulate what I thought the authors were trying to say, and to see if I could generate my own proof, than to penetrate the slop. It took me about 20 minutes, working directly in LaTeX. Then I decided to provide some explanation that is missing in the paper.

The theorem is correct. As Ewert, Marks, and Dembski put it, "The probability of obtaining an object exhibiting $\alpha$ bits of [algorithmic specified complexity] is less than or equal to $2^{-\alpha}$." It is something that they can establish a property like this. But algorithmic specified complexity is a sum of bits of Shannon self-information and bits of Kolmogorov complexity, which seem like apples and oranges to me.

I should mention that the result probably comes from Ewert's doctoral dissertation, Algorithmic Specified Complexity, still under wraps at Baylor. Evidently Dembski, a mathematician, did not edit the paper.

The following makes more sense if you read Sections I and II of the paper. Those of you with a bit of math under your belts will be amazed by the difference.

$\newcommand{\suppmu}{\mathrm{supp}(\mu)} \newcommand{\B}{\mathcal{B}} \newcommand{\X}{\mathcal{X}} \renewcommand{\S}{\mathcal{S}} \newcommand{\P}{\mathcal{P}} \newcommand{\Bprime}{\mathcal{B}^\prime}$The set of all strings (finite sequences) on $\B = \{0, 1\}$ is $\B^*.$ Assume that the binary encoding $e: \S \rightarrow \B^*$ of the set $\S$ of objects of interest is 1-to-1. This allows use of the set of codewords $e(\S)$ in place of $\S.$

The set of all programs $\P$ for universal computer $U$ is a prefix-free subset of $\B^*.$ That is, no program is a proper prefix of any other. The conditional Kolmogorov complexity $K(x|y)$ is the length $\ell(p)$ of the shortest program $p$ that outputs $x$ on input of $y,$ i.e., \[ K(x|y) = \hspace{-0.3em} \min_{p : U(p, y) = x} \hspace{-0.4em} \ell(p). \] The Kraft inequality \[ \sum_{x \in \X} 2^{-\ell(x)} \leq 1 \] holds for all prefix-free sets $\X \subset \B^*,$ including the prefix-free set of programs $\P.$ It follows that for all $\X \subseteq \B^*,$ \[ \sum_{x \in \X} 2^{-\!K(x|y)} \leq \sum_{p \in P} 2^{-\ell(p)} \leq 1, \] where $y$ is a string in $\B^*.$ In the first sum, all terms correspond to distinct programs, and each exponent $-\!K(x|y)$ is the negative length of a program that outputs $x$ on input of $y$.

Theorem 1. Let $\mu$ be a probability measure on encoded objects $e(\S).$ Also let \[ X = \{x \in \suppmu \mid -\!\log_2 \mu(x) - K(x | y) \geq \alpha\}, \] where $y$ is a string in $\B^*$ and $\alpha \geq 0.$ Then $\mu(X) \leq 2^{-\alpha}.$

Proof. Rewrite the property of string $x$ in $X$ to obtain a bound on $\mu(x):$ \begin{align*} -\!\log_2 \mu(x) - K(x | y) &\geq \alpha \\ \log_2 \mu(x) + K(x | y) &\leq -\alpha \\ \log_2 \mu(x) &\leq -\alpha - K(x | y) \\ \mu(x) &\leq 2^{-\alpha - K(x | y)}. \end{align*} Applying the bound, \begin{align*} \mu(X) &= \sum_{x \in X} \mu(x) \\ &\leq \sum_{x \in X} 2^{-\alpha -K(x | y)} \\ &= \sum_{x \in X} 2^{-\alpha} \cdot 2^{-K(x | y)} \\ &= {2^{-\alpha} \sum_{x \in X} 2^{-K(x | y)}} \\ &\leq 2^{-\alpha}. \end{align*} The last step follows by the Kraft inequality. Q.E.D.

Bounded Science

Thursday, July 31, 2014

Proving a theorem of Ewert, Marks, and Dembski