Thursday, December 10, 2009

Blunder in the new Dembski-Marks paper

The recently published conference paper of Dembski and Marks, Bernoulli’s Principle of Insufficient Reason and Conservation of Information in Computer Search contains an enormous, undebatable, and embarrassing error in the argument regarding the so-called "search for a search."

Search algorithms explore a finite space of solutions Ω. An algorithm basically selects a solution in Ω, examines the solution, and decides which solution to select next. It repeats this until it has made Q selections. It is crucial to the arguments of Dembski and Marks that search algorithms have the capacity for randomizing their decisions. To randomize a decision is to base it, at least in part, on inputs from a random source. (Think of flipping a fair coin repeatedly and feeding the sequence of outcomes into a computer program. The inputs would permit the program to randomize its decisions.)

Dembski and Marks believe that people search for search algorithms in a higher-order space Ω2. They write,
Let Ω2 be the finite space of all search algorithms on the search space Ω.
The word I've emphasized is wrong, wrong, and wrong. The set of all randomized search algorithms is infinite, not finite. Section III.A and the "proof" of Equation (7) in Appendix B are based on a false premise.

Consider making just one randomized decision between alternative solutions ω1 and ω2 in Ω, with the probability of choosing ω1 equal to p. The set of all possible values for p is infinite, and therefore the set of all randomized algorithms for making the decision is infinite.

Some technical details

For a fixed representation of randomized algorithms (e.g., probabilistic Turing machines reading i.i.d. uniform bits from an auxiliary tape), the set of probabilities p possible in a randomized dichotomous decision is countably infinite. There is no upper bound on the size and running time of the randomized decision algorithm.

The set of probabilities [0, 1] is uncountably infinite. Thus, while every randomized search algorithm implements a random search process, vanishingly few random search processes correspond to randomized search algorithms. Relatively few randomized search algorithms are fast enough and small enough to solve a problem in practice. We humans necessarily prefer small randomized search algorithms that make decisions rapidly.


  1. This comment has been removed by the author.

  2. Oops, a mistake. But can the proof be resurrected? i.e. does it make a substantive difference?

  3. I too noticed that their higher-order space is apparently infinite, but I rationalized it in two ways:

    1) This is the first time they've characterized the higher-order space as a set of algorithms. Perhaps they don't really mean algorithms, but rather vectors of visited points, a la Wolpert & Macready. If the number of queries is bounded, then the number of unique vectors is finite. I realize this is a pretty big interpretational stretch.

    2) If the space of algorithms is infinite, even uncountably, can't a case still be made that randomly selected algorithms will, on average, not be biased toward any particular point in the lower-order space? If so, it seems that the logic of the proof still holds, even if the math must be modified.

    The proof in Appendix B came from me, so go easy on it. :-) (It was never intended for an infinite higher-order space.)

  4. Bob O'H,

    The set of all programs for a particular discrete computer is countably infinite. There is no uniform probability measure on a countably infinite set. Dembski and Marks might get somewhere, theoretically, by moving to quantum computation. I'm not sure.

    When we humans choose programs that fit in the storage of our computers and that make (perhaps randomized) decisions quickly, we are not exploiting problem-specific knowledge. The knowledge we're exploiting is of computation and computers in general, not problems.

    If you drew search programs uniformly at random from those that fit in the storage of your computer, most of the successes you observed would come from fast programs. That is a tautology.

  5. Robert,

    Ah, I recall when "Atom" said you were a genius, and indicated that someone wanted to acknowledge your contribution.

    It seems to me that Dembski and Marks are committed to viewing the success-failure outcome of the run of an algorithm on a particular problem as random.

    Things would work out nicely for deterministic search algorithms - not necessarily permutations. A Q-query deterministic algorithm essentially implements a height-Q decision tree. There's a finite partition of the set of all deterministic search algorithms, with each block containing algorithms that implement a particular decision tree. You can distribute probability uniformly over the blocks.

    There is no finite partition of randomized search algorithms into blocks of algorithms that "do the same thing."

    I think your second point holds only for randomized algorithms making a single selection of a solution in Ω. Off the top of my head, I believe that compact algorithms generally produce sequences over Ω that are algorithmically compressible (far from Kolmogorov random). Wish I could give this more consideration, but I have some pressing business.

  6. Actually, I called Atom a genius. It's nice to be able to talk with him across the fence with no hostility in either direction.

    There is no finite partition of randomized search algorithms into blocks of algorithms that "do the same thing."

    That makes sense. Apparently Marks and Dembski didn't put much thought into the implications of defining the higher-order space as a space of randomized algorithms.

  7. There is no uniform probability measure on a countably infinite set.
    Ah, I see. Oops.

    Is the only problem the normalising constant? I'm wondering if it's possible to not worry about it, as long as something down the line has a probability measure. A bit like the was a uniform prior can be improper as long as the posterior is proper.

    Sorry for pushing you on this, but I'm trying to see if this is an arcane technical point, and the overall argument is correct, or if it means that the main message of the paper is wrong.

  8. Bob O'H,

    It's good to push me on the point. I don't think there's anything analogous to an improper prior, if only because undefined results don't lead to defined results later. But I also don't have a satisfying explanation. I'll see if I can get a competent mathematician to comment.