Monday, June 28, 2010

Dembski haplessly admits to plagiarism?

In “Efficient Per Query Information Extraction from a Hamming Oracle” Winston Ewert, George Montañez, William A. Dembski, Robert J. Marks II do not cite the sources of the algorithms they analyze. Yet Dembski trumpeted at Uncommon Descent, “New Peer-Reviewed ID Paper — Deconstructing the Dawkins WEASEL.”

The algorithms (A) and (B) are implemented, respectively, by the programs WEASEL1 and WEASEL2 that Dembski said were supplied to him by the pseudonymous “Oxfordensis.” Dembski announced, “[W]e shall… henceforward treat the programs below as the originals,” i.e., as those used by Richard Dawkins. Any way you slice it, the programs and the algorithms they implement are due to someone other than Ewert et alia.

If you abstract an algorithm from a program, you are ethically obligated to cite the program, just as you are ethically obligated to cite a book from which you get an idea, even if you do not copy words from the book. Algorithms are intellectual property, and are in fact patentable.

Has Dembski not tagged himself and his colleagues as plagiarists?

Willie can’t stop whipping the Weasel

William Dembski recently projected his obsession with the Weasel program onto evolutionists. His rationalization? Origin-of-life researcher Michael Yarus reuses the 1986 pop-sci illustration of evolution in a new pop-sci book, Life from an RNA World.

Dembski has misunderstood Richard Dawkins’ description of the program for at least a decade. He and Bob Marks falsely attributed partitioned search — not only the algorithm, but also the term — to Dawkins in a September 2009 article. As soon as the article appeared, he vested faith in two programs implementing different algorithms, explaining vaguely that the pseudonymous “Oxfordensis” supplied them. The story goes that Dawkins used one in preparing The Blind Watchmaker, and the other as a demo in a television show. Dawkins no longer has his code, and does not recognize the apocrypha. Dembski announced that he would pin it on Dawkins anyway.

The kicker is that the programs share a bug. They attempt to mutate exactly one character in each offspring phrase, but instead create a perfect copy of the parent in 1 of 27 attempts, on average. The implemented mutation rule is bizarre, with relevance neither to biology nor to engineering. Yet Dembski, Marks, and IDCist cubs recently published analyses of algorithms abstracted from the buggy programs. (Of course, they went once again to the IEEE Systems, Man, and Cybernetics Society, where reviewers are unlikely to know much about evolutionary algorithms.)

Now the clown-scholar Willie has the temerity to write,

Some internet critics have urged that we are beating a dead [weasel], that this example was never meant to be taken too seriously, and that if we were “serious scientists,” we would be directing our energies elsewhere. Let me suggest that these critics take up their concerns with Yarus.
Ironically, Yarus explains that pedagogy trumps biology:
To fully appreciate this particular [example], you must be aware that it is an idealization, not a realistic model of evolution. But its power comes from its precise aim — directly at the heart of the difficulty many people have with the concept of evolution.
Dembski’s subconscious no doubt screams, “Deny! Deny! Deny!”
Yarus sees this simulation as underwriting the power of evolutionary processes.
Underwriting? Yarus says that it “should give intelligent designers (and the rest of us) a reflective moment or two.”

Poor Willie. Imagine how conflicted he must be, to have a “serious scientist” like  Yarus invoke his name, but put the Weasel in its place.

No more exegesis of apocrypha

Rob Knight kindly supplied me with the code of what Yarus calls the “Dobzhansky program.” Rob and I agree that it implements a (1, λ) evolutionary algorithm (EA). In each generation, one parent phrase reproduces λ = 200 times. The characters in the offspring mutate independently at a rate of .0096. One offspring of maximal fitness survives as the parent of the next generation, and all other phrases “die.” This matches Wikipedia’s algorithm, except that the mutation rate is lower. The consequence of the lower rate is that the program performs well with phrases of length considerably greater than 100.

No more blogorrheaic Mullings

The (1, λ) EA is distinguished from the (1 + λ) EA, in which the parent competes with offspring for survival. With the PLUS algorithm, it is impossible for parental fitness to decrease from one generation to the next. Under certain conditions, it is nearly impossible for the COMMA algorithm. Some elementary calculations — not prolix, overwrought exhortations to the beloved onlookers [inside joke] — make this clear. For generality, let’s refer to the phrase length as L and the mutation rate as μ.

The probability of no mutation in a particular position of an offspring phrase is 1 − μ. The probability of no mutation in all L positions, i.e., no difference from the parent, is (1 − μ)L. Subtract this quantity from 1, and you get the probability that the offspring differs from the parent,

m = 1 − (1 − μ)L.
The expected number of mutants among the λ offspring of a generation is λm, and the probability that all of the offspring are mutants is mλ.

For Yarus’ illustration, with L = 63 and μ ≈ .0096, m ≈ .4554. Of the λ = 200 offspring in a generation, only λm ≈ 91 are mutants, on average. The probability of generating 0, rather than the expected 109, copies of the parent is

mλ ≈ .4554200 ≈ 10−68.
It follows that the (1, 200) EA performs identically to the (1 + 200) EA with overwhelming probability. Even for a phrase of length 300, which the Dobzhansky program will converge upon poorly — the ideal mutation rate is about 1 / 300, and μ is almost 3 times too large — only 1 in 100 thousand generations lacks a copy of the parent.

Profligate evolution

The computational waste is striking. But I believe that it is entirely appropriate to illustrate rapid evolution on a generational basis, i.e., assuming simultaneous evaluation of offspring. It is progress on the generational time scale that matters in biological evolution. And we know that biological evolution is profligate in its own ways. Dembski, Marks, and cubs focus on how many fitness evaluations, rather than how many generations, are required to obtain the fittest phrase. They essentially force a sequential model onto a process characterized by parallelism.

That’s not a feature — it’s a bug

The apocryphal TV demo almost implements what theorists of evolutionary computation call randomized local search (RLS). You can think of RLS as a (1 + 1) EA modified to ensure that there is exactly one mutation in each offspring. This eliminates evaluations of copies of the parent (and also multiply mutated offspring).

The apocryphal TV demo does not surely mutate the character in a randomly selected position, but instead assigns a randomly drawn character to the position.

Right:   xi := Random(ALPHABET - {xi})
Wrong: xi := Random(ALPHABET)
The incorrect version fails with probability 1 / N, where N is the size of the alphabet. The de facto mutation rule is very weird: With probability (N - 1) / N, mutate a randomly selected position of the phrase. Ewert, Montañez, Dembski, and Marks consider alphabet sizes ranging from 1 to 100. For the important case of a binary alphabet, 1 in 2 offspring is a copy of its parent, and the bug doubles the expected running time. All the authors can say is that they’re analyzing an error they suspect Dawkins of making a quarter-century ago.

The apocryphal TBW program implements a (1, 100) EA modified to use the weird mutation rule. As I just explained, the probability of mutation in an offspring phrase is m = (N - 1) / N. Here, for several interesting alphabet sizes N, are the the expected numbers of mutants in a generation (100 m) and the probabilities that all offspring in a generation are mutants (m100).


N 100 m m100
2 50.0 1 / 1030
27 96.3 1 / 44
100 99.0 1 / 3

It is simply crazy for alphabet size to govern mutation as it does here. For the Weasel-Dobzhansky case of N = 27, about 1 in 44 generations has no copy of the parent. Convergence usually requires more than 44 generations. So, with irony that can be appreciated only by folks who follow Uncommon Descent...
There is no latching or ratcheting or locking of any sort in the program that Dawkins putatively used in preparing The Blind Watchmaker.
And I repeat that there is no scholarly value in formal analysis of idiosyncrasies arising from faulty implementation of mutation. If Dembski, Marks, and cubs want to investigate two buggy programs as artifacts of possible “historical significance,” let them openly tell the story of Oxfordensis.