Thursday, July 29, 2010

Feeling charitable toward Baylor’s IDC cubs

The reason I come off as a nasty bastard on this blog is that I harbor quite a bit of anger toward the creationist bastards who duped me as a teenager. The earliest stage of overcoming my upbringing was the worst time of my life. I wanted to die. Consequently, I am deadly serious in my opposition to “science-done-right proves the Bible true” mythology. William A. Dembski provokes me especially with his prevarication and manipulation. He evidently believes that such behavior is moral if it serves higher ends in the “culture war.” My take is, shall we say, more traditional.

When Robert J. Marks II, Distinguished Professor of Engineering at Baylor University, and Fellow of the Institute of Electrical and Electronics Engineers (IEEE), began collaborating with Dembski, I did not rush to the conclusion that he was like Dembski. But it has become apparent that he is willing to play the system. For instance, Dembski was miraculously elevated to the rank of Senior Member of the IEEE, which only 5% of members ever reach, in the very year that he joined the organization. To be considered for elevation, a member must be nominated by a fellow.

Although Marks was the founding president of the progenitor of the IEEE Computational Intelligence Society, which addresses evolutionary computation (EC), he and his IDCist collaborators go to the IEEE Systems, Man, and Cybernetics Society for publication. He is fully aware that reviewers there are unlikely to know much about EC, and are likely to give the benefit of the doubt to a paper bearing his name. I would love to see him impugn the integrity of his and my colleagues in the Computational Intelligence Society by claiming that they don’t review controversial work fairly. But it ain’t gonna happen.

I’ve come to see Marks as the quintessential late-career jerk, altogether too ready to claim expertise in an area he has never engaged vigorously. He is so cocksure as to publish a work of apologetics with the title Evolutionary Computation: A Perpetual Motion Machine for Design Information? (Chap. 17 of Evidence for God, M. Licona and W. A. Dembski, eds.). He states outright some misapprehensions that are implicit in his technical publications. Here’s the whopper: “A common structure in evolutionary search is an imposed fitness function, wherein the merit of a design for each set of parameters is assigned a number.” Who are you, Bob Marks, to say what is common and what is not in a literature you do not follow? Having scrutinized over a thousand papers in EC, and perused many more, I say that you are flat-out wrong. There’s usually a natural, not imposed, sense in which some solutions are better than others. Put up the references, Distinguished Professor Expert, or shut up.

Marks and coauthors cagily avoid scrutiny of their (few) EC sources by dumping on the reviewers references to entire books, i.e., with no mention of specific pages or chapters. This is because their EC veneer will not withstand a scratch. The chapter I just linked to may seem to contradict that, given its references to early work in EC by Barricelli (1962), Crosby (1967), and Bremmerman [sic] et al. (1966). [That's Hans-Joachim Bremermann.] First, note the superficiality of the references. Marks did not survey the literature to come by them. The papers appear in a collection of reprints edited by David Fogel, Evolutionary Computation: The Fossil Record (IEEE Press, 1998). Marks served as a technical editor of the volume, just as I did, and he should have cited it.

Although Marks is an electrical engineer, he has been working with two of Baylor’s graduate students in computer science, Winston Ewert and George Montañez. I would hazard a guess that there is some arrangement for the students to turn their research with Marks into masters’ theses. I’ve been sitting on some errors in their most recent publication, Efficient Per Query Information Extraction from a Hamming Oracle, thinking that the IDC cubs would get what they deserved if they included the errors in their theses. Well, I’ve got a soft spot for students, and I’m feeling charitable today. But there’s no free lunch for Marks. He has no business directing research in EC, his reputation in computational intelligence notwithstanding, and I hope that the CS faculty at Baylor catch on to the fact.

On first reading the paper, I was deeply annoyed by the combination of a Chatty-Cathy, self-reference-laden introduction focusing on “oracles,” irrelevant to the majority of the paper, with a non-survey of the relevant literature in the theory of EC. Ewert et al. dump in three references to books, without discussion of their content, at the beginning of their 4-1/2 page section giving Markov-chain analyses of evolutionary algorithms. It turns out that one of the books does not treat EC at all — I contacted the author to make sure.

As I have discussed here and here, two of the algorithms they analyze are abstracted from defective Weasel programs that Dawkins supposedly used in the mid-1980's. It offends me to see these whirlygigs passed off as objects worthy of analysis in the engineering literature.

Yet again, they express the so-called average active information per query as
$$I_\oplus = {{I_\Omega} \over Q} = \frac{\log N^L}{Q} = {{L \log N} \over Q},$$
where Q is not the simple random variable it appears to be, but is instead the expected number of trials (“queries”) a procedure requires to maximize the number of characters in a “test” string that match a “target” string. Strings are over an alphabet of size N, and are of length L. Unless you have something to hide, you write
$$I_\oplus ={{L \log N} \over {E[T]}},$$
where T is the random number of trials required to obtain a perfect match of the target. This is a strange idea of an average, and it appears that a reviewer said as much. Rather than acknowledge the weirdness overtly, Ewert et al. added a cute “yeah, we know, but we do it consistently” footnote. Anyone without a prior commitment to advancing “intelligence creates active information” ideology would simply flip the fraction over to get the average number of trials per bit of endogenous information IΩ,
$$\frac{1}{I_\oplus} = E\left[{T \over {I_\Omega}}\right] = {{E[T]} \over {L \log N}}.$$
This has a clear interpretation as expected performance normalized by a measure of problem hardness. But when it’s “active information or bust,” you’re not free to go in any sensible direction available to you. I have to add that I can’t make a sensible connection between average active information per query and active information. Given a bound K on the number of trials to match the target string, the active information is
$$I_+ = \log \Pr\{T \leq K\} + {L \log N}.$$
Do you see a relationship between I+ and I that I’m missing?

By the way, I happened upon prior work regarding the amount of information required to solve a problem. The scholarly lassitude of the IDC “maverick geniuses” glares out yet again.

On second reading, I bothered to do sanity checking of the plots. I saw immediately that the surfaces in Fig. 2 were falling off in the wrong directions. For fixed alphabet size N, the plots show the average active information per query increasing as the string length L increases, when it obviously should decrease. The problem is harder, not easier, when the target string is longer. Comparing Fig. 5 to Figs. 3 and 4, it’s easy to see that the subscripts for N and L are reversed somewhere. But what makes Fig. 3 cattywampus is not so simple. Ewert et al. plot
$$I_\oplus(L, N) = \frac{L \log N}{E[T_{N,L}]}$$
$$I_\oplus(L, N) = \frac{L \log N}{E[T_{L,N}]}.$$
That is, the matrix of expected numbers of trials to match the target string is transposed, but the matrix of endogenous information values is not.

The embarrassment here is not that the cubs got confused about indexing of square matrices of values, but that a team of four, including Marks and Dembski, shipped out the paper for review, and then submitted the final copy for publication, with nary a sanity check of the plots. From where I sit, it appears that Ewert and Montañez are getting more in the way of indoctrination than advisement from Marks and Dembski. Considering that various folks have pointed out errors in every paper that Marks and Dembski have coauthored, you’d think the two would give their new papers thorough goings-over.

It is sad that Ewert and Montañez probably know more about analysis of algorithms than Marks and Dembski do, and evidently are forgetting it. The fact is that
$$E[T_{L,N}] = \Theta(N L \log L)$$
for all three of the evolutionary algorithms they consider, provided that parameters are set appropriately. It follows that
$$I_\oplus = \Theta\left(\frac{L \log N}{N L \log L}\right) = \Theta\left(\frac{\log N}{N \log L}\right).$$
In the case of (C), the (1, λ) evolutionary algorithm, setting the mutation rate to 1 / L and the number of offspring λ to N ln L does the trick. (Do a lit review, cubs — Marks and Dembski will not.) From the perspective of a computer scientist, the differences in expected numbers of trials for the algorithms are not worth detailed consideration. This is yet another reason why the study is silly.

Methinks it is like the OneMax problem

The optimization (not search) problem addressed by Ewert et al. (and the Weasel program) is a straightforward generalization of a problem that has been studied heavily by theorists in evolutionary computation, OneMax. In the OneMax problem, the alphabet is {0, 1}, and the fitness function is the number of 1's in the string. In other words, the target string is 11…1. If the cubs poke around in the literature, they’ll find that Dembski and Marks reinvented the wheel with some of their analysis. That’s the charitable conclusion, anyway.

Winston Ewert and George Montañez, don’t say the big, bad evilutionist never gave you anything.

1. Nice point at the end on how Weasel==OneMax. The irony is that while Dembski has spent too much of his life attacking Weasel, his own MESA model assumed the validity of the basic OneMax model. MESA builds a noisy channel into the passage of fitness scores out of the evaluation function, in effect an argument that the environment does not provide enough information for selection to operate effectively. As with Sanford's "Mendel's Accountant", if you turn up the randomness high enough the population will never converge near the optimum.

Thanks for that bit of news, Drs Dembski and Sanford. Now to prove that you understand that the map is not the territory, are the levels of noise in your models anything like the situation in nature?

2. I have informed Winston Ewert of this critique. I hope he'll demonstrate the academic integrity to address the issues raised.

(Winston and I go waaay back. All the way to my banishment from UD for pointing out flaws in their Weaseling.)

3. David,

Without "coupling," the MESA fitness function is indeed equivalent to a noisy count of 1's. And Dembski evidently understood in 2002 that individuals need not be optimal to be fit.

4. Out of morbid curiosity, what is MESA supposed to demonstrate? In the overview, Dembski says "JB [John Bracht] provided key biological insights and connections." but I don't see any connection to biology anywhere in the description or discussion.