Tuesday, December 21, 2010

A chuckle from William Lane Craig

Some of you know William Lane Craig as the evangelical Christian theologian from Biola University who often speaks favorably of “intelligent design” creationism. It happens that I just read one of his criticisms of the Jesus Seminar, a large group that gauged the historicity of the sayings and acts of Jesus reported in the four canonical gospels and the Gospel of Thomas:
Of the 74 [scholars] listed in their publication The Five Gospels, only 14 would be leading figures in the field of New Testament studies. More than half are basically unknowns, who have published only two or three articles. Eighteen of the fellows have published nothing at all in New Testament studies. Most have relatively undistinguished academic positions, for example, teaching at a community college.
Prof. Craig, you really should spend some time with Confucius. Not one leading figure in the field of biology has ever contributed to the “theory” of intelligent design.

Whatever the Jesus Seminar got wrong, something it got right was to place the burden of proof on those who would claim historicity of a gospel passage. Obviously, most scholars delving into the matter are Christians seeking to demonstrate the truth of what they believe. Sound familiar? This is not to suggest that traditional New Testament scholars are no better than ID creationists. The New Testament scholars truly know the subject matter, while ID creationists demonstrate time and again their execrable ignorance of the relevant science.

Friday, December 17, 2010

The three Isaac Newtons of Ypsilanti

Back on August 5, Clive Hayden, the moderator at Uncommon Descent, exclaimed, “Robert Jackson Marks II is THE CHARLES DARWIN OF INTELLIGENT DESIGN!” The occasion was the inclusion of Marks among The 20 Most Influential Christian Scholars at Superscholar.org:
Robert J. Marks II (b. 1950), Baylor University’s leading research professor, has emerged as the public face of intelligent design. As the movement’s premier scientist, he has been dubbed “the Charles Darwin of intelligent design.”
That hyperbole is old news. But the site’s list of 15 Professors Who Were Also Criminals just caught my eye:
Alan Turing was the Isaac Newton of the 20th century, and is seen by many to be the father of the Information Age.
Given that Superscholar.org is the absolute authority on such matters, what are we to make of the claim that Bill Dembski is the Isaac Newton of Information Theory?

The superabundance of Isaac Newtons has me thinking of The Three Christs of Ypsilanti:
To study the basis for delusional belief systems, Rokeach [a psychiatrist] brought together three men who each claimed to be Jesus Christ and confronted them with each other's conflicting claims, while encouraging them to interact personally as a support group. Rokeach also attempted to manipulate other aspects of their delusions by inventing messages from imaginary characters. He did not, as he had hoped, provoke any lessening of the patients' delusions, but did document a number of changes in their beliefs.
I suspect that, although he occasionally retreats from particular errors with “not-pologies,” Dembski will never give up the delusion that he is the Isaac Newton of his day.

Thursday, December 9, 2010

The talk that could have killed me

Presentations seem to be hit-or-miss affairs for me.

My first talk on evolutionary computation, which I gave during the opening session of the Third Annual Conference on Evolutionary Programming (1994), was a hit. Hans-Paul Schwefel, one of the inventors of the evolution strategy, joined me for lunch that day. Prior to the conference banquet, I was buttonholed by a guy whose name I did not recognize from the EC literature. He was obviously very bright, and he complimented me on my work. Furthermore, he seemed to know a lot about directed evolution. I kept stealing glances at his name tag, wondering, "Who is this guy?" To be honest, if I had known how to break away politely, we’d have spoken for 5 minutes instead of 20. In the end, one of the conference organizers approached him to say, “We’re about ready to start now.” And he walked to the table at the front of the room with the “reserved” placard. After dinner, he gave the most brilliant talk I’ve ever heard. His name was Gerry Joyce. Many of you know of the sensational result, “Self-Sustained Replication of an RNA Enzyme,” that he and his student Tracey Lincoln published last year (see PZ Myers' explanation). The conference got even better for me on the last day, when David Fogel, last year’s president of the IEEE Computational Intelligence Society, asked me to serve as co-chair, with Thomas Bäck and Pete Angeline, of the technical program for the following year’s meeting. (Unfortunately, I had to resign that position due to illness.)

The “disbelief discourse” I recently gave to the Oklahoma Atheists was a miss, no matter that I put a huge amount of time into preparing it. Driving to the venue, I took two wrong turns. I arrived at precisely the time I was supposed to begin, with my anxiety sky-high. It turned out that the guy with the projector and screen showed up just when I did, but that didn't make me feel any better. Then it turned out that my Apple laptop would not connect to the projector. So I converted my presentation to PDF and, with two tries, got it onto a thumb drive. I plugged the thumb drive into a backup laptop that was perched on a chair, rather than the podium where I was supposed to stand (and where there was a microphone, as well as a voice recorder for the planned podcast). At that point, I was totally discombobulated. I needed to stand in front of the podium to deal with the laptop. The screen, to which I wanted to point, was well behind me, and I caught myself talking over my shoulder several times. The microphone stand was directly behind my foot, and I bumped into it several times. Worst of all, I occasionally dared to look into the faces in front of me, and saw clearly that things were not going well. “Must press on” was all that I could think. The bright side of the experience was the Q&A. There were some good questions, and I had a lot of fun answering them. I hope that some of you who were there will believe that I’m usually the guy you saw in the end. It was an embarrassing experience for me.

So how could this have killed me? Well, the following day, I felt some pain behind my right knee. I thought I had sat wrong while finishing my slides. Indeed I had, but there was more to the story than that. Several days later, I was admitted to the hospital with extensive clotting in my leg, and with three pulmonary emboli. At present, it appears that an autoimmune response is making my blood sticky (i.e., antibodies are attaching to hemoglobin cells).

Perhaps you can understand now why I started by reminiscing about a time when everything went well.

Monday, November 8, 2010

The identity of the Designer

The elusive Kilroy has left his mark in exceedingly improbable places, all around the world. The only reasonable inference is....

Image copyright: Patrick Tillery

Wednesday, November 3, 2010

What is the probability of life in the physical universe?

I’ll be giving a “Disbelief Discourse” to the Oklahoma Atheists and the Channing Unitarians on Tuesday, November 16, at 7:30 PM.
Channing Unitarian Universalist Church
2800 West 15th Street
Edmond OK 73012
The event is open to the public.

What Is the Probability of Life in the Physical Universe?

The question posed in the title has no objective answer, and creationists avoid confronting it directly. They instead give mathematically dandified arguments that particular features of life are objectively so unlikely to have arisen by natural processes that they must reflect supernatural intervention. Their conclusions imply that life is physically improbable, and thus their claims of objectivity are false. This simple rebuttal may seem unsatisfying to laypeople who believe incorrectly that scientists use math to prove the properties of nature. Consequently, the presentation will begin with an explanation of mathematical modeling in science.

Biosketch

Tom English flirted with creationism in his teens, and went so far as to deliver an anti-evolution talk to his biology classmates. What led him to abandon his naive beliefs about the truth and reconcilability of scripture and science, and to embrace methodological naturalism in scientific investigation, was a combination of studies in the Bible and the philosophy of science. After earning bachelor’s and master’s degrees in psychology and English, respectively, at Mississippi College, and master’s and doctoral degrees in computer science at Mississippi State University, he began investigating evolution in computational processes. He independently proved what came to be known as the “no free lunch” theorem for optimization, and subsequently published six papers related to it. In empirical research, he obtained by computational evolution a predictor of annual sunspots activity that was far more accurate than any previously reported. Tom is a senior member of the Institute of Electrical and Electronics Engineers, and has served as an associate editor of the IEEE Transactions on Evolutionary Computation. His most recent scholarly publication, coauthored by Garry Greenwood, is “Intelligent Design and Evolutionary Computation,” Chapter 1 of Design by Evolution.

Monday, November 1, 2010

“Dover II”: Forensic science is not engineering

It is no secret that William A. Dembski, who filed a brief before withdrawing as an expert witness for the defense in Kitzmiller et al. v. Dover Area School District, is looking ahead to “Dover II.” I just wrote the following in a response to “Conservation of Information in Search: Measuring the Cost of Success”:
The article is the first in a series of publications that makes no positive contribution to the design of search procedures, but instead develops and applies a formal approach to arguing that success in search evidences design. That is, Dembski and Marks have disguised as engineering what is actually an attempt at forensic science, appropriate to making cases in courts of law and public opinion.
I’d point out to anyone who might serve as an anti-IDC expert in Dover II that Dembski and Marks have not subjected their work to scrutiny as forensic science. They have done their damnedest to avoid calling attention to the fact that their engineering papers are really not about engineering. Thus they’ve evaded whatever appropriate scrutiny they might have gotten within an inappropriate community. You can count on it, nonetheless, that the defense will present their publications as peer-reviewed science when the next legal battle comes along.

My comments are also relevant to those of you who face in the court of public opinion the (few) IDC rhetoricians who have managed to write off complex specified information and move on to active information.

Wednesday, October 13, 2010

Of “bad boys” and insipid women

This morning I had breakfast in Midwest City, Oklahoma, home of Tinker Air Force Base. It seems that many military retirees live there. Sitting in the booth next to mine was a one-armed man who might well have been a Vietnam vet. After some time he was joined by a young couple and an infant child. The husband was a brawny guy with a buzzcut. He had a cowboy hat in one hand, the infant carrier in the other, and cowboy boots on his feet. His T-shirt, which looked brand new, was tucked ever-so-neatly into his bluejeans. On the back was an image of a bottle of beer emptying over the silhouette of a shapely woman. Here’s the text that went with it:

Why beer is better than women...
  • A beer is always wet.
  • Beer always looks the same in the morning.
  • Beer is always happy to ride in the trunk.
  • Beer always goes down easy.
  • A beer doesn't change its mind after you've gotten the top off.
  • When you change beers, you don't have to pay alimony.
  • You can enjoy beer all month long.
  • You can share a beer with friends.
Things turned utterly surreal when I heard the server ask, “How old is she?” Yes, she.

Grandpa did not seem thrilled to have the company. Eventually Daddy went out to his immaculate, humongous, white pickup truck and moved Baby Girl’s bag to Grandpa’s car. Then he left alone to do whatever manly stuff he was going to do in that T-shirt.

Perhaps I am making too much of this. For all I know, a buddy gave him the T as a tasteless gag when he was getting married, and he was off to visit the buddy… at 8 a.m. on a Wednesday.

Long ago, I’d have expressed disgust at the “bad boy” making a big display of his indomitability. I would have regarded both the mother and daughter as victims. But there’s nothing secret anymore about attraction to bad boys — I’ve seen first-hand that it runs high among feminist intellectuals, whom you would expect to be least susceptible — and I have to say that women are responsible for dealing with it. After all, we are talking here about guys who do not deceive women. The mother allowed the sexually exciting male she has snagged and foolishly expects to change to degrade and humiliate not only her, but her daughter and women in general.

Monday, October 4, 2010

Archive critiques of ID creationism

It is easy to forget, here in our echo chamber, that most of the world’s researchers care little about “intelligent design” creationism. When Dembski and Marks submit a paper like “The Search for a Search” to online journals founded and edited by Poles and Japanese, there is a good chance that the reviewers are unsuspecting of shenanigans.

Much of the online criticism of IDC could, with little modification, be packaged as rough drafts of scholarly papers, and uploaded to archives such as arXiv.org. What makes this worthwhile is that Google Scholar, and not just Google, indexes these archives. A Scholar hit with a title such as “Errors in ‘The Search for a Search’ of Dembski and Marks” stands a good chance of catching the attention of reviewers of future work by the authors. Of course, archiving a critique does not preclude linking to it from blogs.

Rob recently gave a response [PDF] to “The Search for a Search” that strikes me as a prime example of what should be uploaded to arXiv.org. (In any case, I recommend it to those of you who can deal with math.)

I intend to follow my own advice. Provided that the lead author does not object, I’ll soon archive a brief paper, and notify you of it.

Monday, September 13, 2010

Perfectionism and writer's anxiety

I used to think that I could learn to toss off remarks on the Internet. This post is my 20th at Bounded Science. But I have 24 un-posted drafts.

I recall ever-so-clearly copying sentences from the board when I was in the first grade. The other students had finished all ten, and I was on my fifth. The anxiety and shame I felt were incredible. I’ve since studied psychology and had some counseling. But I cannot begin to explain what was going on with that little guy.

I took the Scholastic Aptitude Tests back when there was no penalty for wrong answers. Although I knew to stop agonizing and start guessing when time was running out, I could not bring myself to do it. I left fairly large numbers of questions unanswered, and ended up with high (“DaveScot”) scores anyway.

Please don’t take this as backhanded bragging. My birthday is coming soon, and how little I’ve produced is weighing heavily on me. “To those whom much is given, much is expected.” I might have another 20 or 30 years in which my brain works well, and I’m wondering how to turn my life around.

The greater the scrutiny I expect a piece of writing to receive, the greater the problems I have with it. Writing for a journal ties me in knots. No matter that I see mediocre stuff in journals all the time, I can’t let go the notion that my own submission has to be absolutely fabulous.

The more I struggle to make my work fabulous, the worse it gets. The best prose I’ve turned out is, unfortunately, my dissertation (1990). What made it different? After gathering data compulsively, I had only a month to do the writing. There was a job in the offing, and my wife and son were depending on me. So I forced out a certain number of pages each day, revising little. Every now and then, I flip the book open, and find myself asking, “Is that really my work?”

I get annoyed when people like Dembski weigh my c.v. I’m well aware of my horribly low output. But anybody who attempts to diminish me when I speak about the area of research I’ve focused upon for 15 years has problems greater than my own.

I’ve begun writing a special post — a highly accessible presentation of my current understanding of “no free lunch” in optimization. Dembski and Marks are pushing an interpretation of the classic NFL theorem that is precisely backwards, and my goal is to get most of you to understand it better than they do. But let’s plan on my post being far from perfect, and on its improving with feedback from you.

Friday, September 3, 2010

Too late to be a Sooner...

... but here I am, living in Oklahoma City (profile updated). Yes, every county in Oklahoma was red in the last presidential election. What can I say? I got a totally refurbished house at a fantastic price. My brother and his wife, with whom I get along famously, are just 5 miles away. The National Weather Center is about 20 miles away… for good reason. And, of course, there's Abbie, aka ERV. We have yet to meet, but I'm sure we will, Sooner or later.

Abbie and I disagree on the size of the hail that hit the area back in May. She says that it was baseball-sized (see the incredible video she posted), but I insist that it was softball-sized. Here’s my sister-in-law with a specimen of lesser diameter than some of the holes in her roof.


One hailstone not only passed through the roof, but also cratered the attic deck. My neighborhood was spared. But I am rethinking the deductible of my homeowner’s insurance.

Wednesday, August 18, 2010

Wasted days and wasted nights?

On his blog Deep Thoughts and Silliness, Bob O'Hara recently linked to this bulletin-board thread devoted to recording “banninations” at the intelligent-design blog Uncommon Descent. It happens that I've been reflecting on my “wasted days and wasted nights” at UD, and I've just gone through the 23 pages of comments on the thread. Now I'm in a confessional mood:
  1. Tom English
  2. Thom English
  3. Thomas English
  4. T M English
  5. austin english
  6. Turner Coates
  7. Cloud of Unknowing
  8. Semiotic 007
  9. Liz Lizard
  10. Sal Gal
  11. Mystic
  12. Oatmeal Stout
  13. Atticus Finch
  14. CEC09
  15. Hamlet
  16. Sooner Emeritus
Those are my UD identities I can remember. Most of what I wrote was good stuff. I often did a lot of reading and thinking before posting. My comments changed considerably over the years, and that was because I was learning. But I also indulged my anger at people who indoctrinate children with simplistic "the Bible tells the truth, and so does science" garbage. And I succumbed to the temptation to jerk the ever-so-accessible chains of Gil Dodgen and Gordon Mullings aka kairosfocus.

Three or four of my alter egos were undeservedly booted by Dembski for posting stuff that made him squirm. I feel good about getting one of the stars of Expelled to reveal himself as the censorial hypocrite that he is. And various of me were killed off capriciously by the notorious blog-czar David “DaveScot” Springer. Some others committed virtual suicide by butting heads with that egotistical jerk. There was some entertainment value in it, but I can’t say that it was a particularly good use of my energies. Some of my personae ranted, and some of them treated Gil and Gordo badly — definitely a waste.

In the end (?), I wish that I'd been a lot more like Bob O'Hara, Mark Frank, David vun Kannon, Allen MacNeill, Seversky, and R0b. (There have been others with a combination of brilliance and good blogosphere manners that I do not have, and I’ve just listed the ones that spring immediately to mind.) Josh Rosenau recently got me thinking with his post on the backfire effect in presenting people with information that contradicts their beliefs. “In your face” confrontation is really not the way to encourage independent thinking in someone leaning toward conservative acceptance of what they hear about science in religious contexts.

By the way, all of admirable individuals I listed above have been banned under some name, if I’m not mistaken. I point this out not to rationalize my occasional online ugliness, but to emphasize the chronic unfairness of the moderation at Uncommon Descent.

I want to mention that I’m gentle, and perhaps effective, in face-to-face conversation with people who’ve heard that intelligent design is the latest, greatest thing in Bible-consistent science. I feel compassion especially for kids who are where I was 35-40 years ago. It comes to me quite naturally to find out what they believe and how they believe, and to proceed on their terms, rather than mine. I am not an ogre in the real world.

Thursday, July 29, 2010

Feeling charitable toward Baylor’s IDC cubs

The reason I come off as a nasty bastard on this blog is that I harbor quite a bit of anger toward the creationist bastards who duped me as a teenager. The earliest stage of overcoming my upbringing was the worst time of my life. I wanted to die. Consequently, I am deadly serious in my opposition to “science-done-right proves the Bible true” mythology. William A. Dembski provokes me especially with his prevarication and manipulation. He evidently believes that such behavior is moral if it serves higher ends in the “culture war.” My take is, shall we say, more traditional.

When Robert J. Marks II, Distinguished Professor of Engineering at Baylor University, and Fellow of the Institute of Electrical and Electronics Engineers (IEEE), began collaborating with Dembski, I did not rush to the conclusion that he was like Dembski. But it has become apparent that he is willing to play the system. For instance, Dembski was miraculously elevated to the rank of Senior Member of the IEEE, which only 5% of members ever reach, in the very year that he joined the organization. To be considered for elevation, a member must be nominated by a fellow.

Although Marks was the founding president of the progenitor of the IEEE Computational Intelligence Society, which addresses evolutionary computation (EC), he and his IDCist collaborators go to the IEEE Systems, Man, and Cybernetics Society for publication. He is fully aware that reviewers there are unlikely to know much about EC, and are likely to give the benefit of the doubt to a paper bearing his name. I would love to see him impugn the integrity of his and my colleagues in the Computational Intelligence Society by claiming that they don’t review controversial work fairly. But it ain’t gonna happen.

I’ve come to see Marks as the quintessential late-career jerk, altogether too ready to claim expertise in an area he has never engaged vigorously. He is so cocksure as to publish a work of apologetics with the title Evolutionary Computation: A Perpetual Motion Machine for Design Information? (Chap. 17 of Evidence for God, M. Licona and W. A. Dembski, eds.). He states outright some misapprehensions that are implicit in his technical publications. Here’s the whopper: “A common structure in evolutionary search is an imposed fitness function, wherein the merit of a design for each set of parameters is assigned a number.” Who are you, Bob Marks, to say what is common and what is not in a literature you do not follow? Having scrutinized over a thousand papers in EC, and perused many more, I say that you are flat-out wrong. There’s usually a natural, not imposed, sense in which some solutions are better than others. Put up the references, Distinguished Professor Expert, or shut up.

Marks and coauthors cagily avoid scrutiny of their (few) EC sources by dumping on the reviewers references to entire books, i.e., with no mention of specific pages or chapters. This is because their EC veneer will not withstand a scratch. The chapter I just linked to may seem to contradict that, given its references to early work in EC by Barricelli (1962), Crosby (1967), and Bremmerman [sic] et al. (1966). [That's Hans-Joachim Bremermann.] First, note the superficiality of the references. Marks did not survey the literature to come by them. The papers appear in a collection of reprints edited by David Fogel, Evolutionary Computation: The Fossil Record (IEEE Press, 1998). Marks served as a technical editor of the volume, just as I did, and he should have cited it.

Although Marks is an electrical engineer, he has been working with two of Baylor’s graduate students in computer science, Winston Ewert and George Montañez. I would hazard a guess that there is some arrangement for the students to turn their research with Marks into masters’ theses. I’ve been sitting on some errors in their most recent publication, Efficient Per Query Information Extraction from a Hamming Oracle, thinking that the IDC cubs would get what they deserved if they included the errors in their theses. Well, I’ve got a soft spot for students, and I’m feeling charitable today. But there’s no free lunch for Marks. He has no business directing research in EC, his reputation in computational intelligence notwithstanding, and I hope that the CS faculty at Baylor catch on to the fact.

First reading

On first reading the paper, I was deeply annoyed by the combination of a Chatty-Cathy, self-reference-laden introduction focusing on “oracles,” irrelevant to the majority of the paper, with a non-survey of the relevant literature in the theory of EC. Ewert et al. dump in three references to books, without discussion of their content, at the beginning of their 4-1/2 page section giving Markov-chain analyses of evolutionary algorithms. It turns out that one of the books does not treat EC at all — I contacted the author to make sure.

As I have discussed here and here, two of the algorithms they analyze are abstracted from defective Weasel programs that Dawkins supposedly used in the mid-1980's. It offends me to see these whirlygigs passed off as objects worthy of analysis in the engineering literature.

Yet again, they express the so-called average active information per query as $$I_\oplus = {{I_\Omega} \over Q} = \frac{\log N^L}{Q} = {{L \log N} \over Q},$$ where Q is not the simple random variable it appears to be, but is instead the expected number of trials (“queries”) a procedure requires to maximize the number of characters in a “test” string that match a “target” string. Strings are over an alphabet of size N, and are of length L. Unless you have something to hide, you write $$I_\oplus ={{L \log N} \over {E[T]}},$$ where T is the random number of trials required to obtain a perfect match of the target. This is a strange idea of an average, and it appears that a reviewer said as much. Rather than acknowledge the weirdness overtly, Ewert et al. added a cute “yeah, we know, but we do it consistently” footnote. Anyone without a prior commitment to advancing “intelligence creates active information” ideology would simply flip the fraction over to get the average number of trials per bit of endogenous information IΩ, $$\frac{1}{I_\oplus} = E\left[{T \over {I_\Omega}}\right] = {{E[T]} \over {L \log N}}.$$ This has a clear interpretation as expected performance normalized by a measure of problem hardness. But when it’s “active information or bust,” you’re not free to go in any sensible direction available to you. I have to add that I can’t make a sensible connection between average active information per query and active information. Given a bound K on the number of trials to match the target string, the active information is $$I_+ = \log \Pr\{T \leq K\} + {L \log N}.$$ Do you see a relationship between I+ and I that I’m missing?

By the way, I happened upon prior work regarding the amount of information required to solve a problem. The scholarly lassitude of the IDC “maverick geniuses” glares out yet again.

Second reading

On second reading, I bothered to do sanity checking of the plots. I saw immediately that the surfaces in Fig. 2 were falling off in the wrong directions. For fixed alphabet size N, the plots show the average active information per query increasing as the string length L increases, when it obviously should decrease. The problem is harder, not easier, when the target string is longer. Comparing Fig. 5 to Figs. 3 and 4, it’s easy to see that the subscripts for N and L are reversed somewhere. But what makes Fig. 3 cattywampus is not so simple. Ewert et al. plot $$I_\oplus(L, N) = \frac{L \log N}{E[T_{N,L}]}$$ instead of $$I_\oplus(L, N) = \frac{L \log N}{E[T_{L,N}]}.$$ That is, the matrix of expected numbers of trials to match the target string is transposed, but the matrix of endogenous information values is not.

The embarrassment here is not that the cubs got confused about indexing of square matrices of values, but that a team of four, including Marks and Dembski, shipped out the paper for review, and then submitted the final copy for publication, with nary a sanity check of the plots. From where I sit, it appears that Ewert and Montañez are getting more in the way of indoctrination than advisement from Marks and Dembski. Considering that various folks have pointed out errors in every paper that Marks and Dembski have coauthored, you’d think the two would give their new papers thorough goings-over.

It is sad that Ewert and Montañez probably know more about analysis of algorithms than Marks and Dembski do, and evidently are forgetting it. The fact is that $$E[T_{L,N}] = \Theta(N L \log L)$$ for all three of the evolutionary algorithms they consider, provided that parameters are set appropriately. It follows that $$I_\oplus = \Theta\left(\frac{L \log N}{N L \log L}\right) = \Theta\left(\frac{\log N}{N \log L}\right).$$ In the case of (C), the (1, λ) evolutionary algorithm, setting the mutation rate to 1 / L and the number of offspring λ to N ln L does the trick. (Do a lit review, cubs — Marks and Dembski will not.) From the perspective of a computer scientist, the differences in expected numbers of trials for the algorithms are not worth detailed consideration. This is yet another reason why the study is silly.

Methinks it is like the OneMax problem

The optimization (not search) problem addressed by Ewert et al. (and the Weasel program) is a straightforward generalization of a problem that has been studied heavily by theorists in evolutionary computation, OneMax. In the OneMax problem, the alphabet is {0, 1}, and the fitness function is the number of 1's in the string. In other words, the target string is 11…1. If the cubs poke around in the literature, they’ll find that Dembski and Marks reinvented the wheel with some of their analysis. That’s the charitable conclusion, anyway.

Winston Ewert and George Montañez, don’t say the big, bad evilutionist never gave you anything.

Wednesday, July 28, 2010

Creeping elegance, or shameless hacking?

In my previous post, I did not feel great about handling processes in the following code for selection, but I did not see a good way around it.


from heapq import heapreplace

def selected(population, popSize, nSelect, best = None):
    if best == None:
        best = nSelect * [(None, None)]
    threshold = best[0][0]
    nSent = 0

    for process in processes:
        if nSent == popSize: break
        process.submit(population[nSent], threshold)
        nSent += 1

    for process, x, score in scored(popSize):
        if score > threshold:
            heapreplace(best, (score, x))
            threshold = best[0][0]
        if nSent < popSize:
            process.submit(population[nSent], threshold)
            nSent += 1

    return best


What I really want is for selected to know nothing about parallel processing, and for the generator of scored individuals to know nothing about selection. The problem is that threshold changes dynamically, and the generator needs a reference to it. As best I can tell, there are no scalar references in Python. Having taught LISP a gazillion times, I should have realized immediately that I could exploit the lexical scoping of Python, and pass to the generator a threshold-returning function defined within the scope of selected.


def selected(self, population, nSelect, best = None):
    if best is None:
        best = nSelect * [(None, None)]

    threshold = lambda: best[0][0]

    for x, score in evaluator.eval(population, threshold):
        if score > best[0][0]:
            heapreplace(best, (score, x))

    return best


Perhaps I should not be blogging about my first Python program. Then again, I’m not the worst programmer on the planet, and some folks may learn from my discussion of code improvement. This go around, I need to show you the generator.


def eval(self, population, threshold):
    popSize = len(population)
    nSent = 0

    for process in self.processes:
        if nSent == popSize: break
        process.submit(population[nSent], threshold())
        nSent += 1

    for unused in population:
        process = self.processes[self.isReady.recv()]
        yield process.result()
        if nSent < popSize:
            process.submit(population[nSent], threshold())
            nSent += 1


This is a method in class Evaluator, which I plan to release. No knowledge of parallel processing is required to use Evaluator objects. The __init__ method starts up the indexed collection of processes, each of which knows its own index. It also opens a Pipe through which processes send their indexes when they have computed the fitness of individuals submitted to them. The Evaluator object’s Connection to the pipe is named isReady.

The first for loop comes from the original version of selected. Iteration over population in the second for loop is just a convenient way of making sure that a result is generated for each individual. In the first line of the loop body, a ready process is identified by receiving its index through the isReady connection. Then the generator yields the result of a fitness evaluation. The flow of control stops flowing at this point, and resumes only when selected returns to the beginning of its for loop and requests the next result from the eval generator.

When execution of the generator resumes, the next unevaluated individual in the population, if any, is submitted to the ready process, along with the value of a call to the threshold function. The call gives the current value of best[0][0], the selection threshold.

By the way, the Pipe should be a Queue, because only the “producer” processes, and not the “consumer” process, send messages through it. But Queue is presently not functioning correctly under the operating system I use, Mac OS X.

Monday, July 26, 2010

Efficient selection with fitness thresholds, heaps, and parallel processing — easier done than said

The obvious approach to selection in an evolutionary algorithm is to preserve the better individuals in the population and cull the others. This is known as truncation selection. The term hints at sorting a list of individuals in descending order of fitness, and then truncating it to length nSelect. But that is really not the way to do things. And doing selection well is really not that hard. After providing a gentle review of the considerations, I’ll prove my point with 18 lines of code.

A principle of computational problem solving is not to waste time determining just how bad a bad solution is. Suppose we’re selecting the 3 fittest of a population of 10 individuals, and that the first 3 fitness scores we obtain are 90, 97, and 93. This means that we’re no longer interested in individuals with fitness of 90 or lower. If it becomes clear in the course of evaluating the fourth individual that its fitness does not exceed the threshold of 90, we can immediately assign it fitness of, say, 0 and move on to the next individual.

Use of the threshold need not be so simple. For some fitness functions, a high threshold reduces work for all evaluations. An example is fitness based on the Levenshtein distance of a string of characters t from a reference string s. This distance is the minimum number of insertions, deletions, and substitutions of single characters required to make the strings identical. Fitness is inversely related to distance. Increasing the threshold reduces the number of possible alignments of the strings that must be considered in computing the distance. In limited experiments with an evolutionary computation involving the Levenshtein distance, I’ve halved the execution time by exploiting thresholds.

A natural choice of data structure for keeping track of the nSelect fittest individuals is a min heap. All you need to know about the heap is that it is stored in an indexed data structure, and that the least element has the least index. That is, the threshold element is always heap[0] when indexing is zero-based. The heap is initialized to contain nSelect dummy individuals of infinitely poor fitness. When an individual has super-threshold fitness, it replaces the threshold element, and the heap is readjusted.

Evolutionary computations cry out for parallel processing. It is cruel and immoral to run them sequentially on computers with multiple processors (cores). But I have made it seem as though providing the fitness function with the selection threshold depends upon sequential evaluation of individuals. There are important cases in which it does not. If parents compete with offspring for survival, then the heap is initialized at the beginning of the run, and is reinitialized only when the fitness function changes — never, in most applications. Also, if the number of fitness evaluations per generation exceeds the number of processors, as is common with present technology, then there remains a sequential component in processing.

The way I’ve approached parallel processing is to maintain throughout the evolutionary run a collection of processes dedicated to fitness evaluation. The processes exist when fitness evaluation cum selection begins. First an individual is submitted, along with the threshold, to each process. Then fitness scores are received one by one. For each score received, the heap and threshold are updated if necessary, and an unevaluated individual is submitted, along with the threshold, to the process that provided the score. In the experiments I mentioned above, I’ve nearly halved the execution time by running two cores instead of one. That is, the combined use of thresholds and two fitness-evaluation processes gives almost a factor-of-4 speedup.

The Python function

I’m going to provide an explanation that any programmer should be able to follow. But first look the code over, considering what I’ve said thus far. The heap, named best, is an optional parameter. The variable nSent registers the number of individuals that have been submitted to evaluation processes. It steps from 0 to popSize, the size of the population.


from heapq import heapreplace

def selected(population, popSize, nSelect, best = None):
    if best == None:
        best = nSelect * [(None, None)]
    threshold = best[0][0]
    nSent = 0

    for process in processes:
        if nSent == popSize: break
        process.submit(population[nSent], threshold)
        nSent += 1

    for process, x, score in scored(popSize):
        if score > threshold:
            heapreplace(best, (score, x))
            threshold = best[0][0]
        if nSent < popSize:
            process.submit(population[nSent], threshold)
            nSent += 1

    return best


If no heap is supplied, best is set to an indexed collection of nSelect (unfit, dummy) pairs represented as (None, None). This works because any (fitness, individual) pair is greater than (None, None). The expression best[0][0] yields the fitness of the least fit individual in the heap, i.e., the threshold fitness for selection.

The first for loop submits to each of the waiting processes an individual in population to evaluate, along with threshold. [My next post greatly improves selected by eliminating the direct manipulation of processes.] The loop exits early if there is a surplus of processes. The processes are instances of a subclass of multiprocessing.Process that I have defined, but am “hiding” from you. I am illustrating how to keep the logic of parallel processing simple through object-oriented design. You don’t need to see the code to understand perfectly well that process.submit() communicates the arguments to process.

The second for loop iterates popSize times, processing triples obtained from scored. Despite appearances, scored is not a function, but a generator. It does not return a collection of all of the triples. In each iteration, it yields just one (process, x, score) to indicate the process that most recently communicated an evaluation (x, score). This indicates not only that the fitness of individual x is score, but that process is waiting to evaluate another individual. If the new score exceeds the selection threshold, then (score, x) goes into the best heap, and threshold is updated. And then the next unevaluated individual in the population, if any, is submitted along with the threshold to the ready process.

When the loop is exited, each individual has had its chance to get into the best heap, which is returned to the caller. By the way, there’s an argument to be made that when the best heap is supplied to the function, an individual with fitness equal to that of the worst in the heap should replace the worst. Presumably the heap contains parents that are competing with offspring for survival. Replacing parents with offspring when they are no better than the offspring can enhance escape from fitness plateaus.

Tuesday, July 13, 2010

Sure mutation in Python

In Python, “lazy” mutation goes something like this:

for i in range(len(offspring)):
    if random() < mutation_rate:
        offspring[i] = choice(alphabet)


The random() value is uniform on [0, 1), and the choice function returns a character drawn uniformly at random from alphabet. It follows from my last post that this can be made right within the implementation of an evolutionary algorithm by defining

adjusted_rate = \
    len(alphabet) / (len(alphabet) - 1) * mutation_rate


and using it in place of mutation_rate. “And that’s all I have to say about that.”

If you want a mutation operator that surely mutates, the following code performs well:


from random import randint

alphabet = 'abcdefghijklmnopqrstuvwxyz '
alphaSize = len(alphabet)
alphaIndex = \
    dict([(alphabet[i], i) for i in range(alphaSize)])

def mutate(c):
    i = randint(0, alphaSize - 2)
    if i >= alphaIndex[c]:
        i += 1
    return alphabet[i]


Here alphaIndex is a dictionary associating each character in the alphabet with its index in the string alphabet. The first character of a string is indexed 0. Thus the expressions alphaIndex['a'] and alphaIndex['d'] evaluate to 0 and 3, respectively. For all characters c in alphabet,

alphaIndex[c] == alphabet.index(c).

Looking up an index in the dictionary alphaIndex is slightly faster than calling the function alphabet.index, which searches alphabet sequentially to locate the character. The performance advantage for the dictionary would be greater if the alphabet were larger.

The function mutate randomly selects an index other than that of character c, and returns the character in alphabet with the selected index. It starts by calling randint to get a random index i between “least index” (0) and “maximum index minus 1.” The trick is that if i is greater than or equal to the index of the character c that we want to exclude from selection, then it is bumped up by 1. This puts it in the range alphaIndex[c] + 1, …, alphaSize - 1 (the maximum index). All indexes other than that of c are equally likely to be selected.

Monday, July 12, 2010

The roly poly and the cockroach

You may have noted in my post on Dembski's Weasel-whipping that I gave .0096 as the mutation rate of the Dobzhansky program, a conventional (1, 200) evolutionary algorithm (EA), while Yarus indicates that “1 in 100 characters in each generation” are mutated. Well, the slight discrepancy is due to the fact that the program uses the “lazy” mutation operator that I slammed as a bug in the algorithms analyzed by Ewert, Montañez, Dembski, and Marks [here]. I should explain that what is a roly poly in one context is a big, fat, nasty cockroach in another.

To mutate is to cause or to undergo change. That is, mutation is actual change, not an attempt at change. The lazy mutation operator simply overwrites a character in a phrase with a character drawn randomly from the alphabet, and sometimes fails to change the character. For an alphabet of size N, the mutation rate is (N − 1) / N times the probability that the operator is invoked. For the Dobzhansky program, N = 27, and
26 / 27 × .01 ≈ .0096.
The difference between .01 and .0096 is irrelevant to what Yarus writes about the program.

Correcting an EA implementation that uses a lazy mutation operator is trivial. Set the rate at which the operation is performed to N / (N − 1) times the desired mutation rate. Goodbye, roly poly.

No such trick exterminates the cucaracha of Ewert et al. Their algorithms (A) and (B), abstracted from the apocryphal Weasel programs, apply the lazy mutation operator to exactly one randomly selected character in each offspring phrase. The alphabet size ranges from 1 to 100, so the probability that an offspring is a mutant ranges from 0 / 1 to 99 / 100. As I explained in my previous post, it is utterly bizarre for the alphabet size to govern mutation in this manner. The algorithms are whirlygigs, of no interest to biologists and engineers.

Ewert et al. also address as algorithm (C) what would be a (1, 200) EA, were its “mutation” always mutation. The rate of application of the lazy mutation operator is fixed at 1 / 20. It is important to know that the near-optimal mutation rate for phrases of length L is 1 / L. With 100 characters in the alphabet, the effective mutation rate is almost 1 / 20, and the algorithm is implicitly tuned to handle phrases of length 20. For a binary alphabet, the effective mutation rate is just 1 / 40, and the algorithm is implicitly tuned to handle phrases of length 40. This should give you a strong sense of why the mutation rate should be explicit in analysis — as it always is in the evolutionary computation literature that the “maverick geniuses” do not bother to survey.

Regarding the number of mutants

I may have seemed to criticize the Dobzhansky program for generating many non-mutant offspring. That was not my intent. I think it’s interesting that the program performs so well, behaving as it does.

With the near-optimal mutation rate of 1 / L, the probability of generating a copy of the parent, (1 − 1 / L)L, converges rapidly on e-1 ≈ 0.368. Even for L as low as 25, an average of 72 in 200 offspring are exact copies of the parent.

It would not have been appropriate for Yarus to tune the mutation rate to the length of the Dobzhansky quote. That’s the sort of thing we do in computational problem solving. It’s not how nature works. I don’t make much of the fact that Yarus had an expected 109, rather than 73, non-mutant offspring per generation.

Edit: Yarus possibly changed the parameter settings of the program from those I’ve seen. I really don’t care if he did. I’m trying to share some fundamentals of how (not) to analyze evolutionary algorithms.

Monday, June 28, 2010

Dembski haplessly admits to plagiarism?

In “Efficient Per Query Information Extraction from a Hamming Oracle” Winston Ewert, George Montañez, William A. Dembski, Robert J. Marks II do not cite the sources of the algorithms they analyze. Yet Dembski trumpeted at Uncommon Descent, “New Peer-Reviewed ID Paper — Deconstructing the Dawkins WEASEL.”

The algorithms (A) and (B) are implemented, respectively, by the programs WEASEL1 and WEASEL2 that Dembski said were supplied to him by the pseudonymous “Oxfordensis.” Dembski announced, “[W]e shall… henceforward treat the programs below as the originals,” i.e., as those used by Richard Dawkins. Any way you slice it, the programs and the algorithms they implement are due to someone other than Ewert et alia.

If you abstract an algorithm from a program, you are ethically obligated to cite the program, just as you are ethically obligated to cite a book from which you get an idea, even if you do not copy words from the book. Algorithms are intellectual property, and are in fact patentable.

Has Dembski not tagged himself and his colleagues as plagiarists?

Willie can’t stop whipping the Weasel

William Dembski recently projected his obsession with the Weasel program onto evolutionists. His rationalization? Origin-of-life researcher Michael Yarus reuses the 1986 pop-sci illustration of evolution in a new pop-sci book, Life from an RNA World.

Dembski has misunderstood Richard Dawkins’ description of the program for at least a decade. He and Bob Marks falsely attributed partitioned search — not only the algorithm, but also the term — to Dawkins in a September 2009 article. As soon as the article appeared, he vested faith in two programs implementing different algorithms, explaining vaguely that the pseudonymous “Oxfordensis” supplied them. The story goes that Dawkins used one in preparing The Blind Watchmaker, and the other as a demo in a television show. Dawkins no longer has his code, and does not recognize the apocrypha. Dembski announced that he would pin it on Dawkins anyway.

The kicker is that the programs share a bug. They attempt to mutate exactly one character in each offspring phrase, but instead create a perfect copy of the parent in 1 of 27 attempts, on average. The implemented mutation rule is bizarre, with relevance neither to biology nor to engineering. Yet Dembski, Marks, and IDCist cubs recently published analyses of algorithms abstracted from the buggy programs. (Of course, they went once again to the IEEE Systems, Man, and Cybernetics Society, where reviewers are unlikely to know much about evolutionary algorithms.)

Now the clown-scholar Willie has the temerity to write,

Some internet critics have urged that we are beating a dead [weasel], that this example was never meant to be taken too seriously, and that if we were “serious scientists,” we would be directing our energies elsewhere. Let me suggest that these critics take up their concerns with Yarus.
Ironically, Yarus explains that pedagogy trumps biology:
To fully appreciate this particular [example], you must be aware that it is an idealization, not a realistic model of evolution. But its power comes from its precise aim — directly at the heart of the difficulty many people have with the concept of evolution.
Dembski’s subconscious no doubt screams, “Deny! Deny! Deny!”
Yarus sees this simulation as underwriting the power of evolutionary processes.
Underwriting? Yarus says that it “should give intelligent designers (and the rest of us) a reflective moment or two.”

Poor Willie. Imagine how conflicted he must be, to have a “serious scientist” like  Yarus invoke his name, but put the Weasel in its place.

No more exegesis of apocrypha

Rob Knight kindly supplied me with the code of what Yarus calls the “Dobzhansky program.” Rob and I agree that it implements a (1, λ) evolutionary algorithm (EA). In each generation, one parent phrase reproduces λ = 200 times. The characters in the offspring mutate independently at a rate of .0096. One offspring of maximal fitness survives as the parent of the next generation, and all other phrases “die.” This matches Wikipedia’s algorithm, except that the mutation rate is lower. The consequence of the lower rate is that the program performs well with phrases of length considerably greater than 100.

No more blogorrheaic Mullings

The (1, λ) EA is distinguished from the (1 + λ) EA, in which the parent competes with offspring for survival. With the PLUS algorithm, it is impossible for parental fitness to decrease from one generation to the next. Under certain conditions, it is nearly impossible for the COMMA algorithm. Some elementary calculations — not prolix, overwrought exhortations to the beloved onlookers [inside joke] — make this clear. For generality, let’s refer to the phrase length as L and the mutation rate as μ.

The probability of no mutation in a particular position of an offspring phrase is 1 − μ. The probability of no mutation in all L positions, i.e., no difference from the parent, is (1 − μ)L. Subtract this quantity from 1, and you get the probability that the offspring differs from the parent,

m = 1 − (1 − μ)L.
The expected number of mutants among the λ offspring of a generation is λm, and the probability that all of the offspring are mutants is mλ.

For Yarus’ illustration, with L = 63 and μ ≈ .0096, m ≈ .4554. Of the λ = 200 offspring in a generation, only λm ≈ 91 are mutants, on average. The probability of generating 0, rather than the expected 109, copies of the parent is

mλ ≈ .4554200 ≈ 10−68.
It follows that the (1, 200) EA performs identically to the (1 + 200) EA with overwhelming probability. Even for a phrase of length 300, which the Dobzhansky program will converge upon poorly — the ideal mutation rate is about 1 / 300, and μ is almost 3 times too large — only 1 in 100 thousand generations lacks a copy of the parent.

Profligate evolution

The computational waste is striking. But I believe that it is entirely appropriate to illustrate rapid evolution on a generational basis, i.e., assuming simultaneous evaluation of offspring. It is progress on the generational time scale that matters in biological evolution. And we know that biological evolution is profligate in its own ways. Dembski, Marks, and cubs focus on how many fitness evaluations, rather than how many generations, are required to obtain the fittest phrase. They essentially force a sequential model onto a process characterized by parallelism.

That’s not a feature — it’s a bug

The apocryphal TV demo almost implements what theorists of evolutionary computation call randomized local search (RLS). You can think of RLS as a (1 + 1) EA modified to ensure that there is exactly one mutation in each offspring. This eliminates evaluations of copies of the parent (and also multiply mutated offspring).

The apocryphal TV demo does not surely mutate the character in a randomly selected position, but instead assigns a randomly drawn character to the position.

Right:   xi := Random(ALPHABET - {xi})
Wrong: xi := Random(ALPHABET)
The incorrect version fails with probability 1 / N, where N is the size of the alphabet. The de facto mutation rule is very weird: With probability (N - 1) / N, mutate a randomly selected position of the phrase. Ewert, Montañez, Dembski, and Marks consider alphabet sizes ranging from 1 to 100. For the important case of a binary alphabet, 1 in 2 offspring is a copy of its parent, and the bug doubles the expected running time. All the authors can say is that they’re analyzing an error they suspect Dawkins of making a quarter-century ago.

The apocryphal TBW program implements a (1, 100) EA modified to use the weird mutation rule. As I just explained, the probability of mutation in an offspring phrase is m = (N - 1) / N. Here, for several interesting alphabet sizes N, are the the expected numbers of mutants in a generation (100 m) and the probabilities that all offspring in a generation are mutants (m100).


N 100 m m100
2 50.0 1 / 1030
27 96.3 1 / 44
100 99.0 1 / 3

It is simply crazy for alphabet size to govern mutation as it does here. For the Weasel-Dobzhansky case of N = 27, about 1 in 44 generations has no copy of the parent. Convergence usually requires more than 44 generations. So, with irony that can be appreciated only by folks who follow Uncommon Descent...
There is no latching or ratcheting or locking of any sort in the program that Dawkins putatively used in preparing The Blind Watchmaker.
And I repeat that there is no scholarly value in formal analysis of idiosyncrasies arising from faulty implementation of mutation. If Dembski, Marks, and cubs want to investigate two buggy programs as artifacts of possible “historical significance,” let them openly tell the story of Oxfordensis.

Wednesday, May 5, 2010

BIO-Complexity: Manufacturing the controversy

BIO-Complexity bills itself as a peer-reviewed journal of science. However, close inspection of its policies reveals it to be a website for slapdash dissemination and discussion of articles that have not been vetted by editors and reviewers. And very few scientists would agree that the scope is scientific.
It aims to be the leading forum for testing the scientific merit of the claim that intelligent design (ID) is a credible explanation for life.
The gist of intelligent design is that an immaterial, and hence unobservable, intelligence creates physical information for a purpose. Scientists are virtually unanimous that such godlike action has no place in scientific explanations.

As long as scientists reject ID as supernaturalism, there will be none of the "scientific controversy over ID" referred to at BIO-Complexity. It seems that the forum is designed to get scientists, identified by their real names, to engage in highly restricted exchanges on ID that create the impression of genuine controversy. This could aid the Discovery Institute in its "teach the controversy" campaign, the objective of which is to get ID into the science curricula of public schools.

I'm not spinning a silly conspiracy theory here. The Discovery Institute supports the Biologic Institute, which in turn supports BIO-Complexity. The editorial board of BIO-Complexity is dominated by affiliates of the Biologic Institute and fellows of the Discovery Institute. Douglas Axe, the director of the Biologic Institute, and an editor of the forum, tries to induce debate with a taunt in The Debate Over Design Gains Momentum with a New Peer-Reviewed Science Journal: BIO-Complexity:
[I]f you examine the way scientists on both sides of the ID debate are conducting themselves, which side would you say is generally doing a better job of inviting critical scrutiny? Which side is earnestly seeking the strongest critique that the other side can offer? The answer should be obvious. It has to be the side that is promoting the debate, right? Or conversely, which side has little tolerance for dissent? That’s equally obvious. It’s the conflicted side—the one that is constantly switching between denying that the debate exists, trying to win it, and trying to shut it down.
The Biologic Institute also offers "Our take on the ID controversy", cherry-picking the writing of ID adversaries to create the false impression that they regard ID as scientific. ID propagandists have a long history of turning scientists' remarks on ID into evidence of ID's scientific legitimacy. Scientists stirred to comment at BIO-Complexity should keep this in mind.

I cannot imagine how an adversary of ID could win by submitting a research article to BIO-Complexity. It would say implicitly that scientists believe that the claims of ID can be falsified on naturalistic grounds, contradicting the crucial fact that the claims are intrinsically supernatural. And it would suggest that science is advancing due to the "controversy," even if ID does not explain life on Earth.

Journal publishing without the tears (and blood and sweat)

The conveniently redefined "innovative" peer-review process of BIO-Complexity is much less stringent than is the norm in scientific journals.
The most significant form of peer review begins when a completed work is made publically available for examination and response. The goal of pre-publication peer review should therefore be to decide whether the work in question merits the attention of experts, rather than to predict the final result of that attention. BIO-Complexity uses an innovative approach to pre-publication peer-review in order to achieve this goal.

[...]

Two or more reviewers will be consulted for each reviewed manuscript. Authors are encouraged to suggest suitable reviewers, though the Editor may elect to use other reviewers.

Reviewers are asked to comment in fair terms on the work’s limitations, but also on whether they think the expert community would benefit from considering both the merits and the limitations. Taking into consideration the manuscript and the reviewers’ comments, the Editor will use this criterion of benefit to decide whether to take the manuscript forward.

[...]

BIO-Complexity aims to communicate decisions to authors within six weeks of submission.

In short, this is a slapdash approach to getting articles posted on a website for debate. There is no requirement that the editor act as advised by the reviewers. At the same time, the editor does not take personal responsibility for the quality of an accepted article.

R - E - S - P - E - C - T, oh, what it means to me

For each published article, the journal publishes one critique, accepted at the sole discretion of the editor of the article. The authors of the article will respond just once to the critique. There are also online comments on articles:
Respectful, open dialog is the most productive way to approach matters of controversy. [emphasis added]
(Watch to see if a double-standard for "respect" emerges at yet another creationist website. Do you think Doug Axe will restrain himself any more than he did in the taunt I quoted above?) If an adversary comments, he or she does so under constraints that lend to the impression that there is a legitimate scientific controversy:
  • Only people willing to use their real names are allowed to post comments.
  • Comments that fail to respect others will be removed (repeat or flagrant offenders being blocked).
  • Comments need to stay on point.

To be registered for posting comments, first register as a reader or author, then send an email from an institutional or corporate account (to establish your identity) with a brief description of your areas of interest to our support address

My point will never be on point according to BIO-Complexity. An intelligent designer is a god by another name, and is outside the scope of scientific investigation. There is no scientific controversy over intelligent design.

Monday, March 15, 2010

Errors in "Conservation of Information in Search"

William A. Dembski is a senior member, and Robert J. Marks II is a fellow, of the Institute of Electrical and Electronics Engineers (IEEE). The IEEE Code of Ethics requires members "to see, accept, and offer honest criticism of technical work, to acknowledge and correct errors, and to credit properly the contributions of others."

A correspondent tells me that he has notified Dembski and Marks of obvious mathematical errors in Conservation of Information in Search: Measuring the Cost of Success. The errors are identified and explained here.

My opinion, as a senior member of the IEEE, is that it is unethical for Dembski and Marks to continue disseminating the article online without correcting its known errors. Some researchers, including me, emend online versions of their publications by adding footnotes. I do not know why Dembski and Marks would not follow suit. Of course, they must "credit properly" the source of the corrections.

Section III.E of the article begins, "Partitioned search [12] is a 'divide and conquer' procedure. . . ." The combination of emphasis and citation falsely indicates that the term "partitioned search" comes from [12], Richard Dawkins' The Blind Watchmaker (TBW). Furthermore, categorical attribution of the procedure itself to Dawkins is unwarranted.

TBW describes a program that models an aspect of biological evolution (pp. 47-48). The program searches for a target phrase by iteratively "'breeding' ... mutant 'progeny'" from a parent phrase. The parent of the next generation is the progeny that "most resembles the target." Partitioned search would require additional information as to where the parent matches the target, along with exemption of matching characters from "mutation" in "breeding." There is no mention in TBW of these necessary elements of partitioned search.

The following now appears on the anonymously authored WeaselWare page of the website for Marks' Evolutionary Informatics Lab:
In an Evolutionary Search such as the one proposed by Dr. Dawkins [in TBW]*....
The footnote reads,
* Dr. Dawkins no longer possesses the original source code for his algorithm. Feeback and reflection have led the authors to conclude that an Evolutionary Search is the more likely interpretation for the type of search presented in TBW. Although Partitioned Search was the original interpretation, we have now expanded our analysis to include Evolutionary Strategies, thus covering all reasonable interpretations.
I'm calling on Dembski and Marks to acknowledge in the online version of the article that the term partitioned search does not appear in TBW, and that they "probably" interpreted TBW incorrectly.

Dembski and Marks should not dodge responsibility for the misinterpretation. As a computer scientist, I find the phrase "Dawkins no longer possesses the original source code for his algorithm" utterly bizarre. One starts with an algorithm, selects a programming language, and then expresses the algorithm in the particular programming language to obtain "source code." Dawkins need not have source code to tell us his algorithm. Dembski publicized a communication with Dawkins immediately after the publication of the article. Clearly he could have contacted Dawkins to ask about the algorithm while writing the article. If he doubted that the algorithm produced the results shown in TBW, all he had to do was implement the algorithm and check to see if worked as advertised.

Please join me in asking Dembski and Marks (their email addresses are at the bottom of the first page of the article) to fulfill their obligations under the IEEE Code of Ethics. Feel free to link to this text, which appears both at my blog, Bounded Science, and on the Sidewiki here at the website of the Evolutionary Informatics Lab.

Monday, January 18, 2010

Granville Sewell discovers YouTube

Mathematician and creationist Granville Sewell mangles thermodynamics for the UncommonDescent crowd yet again in Can ANYTHING Happen in an Open System–Video. I really don't have time to respond at the moment, but attached the following to the UD page as a Sidewiki comment, simply because it annoyed the hell out of me that he did not allow comments. I'd appreciate "helpful" votes from those of you set up to use Google's Sidewiki.

----------------------------------

Don't waste your time.

Believe it or not, the "unpolished" video is a sequence of still shots, mostly of text that Prof. Sewell is reading aloud. The only exceptions are images of 1) the cover of a book he reads from and 2) a computer motherboard. Sewell says nothing that he has not posted here on multiple occasions.

Professors who go to class unprepared and read from the textbook get fried in student evaluations. It amazes me that Prof. Sewell would do essentially that in a YouTube presentation. It should surprise no one that he allows comments neither at UncommonDescent nor at YouTube.

Ironically, Sewell's colleagues in the Evolutionary Informatics Lab, William A. Dembski and Robert J. Marks II, lionize a physicist and pioneer of information theory, Leon Brillouin, who answered Sewell more than 50 years ago in Science and Information Theory. Brillouin emphasized that large amounts of information can be gained through expenditure of small amounts of negentropy in physical observation, with overall increase in entropy of the physical system including the observer and the observed. He also emphasized that the negentropy costs of information processing would decline with advances in technology. There is no contradiction of thermodynamics in the growth of human knowledge and the attending advances in complexity of artifacts.