Tuesday, August 5, 2014

ID creationist “scholarship” sinks to new depths

[Edit: I regret turning nasty here. See the comments section for something rather more kind I would say to Winston Ewert.]

I mentioned a couple posts ago that I was annoyed with the editors and reviewers of an article, apparently to be published in an IEEE journal, by Winston Ewert, William Dembski, and Robert J. Marks II. [Edit 9/18/2014: Marks has removed the paper from his website. The IEEE Transactions on Systems, Man, and Cybernetics presently offers "Algorithmic Specified Complexity in the Game of Life" as a preprint.] Although I focus on the authors here, it will be clear enough what the folks responsible for quality control, including those on Ewert’s dissertation committee at Baylor, should have done. The short version is that, Googling obvious keywords, it takes at most a minute to discover that prior work, of which the authors claim to have no knowledge, exists in abundance.

Before continuing, I want to emphasize that my response to ID creationism is anything but knee-jerk. Consider, if nothing else, that the way I dealt with a dubious theorem supplied by Ewert et al. was to attempt to prove it. I not only succeeded in doing so, but supplied the proof here. I have since proved a fundamental theorem that some might take as advancing ID theory, though I see it otherwise.

Back to the article. There is no way around concluding that the veterans, Dembski and Marks, are either incompetent or dishonest. (You can guess which way I lean from a subtitle I’ve considered for this blog: Truth Is the First Casualty of Culture War.) And Youngman Ewert is either lazy, dishonest, or intellectually hobbled. (I see many signs of hero worship and true-believerism.)

You need no clear idea of what information theory is to grasp the significance of Dembski allowing himself to be promoted as the “Isaac Newton of information theory.” Marks edited a volume on engineering applications of information theory. The two have repeatedly cited standard texts that I own, but have not studied thoroughly. I am not an information theorist. So how is it that I should be thunderstruck by a flatly incorrect statement in the second sentence of the abstract, and then be floored by an intensification of it in the second paragraph of the text?

You may think that you are unqualified to judge whether Ewert, Dembski, and Marks are right or wrong. But Google Scholar settles the matter unequivocally. That is what makes the falsehood so atrocious. The authors state in the introduction:

Both Shannon [1], [2] and Kolmogorov-Chaitin-Solomonoff (KCS)1 [2]-[9] measures of information are famous for not being able to measure meaning. [...] We propose an information theoretic method to measure meaning. To our knowledge, there exists no other general analytic model for numerically assigning functional meaning to an object.

1Sometimes referred to as only Kolmogorov complexity or Kolmogorov information.
My immediate thoughts, incomprehensible to most of you, but eminently sensible to many who’ve had a course in information theory, were Kolmogorov structure function and Kolmogorov sufficient statistic. I first learned about Kolmogorov complexity from a chapter in reference [2]: Cover and Thomas, Elements of Information Theory. See Section 14.12, “Kolmogorov Sufficient Statistic.” If Dembski and Marks do not understand the relevance, then they are incompetent. If they do, then they are dishonest.

Back to Google. And to Ewert. And to Baylor. The article clearly draws on Ewert’s dissertation, which had better include a chapter with a title like “Literature Review.” Actually, his dissertation proposal should have surveyed the relevant literature. A scholar should comprehend a body of knowledge before trying to extend it — right? Let’s make the relatively charitable assumptions that Ewert, though presuming to make a fundamental contribution to information theory, never took a course in information theory, and failed to grasp some aspects of the introduction to Kolmogorov complexity in Cover and Thomas. He nonetheless had obvious keywords to enter into Google Scholar. There are 41,500 results for the unclever search

meaningful information Kolmogorov.
(The term meaning is too general, and meaningful is the last word of the first sentence in the abstract.) Close to the top are two articles by Paul Vitányi, both entitled “Meaningful Information.” With the search narrowed to
"meaningful information" Kolmogorov,
there are about 1,310 results. This is what I did first. I immediately went to the more recent of Vitányi's articles, because he’s a prominent researcher in Kolmogorov complexity, and also the coauthor of the standard text on the topic (reference [24] in Ewert et al.). I have highlighted key phrases in the abstract for those of you who don’t care to grapple with it.
Abstract—The information in an individual finite object (like a binary string) is commonly measured by its Kolmogorov complexity. One can divide that information into two parts: the information accounting for the useful regularity present in the object and the information accounting for the remaining accidental information. There can be several ways (model classes) in which the regularity is expressed. Kolmogorov has proposed the model class of finite sets, generalized later to computable probability mass functions. The resulting theory, known as Algorithmic Statistics, analyzes the algorithmic [Kolmogorov] sufficient statistic when the statistic is restricted to the given model class. However, the most general way to proceed is perhaps to express the useful information as a total recursive function. The resulting measure has been called the “sophistication” of the object. We develop the theory of recursive functions statistic, the maximum and minimum value, the existence of absolutely nonstochastic objects (that have maximal sophistication—all the information in them is meaningful and there is no residual randomness), determine its relation with the more restricted model classes of finite sets, and computable probability distributions, in particular with respect to the algorithmic (Kolmogorov) minimal sufficient statistic, the relation to the halting problem and further algorithmic properties.

So what are you up to in this paper, Dr. Ewert? When you feed us the “to our knowledge” guano, knowingly creating the misimpression that you tried to acquire knowledge, you are essentially lying. You have exaggerated the significance of your own work by failing to report on the prior work of others. I doubt highly that you are capable of understanding their work. Were you so inept as to not Google, or so deceitful as to sweep a big mound of inconvenient truth under the rug? Assuming the former, you need to catch on to the fact that your “maverick genius” coauthors should have known what was out there.


  1. Hi Tom. In the case of Winston Ewert, I lean toward the overly-simplistic philosophy of Mr. Miyagi: "No such thing as bad student -- only bad teacher."

    Regarding "algorithmic specified complexity", it's a variation on a long-playing theme. The nexus between parsimony and probability has been studied for decades, sometimes with ASC-like results.

    For example, in 2005 Dembski started defining specified complexity as:

    SC = –log2[10^120 * Phi_S(T) * P(T|H)]

    10^120*P(T|H) is the probability of T when you take into account the maximum number of possible trials. S is a "semiotic agent", and Phi_S(T) is the number of strings in S's vocabulary that are at least as simple as S's simplest description of T. So in terms of prefix codes and Kolmogorov complexity, Phi_S(T) is upper-bounded by 2^K(T). Plugging these in, we get:

    SC >= -log2[2^K(T) * P(T|H_alltrials)]

    >= -log2[P(T|H_alltrials)] - K(T)

    >= ASC(T)

    So ASC is a lower bound for Dembski's specified complexity if we interpret Phi_S(T) in terms of Kolmogorov complexity.

    Going back a little further to 2003, Elsberry and Shallit suggested a measure that's even more conspicuously similar to ASC. They defined the SAI ("specified anti-information") of string X as:

    SAI(X) = |X| - K(X)

    If X is selected from an equiprobable set of all strings of length |X|, then:

    P(X) = 2^-|X|


    |X| = -log2(P(X))


    SAI = -log2(P(X)) - K(X)

    = ASC(X).

    In other words, ASC is SAI sans the assumption of equiprobability.

    And we can go back even further, to Solomonoff's work on induction in the 60's. Solomonoff's universal distribution is:

    P(X|U) = 2^-K(X)

    Given another distribution H, we can express it as a weighted universal distribution:

    P(X|H) = 2^-K(X) * weight(X)

    Solving for the weight and expressing it in bits:

    -log2(weight(X)) = -log2(P(X|H)) - K(X)

    = ASC(X)

    So ASC is the factor by which a given distribution differs from the universal distribution, expressed in bits. And note that the universal distribution is a quantitative formulation of a much older principle that we call Ockham's razor.

    1. Sorry to be slow responding to your excellent comment, R0b. I apologize also for not having indicated that MathJax works in comments as in posts, though the results are not visible in preview. I tested it only this morning.

      I’ve made some connections of the sort that you have. I’ll be posting a brief report at arXiv. The combination of your insights with mine would make for a nice submission to one of the journals of the IEEE Systems, Man, and Cybernetics Society. I’d be grateful if you would consider working with me. I need to serve as coauthor to some lead author or another, suffering as I do from attention deficit disorder. Left to my own devices, I write lots and lots of good fragments, but end up without a coherent whole.

      Buried in the article I linked to: “KCS complexity or variations thereof have been previously proposed as a way to measure specification [25]-[27].” The references are to Dembski, of course, going back to The Design Inference (1998). Ewert et al. generalize $K(x)$ to $K(x|y),$ referring to the binary string $y$ as the \emph{context.} (I cannot bring myself to denote the strings as $X$ and $C,$ as Ewert et al. do.) That has no impact, however, on your observations about the relation of algorithmic specified complexity to algorithmic probability. It is quite odd to maintain that an event with high algorithmic probability is low in chance. Scientists generally change their beliefs about chances of events when they discover simple explanations.

      I knew that I needed to review SAI. Thanks for showing me the payoff.

    2. As for my remarks about Winston Ewert, I wish that I hadn't made them. He and his colleagues fail on technical demerits. Personalizing only weakens my case. I was going to replace this post with a redacted version. There's no way to do that now.

    3. Tom, I'm flattered by the offer to write a paper. My problem is that I value my anonymity too much. That and I'm a lousy writer.

      But there are several mathematically-inclined former ID critics who moved on before Dembski and Marks launched their new wave of mathematical arguments. I'm thinking of Elsberry, Shallit, Richard Wein, and Erik Tellgren. There are also a few mathematicians, namely Peter Oloffson and Olle Haggstrom, who have commented on Marks and Dembski's new work. Maybe one of these people, all of whom are far better equipped than I am, can be persuaded to take up their pen again.

      My wish for Ewert is that he can see clearly enough to move on to better things before Marks and Dembski damage his career. Ewert strikes me as an honest guy. He has shown admirable integrity in saying some things that are sure to be unpopular among IDists. For example, here he acknowledges what critics have been saying for years, namely that IDists are confused about the definition of CSI, and that "under Dembski’s formulation, we do not know whether or not biology contains specified complexity." I'm very curious to read his dissertation -- I sincerely hope that it's better than his Master's thesis.

    4. I'd wondered... I'll drop that.

      If I had a face-to-face chat with Winston, I'd tell him that I really, really, really regret having gotten into the "no free lunch" thing. Rather than play to my strengths -- I'm a crackerjack computer scientist -- I wasted a lot of myself on obfuscated reinvention of a probabilist's wheel. What's dreadfully embarrassing is that it took me many years to realize how bad my work was. I had to acknowledge that Haggstrom was correct in his criticism of NFL. And Winston should acknowledge that he does not have a theoretical bone is his body.

      I just finished studying the five examples of Section 3 of "Algorithmic Specified Complexity." Every one of them is botched. Googling, I find that Winston loves programming. I suspect that he's good at it. So my utterly sincere advice to him would be to exploit what talent he has, and to let go the illusion that he can do theory. That doesn't write him out of the IDC story. Dembski and Marks can't program their way out of a wet paper bag.

  2. It seems that Vitanyi's definition of "meaningful information" is what most would regard as simple regularity (learnable structure) in data. (Based on a skim of http://link.springer.com/chapter/10.1007/3-540-36136-7_51). He contrasts it with random (accidental) aspects of an object, and gives the example of the laws of planetary motion being the "meaningful" component, and the initial parameters as the "accidental" component. This seems to be nothing other than seeking to separate signal from noise, a common pursuit in data mining and applied stats.

    It doesn't seem to mean semantic meaning or functional information, as Ewert et al. use it. So I can't see why all the fuss. Am I missing something?

    1. Yes. Your remarks apply equally to the work of Ewert et al.

  3. This comment has been removed by the author.

  4. "Rather than play to my strengths -- I'm a crackerjack computer scientist -- I wasted a lot of myself on obfuscated reinvention of a probabilist's wheel. What's dreadfully embarrassing is that it took me many years to realize how bad my work was. I had to acknowledge that Haggstrom was correct in his criticism of NFL."

    Would you be willing to elaborate on this? It might help others avoid the types of mistakes you want them to avoid, if you could explain specifically: 1) in what ways your work was poor, or a "reinvented wheel", 2) why you did not notice your work was bad, and 3) what work/areas should be avoided (for fear of reinvention). Thanks.

    1. The first two questions are entirely legitimate. The last borders on the fallacy of the complex question. I'm more than willing to elaborate. But I'm going to limit my response, because I'm struggling to get a couple other things written.

      1) I have toiled and toiled over a preface for my first NFL paper, discovering errors in my explanations of errors, and compounding my embarrassment. Dog willing, and the Creek don't rise, I will eventually post "Sampling Bias Is Not Information" here. You can arrange for Google Scholar to alert you when the title appears, if you don't want to follow my blog. For now, I'll say only that what I called "conservation of information" was nothing but statistical independence of the sampling ("search" or "optimization") process and the sample of random values of the objective function. I describe statistical independence in the introduction to the paper, but fail to identify it. That is indeed bad.

      2) For one thing, I had given my confusion a fancy name. But I would rather focus here on my runaway arrogance. I independently proved the main NFL theorem in 1994. Questioning a student during his thesis defense, I got him to state clearly that "future work" was supposed to lead to a generally superior optimizer. After a centisecond of thought, I sketched the simple argument that Häggström gives in "Intelligent Design and the NFL Theorems" (2007). I considered going for a publication, but, as obvious as the result was to me, I had to believe that it was already in the optimization literature. Twenty years ago, a lit review in an unfamiliar field was a big undertaking. And I had more than enough work to do at the time. When Wolpert and Macready disseminated "No Free Lunch Theorems for Search" through GA DIgest (early 1995), stirring up a big controversy, I concluded that I was remarkably clever, rather than that the whole affair was silly. The accepted version of my first NFL paper included my simple proof, but a reviewer's suggestion led me to exchangeability (Häggström provided the term) as the "NFL condition." Rather than save the result for another paper, I dumped it into Sect. 3.2 about the time that camera-ready copy was due. I obsoleted what is known as the NFL theorem, 16 months before it was published. (I sent a copy of my paper to Bill Macready, who responded, "Nice work, Tom.") As the reputation of NFL grew, so did my arrogance. I forgot what I had suspected in the beginning, namely that something so simple was probably not a novel discovery. The more work I did with NFL, the more it became "my thing." In 2000 and 2001, I gave tutorials on the topic at conferences. I put NFL on the cover letters of my applications for academic positions. When Dembski came along with his book No Free Lunch, I was able to respond with "authority." Meanwhile, the citations of Wolpert and Macready, (presently 3823) go up and up and up. "Fifty Million Frenchmen Can't Be Wrong," you know.

    2. 3) Do not presume to contribute to a field that you have not bothered to study in depth.