To mutate is to cause or to undergo change. That is, mutation is actual change, not an attempt at change. The lazy mutation operator simply overwrites a character in a phrase with a character drawn randomly from the alphabet, and sometimes fails to change the character. For an alphabet of size N, the mutation rate is (N − 1) / N times the probability that the operator is invoked. For the Dobzhansky program, N = 27, and
26 / 27 × .01 ≈ .0096.The difference between .01 and .0096 is irrelevant to what Yarus writes about the program.
Correcting an EA implementation that uses a lazy mutation operator is trivial. Set the rate at which the operation is performed to N / (N − 1) times the desired mutation rate. Goodbye, roly poly.
No such trick exterminates the cucaracha of Ewert et al. Their algorithms (A) and (B), abstracted from the apocryphal Weasel programs, apply the lazy mutation operator to exactly one randomly selected character in each offspring phrase. The alphabet size ranges from 1 to 100, so the probability that an offspring is a mutant ranges from 0 / 1 to 99 / 100. As I explained in my previous post, it is utterly bizarre for the alphabet size to govern mutation in this manner. The algorithms are whirlygigs, of no interest to biologists and engineers.
Ewert et al. also address as algorithm (C) what would be a (1, 200) EA, were its “mutation” always mutation. The rate of application of the lazy mutation operator is fixed at 1 / 20. It is important to know that the near-optimal mutation rate for phrases of length L is 1 / L. With 100 characters in the alphabet, the effective mutation rate is almost 1 / 20, and the algorithm is implicitly tuned to handle phrases of length 20. For a binary alphabet, the effective mutation rate is just 1 / 40, and the algorithm is implicitly tuned to handle phrases of length 40. This should give you a strong sense of why the mutation rate should be explicit in analysis — as it always is in the evolutionary computation literature that the “maverick geniuses” do not bother to survey.
Regarding the number of mutants
I may have seemed to criticize the Dobzhansky program for generating many non-mutant offspring. That was not my intent. I think it’s interesting that the program performs so well, behaving as it does.
With the near-optimal mutation rate of 1 / L, the probability of generating a copy of the parent, (1 − 1 / L)L, converges rapidly on e-1 ≈ 0.368. Even for L as low as 25, an average of 72 in 200 offspring are exact copies of the parent.
It would not have been appropriate for Yarus to tune the mutation rate to the length of the Dobzhansky quote. That’s the sort of thing we do in computational problem solving. It’s not how nature works. I don’t make much of the fact that Yarus had an expected 109, rather than 73, non-mutant offspring per generation.
Edit: Yarus possibly changed the parameter settings of the program from those I’ve seen. I really don’t care if he did. I’m trying to share some fundamentals of how (not) to analyze evolutionary algorithms.