Tuesday, July 13, 2010

Sure mutation in Python

In Python, “lazy” mutation goes something like this:

for i in range(len(offspring)):
    if random() < mutation_rate:
        offspring[i] = choice(alphabet)

The random() value is uniform on [0, 1), and the choice function returns a character drawn uniformly at random from alphabet. It follows from my last post that this can be made right within the implementation of an evolutionary algorithm by defining

adjusted_rate = \
    len(alphabet) / (len(alphabet) - 1) * mutation_rate

and using it in place of mutation_rate. “And that’s all I have to say about that.”

If you want a mutation operator that surely mutates, the following code performs well:

from random import randint

alphabet = 'abcdefghijklmnopqrstuvwxyz '
alphaSize = len(alphabet)
alphaIndex = \
    dict([(alphabet[i], i) for i in range(alphaSize)])

def mutate(c):
    i = randint(0, alphaSize - 2)
    if i >= alphaIndex[c]:
        i += 1
    return alphabet[i]

Here alphaIndex is a dictionary associating each character in the alphabet with its index in the string alphabet. The first character of a string is indexed 0. Thus the expressions alphaIndex['a'] and alphaIndex['d'] evaluate to 0 and 3, respectively. For all characters c in alphabet,

alphaIndex[c] == alphabet.index(c).

Looking up an index in the dictionary alphaIndex is slightly faster than calling the function alphabet.index, which searches alphabet sequentially to locate the character. The performance advantage for the dictionary would be greater if the alphabet were larger.

The function mutate randomly selects an index other than that of character c, and returns the character in alphabet with the selected index. It starts by calling randint to get a random index i between “least index” (0) and “maximum index minus 1.” The trick is that if i is greater than or equal to the index of the character c that we want to exclude from selection, then it is bumped up by 1. This puts it in the range alphaIndex[c] + 1, …, alphaSize - 1 (the maximum index). All indexes other than that of c are equally likely to be selected.

No comments :

Post a Comment