## Monday, July 26, 2010

### Efficient selection with fitness thresholds, heaps, and parallel processing — easier done than said

The obvious approach to selection in an evolutionary algorithm is to preserve the better individuals in the population and cull the others. This is known as truncation selection. The term hints at sorting a list of individuals in descending order of fitness, and then truncating it to length nSelect. But that is really not the way to do things. And doing selection well is really not that hard. After providing a gentle review of the considerations, I’ll prove my point with 18 lines of code.

A principle of computational problem solving is not to waste time determining just how bad a bad solution is. Suppose we’re selecting the 3 fittest of a population of 10 individuals, and that the first 3 fitness scores we obtain are 90, 97, and 93. This means that we’re no longer interested in individuals with fitness of 90 or lower. If it becomes clear in the course of evaluating the fourth individual that its fitness does not exceed the threshold of 90, we can immediately assign it fitness of, say, 0 and move on to the next individual.

Use of the threshold need not be so simple. For some fitness functions, a high threshold reduces work for all evaluations. An example is fitness based on the Levenshtein distance of a string of characters t from a reference string s. This distance is the minimum number of insertions, deletions, and substitutions of single characters required to make the strings identical. Fitness is inversely related to distance. Increasing the threshold reduces the number of possible alignments of the strings that must be considered in computing the distance. In limited experiments with an evolutionary computation involving the Levenshtein distance, I’ve halved the execution time by exploiting thresholds.

A natural choice of data structure for keeping track of the nSelect fittest individuals is a min heap. All you need to know about the heap is that it is stored in an indexed data structure, and that the least element has the least index. That is, the threshold element is always heap[0] when indexing is zero-based. The heap is initialized to contain nSelect dummy individuals of infinitely poor fitness. When an individual has super-threshold fitness, it replaces the threshold element, and the heap is readjusted.

Evolutionary computations cry out for parallel processing. It is cruel and immoral to run them sequentially on computers with multiple processors (cores). But I have made it seem as though providing the fitness function with the selection threshold depends upon sequential evaluation of individuals. There are important cases in which it does not. If parents compete with offspring for survival, then the heap is initialized at the beginning of the run, and is reinitialized only when the fitness function changes — never, in most applications. Also, if the number of fitness evaluations per generation exceeds the number of processors, as is common with present technology, then there remains a sequential component in processing.

The way I’ve approached parallel processing is to maintain throughout the evolutionary run a collection of processes dedicated to fitness evaluation. The processes exist when fitness evaluation cum selection begins. First an individual is submitted, along with the threshold, to each process. Then fitness scores are received one by one. For each score received, the heap and threshold are updated if necessary, and an unevaluated individual is submitted, along with the threshold, to the process that provided the score. In the experiments I mentioned above, I’ve nearly halved the execution time by running two cores instead of one. That is, the combined use of thresholds and two fitness-evaluation processes gives almost a factor-of-4 speedup.

The Python function

I’m going to provide an explanation that any programmer should be able to follow. But first look the code over, considering what I’ve said thus far. The heap, named best, is an optional parameter. The variable nSent registers the number of individuals that have been submitted to evaluation processes. It steps from 0 to popSize, the size of the population.

from heapq import heapreplace

def selected(population, popSize, nSelect, best = None):
if best == None:
best = nSelect * [(None, None)]
threshold = best[0][0]
nSent = 0

for process in processes:
if nSent == popSize: break
process.submit(population[nSent], threshold)
nSent += 1

for process, x, score in scored(popSize):
if score > threshold:
heapreplace(best, (score, x))
threshold = best[0][0]
if nSent < popSize:
process.submit(population[nSent], threshold)
nSent += 1

return best

If no heap is supplied, best is set to an indexed collection of nSelect (unfit, dummy) pairs represented as (None, None). This works because any (fitness, individual) pair is greater than (None, None). The expression best[0][0] yields the fitness of the least fit individual in the heap, i.e., the threshold fitness for selection.

The first for loop submits to each of the waiting processes an individual in population to evaluate, along with threshold. [My next post greatly improves selected by eliminating the direct manipulation of processes.] The loop exits early if there is a surplus of processes. The processes are instances of a subclass of multiprocessing.Process that I have defined, but am “hiding” from you. I am illustrating how to keep the logic of parallel processing simple through object-oriented design. You don’t need to see the code to understand perfectly well that process.submit() communicates the arguments to process.

The second for loop iterates popSize times, processing triples obtained from scored. Despite appearances, scored is not a function, but a generator. It does not return a collection of all of the triples. In each iteration, it yields just one (process, x, score) to indicate the process that most recently communicated an evaluation (x, score). This indicates not only that the fitness of individual x is score, but that process is waiting to evaluate another individual. If the new score exceeds the selection threshold, then (score, x) goes into the best heap, and threshold is updated. And then the next unevaluated individual in the population, if any, is submitted along with the threshold to the ready process.

When the loop is exited, each individual has had its chance to get into the best heap, which is returned to the caller. By the way, there’s an argument to be made that when the best heap is supplied to the function, an individual with fitness equal to that of the worst in the heap should replace the worst. Presumably the heap contains parents that are competing with offspring for survival. Replacing parents with offspring when they are no better than the offspring can enhance escape from fitness plateaus.