FGA documentation



Introduction to genetic algorithms


A genetic algorithm is a search heuristic used to solve optimization problems. It is based on the paradigm of evolutionary computation and it is inspired by biological models.

The possible solutions of a problem are encoded in chromosomes: firstly, a random population of chromosomes is created; then each chromosome is evaluated by the “fitness” function and the best chromosomes are selected for reproduction. The probability of being chosen is proportional to fitness:


(where is the fitness of chromosome i)


Two parent chromosomes generate two children chromosomes through the “crossover” operator; FGA implements different types of crossover and lets the user define custom ones. Here are two examples of crossover:






(one-point crossover and two-point crossover)


Then comes the mutation stage, where there is a probability for each gene of a chromosome to be changed (with FGA it is also possible to define different mutation routines).

These steps are repeated until the population's best chromosome satisfies minimum criteria or until it stabilizes to a certain value.



Using FGA


The header fga.hpp implements two template classes: Population and PopulationMT. The first controls the real genetic algorithm, while the second is used to build a parallel version of the algorithm.

The user must define the data type of genes, the size of the population, the number of threads to be executed (in the case of PopulationMT) and some functions that are used to generate, evaluate, combinate and mutate chromosomes, the most important being the “fitness” function. All the functions are passed as function pointers.

Then the functions run() or cycle() have to be called in order to start the algorithm; there are of course functions that return the best chromosome of the population.

For example, let's assume chromosomes are strings of characters:


...

float my_fitness_function(char *chromosome)

{

// compute fitness value

return fitness_of_chromosome;

}


char my_mutate_gene(char gene)

{

// compute mutated gene

return mutated_gene

}


char my_random_gene()

{

// compute a random gene

return random_gene;

}


int main()

{

...

Population<char> my_population(number_of_chromosomes, length_of_chromosomes, my_fitness, my_mutate_gene, my_random_gene, NULL, NULL, NULL);

my_population.run(minimum_tolerated_fitness, maximum_number_of_generations);

float best_score = my_population.get_all_time_best_score();

char *best_chromosome = my_population.get_all_time_best();

...

}


OR


int main()

{

...

while (some_condition) {

my_population.cycle();

float best_score = my_population.get_best_score();

char *best_chromosome = my_population.get_best();

}

...

}


OR (for custom crossover operator and chromosome-wide routines)


...

float my_fitness_function(char *chromosome)

{

// compute fitness value

return fitness_of_chromosome;

}


void my_mutate_chromosome(char *chromosome)

{

// mutate the chromosome in place

}


void my_random_chromosome(char *chromosome)

{

// generate a chromosome in the given buffer

}


void my_crossover_operator(char *chromosome1, char *chromosome2)

{

// cross over the two chromosomes

}


int main()

{

...

Population<char> my_population(number_of_chromosomes, length_of_chromosomes, my_fitness, NULL, NULL, my_crossover_operator, my_mutate_chromosome, my_random_chromosome);

...

}


OR (for multi-threaded algorithm)


...

PopulationMT<char> my_population(number_of_threads, total_number_of_chromosomes, length_of_chromosomes, my_fitness, my_mutate_gene, my_random_gene, NULL, NULL, NULL);

...


For a complete list of the available functions refer to the file fga.hpp.


The maxbit_st and maxbit_mt examples show the usage of the FGA library for a very simple problem: maximize the 1-bits of a binary string of fixed length.


The following is a less trivial application of genetic algorithms.



Solving the travelling salesman problem


The “travelling salesman problem” (often called TSP) consists in finding the cheapest Hamiltonian path (or cycle) that connects all the nodes (“cities”) of a graph.

The TSP is proven to be NP-complete, and an optimal solution takes O(n!) time, or, using dynamic programming, O(2n), which is in any case exponential.

This is because such an algorithm involves trying all the possible permutations of the sequence of nodes traversed in the path.

Using an heuristic based on genetic algorithms, a good solution to the problem can be found in a reasonable amount of time, even if it is not sure that it will be the optimal solution.

The main issue is the choice of how to represent a path in a chromosome and how to implement the crossover operator: if a path is represented (as it is natural) by a permutation of the nodes, standard operators don't guarantee that the strings they produce represent valid Hamiltonian paths; different approaches were found to solve this problem by changing the representation: for a brief review of them see [1].

My choice was to keep the simplest representation and define new crossover and mutation operators. Crossover now selects a random sub-string from parent 1 and joins it with the maximum compatible sub-string from parent 2 (and vice versa for child 2); then, if needed, it completes the path with missing nodes. For example, let's consider a TSP with 5 cities:


parent 1: A – C – E – B – D

parent 2: B – E – A – D – C


random sub-string from parent 1: C – E -

maximum compatible sub-string from parent 2: - E – A – D -


sub-strings joined: C – E – A – D -

child 1 (completed with missing nodes): C – E – A – D – B


This new crossover operator is particularly good at preserving inheritance, and always returns well-formed Hamiltonian paths.

The mutation operator simply swaps two random nodes:


child 1 (mutated): C – D – A – E - B


Now the mutation rate has a new meaning: it represents the probability that a chromosome will be subject to mutation.


Comparing the algorithm with the classical brute-force approach shows the benefit of using genetic algorithms to solve problems that lack a polynomial-time solution.


$ ./graph_gen

L = 5


$ time ./tsp_bf

Brute force algorithm


Progress: 100% (120 of 120 paths generated)


Best cost: 10


Best path:

2 4 0 1 3


real 0m0.071s

user 0m0.040s

sys 0m0.024s


$ time ./tsp

Generations: 1512


Mutation rate: 0.7


Best cost: 10


Best path:

2 4 0 1 3


Best path was stable for 1500 generations. Algorithm halted.


real 0m2.024s

user 0m1.212s

sys 0m0.687s


With a graph of only 5 nodes, the brute-force algorithm found the optimal solution instantly; the genetic algorithm also found it, but had to wait a fixed number of generations before to consider it the final result.

If we change the dimension of the graph, the exponential growth of the brute-force solution will become clear:


$ ./graph_gen

L = 25


$ time ./tsp

Generations: 14556


Mutation rate: 0.8


Best cost: 78


Best path:

19 24 9 17 11 12 13 7 0 2 23 1 3 4 6 5 8 10 14 15 16 18 20 21 22


Best path was stable for 7500 generations. Algorithm halted.


real 0m19.399s

user 0m10.824s

sys 0m5.089s


$ time ./tsp_bf

Brute force algorithm


Progress: 188700000 paths generated


Best cost: 98


Best path:

0 1 2 3 4 5 6 7 8 9 10 11 12 16 18 22 20 21 24 15 14 17 13 23 19


real 3m40.658s

user 2m26.597s

sys 1m0.451s


After more than 3 minutes we had to stop the process, and the algorithm was still at ~0% of its progress; moreover, the solution found at that time was visibly worst than the one returned by the genetic algorithm.

More precisely, the brute-force version had generated ~108 of ~1025 (25! = 15511210043330985984000000 ~= 1025) total possible paths, so we could expect it to terminate in the order of 1017 minutes, which corresponds to ~100000000000 years.

The genetic algorithm took less than 20 seconds to return a good approximate solution.


[1] http://www.cs.uml.edu/~giam/91.510/Lectures/Lecture6.ppt


Alessandro Presta