Genetic Scala is a basic genetic algorithm implementation in Scala. The solution to achieve is a 64 bit number which is entered as the candidate solution. The fitness of each organism is it’s closeness to this 64-bit number and the process completes when an organism’s chromosome matches this solution.

The initial generation is generated stochastically and then improves toward the solution each generation by creating children from the fittest individuals (the closest match to the solution) of a generation and mutating individuals to allow their genome to evolve.

Running

sbt run

Process:

1. Initialisation

Creates a random population of 50 organisms each with a 64-bit chromosome, such as:

001100101011101010101101001010110101010110101010110101101001010

2. Evaluation

Fitness is calculated for each organism in the population by comparing each bit of the candidate solution with each bit in the organism’s chromosome. This gives a score which is then re-oriented to be a float between 0.0 and 1.0, using:

1.0 / ((m - s) / 100)

…where m is the maximum score and s is the organism’s score.

3. Selection

Organisms are selected based on stochastic universal sampling which can avoid issues with approaches such as roulette wheel selection when dealing with organisms that have very high fitness scores saturating the candidate space.

10 rounds are performed, each picking a random organism from the population. Then the fittest organism is selected from the tournament. This picks a strong organism but also allows for a diverse population as there is a potential for a weaker organism to be selected which could contain genetic information useful for later recombination.

By evolving the population in ‘elitist’ mode, the strongest organism is kept from the previous population generation and added to the new generation. This ensures the next generation is at least as fit as the fittest in the previous generation.

4. Crossover

To create a new organism, two parents are selected using tournament selection then a child is created using uniform crossover which uses a mixing ratio of 0.5, thereby taking an roughly equal number of genes from each organism.

In the below partial chromosome, z is created using alternate genes from parents x and y.

 
x = 0010|11|10|10|1101|011|010|10|...
y = 1011|00|01|01|0010|100|101|01|...
z = 0010|00|10|01|1101|100|010|01|...
 

5. Mutation

Randomness is added to the genetics of the population by making small changes to an organism’s chromosome. A mutation rate of 0.015 is used to change a low enough proportion of the genes to not adversely affect already fit organisms but high enough to introduce some healthy change in the population’s genome to move closer to a solution.

6. Repeat

The process is repeated by evolving the population and checking the fittest in that next generation.

7. Termination

The process is repeated until we find an organism with a fitness of 1.0 (as we have an understandable solution) so our fitness function can find a definite score rather than just a best fit.

Results

Generally a result is found within 10 - 20 generations. Output shows the generation, the chromosome of the fittest organism in that generation and the fitness. Finally the candidate solution is displayed along with the solution found from the fittest organism.

generation candidate fitness
01 1011111111000001110110000100001101100101011101101110010011010101 0.75
02 1001011111010101110111000100011101110101011110101110010111011101 0.80
03 0000010111010101000101010101000001100101000101011100010001111101 0.86
04 0001010101010101110101010101000101100101010101011100010011110101 0.90
05 0001010101010100100101010100010101010101010101010111010101110101 0.93
06 0001010101010100100101010100010101010101010101010111010101110101 0.93
07 0001010101010101010101010101011101010101010101010100010101010101 0.97
08 0001010101010101010101010101011101010101010101010100010101010101 0.97
09 0101010101010101000101010101010101010101010101010101010101010101 0.99
10 0101010101010101010101010101010101010101010101010101010101010101 0.99
     

Thus the candidate solution and the achieved solution are the same:

candidate: 0101010101010101010101010101010101010101010101010101010101010101 solution: 0101010101010101010101010101010101010101010101010101010101010101