Newer
Older
"\n",
"\n",
"* Uniform Crossover: This type of crossover chooses randomly the genes to get merged. Here the genes 1, 2 and 5 where chosen from the first parent, so the genes 3, 4 will be added by the second parent.\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"### Mutation\n",
"\n",
"When an offspring is produced, there is a chance it will mutate, having one (or more, depending on the implementation) of its genes altered.\n",
"\n",
"For example, let's say the new individual to undergo mutation is \"abcde\". Randomly we pick to change its third gene to 'z'. The individual now becomes \"ab<font color='red'>z</font>de\" and is added to the population."
]
},
{
"cell_type": "markdown",
"At each iteration, the fittest individuals are picked randomly to mate and produce offsprings. We measure an individual's fitness with a *fitness function*. That function depends on the given problem and it is used to score an individual. Usually the higher the better.\n",
"The selection process is this:\n",
"1) Individuals are scored by the fitness function.\n",
"\n",
"2) Individuals are picked randomly, according to their score (higher score means higher chance to get picked). Usually the formula to calculate the chance to pick an individual is the following (for population *P* and individual *i*):\n",
"\n",
"$$ chance(i) = \\dfrac{fitness(i)}{\\sum_{k \\, in \\, P}{fitness(k)}} $$"
]
},
{
"cell_type": "markdown",
"### Implementation\n",
"\n",
"Below we look over the implementation of the algorithm in the `search` module.\n",
"\n",
"First the implementation of the main core of the algorithm:"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
"%psource genetic_algorithm"
]
},
{
"cell_type": "markdown",
"source": [
"The algorithm takes the following input:\n",
"\n",
"* `population`: The initial population.\n",
"\n",
"* `fitness_fn`: The problem's fitness function.\n",
"\n",
"* `gene_pool`: The gene pool of the states/individuals. Genes need to be chars. By default '0' and '1'.\n",
"\n",
"* `f_thres`: The fitness threshold. If an individual reaches that score, iteration stops. By default 'None', which means the algorithm will try and find the optimal solution.\n",
"\n",
"* `ngen`: The number of iterations/generations.\n",
"\n",
"* `pmut`: The probability of mutation.\n",
"\n",
"The algorithm gives as output the state with the largest score."
]
},
{
"cell_type": "markdown",
"For each generation, the algorithm updates the population. First it calculates the fitnesses of the individuals, then it selects the most fit ones and finally crosses them over to produce offsprings. There is a chance that the offspring will be mutated, given by `pmut`. If at the end of the generation an individual meets the fitness threshold, the algorithm halts and returns that individual.\n",
"\n",
"The function of mating is accomplished by the method `reproduce`:"
]
},
{
"cell_type": "code",
},
"outputs": [],
"source": [
"def reproduce(x, y):\n",
" n = len(x)\n",
" c = random.randrange(0, n)\n",
" return x[:c] + y[c:]"
]
},
{
"cell_type": "markdown",
"source": [
"The method picks at random a point and merges the parents (`x` and `y`) around it.\n",
"The mutation is done in the method `mutate`:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
},
"outputs": [],
"source": [
"def mutate(x, gene_pool):\n",
" n = len(x)\n",
" g = len(gene_pool)\n",
" c = random.randrange(0, n)\n",
" r = random.randrange(0, g)\n",
" new_gene = gene_pool[r]\n",
" return x[:c] + new_gene + x[c+1:]"
]
},
{
"cell_type": "markdown",
"We pick a gene in `x` to mutate and a gene from the gene pool to replace it with.\n",
"\n",
"To help initializing the population we have the helper function `init_population`\":"
]
},
{
"cell_type": "code",
},
"outputs": [],
"source": [
"def init_population(pop_number, gene_pool, state_length):\n",
" g = len(gene_pool)\n",
" population = []\n",
" for i in range(pop_number):\n",
" new_individual = ''.join([gene_pool[random.randrange(0, g)]\n",
" for j in range(state_length)])\n",
" population.append(new_individual)\n",
"\n",
" return population"
]
},
{
"cell_type": "markdown",
"source": [
"The function takes as input the number of individuals in the population, the gene pool and the length of each individual/state. It creates individuals with random genes and returns the population when done."
]
},
{
"cell_type": "markdown",
"source": [
"### Usage\n",
"Below we give two example usages for the genetic algorithm, for a graph coloring problem and the 8 queens problem.\n",
"First we will take on the simpler problem of coloring a small graph with two colors. Before we do anything, let's imagine how a solution might look. First, we have only two colors, so we can represent them with a binary notation: 0 for one color and 1 for the other. These make up our gene pool. What of the individual solutions though? For that, we will look at our problem. We stated we have a graph. A graph has nodes and edges, and we want to color the nodes. Naturally, we want to store each node's color. If we have four nodes, we can store their colors in a string of genes, one for each node. A possible solution will then look like this: \"1100\". In the general case, we will represent each solution with a string of 1s and 0s, with length the number of nodes.\n",
"Next we need to come up with a fitness function that appropriately scores individuals. Again, we will look at the problem definition at hand. We want to color a graph. For a solution to be optimal, no edge should connect two nodes of the same color. How can we use this information to score a solution? A naive (and ineffective) approach would be to count the different colors in the string. So \"1111\" has a score of 1 and \"1100\" has a score of 2. Why that fitness function is not ideal though? Why, we forgot the information about the edges! The edges are pivotal to the problem and the above function only deals with node colors. We didn't use all the information at hand and ended up with an ineffective answer. How, then, can we use that information to our advantage?\n",
"We said that the optimal solution will have all the edges connecting nodes of different color. So, to score a solution we can count how many edges are valid (aka connecting nodes of different color). That is a great fitness function!\n",
"Let's jump into solving this problem using the `genetic_algorithm` function."
]
},
{
"cell_type": "markdown",
"source": [
"First we need to represent the graph. Since we mostly need information about edges, we will just store the edges. We will denote edges with capital letters and nodes with integers:"
]
},
{
"cell_type": "code",
},
"outputs": [],
"source": [
"edges = {\n",
" 'A': [0, 1],\n",
" 'B': [0, 3],\n",
" 'C': [1, 2],\n",
" 'D': [2, 3]\n",
"}"
]
},
{
"cell_type": "markdown",
"Edge 'A' connects nodes 0 and 1, edge 'B' connects nodes 0 and 3 etc.\n",
"\n",
"We already said our gene pool is 0 and 1, so we can jump right into initializing our population. Since we have only four nodes, `state_length` should be 4. For the number of individuals, we will try 8. We can increase this number if we need higher accuracy, but be careful! Larger populations need more computating power and take longer. You need to strike that sweet balance between accuracy and cost (the ultimate dilemma of the programmer!)."
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['0011', '1111', '0000', '1010', '0111', '1010', '0111', '0011']\n"
]
}
],
"population = init_population(8, ['0', '1'], 4)\n",
"print(population)"
]
},
{
"cell_type": "markdown",
"We created and printed the population. You can see that the genes in the individuals are random and there are 8 individuals each with 4 genes.\n",
"Next we need to write our fitness function. We previously said we want the function to count how many edges are valid. So, given a coloring/individual `c`, we will do just that:"
]
},
{
"cell_type": "code",
},
"outputs": [],
"source": [
"def fitness(c):\n",
" return sum(c[n1] != c[n2] for (n1, n2) in edges.values())"
]
},
{
"cell_type": "markdown",
"Great! Now we will run the genetic algorithm and see what solution it gives."
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1010\n"
]
}
],
"solution = genetic_algorithm(population, fitness)\n",
"print(solution)"
]
},
{
"cell_type": "markdown",
"source": [
"The algorithm converged to a solution. Let's check its score:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4\n"
]
}
],
"source": [
"print(fitness(solution))"
]
},
{
"cell_type": "markdown",
"source": [
"The solution has a score of 4. Which means it is optimal, since we have exactly 4 edges in our graph, meaning all are valid!\n",
"*NOTE: Because the algorithm is non-deterministic, there is a chance a different solution is given. It might even be wrong, if we are very unlucky!*"
]
},
{
"cell_type": "markdown",
"#### Eight Queens\n",
"\n",
"Let's take a look at a more complicated problem.\n",
"\n",
"In the *Eight Queens* problem, we are tasked with placing eight queens on an 8x8 chessboard without any queen threatening the others (aka queens should not be in the same row, column or diagonal). In its general form the problem is defined as placing *N* queens in an NxN chessboard without any conflicts.\n",
"\n",
"First we need to think about the representation of each solution. We can go the naive route of representing the whole chessboard with the queens' placements on it. That is definitely one way to go about it, but for the purpose of this tutorial we will do something different. We have eight queens, so we will have a gene for each of them. The gene pool will be numbers from 0 to 7, for the different columns. The *position* of the gene in the state will denote the row the particular queen is placed in.\n",
"\n",
"For example, we can have the state \"03304577\". Here the first gene with a value of 0 means \"the queen at row 0 is placed at column 0\", for the second gene \"the queen at row 1 is placed at column 3\" and so forth.\n",
"\n",
"We now need to think about the fitness function. On the graph coloring problem we counted the valid edges. The same thought process can be applied here. Instead of edges though, we have positioning between queens. If two queens are not threatening each other, we say they are at a \"non-attacking\" positioning. We can, therefore, count how many such positionings are there.\n",
"\n",
"Let's dive right in and initialize our population:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['16144650', '15257744', '25105035', '45153531', '02333213']\n"
]
}
],
"source": [
"population = init_population(100, [str(i) for i in range(8)], 8)\n",
"print(population[:5])"
]
},
{
"cell_type": "markdown",
"We have a population of 100 and each individual has 8 genes. The gene pool is the integers from 0 to 7, in string form. Above you can see the first five individuals.\n",
"\n",
"Next we need to write our fitness function. Remember, queens threaten each other if they are at the same row, column or diagonal.\n",
"Since positionings are mutual, we must take care not to count them twice. Therefore for each queen, we will only check for conflicts for the queens after her.\n",
"\n",
"A gene's value in an individual `q` denotes the queen's column, and the position of the gene denotes its row. We can check if the aforementioned values between two genes are the same. We also need to check for diagonals. A queen *a* is in the diagonal of another queen, *b*, if the difference of the rows between them is equal to either their difference in columns (for the diagonal on the right of *a*) or equal to the negative difference of their columns (for the left diagonal of *a*). Below is given the fitness function."
]
},
{
"cell_type": "code",
},
"outputs": [],
"source": [
"def fitness(q):\n",
" non_attacking = 0\n",
" for row1 in range(len(q)):\n",
" for row2 in range(row1+1, len(q)):\n",
" col1 = int(q[row1])\n",
" col2 = int(q[row2])\n",
" row_diff = row1 - row2\n",
" col_diff = col1 - col2\n",
" if col1 != col2 and row_diff != col_diff and row_diff != -col_diff:\n",
" non_attacking += 1\n",
]
},
{
"cell_type": "markdown",
"Note that the best score achievable is 28. That is because for each queen we only check for the queens after her. For the first queen we check 7 other queens, for the second queen 6 others and so on. In short, the number of checks we make is the sum 7+6+5+...+1. Which is equal to 7\\*(7+1)/2 = 28.\n",
"\n",
"Because it is very hard and will take long to find a perfect solution, we will set the fitness threshold at 25. If we find an individual with a score greater or equal to that, we will halt. Let's see how the genetic algorithm will fare."
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"43506172\n",
"26\n"
]
}
],
"solution = genetic_algorithm(population, fitness, f_thres=25)\n",
"print(solution)\n",
"print(fitness(solution))"
]
},
{
"cell_type": "markdown",
"Above you can see the solution and its fitness score, which should be no less than 25."
]
},
{
"cell_type": "markdown",
"source": [
"With that this tutorial on the genetic algorithm comes to an end. Hope you found this guide helpful!"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"state": {
"013d8df0a2ab4899b09f83aa70ce5d50": {
"views": []
},
"01ee7dc2239c4b0095710436453b362d": {
"views": []
},
"04d594ae6a704fc4b16895e6a7b85270": {
"views": []
},
"052ea3e7259346a4b022ec4fef1fda28": {
"views": [
{
}
]
},
"0ade4328785545c2b66d77e599a3e9da": {
"views": [
{
}
]
},
"0b94d8de6b4e47f89b0382b60b775cbd": {
"views": []
},
"0c63dcc0d11a451ead31a4c0c34d7b43": {
"views": []
},
"0d91be53b6474cdeac3239fdffeab908": {
"views": [
{
}
]
},
"0fe9c3b9b1264d4abd22aef40a9c1ab9": {
"views": []
},
"10fd06131b05455d9f0a98072d7cebc6": {
"views": []
},
"1193eaa60bb64cb790236d95bf11f358": {
"views": [
{
}
]
},
"11b596cbf81a47aabccae723684ac3a5": {
"views": []
},
"127ae5faa86f41f986c39afb320f2298": {
"views": []
},
"16a9167ec7b4479e864b2a32e40825a1": {
"views": [
{
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
}
]
},
"170e2e101180413f953a192a41ecbfcc": {
"views": []
},
"181efcbccf89478792f0e38a25500e51": {
"views": []
},
"1894a28092604d69b0d7d465a3b165b1": {
"views": []
},
"1a56cc2ab5ae49ea8bf2a3f6ca2b1c36": {
"views": []
},
"1cfd8f392548467696d8cd4fc534a6b4": {
"views": []
},
"1e395e67fdec406f8698aa5922764510": {
"views": []
},
"23509c6536404e96985220736d286183": {
"views": []
},
"23bffaca1206421fb9ea589126e35438": {
"views": []
},
"25330d0b799e4f02af5e510bc70494cf": {
"views": []
},
"2ab8bf4795ac4240b70e1a94e14d1dd6": {
"views": [
{
}
]
},
"2bd48f1234e4422aaedecc5815064181": {
"views": []
},
"2d3a082066304c8ebf2d5003012596b4": {
"views": []
},
"2dc962f16fd143c1851aaed0909f3963": {
"views": [
{
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
}
]
},
"2f659054242a453da5ea0884de996008": {
"views": []
},
"30a214881db545729c1b883878227e95": {
"views": []
},
"3275b81616424947be98bf8fd3cd7b82": {
"views": []
},
"330b52bc309d4b6a9b188fd9df621180": {
"views": []
},
"3320648123f44125bcfda3b7c68febcf": {
"views": []
},
"338e3b1562e747f197ab3ceae91e371f": {
"views": []
},
"34658e2de2894f01b16cf89905760f14": {
"views": [
{
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
}
]
},
"352f5fd9f698460ea372c6af57c5b478": {
"views": []
},
"35dc16b828a74356b56cd01ff9ddfc09": {
"views": []
},
"3805ce2994364bd1b259373d8798cc7a": {
"views": []
},
"3d1f1f899cfe49aaba203288c61686ac": {
"views": []
},
"3d7e943e19794e29b7058eb6bbe23c66": {
"views": []
},
"3f6652b3f85740949b7711fbcaa509ba": {
"views": []
},
"43e48664a76342c991caeeb2d5b17a49": {
"views": [
{
}
]
},
"4662dec8595f45fb9ae061b2bdf44427": {
"views": []
},
"47ae3d2269d94a95a567be21064eb98a": {
"views": []
},
"49c49d665ba44746a1e1e9dc598bc411": {
"views": [
{
}
]
},
"4a1c43b035f644699fd905d5155ad61f": {
"views": [
{
}
]
},
"4eb88b6f6b4241f7b755f69b9e851872": {
"views": []
},
"4fbb3861e50f41c688e9883da40334d4": {
"views": []
},
"52d76de4ee8f4487b335a4a11726fbce": {
"views": []
},
"53eccc8fc0ad461cb8277596b666f32a": {
"views": [
{
}
]
},
"54d3a6067b594ad08907ce059d9f4a41": {
"views": []
},
"612530d3edf8443786b3093ab612f88b": {
"views": []
},
"613a133b6d1f45e0ac9c5c270bc408e0": {
"views": []
},
"636caa7780614389a7f52ad89ea1c6e8": {
"views": [
{
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
}
]
},
"63aa621196294629b884c896b6a034d8": {
"views": []
},
"66d1d894cc7942c6a91f0630fc4321f9": {
"views": []
},
"6775928a174b43ecbe12608772f1cb05": {
"views": []
},
"6bce621c90d543bca50afbe0c489a191": {
"views": []
},
"6ebbb8c7ec174c15a6ee79a3c5b36312": {
"views": []
},
"743219b9d37e4f47a5f777bb41ad0a96": {
"views": [
{
}
]
},
"774f464794cc409ca6d1106bcaac0cf1": {
"views": []
},
"7ba3da40fb26490697fc64b3248c5952": {
"views": []
},
"7e79fea4654f4bedb5969db265736c25": {
"views": []
},
"85c82ed0844f4ae08a14fd750e55fc15": {
"views": []
},
"86e8f92c1d584cdeb13b36af1b6ad695": {
"views": [
{
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
}
]
},
"88485e72d2ec447ba7e238b0a6de2839": {
"views": []
},
"892d7b895d3840f99504101062ba0f65": {
"views": []
},
"89be4167713e488696a20b9b5ddac9bd": {
"views": []
},
"8a24a07d166b45498b7d8b3f97c131eb": {
"views": []
},
"8e7c7f3284ee45b38d95fe9070d5772f": {
"views": []
},
"98985eefab414365991ed6844898677f": {
"views": []
},
"98df98e5af87474d8b139cb5bcbc9792": {
"views": []
},
"99f11243d387409bbad286dd5ecb1725": {
"views": []
},
"9ab2d641b0be4cf8950be5ba72e5039f": {
"views": []
},
"9b1ffbd1e7404cb4881380a99c7d11bc": {
"views": []
},
"9c07ec6555cb4d0ba8b59007085d5692": {
"views": []
},
"9cc80f47249b4609b98223ce71594a3d": {
"views": []
},
"9d79bfd34d3640a3b7156a370d2aabae": {
"views": []
},
"a015f138cbbe4a0cad4d72184762ed75": {
"views": []
},
"a27d2f1eb3834c38baf1181b0de93176": {
"views": []
},
"a29b90d050f3442a89895fc7615ccfee": {
"views": [
{
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
}
]
},
"a725622cfc5b43b4ae14c74bc2ad7ad0": {
"views": []
},
"ac2e05d7d7e945bf99862a2d9d1fa685": {
"views": []
},
"b0bb2ca65caa47579a4d3adddd94504b": {
"views": []
},
"b8995c40625d465489e1b7ec8014b678": {
"views": []
},
"ba83da1373fe45d19b3c96a875f2f4fb": {
"views": []
},
"baa0040d35c64604858c529418c22797": {
"views": []
},
"badc9fd7b56346d6b6aea68bfa6d2699": {
"views": [
{
}
]
},
"bdb41c7654e54c83a91452abc59141bd": {
"views": []
},
"c2399056ef4a4aa7aa4e23a0f381d64a": {
"views": [
{
}
]
},
"c73b47b242b4485fb1462abcd92dc7c9": {
"views": []
},
"ce3f28a8aeee4be28362d068426a71f6": {
"views": [
{
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
}
]
},
"d3067a6bb84544bba5f1abd241a72e55": {
"views": []
},
"db13a2b94de34ce9bea721aaf971c049": {
"views": []
},
"db468d80cb6e43b6b88455670b036618": {
"views": []
},
"e2cb458522b4438ea3f9873b6e411acb": {
"views": []
},
"e77dca31f1d94d4dadd3f95d2cdbf10e": {
"views": []
},
"e7bffb1fed664dea90f749ea79dcc4f1": {
"views": [
{
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
}
]
},
"e80abb145fce4e888072b969ba8f455a": {
"views": []
},
"e839d0cf348c4c1b832fc1fc3b0bd3c9": {
"views": []
},
"e948c6baadde46f69f105649555b84eb": {
"views": []
},
"eb16e9da25bf4bef91a34b1d0565c774": {
"views": []
},
"ec82b64048834eafa3e53733bb54a713": {
"views": []
},
"edbb3a621c87445e9df4773cc60ec8d2": {
"views": []
},
"ef6c99705936425a975e49b9e18ac267": {
"views": []
},
"f1b494f025dd48d1ae58ae8e3e2ebf46": {
"views": []
},
"f435b108c59c42989bf209a625a3a5b5": {
"views": [
{
}
]
},
"f71ed7e15a314c28973943046c4529d6": {
"views": []
},
"f81f726f001c4fb999851df532ed39f2": {
"views": []
}
},
}
},
"nbformat": 4,