Newer
Older
" \n",
" node_colors[node.state] = \"red\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" \n",
" if problem.goal_test(node.state):\n",
" node_colors[node.state] = \"green\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" return(iterations, all_node_colors, node)\n",
" \n",
" frontier = PriorityQueue(min, f)\n",
" frontier.append(node)\n",
" \n",
" node_colors[node.state] = \"orange\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" \n",
" explored = set()\n",
" while frontier:\n",
" node = frontier.pop()\n",
" \n",
" node_colors[node.state] = \"red\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" \n",
" if problem.goal_test(node.state):\n",
" node_colors[node.state] = \"green\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" return(iterations, all_node_colors, node)\n",
" \n",
" explored.add(node.state)\n",
" for child in node.expand(problem):\n",
" if child.state not in explored and child not in frontier:\n",
" frontier.append(child)\n",
" node_colors[child.state] = \"orange\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" elif child in frontier:\n",
" incumbent = frontier[child]\n",
" if f(child) < f(incumbent):\n",
" del frontier[incumbent]\n",
" frontier.append(child)\n",
" node_colors[child.state] = \"orange\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
"\n",
" node_colors[node.state] = \"gray\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" return None\n",
"\n",
"def uniform_cost_search(problem):\n",
" \"[Figure 3.14]\"\n",
" iterations, all_node_colors, node = best_first_graph_search(problem, lambda node: node.path_cost)\n",
" return(iterations, all_node_colors, node)"
"cell_type": "code",
"execution_count": 23,
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "a667c668001e4e598478ba4a870c6aec"
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "135c6bd739de4aab8fc7b2fcb6b90954"
}
},
"metadata": {},
"output_type": "display_data"
}
],
"all_node_colors = []\n",
"romania_problem = GraphProblem('Arad', 'Bucharest', romania_map)\n",
"display_visual(user_input = False, algorithm = uniform_cost_search, problem = romania_problem)"
"cell_type": "markdown",
"metadata": {},
"## A\\* SEARCH\n",
"\n",
"Let's change all the node_colors to starting position and define a different problem statement."
]
},
{
"cell_type": "code",
"execution_count": 24,
},
"outputs": [],
"source": [
"def best_first_graph_search(problem, f):\n",
" \"\"\"Search the nodes with the lowest f scores first.\n",
" You specify the function f(node) that you want to minimize; for example,\n",
" if f is a heuristic estimate to the goal, then we have greedy best\n",
" first search; if f is node.depth then we have breadth-first search.\n",
" There is a subtlety: the line \"f = memoize(f, 'f')\" means that the f\n",
" values will be cached on the nodes as they are computed. So after doing\n",
" a best first search you can examine the f values of the path returned.\"\"\"\n",
" \n",
" # we use these two variables at the time of visualisations\n",
" iterations = 0\n",
" all_node_colors = []\n",
" node_colors = dict(initial_node_colors)\n",
" \n",
" f = memoize(f, 'f')\n",
" node = Node(problem.initial)\n",
" \n",
" node_colors[node.state] = \"red\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" \n",
" if problem.goal_test(node.state):\n",
" node_colors[node.state] = \"green\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" return(iterations, all_node_colors, node)\n",
" \n",
" frontier = PriorityQueue(min, f)\n",
" frontier.append(node)\n",
" \n",
" node_colors[node.state] = \"orange\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" \n",
" explored = set()\n",
" while frontier:\n",
" node = frontier.pop()\n",
" \n",
" node_colors[node.state] = \"red\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" \n",
" if problem.goal_test(node.state):\n",
" node_colors[node.state] = \"green\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" return(iterations, all_node_colors, node)\n",
" \n",
" explored.add(node.state)\n",
" for child in node.expand(problem):\n",
" if child.state not in explored and child not in frontier:\n",
" frontier.append(child)\n",
" node_colors[child.state] = \"orange\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" elif child in frontier:\n",
" incumbent = frontier[child]\n",
" if f(child) < f(incumbent):\n",
" del frontier[incumbent]\n",
" frontier.append(child)\n",
" node_colors[child.state] = \"orange\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
"\n",
" node_colors[node.state] = \"gray\"\n",
" iterations += 1\n",
" all_node_colors.append(dict(node_colors))\n",
" return None\n",
"\n",
"def astar_search(problem, h=None):\n",
" \"\"\"A* search is best-first graph search with f(n) = g(n)+h(n).\n",
" You need to specify the h function when you call astar_search, or\n",
" else in your Problem subclass.\"\"\"\n",
" h = memoize(h or problem.h, 'h')\n",
" iterations, all_node_colors, node = best_first_graph_search(problem, lambda n: n.path_cost + h(n))\n",
" return(iterations, all_node_colors, node)"
]
},
{
"cell_type": "code",
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
"execution_count": 43,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "3e62c492a82044e4813ad5d84e698874"
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "b661fd0c0c8d495db2672aedc25b9a44"
}
},
"metadata": {},
"output_type": "display_data"
}
],
"all_node_colors = []\n",
"romania_problem = GraphProblem('Arad', 'Bucharest', romania_map)\n",
"display_visual(user_input = False, algorithm = astar_search, problem = romania_problem)"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"scrolled": false
},
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "7f1ffa858c92429bb28f74c23c0c939c"
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "7a98e98ffec14520b93ce542f5169bcc"
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "094beb8cf34c4a5b87f8368539d24091"
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "a8f89c87de964ee69004902763e68a54"
}
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "2ccdb4aba3ee4371a78306755e5642ad"
}
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"all_node_colors = []\n",
"# display_visual(user_input = True, algorithm = breadth_first_tree_search)\n",
"display_visual(user_input = True)"
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## A* Search Heuristics Comparison\n",
"\n",
"Different Heuristics have different efficiency in solving a particular problem via A* search which is generally defined by the node of explored nodes as well as the branching factor. With the help of the Classic 8* Puzzle we can effectively visualize the difference in performance of these heuristics. \n",
"\n",
"### 8-Puzzle Problem\n",
"\n",
"*8-Puzzle Problem* is another problem that is classified as NP hard for which genetic algorithms provide a better solution than any pre-existing ones.\n",
"\n",
"The *8-Puzzle Problem* consists of a *3x3 tray* in which 8 tiles numbered 1-8 are placed and the 9<sup>th</sup> tile is uncovered. The aim of the game is that given a initial placement of the tiles, we have to reach the goal state on the constraint that a tile adjacent to be the blank space can be slid into that space.\n",
"\n",
"*example:*\n",
" Initial State Goal State\n",
"\n",
" | 7 | 2 | 4 | | | 1 | 2 |\n",
" | 5 | | 6 | ----> | 3 | 4 | 5 |\n",
" | 8 | 3 | 1 | | 6 | 7 | 8 |\n",
"\n",
"We have a total of 8+1(blank) tiles giving us total of 9! initial configurations but of all these configurations only 9!/2 can lead to a solution.The solvability can be checked by calculating the *Permutation Inversion* of each tile and then summing it up.\n",
"Inversion is defined as when a tile preceeds another tile with lower number.\n",
"Let's calculate the Permutation Inversion of the example shown above -\n",
" \n",
" Tile 7 -> 6 Inversions (for tile 2, 4, 5, 6, 3, 1)\n",
" Tile 2 -> 1 Inversions\n",
" Tile 4 -> 2 Inversions\n",
" Tile 5 -> 2 Inversions\n",
" Tile 6 -> 2 Inversions\n",
" Tile 8 -> 2 Inversions\n",
" Tile 3 -> 1 Inversions\n",
" Tile 1 -> 0 Inversions\n",
"Total Inversions = 16 Inversions, \n",
"Is total Inversions are even then the initial configuration is solvable else the configuration is impossible to solve.\n",
"\n",
"For example we can have a state \"724506831\" where 0 represents the empty tile.\n",
"\n",
"#### Heuristics:-\n",
"1.) Manhattan Distance:- For the 8 Puzzle problem \"Manhattan distance is defined as the distance of a tile from its \n",
" goal. In the example shown above the manhattan distance for the 'numbered tile 1' is 4\n",
" (2 unit left and 2 unit up).\n",
"\n",
"2.) No. of Misplaced Tiles:- This heuristics calculates the number of misplaced tile in the state from the goal \n",
" state.\n",
"\n",
"3.) Sqrt of Manhattan Distance:- Uses the sqaure root of the Manhattan distance\n",
"\n",
"4.) Max Heuristic :- Score on the basis of max of Manhattan Distance and No. of Misplced tiles."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# define heuristics\n",
"def linear(state,goal):\n",
" return sum([1 if state[i] != goal[i] else 0 for i in range(8)])\n",
"\n",
"def manhanttan(state,goal):\n",
" index_goal = {0:[2,2], 1:[0,0], 2:[0,1], 3:[0,2], 4:[1,0], 5:[1,1], 6:[1,2], 7:[2,0], 8:[2,1]}\n",
" index_state = {}\n",
" index = [[0,0], [0,1], [0,2], [1,0], [1,1], [1,2], [2,0], [2,1], [2,2]]\n",
" x=0\n",
" y=0\n",
" for i in range(len(state)):\n",
" index_state[state[i]] = index[i]\n",
" mhd = 0\n",
" for i in range(8):\n",
" for j in range(2):\n",
" mhd = abs(index_goal[i][j] - index_state[i][j]) + mhd\n",
" return mhd\n",
"\n",
"def sqrt_manhanttan(state,goal):\n",
" index_goal = {0:[2,2], 1:[0,0], 2:[0,1], 3:[0,2], 4:[1,0], 5:[1,1], 6:[1,2], 7:[2,0], 8:[2,1]}\n",
" index_state = {}\n",
" index = [[0,0], [0,1], [0,2], [1,0], [1,1], [1,2], [2,0], [2,1], [2,2]]\n",
" x=0\n",
" y=0\n",
" for i in range(len(state)):\n",
" index_state[state[i]] = index[i]\n",
" mhd = 0\n",
" for i in range(8):\n",
" for j in range(2):\n",
" mhd = (index_goal[i][j] - index_state[i][j])**2 + mhd\n",
" return math.sqrt(mhd)\n",
"\n",
"def max_heuristic(state,goal):\n",
" score1 = manhanttan(state, goal)\n",
" score2 = linear(state, goal)\n",
" return max(score1, score2)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Algorithm for 8 Puzzle problem\n",
"\n",
"def checkSolvability(state):\n",
" inversion = 0\n",
" for i in range(len(state)):\n",
" for j in range(i,len(state)):\n",
" if (state[i]>state[j] and state[j]!=0):\n",
" inversion += 1\n",
" check = True\n",
" if inversion%2 != 0:\n",
" check = False\n",
" print(check)\n",
" return check\n",
"\n",
"def getPossibleMoves(state,heuristic,goal,moves):\n",
" move = {0:[1,3], 1:[0,2,4], 2:[1,5], 3:[0,6,4], 4:[1,3,5,7], 5:[2,4,8], 6:[3,7], 7:[6,8], 8:[7,5]} # create a dictionary of moves\n",
" index = state[0].index(0)\n",
" possible_moves = []\n",
" for i in range(len(move[index])):\n",
" conf = list(state[0][:])\n",
" a = conf[index]\n",
" b = conf[move[index][i]]\n",
" conf[move[index][i]] = a\n",
" conf[index] = b\n",
" possible_moves.append(conf)\n",
" scores = []\n",
" for i in possible_moves:\n",
" scores.append(heuristic(i,goal))\n",
" scores = [x+moves for x in scores]\n",
" allowed_state = []\n",
" for i in range(len(possible_moves)):\n",
" node = []\n",
" node.append(possible_moves[i])\n",
" node.append(scores[i])\n",
" node.append(state[0])\n",
" allowed_state.append(node) \n",
" return allowed_state\n",
"\n",
"path = []\n",
"final = []\n",
"def create_path(goal,initial):\n",
" node = goal[0]\n",
" final.append(goal[0])\n",
" if goal[2] == initial:\n",
" return reversed(final)\n",
" else:\n",
" parent = goal[2]\n",
" for i in path:\n",
" if i[0] == parent:\n",
" parent = i\n",
" create_path(parent,initial)\t\n",
"\n",
"def show_path(initial):\n",
" move = []\n",
" for i in range(0,len(path)):\n",
" move.append(''.join(str(x) for x in path[i][0]))\n",
" print(\"Number of explored nodes by the following heuristic are: \", len(set(move)))\t\n",
" print(initial)\n",
" for i in reversed(final):\n",
" print(i)\n",
" return\n",
"\n",
"def solve(initial,goal,heuristic):\n",
" root = [initial,heuristic(initial,goal),'']\n",
" nodes = [] # nodes is a priority Queue based on the state score \n",
" nodes.append(root)\n",
" moves = 0\n",
" while len(nodes) != 0:\n",
" node = nodes[0]\n",
" del nodes[0]\n",
" path.append(node)\n",
" if node[0] == goal:\n",
" soln = create_path(path[-1],initial )\n",
" show_path(initial)\n",
" return \n",
" moves +=1\n",
" opened_nodes = getPossibleMoves(node,heuristic,goal,moves)\n",
" nodes = sorted(opened_nodes+nodes, key=itemgetter(1))\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Heuristics is max_heuristic\n",
"True\n",
"Number of explored nodes by the following heuristic are: 126\n",
"[2, 4, 3, 1, 5, 6, 7, 8, 0]\n",
"[2, 4, 3, 1, 5, 0, 7, 8, 6]\n",
"[2, 4, 3, 1, 0, 5, 7, 8, 6]\n",
"[2, 0, 3, 1, 4, 5, 7, 8, 6]\n",
"[0, 2, 3, 1, 4, 5, 7, 8, 6]\n",
"[1, 2, 3, 0, 4, 5, 7, 8, 6]\n",
"[1, 2, 3, 4, 0, 5, 7, 8, 6]\n",
"[1, 2, 3, 4, 5, 0, 7, 8, 6]\n",
"[1, 2, 3, 4, 5, 6, 7, 8, 0]\n"
]
}
],
"source": [
"goal_state = [1,2,3,4,5,6,7,8,0] # define the goal state\n",
"initial_state = [2,4,3,1,5,6,7,8,0] # define the initial state\n",
"print(\"Heuristics is max_heuristic\")\n",
"checkSolvability(initial_state)\n",
"solve(initial_state,goal_state,max_heuristic) # to check the different heuristics change the function name in solve"
]
},
{
"cell_type": "markdown",
"\n",
"Genetic algorithms (or GA) are inspired by natural evolution and are particularly useful in optimization and search problems with large state spaces.\n",
"\n",
"Given a problem, algorithms in the domain make use of a *population* of solutions (also called *states*), where each solution/state represents a feasible solution. At each iteration (often called *generation*), the population gets updated using methods inspired by biology and evolution, like *crossover*, *mutation* and *natural selection*."
]
},
{
"cell_type": "markdown",
"source": [
"### Overview\n",
"\n",
"A genetic algorithm works in the following way:\n",
"\n",
"1) Initialize random population.\n",
"\n",
"2) Calculate population fitness.\n",
"\n",
"3) Select individuals for mating.\n",
"\n",
"4) Mate selected individuals to produce new population.\n",
"\n",
" * Random chance to mutate individuals.\n",
"\n",
"5) Repeat from step 2) until an individual is fit enough or the maximum number of iterations was reached."
]
},
{
"cell_type": "markdown",
"### Glossary\n",
"\n",
"Before we continue, we will lay the basic terminology of the algorithm.\n",
"\n",
"* Individual/State: A list of elements (called *genes*) that represent possible solutions.\n",
"* Population: The list of all the individuals/states.\n",
"\n",
"* Gene pool: The alphabet of possible values for an individual's genes.\n",
"\n",
"* Generation/Iteration: The number of times the population will be updated.\n",
"\n",
"* Fitness: An individual's score, calculated by a function specific to the problem."
]
},
{
"cell_type": "markdown",
"### Crossover\n",
"\n",
"Two individuals/states can \"mate\" and produce one child. This offspring bears characteristics from both of its parents. There are many ways we can implement this crossover. Here we will take a look at the most common ones. Most other methods are variations of those below.\n",
"\n",
"* Point Crossover: The crossover occurs around one (or more) point. The parents get \"split\" at the chosen point or points and then get merged. In the example below we see two parents get split and merged at the 3rd digit, producing the following offspring after the crossover.\n",
"\n",
"\n",
"\n",
"* Uniform Crossover: This type of crossover chooses randomly the genes to get merged. Here the genes 1, 2 and 5 were chosen from the first parent, so the genes 3, 4 were added by the second parent.\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"### Mutation\n",
"\n",
"When an offspring is produced, there is a chance it will mutate, having one (or more, depending on the implementation) of its genes altered.\n",
"\n",
"For example, let's say the new individual to undergo mutation is \"abcde\". Randomly we pick to change its third gene to 'z'. The individual now becomes \"abzde\" and is added to the population."
]
},
{
"cell_type": "markdown",
"At each iteration, the fittest individuals are picked randomly to mate and produce offsprings. We measure an individual's fitness with a *fitness function*. That function depends on the given problem and it is used to score an individual. Usually the higher the better.\n",
"The selection process is this:\n",
"1) Individuals are scored by the fitness function.\n",
"\n",
"2) Individuals are picked randomly, according to their score (higher score means higher chance to get picked). Usually the formula to calculate the chance to pick an individual is the following (for population *P* and individual *i*):\n",
"\n",
"$$ chance(i) = \\dfrac{fitness(i)}{\\sum_{k \\, in \\, P}{fitness(k)}} $$"
]
},
{
"cell_type": "markdown",
"### Implementation\n",
"\n",
"Below we look over the implementation of the algorithm in the `search` module.\n",
"\n",
"First the implementation of the main core of the algorithm:"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
"%psource genetic_algorithm"
]
},
{
"cell_type": "markdown",
"source": [
"The algorithm takes the following input:\n",
"\n",
"* `population`: The initial population.\n",
"\n",
"* `fitness_fn`: The problem's fitness function.\n",
"\n",
"* `gene_pool`: The gene pool of the states/individuals. By default 0 and 1.\n",
"* `f_thres`: The fitness threshold. If an individual reaches that score, iteration stops. By default 'None', which means the algorithm will not halt until the generations are ran.\n",
"\n",
"* `ngen`: The number of iterations/generations.\n",
"\n",
"* `pmut`: The probability of mutation.\n",
"\n",
"The algorithm gives as output the state with the largest score."
]
},
{
"cell_type": "markdown",
"For each generation, the algorithm updates the population. First it calculates the fitnesses of the individuals, then it selects the most fit ones and finally crosses them over to produce offsprings. There is a chance that the offspring will be mutated, given by `pmut`. If at the end of the generation an individual meets the fitness threshold, the algorithm halts and returns that individual.\n",
"\n",
"The function of mating is accomplished by the method `reproduce`:"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
]
},
{
"cell_type": "markdown",
"source": [
"The method picks at random a point and merges the parents (`x` and `y`) around it.\n",
"The mutation is done in the method `mutate`:"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
]
},
{
"cell_type": "markdown",
"We pick a gene in `x` to mutate and a gene from the gene pool to replace it with.\n",
"\n",
"To help initializing the population we have the helper function `init_population`\":"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
]
},
{
"cell_type": "markdown",
"source": [
"The function takes as input the number of individuals in the population, the gene pool and the length of each individual/state. It creates individuals with random genes and returns the population when done."
]
},
{
"cell_type": "markdown",
"source": [
"### Usage\n",
"Below we give two example usages for the genetic algorithm, for a graph coloring problem and the 8 queens problem.\n",
"First we will take on the simpler problem of coloring a small graph with two colors. Before we do anything, let's imagine how a solution might look. First, we have to represent our colors. Say, 'R' for red and 'G' for green. These make up our gene pool. What of the individual solutions though? For that, we will look at our problem. We stated we have a graph. A graph has nodes and edges, and we want to color the nodes. Naturally, we want to store each node's color. If we have four nodes, we can store their colors in a list of genes, one for each node. A possible solution will then look like this: ['R', 'R', 'G', 'R']. In the general case, we will represent each solution with a list of chars ('R' and 'G'), with length the number of nodes.\n",
"Next we need to come up with a fitness function that appropriately scores individuals. Again, we will look at the problem definition at hand. We want to color a graph. For a solution to be optimal, no edge should connect two nodes of the same color. How can we use this information to score a solution? A naive (and ineffective) approach would be to count the different colors in the string. So ['R', 'R', 'R', 'R'] has a score of 1 and ['R', 'R', 'G', 'G'] has a score of 2. Why that fitness function is not ideal though? Why, we forgot the information about the edges! The edges are pivotal to the problem and the above function only deals with node colors. We didn't use all the information at hand and ended up with an ineffective answer. How, then, can we use that information to our advantage?\n",
"We said that the optimal solution will have all the edges connecting nodes of different color. So, to score a solution we can count how many edges are valid (aka connecting nodes of different color). That is a great fitness function!\n",
"Let's jump into solving this problem using the `genetic_algorithm` function."
]
},
{
"cell_type": "markdown",
"source": [
"First we need to represent the graph. Since we mostly need information about edges, we will just store the edges. We will denote edges with capital letters and nodes with integers:"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
"edges = {\n",
" 'A': [0, 1],\n",
" 'B': [0, 3],\n",
" 'C': [1, 2],\n",
" 'D': [2, 3]\n",
"}"
]
},
{
"cell_type": "markdown",
"Edge 'A' connects nodes 0 and 1, edge 'B' connects nodes 0 and 3 etc.\n",
"\n",
"We already said our gene pool is 'R' and 'G', so we can jump right into initializing our population. Since we have only four nodes, `state_length` should be 4. For the number of individuals, we will try 8. We can increase this number if we need higher accuracy, but be careful! Larger populations need more computating power and take longer. You need to strike that sweet balance between accuracy and cost (the ultimate dilemma of the programmer!)."
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[['R', 'G', 'G', 'R'], ['G', 'R', 'G', 'G'], ['G', 'G', 'G', 'G'], ['R', 'G', 'G', 'G'], ['R', 'G', 'G', 'R'], ['G', 'R', 'G', 'R'], ['G', 'G', 'G', 'R'], ['G', 'R', 'G', 'R']]\n"
"population = init_population(8, ['R', 'G'], 4)\n",
]
},
{
"cell_type": "markdown",
"We created and printed the population. You can see that the genes in the individuals are random and there are 8 individuals each with 4 genes.\n",
"Next we need to write our fitness function. We previously said we want the function to count how many edges are valid. So, given a coloring/individual `c`, we will do just that:"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
"def fitness(c):\n",
" return sum(c[n1] != c[n2] for (n1, n2) in edges.values())"
]
},
{
"cell_type": "markdown",
"Great! Now we will run the genetic algorithm and see what solution it gives."
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"solution = genetic_algorithm(population, fitness, gene_pool=['R', 'G'])\n",
"print(solution)"
]
},
{
"cell_type": "markdown",
"source": [
"The algorithm converged to a solution. Let's check its score:"
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4\n"
]
}
],
"source": [
"print(fitness(solution))"
]
},
{
"cell_type": "markdown",
"source": [
"The solution has a score of 4. Which means it is optimal, since we have exactly 4 edges in our graph, meaning all are valid!\n",
"*NOTE: Because the algorithm is non-deterministic, there is a chance a different solution is given. It might even be wrong, if we are very unlucky!*"
]
},
{
"cell_type": "markdown",
"#### Eight Queens\n",
"\n",
"Let's take a look at a more complicated problem.\n",
"\n",
"In the *Eight Queens* problem, we are tasked with placing eight queens on an 8x8 chessboard without any queen threatening the others (aka queens should not be in the same row, column or diagonal). In its general form the problem is defined as placing *N* queens in an NxN chessboard without any conflicts.\n",
"\n",
"First we need to think about the representation of each solution. We can go the naive route of representing the whole chessboard with the queens' placements on it. That is definitely one way to go about it, but for the purpose of this tutorial we will do something different. We have eight queens, so we will have a gene for each of them. The gene pool will be numbers from 0 to 7, for the different columns. The *position* of the gene in the state will denote the row the particular queen is placed in.\n",
"\n",
"For example, we can have the state \"03304577\". Here the first gene with a value of 0 means \"the queen at row 0 is placed at column 0\", for the second gene \"the queen at row 1 is placed at column 3\" and so forth.\n",
"\n",
"We now need to think about the fitness function. On the graph coloring problem we counted the valid edges. The same thought process can be applied here. Instead of edges though, we have positioning between queens. If two queens are not threatening each other, we say they are at a \"non-attacking\" positioning. We can, therefore, count how many such positionings are there.\n",
"\n",
"Let's dive right in and initialize our population:"
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[6, 7, 3, 6, 3, 0, 1, 4], [7, 1, 4, 1, 5, 2, 0, 0], [1, 4, 7, 0, 0, 2, 5, 2], [2, 0, 3, 7, 5, 7, 0, 0], [6, 3, 1, 7, 5, 6, 3, 0]]\n"
]
}
],
"source": [
"population = init_population(100, range(8), 8)\n",
]
},
{
"cell_type": "markdown",
"We have a population of 100 and each individual has 8 genes. The gene pool is the integers from 0 to 7, in string form. Above you can see the first five individuals.\n",
"\n",
"Next we need to write our fitness function. Remember, queens threaten each other if they are at the same row, column or diagonal.\n",
"Since positionings are mutual, we must take care not to count them twice. Therefore for each queen, we will only check for conflicts for the queens after her.\n",
"\n",
"A gene's value in an individual `q` denotes the queen's column, and the position of the gene denotes its row. We can check if the aforementioned values between two genes are the same. We also need to check for diagonals. A queen *a* is in the diagonal of another queen, *b*, if the difference of the rows between them is equal to either their difference in columns (for the diagonal on the right of *a*) or equal to the negative difference of their columns (for the left diagonal of *a*). Below is given the fitness function."
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
"def fitness(q):\n",
" non_attacking = 0\n",
" for row1 in range(len(q)):\n",
" for row2 in range(row1+1, len(q)):\n",
" col1 = int(q[row1])\n",
" col2 = int(q[row2])\n",
" row_diff = row1 - row2\n",
" col_diff = col1 - col2\n",
" if col1 != col2 and row_diff != col_diff and row_diff != -col_diff:\n",
" non_attacking += 1\n",
]
},
{
"cell_type": "markdown",
"Note that the best score achievable is 28. That is because for each queen we only check for the queens after her. For the first queen we check 7 other queens, for the second queen 6 others and so on. In short, the number of checks we make is the sum 7+6+5+...+1. Which is equal to 7\\*(7+1)/2 = 28.\n",
"\n",
"Because it is very hard and will take long to find a perfect solution, we will set the fitness threshold at 25. If we find an individual with a score greater or equal to that, we will halt. Let's see how the genetic algorithm will fare."
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"solution = genetic_algorithm(population, fitness, f_thres=28, gene_pool=range(8))\n",
"print(solution)\n",
"print(fitness(solution))"
]
},
{
"cell_type": "markdown",
"Above you can see the solution and its fitness score, which should be no less than 25."
]
},
{
"cell_type": "markdown",
"source": [
"With that this tutorial on the genetic algorithm comes to an end. Hope you found this guide helpful!"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
<<<<<<< HEAD
"version": "3.5.4rc1"
=======
"version": "3.5.2"
>>>>>>> 8561c52d63fcaef4c0f99d997073aeb93e926e56
"state": {
"013d8df0a2ab4899b09f83aa70ce5d50": {
"views": []
},
"01ee7dc2239c4b0095710436453b362d": {
"views": []
},
"04d594ae6a704fc4b16895e6a7b85270": {
"views": []
},
"052ea3e7259346a4b022ec4fef1fda28": {
"views": [
{
}
]
},
"0ade4328785545c2b66d77e599a3e9da": {
"views": [
{
}
]
},
"0b94d8de6b4e47f89b0382b60b775cbd": {
"views": []
},