search.ipynb

   "outputs": [],
   "source": [
    "population = init_population(8, ['R', 'G'], 4)\n",
    "print(population)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We created and printed the population. You can see that the genes in the individuals are random and there are 8 individuals each with 4 genes.\n",
    "\n",
    "Next we need to write our fitness function. We previously said we want the function to count how many edges are valid. So, given a coloring/individual `c`, we will do just that:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def fitness(c):\n",
    "    return sum(c[n1] != c[n2] for (n1, n2) in edges.values())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Great! Now we will run the genetic algorithm and see what solution it gives."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "solution = genetic_algorithm(population, fitness, gene_pool=['R', 'G'])\n",
    "print(solution)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The algorithm converged to a solution. Let's check its score:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "print(fitness(solution))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The solution has a score of 4. Which means it is optimal, since we have exactly 4 edges in our graph, meaning all are valid!\n",
    "\n",
    "*NOTE: Because the algorithm is non-deterministic, there is a chance a different solution is given. It might even be wrong, if we are very unlucky!*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Eight Queens\n",
    "\n",
    "Let's take a look at a more complicated problem.\n",
    "\n",
    "In the *Eight Queens* problem, we are tasked with placing eight queens on an 8x8 chessboard without any queen threatening the others (aka queens should not be in the same row, column or diagonal). In its general form the problem is defined as placing *N* queens in an NxN chessboard without any conflicts.\n",
    "\n",
    "First we need to think about the representation of each solution. We can go the naive route of representing the whole chessboard with the queens' placements on it. That is definitely one way to go about it, but for the purpose of this tutorial we will do something different. We have eight queens, so we will have a gene for each of them. The gene pool will be numbers from 0 to 7, for the different columns. The *position* of the gene in the state will denote the row the particular queen is placed in.\n",
    "\n",
    "For example, we can have the state \"03304577\". Here the first gene with a value of 0 means \"the queen at row 0 is placed at column 0\", for the second gene \"the queen at row 1 is placed at column 3\" and so forth.\n",
    "\n",
    "We now need to think about the fitness function. On the graph coloring problem we counted the valid edges. The same thought process can be applied here. Instead of edges though, we have positioning between queens. If two queens are not threatening each other, we say they are at a \"non-attacking\" positioning. We can, therefore, count how many such positionings are there.\n",
    "\n",
    "Let's dive right in and initialize our population:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "population = init_population(100, range(8), 8)\n",
    "print(population[:5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have a population of 100 and each individual has 8 genes. The gene pool is the integers from 0 to 7, in string form. Above you can see the first five individuals.\n",
    "\n",
    "Next we need to write our fitness function. Remember, queens threaten each other if they are at the same row, column or diagonal.\n",
    "\n",
    "Since positionings are mutual, we must take care not to count them twice. Therefore for each queen, we will only check for conflicts for the queens after her.\n",
    "\n",
    "A gene's value in an individual `q` denotes the queen's column, and the position of the gene denotes its row. We can check if the aforementioned values between two genes are the same. We also need to check for diagonals. A queen *a* is in the diagonal of another queen, *b*, if the difference of the rows between them is equal to either their difference in columns (for the diagonal on the right of *a*) or equal to the negative difference of their columns (for the left diagonal of *a*). Below is given the fitness function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def fitness(q):\n",
    "    non_attacking = 0\n",
    "    for row1 in range(len(q)):\n",
    "        for row2 in range(row1+1, len(q)):\n",
    "            col1 = int(q[row1])\n",
    "            col2 = int(q[row2])\n",
    "            row_diff = row1 - row2\n",
    "            col_diff = col1 - col2\n",
    "\n",
    "            if col1 != col2 and row_diff != col_diff and row_diff != -col_diff:\n",
    "                non_attacking += 1\n",
    "\n",
    "    return non_attacking"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the best score achievable is 28. That is because for each queen we only check for the queens after her. For the first queen we check 7 other queens, for the second queen 6 others and so on. In short, the number of checks we make is the sum 7+6+5+...+1. Which is equal to 7\\*(7+1)/2 = 28.\n",
    "\n",
    "Because it is very hard and will take long to find a perfect solution, we will set the fitness threshold at 25. If we find an individual with a score greater or equal to that, we will halt. Let's see how the genetic algorithm will fare."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "solution = genetic_algorithm(population, fitness, f_thres=25, gene_pool=range(8))\n",
    "print(solution)\n",
    "print(fitness(solution))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Above you can see the solution and its fitness score, which should be no less than 25."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With that this tutorial on the genetic algorithm comes to an end. Hope you found this guide helpful!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}