vacuum_world.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# THE VACUUM WORLD   \n",
    "\n",
    "In this notebook, we will be discussing **the structure of agents** through an example of the **vacuum agent**. The job of AI is to design an **agent program** that implements the agent function: the mapping from percepts to actions. We assume this program will run on some sort of computing device with physical sensors and actuators: we call this the **architecture**:\n",
    "\n",
    "<h3 align=\"center\">agent = architecture + program</h3>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before moving on, please review [<b>agents.ipynb</b>](https://github.com/aimacode/aima-python/blob/master/agents.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## CONTENTS\n",
    "\n",
    "* Agent\n",
    "* Random Agent Program\n",
    "* Table-Driven Agent Program\n",
    "* Simple Reflex Agent Program\n",
    "* Model-Based Reflex Agent Program\n",
    "* Goal-Based Agent Program\n",
    "* Utility-Based Agent Program\n",
    "* Learning Agent"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## AGENT PROGRAMS\n",
    "\n",
    "An agent program takes the current percept as input from the sensors and returns an action to the actuators. There is a difference between an agent program and an agent function: an agent program takes the current percept as input whereas an agent function takes the entire percept history.\n",
    "\n",
    "The agent program takes just the current percept as input because nothing more is available from the environment; if the agent's actions depend on the entire percept sequence, the agent will have to remember the percept.\n",
    "\n",
    "We'll discuss the following agent programs here with the help of the vacuum world example:\n",
    "\n",
    "* Random Agent Program\n",
    "* Table-Driven Agent Program\n",
    "* Simple Reflex Agent Program\n",
    "* Model-Based Reflex Agent Program\n",
    "* Goal-Based Agent Program\n",
    "* Utility-Based Agent Program"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Random Agent Program\n",
    "\n",
    "A random agent program, as the name suggests, chooses an action at random, without taking into account the percepts.   \n",
    "Here, we will demonstrate a random vacuum agent for a trivial vacuum environment, that is, the two-state environment."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's begin by importing all the functions from the agents module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "from agents import *\n",
    "from notebook import psource"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us first see how we define the TrivialVacuumEnvironment. Run the next cell to see how abstract class TrivialVacuumEnvironment is defined in agents module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\"\n",
       "   \"http://www.w3.org/TR/html4/strict.dtd\">\n",
       "\n",
       "<html>\n",
       "<head>\n",
       "  <title></title>\n",
       "  <meta http-equiv=\"content-type\" content=\"text/html; charset=None\">\n",
       "  <style type=\"text/css\">\n",
       "td.linenos { background-color: #f0f0f0; padding-right: 10px; }\n",
       "span.lineno { background-color: #f0f0f0; padding: 0 5px 0 5px; }\n",
       "pre { line-height: 125%; }\n",
       "body .hll { background-color: #ffffcc }\n",
       "body  { background: #f8f8f8; }\n",
       "body .c { color: #408080; font-style: italic } /* Comment */\n",
       "body .err { border: 1px solid #FF0000 } /* Error */\n",
       "body .k { color: #008000; font-weight: bold } /* Keyword */\n",
       "body .o { color: #666666 } /* Operator */\n",
       "body .ch { color: #408080; font-style: italic } /* Comment.Hashbang */\n",
       "body .cm { color: #408080; font-style: italic } /* Comment.Multiline */\n",
       "body .cp { color: #BC7A00 } /* Comment.Preproc */\n",
       "body .cpf { color: #408080; font-style: italic } /* Comment.PreprocFile */\n",
       "body .c1 { color: #408080; font-style: italic } /* Comment.Single */\n",
       "body .cs { color: #408080; font-style: italic } /* Comment.Special */\n",
       "body .gd { color: #A00000 } /* Generic.Deleted */\n",
       "body .ge { font-style: italic } /* Generic.Emph */\n",
       "body .gr { color: #FF0000 } /* Generic.Error */\n",
       "body .gh { color: #000080; font-weight: bold } /* Generic.Heading */\n",
       "body .gi { color: #00A000 } /* Generic.Inserted */\n",
       "body .go { color: #888888 } /* Generic.Output */\n",
       "body .gp { color: #000080; font-weight: bold } /* Generic.Prompt */\n",
       "body .gs { font-weight: bold } /* Generic.Strong */\n",
       "body .gu { color: #800080; font-weight: bold } /* Generic.Subheading */\n",
       "body .gt { color: #0044DD } /* Generic.Traceback */\n",
       "body .kc { color: #008000; font-weight: bold } /* Keyword.Constant */\n",
       "body .kd { color: #008000; font-weight: bold } /* Keyword.Declaration */\n",
       "body .kn { color: #008000; font-weight: bold } /* Keyword.Namespace */\n",
       "body .kp { color: #008000 } /* Keyword.Pseudo */\n",
       "body .kr { color: #008000; font-weight: bold } /* Keyword.Reserved */\n",
       "body .kt { color: #B00040 } /* Keyword.Type */\n",
       "body .m { color: #666666 } /* Literal.Number */\n",
       "body .s { color: #BA2121 } /* Literal.String */\n",
       "body .na { color: #7D9029 } /* Name.Attribute */\n",
       "body .nb { color: #008000 } /* Name.Builtin */\n",
       "body .nc { color: #0000FF; font-weight: bold } /* Name.Class */\n",
       "body .no { color: #880000 } /* Name.Constant */\n",
       "body .nd { color: #AA22FF } /* Name.Decorator */\n",
       "body .ni { color: #999999; font-weight: bold } /* Name.Entity */\n",
       "body .ne { color: #D2413A; font-weight: bold } /* Name.Exception */\n",
       "body .nf { color: #0000FF } /* Name.Function */\n",
       "body .nl { color: #A0A000 } /* Name.Label */\n",
       "body .nn { color: #0000FF; font-weight: bold } /* Name.Namespace */\n",
       "body .nt { color: #008000; font-weight: bold } /* Name.Tag */\n",
       "body .nv { color: #19177C } /* Name.Variable */\n",
       "body .ow { color: #AA22FF; font-weight: bold } /* Operator.Word */\n",
       "body .w { color: #bbbbbb } /* Text.Whitespace */\n",
       "body .mb { color: #666666 } /* Literal.Number.Bin */\n",
       "body .mf { color: #666666 } /* Literal.Number.Float */\n",
       "body .mh { color: #666666 } /* Literal.Number.Hex */\n",
       "body .mi { color: #666666 } /* Literal.Number.Integer */\n",
       "body .mo { color: #666666 } /* Literal.Number.Oct */\n",
       "body .sa { color: #BA2121 } /* Literal.String.Affix */\n",
       "body .sb { color: #BA2121 } /* Literal.String.Backtick */\n",
       "body .sc { color: #BA2121 } /* Literal.String.Char */\n",
       "body .dl { color: #BA2121 } /* Literal.String.Delimiter */\n",
       "body .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */\n",
       "body .s2 { color: #BA2121 } /* Literal.String.Double */\n",
       "body .se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */\n",
       "body .sh { color: #BA2121 } /* Literal.String.Heredoc */\n",
       "body .si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */\n",
       "body .sx { color: #008000 } /* Literal.String.Other */\n",
       "body .sr { color: #BB6688 } /* Literal.String.Regex */\n",
       "body .s1 { color: #BA2121 } /* Literal.String.Single */\n",
       "body .ss { color: #19177C } /* Literal.String.Symbol */\n",
       "body .bp { color: #008000 } /* Name.Builtin.Pseudo */\n",
       "body .fm { color: #0000FF } /* Name.Function.Magic */\n",
       "body .vc { color: #19177C } /* Name.Variable.Class */\n",
       "body .vg { color: #19177C } /* Name.Variable.Global */\n",
       "body .vi { color: #19177C } /* Name.Variable.Instance */\n",
       "body .vm { color: #19177C } /* Name.Variable.Magic */\n",
       "body .il { color: #666666 } /* Literal.Number.Integer.Long */\n",
       "\n",
       "  </style>\n",
       "</head>\n",
       "<body>\n",
       "<h2></h2>\n",
       "\n",
       "<div class=\"highlight\"><pre><span></span><span class=\"k\">class</span> <span class=\"nc\">TrivialVacuumEnvironment</span><span class=\"p\">(</span><span class=\"n\">Environment</span><span class=\"p\">):</span>\n",
       "\n",
       "    <span class=\"sd\">&quot;&quot;&quot;This environment has two locations, A and B. Each can be Dirty</span>\n",
       "<span class=\"sd\">    or Clean. The agent perceives its location and the location&#39;s</span>\n",
       "<span class=\"sd\">    status. This serves as an example of how to implement a simple</span>\n",
       "<span class=\"sd\">    Environment.&quot;&quot;&quot;</span>\n",
       "\n",
       "    <span class=\"k\">def</span> <span class=\"fm\">__init__</span><span class=\"p\">(</span><span class=\"bp\">self</span><span class=\"p\">):</span>\n",
       "        <span class=\"nb\">super</span><span class=\"p\">()</span><span class=\"o\">.</span><span class=\"fm\">__init__</span><span class=\"p\">()</span>\n",
       "        <span class=\"bp\">self</span><span class=\"o\">.</span><span class=\"n\">status</span> <span class=\"o\">=</span> <span class=\"p\">{</span><span class=\"n\">loc_A</span><span class=\"p\">:</span> <span class=\"n\">random</span><span class=\"o\">.</span><span class=\"n\">choice</span><span class=\"p\">([</span><span class=\"s1\">&#39;Clean&#39;</span><span class=\"p\">,</span> <span class=\"s1\">&#39;Dirty&#39;</span><span class=\"p\">]),</span>\n",
       "                       <span class=\"n\">loc_B</span><span class=\"p\">:</span> <span class=\"n\">random</span><span class=\"o\">.</span><span class=\"n\">choice</span><span class=\"p\">([</span><span class=\"s1\">&#39;Clean&#39;</span><span class=\"p\">,</span> <span class=\"s1\">&#39;Dirty&#39;</span><span class=\"p\">])}</span>\n",
       "\n",
       "    <span class=\"k\">def</span> <span class=\"nf\">thing_classes</span><span class=\"p\">(</span><span class=\"bp\">self</span><span class=\"p\">):</span>\n",
       "        <span class=\"k\">return</span> <span class=\"p\">[</span><span class=\"n\">Wall</span><span class=\"p\">,</span> <span class=\"n\">Dirt</span><span class=\"p\">,</span> <span class=\"n\">ReflexVacuumAgent</span><span class=\"p\">,</span> <span class=\"n\">RandomVacuumAgent</span><span class=\"p\">,</span>\n",
       "                <span class=\"n\">TableDrivenVacuumAgent</span><span class=\"p\">,</span> <span class=\"n\">ModelBasedVacuumAgent</span><span class=\"p\">]</span>\n",
       "\n",
       "    <span class=\"k\">def</span> <span class=\"nf\">percept</span><span class=\"p\">(</span><span class=\"bp\">self</span><span class=\"p\">,</span> <span class=\"n\">agent</span><span class=\"p\">):</span>\n",
       "        <span class=\"sd\">&quot;&quot;&quot;Returns the agent&#39;s location, and the location status (Dirty/Clean).&quot;&quot;&quot;</span>\n",
       "        <span class=\"k\">return</span> <span class=\"p\">(</span><span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">location</span><span class=\"p\">,</span> <span class=\"bp\">self</span><span class=\"o\">.</span><span class=\"n\">status</span><span class=\"p\">[</span><span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">location</span><span class=\"p\">])</span>\n",
       "\n",
       "    <span class=\"k\">def</span> <span class=\"nf\">execute_action</span><span class=\"p\">(</span><span class=\"bp\">self</span><span class=\"p\">,</span> <span class=\"n\">agent</span><span class=\"p\">,</span> <span class=\"n\">action</span><span class=\"p\">):</span>\n",
       "        <span class=\"sd\">&quot;&quot;&quot;Change agent&#39;s location and/or location&#39;s status; track performance.</span>\n",
       "<span class=\"sd\">        Score 10 for each dirt cleaned; -1 for each move.&quot;&quot;&quot;</span>\n",
       "        <span class=\"k\">if</span> <span class=\"n\">action</span> <span class=\"o\">==</span> <span class=\"s1\">&#39;Right&#39;</span><span class=\"p\">:</span>\n",
       "            <span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">location</span> <span class=\"o\">=</span> <span class=\"n\">loc_B</span>\n",
       "            <span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">performance</span> <span class=\"o\">-=</span> <span class=\"mi\">1</span>\n",
       "        <span class=\"k\">elif</span> <span class=\"n\">action</span> <span class=\"o\">==</span> <span class=\"s1\">&#39;Left&#39;</span><span class=\"p\">:</span>\n",
       "            <span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">location</span> <span class=\"o\">=</span> <span class=\"n\">loc_A</span>\n",
       "            <span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">performance</span> <span class=\"o\">-=</span> <span class=\"mi\">1</span>\n",
       "        <span class=\"k\">elif</span> <span class=\"n\">action</span> <span class=\"o\">==</span> <span class=\"s1\">&#39;Suck&#39;</span><span class=\"p\">:</span>\n",
       "            <span class=\"k\">if</span> <span class=\"bp\">self</span><span class=\"o\">.</span><span class=\"n\">status</span><span class=\"p\">[</span><span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">location</span><span class=\"p\">]</span> <span class=\"o\">==</span> <span class=\"s1\">&#39;Dirty&#39;</span><span class=\"p\">:</span>\n",
       "                <span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">performance</span> <span class=\"o\">+=</span> <span class=\"mi\">10</span>\n",
       "            <span class=\"bp\">self</span><span class=\"o\">.</span><span class=\"n\">status</span><span class=\"p\">[</span><span class=\"n\">agent</span><span class=\"o\">.</span><span class=\"n\">location</span><span class=\"p\">]</span> <span class=\"o\">=</span> <span class=\"s1\">&#39;Clean&#39;</span>\n",
       "\n",
       "    <span class=\"k\">def</span> <span class=\"nf\">default_location</span><span class=\"p\">(</span><span class=\"bp\">self</span><span class=\"p\">,</span> <span class=\"n\">thing</span><span class=\"p\">):</span>\n",
       "        <span class=\"sd\">&quot;&quot;&quot;Agents start in either location at random.&quot;&quot;&quot;</span>\n",
       "        <span class=\"k\">return</span> <span class=\"n\">random</span><span class=\"o\">.</span><span class=\"n\">choice</span><span class=\"p\">([</span><span class=\"n\">loc_A</span><span class=\"p\">,</span> <span class=\"n\">loc_B</span><span class=\"p\">])</span>\n",
       "</pre></div>\n",
       "</body>\n",
       "</html>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "psource(TrivialVacuumEnvironment)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Clean', (1, 0): 'Dirty'}.\n"
     ]
    }
   ],
   "source": [
    "# These are the two locations for the two-state environment\n",
    "loc_A, loc_B = (0, 0), (1, 0)\n",
    "\n",
    "# Initialize the two-state environment\n",
    "trivial_vacuum_env = TrivialVacuumEnvironment()\n",
    "\n",
    "# Check the initial state of the environment\n",
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's create our agent now. This agent will choose any of the actions from 'Right', 'Left', 'Suck' and 'NoOp' (No Operation) randomly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create the random agent\n",
    "random_agent = Agent(program=RandomAgentProgram(['Right', 'Left', 'Suck', 'NoOp']))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will now add our agent to the environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RandomVacuumAgent is located at (1, 0).\n"
     ]
    }
   ],
   "source": [
    "# Add agent to the environment\n",
    "trivial_vacuum_env.add_thing(random_agent)\n",
    "\n",
    "print(\"RandomVacuumAgent is located at {}.\".format(random_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's run our environment now."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Clean', (1, 0): 'Dirty'}.\n",
      "RandomVacuumAgent is located at (1, 0).\n"
     ]
    }
   ],
   "source": [
    "# Running the environment\n",
    "trivial_vacuum_env.step()\n",
    "\n",
    "# Check the current state of the environment\n",
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))\n",
    "\n",
    "print(\"RandomVacuumAgent is located at {}.\".format(random_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## TABLE-DRIVEN AGENT PROGRAM\n",
    "\n",
    "A table-driven agent program keeps track of the percept sequence and then uses it to index into a table of actions to decide what to do. The table represents explicitly the agent function that the agent program embodies.  \n",
    "In the two-state vacuum world, the table would consist of all the possible states of the agent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "table = {((loc_A, 'Clean'),): 'Right',\n",
    "             ((loc_A, 'Dirty'),): 'Suck',\n",
    "             ((loc_B, 'Clean'),): 'Left',\n",
    "             ((loc_B, 'Dirty'),): 'Suck',\n",
    "             ((loc_A, 'Dirty'), (loc_A, 'Clean')): 'Right',\n",
    "             ((loc_A, 'Clean'), (loc_B, 'Dirty')): 'Suck',\n",
    "             ((loc_B, 'Clean'), (loc_A, 'Dirty')): 'Suck',\n",
    "             ((loc_B, 'Dirty'), (loc_B, 'Clean')): 'Left',\n",
    "             ((loc_A, 'Dirty'), (loc_A, 'Clean'), (loc_B, 'Dirty')): 'Suck',\n",
    "             ((loc_B, 'Dirty'), (loc_B, 'Clean'), (loc_A, 'Dirty')): 'Suck'\n",
    "        }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will now create a table-driven agent program for our two-state environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a table-driven agent\n",
    "table_driven_agent = Agent(program=TableDrivenAgentProgram(table=table))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since we are using the same environment, let's remove the previously added random agent from the environment to avoid confusion."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [],
   "source": [
    "trivial_vacuum_env.delete_thing(random_agent)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "TableDrivenVacuumAgent is located at (0, 0).\n"
     ]
    }
   ],
   "source": [
    "# Add the table-driven agent to the environment\n",
    "trivial_vacuum_env.add_thing(table_driven_agent)\n",
    "\n",
    "print(\"TableDrivenVacuumAgent is located at {}.\".format(table_driven_agent.location))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Clean', (1, 0): 'Dirty'}.\n",
      "TableDrivenVacuumAgent is located at (1, 0).\n"
     ]
    }
   ],
   "source": [
    "# Run the environment\n",
    "trivial_vacuum_env.step()\n",
    "\n",
    "# Check the current state of the environment\n",
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))\n",
    "\n",
    "print(\"TableDrivenVacuumAgent is located at {}.\".format(table_driven_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## SIMPLE REFLEX AGENT PROGRAM\n",
    "\n",
    "A simple reflex agent program selects actions on the basis of the *current* percept, ignoring the rest of the percept history. These agents work on a **condition-action rule** (also called **situation-action rule**, **production** or **if-then rule**), which tells the agent the action to trigger when a particular situation is encountered.  \n",
    "\n",
    "The schematic diagram shown in **Figure 2.9** of the book will make this more clear:\n",
    "\n",
    "\"![simple reflex agent](images/simple_reflex_agent.jpg)\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us now create a simple reflex agent for the environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete the previously added table-driven agent\n",
    "trivial_vacuum_env.delete_thing(table_driven_agent)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To create our agent, we need two functions: INTERPRET-INPUT function, which generates an abstracted description of the current state from the percerpt and the RULE-MATCH function, which returns the first rule in the set of rules that matches the given state description."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "loc_A = (0, 0)\n",
    "loc_B = (1, 0)\n",
    "\n",
    "\"\"\"We change the simpleReflexAgentProgram so that it doesn't make use of the Rule class\"\"\"\n",
    "def SimpleReflexAgentProgram():\n",
    "    \"\"\"This agent takes action based solely on the percept. [Figure 2.10]\"\"\"\n",
    "    \n",
    "    def program(percept):\n",
    "        loc, status = percept\n",
    "        return ('Suck' if status == 'Dirty' \n",
    "                else'Right' if loc == loc_A \n",
    "                            else'Left')\n",
    "    return program\n",
    "\n",
    "        \n",
    "# Create a simple reflex agent the two-state environment\n",
    "program = SimpleReflexAgentProgram()\n",
    "simple_reflex_agent = Agent(program)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now add the agent to the environment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SimpleReflexVacuumAgent is located at (1, 0).\n"
     ]
    }
   ],
   "source": [
    "trivial_vacuum_env.add_thing(simple_reflex_agent)\n",
    "\n",
    "print(\"SimpleReflexVacuumAgent is located at {}.\".format(simple_reflex_agent.location))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Clean', (1, 0): 'Clean'}.\n",
      "SimpleReflexVacuumAgent is located at (1, 0).\n"
     ]
    }
   ],
   "source": [
    "# Run the environment\n",
    "trivial_vacuum_env.step()\n",
    "\n",
    "# Check the current state of the environment\n",
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))\n",
    "\n",
    "print(\"SimpleReflexVacuumAgent is located at {}.\".format(simple_reflex_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## MODEL-BASED REFLEX AGENT PROGRAM\n",
    "\n",
    "A model-based reflex agent maintains some sort of **internal state** that depends on the percept history and thereby reflects at least some of the unobserved aspects of the current state. In addition to this, it also requires a **model** of the world, that is, knowledge about \"how the world works\".\n",
    "\n",
    "The schematic diagram shown in **Figure 2.11** of the book will make this more clear:\n",
    "<img src=\"files/images/model_based_reflex_agent.jpg\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will now create a model-based reflex agent for the environment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete the previously added simple reflex agent\n",
    "trivial_vacuum_env.delete_thing(simple_reflex_agent)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need another function UPDATE-STATE which will be responsible for creating a new state description."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ModelBasedVacuumAgent is located at (0, 0).\n"
     ]
    }
   ],
   "source": [
    "# TODO: Implement this function for the two-dimensional environment\n",
    "def update_state(state, action, percept, model):\n",
    "    pass\n",
    "\n",
    "# Create a model-based reflex agent\n",
    "model_based_reflex_agent = ModelBasedVacuumAgent()\n",
    "\n",
    "# Add the agent to the environment\n",
    "trivial_vacuum_env.add_thing(model_based_reflex_agent)\n",
    "\n",
    "print(\"ModelBasedVacuumAgent is located at {}.\".format(model_based_reflex_agent.location))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Clean', (1, 0): 'Clean'}.\n",
      "ModelBasedVacuumAgent is located at (1, 0).\n"
     ]
    }
   ],
   "source": [
    "# Run the environment\n",
    "trivial_vacuum_env.step()\n",
    "\n",
    "# Check the current state of the environment\n",
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))\n",
    "\n",
    "print(\"ModelBasedVacuumAgent is located at {}.\".format(model_based_reflex_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## GOAL-BASED AGENT PROGRAM\n",
    "\n",
    "A goal-based agent needs some sort of **goal** information that describes situations that are desirable, apart from the current state description.\n",
    "\n",
    "**Figure 2.13** of the book shows a model-based, goal-based agent:\n",
    "<img src=\"files/images/model_goal_based_agent.jpg\">\n",
    "\n",
    "**Search** (Chapters 3 to 5) and **Planning** (Chapters 10 to 11) are the subfields of AI devoted to finding action sequences that achieve the agent's goals.\n",
    "\n",
    "## UTILITY-BASED AGENT PROGRAM\n",
    "\n",
    "A utility-based agent maximizes its **utility** using the agent's **utility function**, which is essentially an internalization of the agent's performance measure.\n",
    "\n",
    "**Figure 2.14** of the book shows a model-based, utility-based agent:\n",
    "<img src=\"files/images/model_utility_based_agent.jpg\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LEARNING AGENT\n",
    "\n",
    "Learning allows the agent to operate in initially unknown environments and to become more competent than its initial knowledge alone might allow. Here, we will breifly introduce the main ideas of learning agents.  \n",
    "\n",
    "A learning agent can be divided into four conceptual components. The **learning element** is responsible for making improvements. It uses the feedback from the **critic** on how the agent is doing and determines how the performance element should be modified to do better in the future. The **performance element** is responsible for selecting external actions for the agent: it takes in percepts and decides on actions. The critic tells the learning element how well the agent is doing with respect to a fixed performance standard. It is necesaary because the percepts themselves provide no indication of the agent's success. The last component of the learning agent is the **problem generator**. It is responsible for suggesting actions that will lead to new and informative experiences.  \n",
    "\n",
    "**Figure 2.15** of the book sums up the components and their working:  \n",
    "<img src=\"files/images/general_learning_agent.jpg\">"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}