vacuum_world.ipynb 17 ko
Newer Older
Apurv Bajaj's avatar
Apurv Bajaj a validé
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# THE VACUUM WORLD   \n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "In this notebook, we will be discussing **the structure of agents** through an example of the **vacuum agent**. The job of AI is to design an **agent program** that implements the agent function: the mapping from percepts to actions. We assume this program will run on some sort of computing device with physical sensors and actuators: we call this the **architecture**:\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "<h3 align=\"center\">agent = architecture + program</h3>"
Apurv Bajaj's avatar
Apurv Bajaj a validé
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before moving on, please review [<b>agents.ipynb</b>](https://github.com/aimacode/aima-python/blob/master/agents.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## CONTENTS\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "* Agent\n",
    "* Random Agent Program\n",
    "* Table-Driven Agent Program\n",
    "* Simple Reflex Agent Program\n",
    "* Model-Based Reflex Agent Program\n",
    "* Goal-Based Agent Program\n",
    "* Utility-Based Agent Program\n",
    "* Learning Agent"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## AGENT PROGRAMS\n",
    "\n",
    "An agent program takes the current percept as input from the sensors and returns an action to the actuators. There is a difference between an agent program and an agent function: an agent program takes the current percept as input whereas an agent function takes the entire percept history.\n",
    "\n",
    "The agent program takes just the current percept as input because nothing more is available from the environment; if the agent's actions depend on the entire percept sequence, the agent will have to remember the percept.\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "We'll discuss the following agent programs here with the help of the vacuum world example:\n",
    "\n",
    "* Random Agent Program\n",
    "* Table-Driven Agent Program\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "* Simple Reflex Agent Program\n",
    "* Model-Based Reflex Agent Program\n",
    "* Goal-Based Agent Program\n",
    "* Utility-Based Agent Program"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Random Agent Program\n",
    "\n",
    "A random agent program, as the name suggests, chooses an action at random, without taking into account the percepts.   \n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "Here, we will demonstrate a random vacuum agent for a trivial vacuum environment, that is, the two-state environment."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's begin by importing all the functions from the agents module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [],
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "source": [
    "from agents import *\n",
    "from notebook import psource"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us first see how we define the TrivialVacuumEnvironment. Run the next cell to see how abstract class TrivialVacuumEnvironment is defined in agents module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [],
   "source": [
    "psource(TrivialVacuumEnvironment)"
Apurv Bajaj's avatar
Apurv Bajaj a validé
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Dirty', (1, 0): 'Clean'}.\n"
Apurv Bajaj's avatar
Apurv Bajaj a validé
     ]
    }
   ],
   "source": [
    "# These are the two locations for the two-state environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "loc_A, loc_B = (0, 0), (1, 0)\n",
    "\n",
    "# Initialize the two-state environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env = TrivialVacuumEnvironment()\n",
    "\n",
Robert Hönig's avatar
Robert Hönig a validé
    "# Check the initial state of the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's create our agent now. This agent will choose any of the actions from 'Right', 'Left', 'Suck' and 'NoOp' (No Operation) randomly."
Apurv Bajaj's avatar
Apurv Bajaj a validé
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "outputs": [],
   "source": [
    "# Create the random agent\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "random_agent = Agent(program=RandomAgentProgram(['Right', 'Left', 'Suck', 'NoOp']))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will now add our agent to the environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RandomVacuumAgent is located at (0, 0).\n"
     ]
    }
   ],
   "source": [
    "# Add agent to the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.add_thing(random_agent)\n",
    "\n",
    "print(\"RandomVacuumAgent is located at {}.\".format(random_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's run our environment now."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Dirty', (1, 0): 'Clean'}.\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
      "RandomVacuumAgent is located at (0, 0).\n"
     ]
    }
   ],
   "source": [
    "# Running the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.step()\n",
    "\n",
    "# Check the current state of the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))\n",
    "\n",
    "print(\"RandomVacuumAgent is located at {}.\".format(random_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## TABLE-DRIVEN AGENT PROGRAM\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "A table-driven agent program keeps track of the percept sequence and then uses it to index into a table of actions to decide what to do. The table represents explicitly the agent function that the agent program embodies.  \n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "In the two-state vacuum world, the table would consist of all the possible states of the agent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "outputs": [],
   "source": [
    "table = {((loc_A, 'Clean'),): 'Right',\n",
    "             ((loc_A, 'Dirty'),): 'Suck',\n",
    "             ((loc_B, 'Clean'),): 'Left',\n",
    "             ((loc_B, 'Dirty'),): 'Suck',\n",
    "             ((loc_A, 'Dirty'), (loc_A, 'Clean')): 'Right',\n",
    "             ((loc_A, 'Clean'), (loc_B, 'Dirty')): 'Suck',\n",
    "             ((loc_B, 'Clean'), (loc_A, 'Dirty')): 'Suck',\n",
    "             ((loc_B, 'Dirty'), (loc_B, 'Clean')): 'Left',\n",
    "             ((loc_A, 'Dirty'), (loc_A, 'Clean'), (loc_B, 'Dirty')): 'Suck',\n",
    "             ((loc_B, 'Dirty'), (loc_B, 'Clean'), (loc_A, 'Dirty')): 'Suck'\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "        }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will now create a table-driven agent program for our two-state environment."
Apurv Bajaj's avatar
Apurv Bajaj a validé
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "outputs": [],
   "source": [
    "# Create a table-driven agent\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "table_driven_agent = Agent(program=TableDrivenAgentProgram(table=table))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since we are using the same environment, let's remove the previously added random agent from the environment to avoid confusion."
Apurv Bajaj's avatar
Apurv Bajaj a validé
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "outputs": [],
   "source": [
    "trivial_vacuum_env.delete_thing(random_agent)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "TableDrivenVacuumAgent is located at (0, 0).\n"
     ]
    }
   ],
   "source": [
    "# Add the table-driven agent to the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.add_thing(table_driven_agent)\n",
    "\n",
    "print(\"TableDrivenVacuumAgent is located at {}.\".format(table_driven_agent.location))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Clean', (1, 0): 'Clean'}.\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
      "TableDrivenVacuumAgent is located at (0, 0).\n"
     ]
    }
   ],
   "source": [
    "# Run the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.step()\n",
    "\n",
    "# Check the current state of the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))\n",
    "\n",
    "print(\"TableDrivenVacuumAgent is located at {}.\".format(table_driven_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## SIMPLE REFLEX AGENT PROGRAM\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
Robert Hönig's avatar
Robert Hönig a validé
    "A simple reflex agent program selects actions on the basis of the *current* percept, ignoring the rest of the percept history. These agents work on a **condition-action rule** (also called **situation-action rule**, **production** or **if-then rule**), which tells the agent the action to trigger when a particular situation is encountered.  \n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "The schematic diagram shown in **Figure 2.9** of the book will make this more clear:\n",
    "\n",
    "<img src=\"/files/images/simple_reflex_agent.jpg\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us now create a simple reflex agent for the environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "outputs": [],
   "source": [
    "# Delete the previously added table-driven agent\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.delete_thing(table_driven_agent)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To create our agent, we need two functions: INTERPRET-INPUT function, which generates an abstracted description of the current state from the percerpt and the RULE-MATCH function, which returns the first rule in the set of rules that matches the given state description."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "outputs": [],
   "source": [
    "# TODO: Implement these functions for two-dimensional environment\n",
    "# Interpret-input function for the two-state environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "def interpret_input(percept):\n",
    "    pass\n",
    "\n",
    "rules = None\n",
    "\n",
    "# Rule-match function for the two-state environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "def rule_match(state, rule):\n",
    "    for rule in rules:\n",
    "        if rule.matches(state):\n",
    "            return rule        \n",
    "        \n",
    "# Create a simple reflex agent the two-state environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "simple_reflex_agent = ReflexVacuumAgent()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now add the agent to the environment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SimpleReflexVacuumAgent is located at (1, 0).\n"
Apurv Bajaj's avatar
Apurv Bajaj a validé
     ]
    }
   ],
   "source": [
    "trivial_vacuum_env.add_thing(simple_reflex_agent)\n",
    "\n",
    "print(\"SimpleReflexVacuumAgent is located at {}.\".format(simple_reflex_agent.location))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Clean', (1, 0): 'Clean'}.\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
      "SimpleReflexVacuumAgent is located at (0, 0).\n"
     ]
    }
   ],
   "source": [
    "# Run the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.step()\n",
    "\n",
    "# Check the current state of the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))\n",
    "\n",
    "print(\"SimpleReflexVacuumAgent is located at {}.\".format(simple_reflex_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## MODEL-BASED REFLEX AGENT PROGRAM\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
Robert Hönig's avatar
Robert Hönig a validé
    "A model-based reflex agent maintains some sort of **internal state** that depends on the percept history and thereby reflects at least some of the unobserved aspects of the current state. In addition to this, it also requires a **model** of the world, that is, knowledge about \"how the world works\".\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "The schematic diagram shown in **Figure 2.11** of the book will make this more clear:\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "<img src=\"files/images/model_based_reflex_agent.jpg\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will now create a model-based reflex agent for the environment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [],
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "source": [
    "# Delete the previously added simple reflex agent\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.delete_thing(simple_reflex_agent)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need another function UPDATE-STATE which will be responsible for creating a new state description."
Apurv Bajaj's avatar
Apurv Bajaj a validé
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ModelBasedVacuumAgent is located at (0, 0).\n"
     ]
    }
   ],
   "source": [
    "# TODO: Implement this function for the two-dimensional environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "def update_state(state, action, percept, model):\n",
    "    pass\n",
    "\n",
    "# Create a model-based reflex agent\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "model_based_reflex_agent = ModelBasedVacuumAgent()\n",
    "\n",
    "# Add the agent to the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.add_thing(model_based_reflex_agent)\n",
    "\n",
    "print(\"ModelBasedVacuumAgent is located at {}.\".format(model_based_reflex_agent.location))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
Apurv Bajaj's avatar
Apurv Bajaj a validé
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "State of the Environment: {(0, 0): 'Clean', (1, 0): 'Clean'}.\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
      "ModelBasedVacuumAgent is located at (1, 0).\n"
     ]
    }
   ],
   "source": [
    "# Run the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "trivial_vacuum_env.step()\n",
    "\n",
    "# Check the current state of the environment\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "print(\"State of the Environment: {}.\".format(trivial_vacuum_env.status))\n",
    "\n",
    "print(\"ModelBasedVacuumAgent is located at {}.\".format(model_based_reflex_agent.location))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## GOAL-BASED AGENT PROGRAM\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "A goal-based agent needs some sort of **goal** information that describes situations that are desirable, apart from the current state description.\n",
    "\n",
    "**Figure 2.13** of the book shows a model-based, goal-based agent:\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "<img src=\"files/images/model_goal_based_agent.jpg\">\n",
    "\n",
    "**Search** (Chapters 3 to 5) and **Planning** (Chapters 10 to 11) are the subfields of AI devoted to finding action sequences that achieve the agent's goals.\n",
    "\n",
    "## UTILITY-BASED AGENT PROGRAM\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "A utility-based agent maximizes its **utility** using the agent's **utility function**, which is essentially an internalization of the agent's performance measure.\n",
Apurv Bajaj's avatar
Apurv Bajaj a validé
    "\n",
    "**Figure 2.14** of the book shows a model-based, utility-based agent:\n",
    "<img src=\"files/images/model_utility_based_agent.jpg\">"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LEARNING AGENT\n",
    "\n",
    "Learning allows the agent to operate in initially unknown environments and to become more competent than its initial knowledge alone might allow. Here, we will breifly introduce the main ideas of learning agents.  \n",
    "\n",
    "A learning agent can be divided into four conceptual components. The **learning element** is responsible for making improvements. It uses the feedback from the **critic** on how the agent is doing and determines how the performance element should be modified to do better in the future. The **performance element** is responsible for selecting external actions for the agent: it takes in percepts and decides on actions. The critic tells the learning element how well the agent is doing with respect to a fixed performance standard. It is necesaary because the percepts themselves provide no indication of the agent's success. The last component of the learning agent is the **problem generator**. It is responsible for suggesting actions that will lead to new and informative experiences.  \n",
    "\n",
    "**Figure 2.15** of the book sums up the components and their working:  \n",
    "<img src=\"files/images/general_learning_agent.jpg\">"
   ]
Apurv Bajaj's avatar
Apurv Bajaj a validé
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
Apurv Bajaj's avatar
Apurv Bajaj a validé
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}