mdp.ipynb 179 ko
Newer Older
   ],
   "source": [
    "pi = best_policy(sequential_decision_environment, value_iteration(sequential_decision_environment, .001))\n",
    "from utils import print_table\n",
    "print_table(sequential_decision_environment.to_arrows(pi))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is exactly the output we expected\n",
    "![title](images/-4.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The living reward for each state is now more negative than the most negative terminal. Life is so painful that the agent heads for the nearest exit as even the worst exit is less painful than the current state."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Case 4\n",
    "---\n",
    "R(s) = 4 in all states except terminal states"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "sequential_decision_environment = GridMDP([[4, 4, 4, +1],\n",
    "                                           [4, None, 4, -1],\n",
    "                                           [4, 4, 4, 4]],\n",
    "                                          terminals=[(3, 2), (3, 1)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ">   >      <   .\n",
      ">   None   <   .\n",
      ">   >      >   v\n"
     ]
    }
   ],
   "source": [
    "pi = best_policy(sequential_decision_environment, value_iteration(sequential_decision_environment, .001))\n",
    "from utils import print_table\n",
    "print_table(sequential_decision_environment.to_arrows(pi))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, the output we expect is\n",
    "![title](images/4.jpg)\n",
    "<br>\n",
    "As life is positively enjoyable and the agent avoids _both_ exits.\n",
    "Even though the output we get is not exactly what we want, it is definitely not wrong.\n",
    "The scenario here requires the agent to anything but reach a terminal state, as this is the only way the agent can maximize its reward (total reward tends to infinity), and the program does just that.\n",
    "<br>\n",
    "Currently, the GridMDP class doesn't support an explicit marker for a \"do whatever you like\" action or a \"don't care\" condition.\n",
    "You can however, extend the class to do so.\n",
    "<br>\n",
    "For in-depth knowledge about sequential decision problems, refer **Section 17.1** in the AIMA book."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## Appendix\n",
    "\n",
    "Surprisingly, it turns out that there are six other optimal policies for various ranges of R(s). \n",
    "You can try to find them out for yourself.\n",
    "See **Exercise 17.5**.\n",
    "To help you with this, we have a GridMDP editor in `grid_mdp.py` in the GUI folder. \n",
    "<br>\n",
    "Here's a brief tutorial about how to use it\n",
    "<br>\n",
    "Let us use it to solve `Case 2` above\n",
    "1. Run `python gui/grid_mdp.py` from the master directory.\n",
    "2. Enter the dimensions of the grid (3 x 4 in this case), and click on `'Build a GridMDP'`\n",
    "3. Click on `Initialize` in the `Edit` menu.\n",
    "4. Set the reward as -0.4 and click `Apply`. Exit the dialog. \n",
    "![title](images/ge0.jpg)\n",
    "<br>\n",
    "5. Select cell (1, 1) and check the `Wall` radio button. `Apply` and exit the dialog.\n",
    "![title](images/ge1.jpg)\n",
    "<br>\n",
    "6. Select cells (4, 1) and (4, 2) and check the `Terminal` radio button for both. Set the rewards appropriately and click on `Apply`. Exit the dialog. Your window should look something like this.\n",
    "![title](images/ge2.jpg)\n",
    "<br>\n",
    "7. You are all set up now. Click on `Build and Run` in the `Build` menu and watch the heatmap calculate the utility function.\n",
    "![title](images/ge4.jpg)\n",
    "<br>\n",
    "Green shades indicate positive utilities and brown shades indicate negative utilities. \n",
    "The values of the utility function and arrow diagram will pop up in separate dialogs after the algorithm converges."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
Anthony Marakis's avatar
Anthony Marakis a validé
   "version": "3.5.3"
    "001e6c8ed3fc4eeeb6ab7901992314dd": {
     "views": []
    },
    "00f29880456846a8854ab515146ec55b": {
     "views": []
    },
    "010f52f7cde545cba25593839002049b": {
     "views": []
    },
    "01473ad99aa94acbaca856a7d980f2b9": {
     "views": []
    },
    "021a4a4f35da484db5c37c5c8d0dbcc2": {
     "views": []
    },
    "02229be5d3bc401fad55a0378977324a": {
     "views": []
    },
    "022a5fdfc8e44fb09b21c4bd5b67a0db": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "025c3b0250b94d4c8d9b33adfdba4c15": {
     "views": []
    },
    "028f96abfed644b8b042be1e4b16014d": {
     "views": []
    },
    "0303bad44d404a1b9ad2cc167e42fcb7": {
     "views": []
    },
    "031d2d17f32347ec83c43798e05418fe": {
     "views": []
    },
    "03de64f0c2fd43f1b3b5d84aa265aeb7": {
     "views": []
    },
    "03fdd484675b42ad84448f64c459b0e0": {
     "views": []
    },
    "044cf74f03fd44fd840e450e5ee0c161": {
     "views": []
    },
    "054ae5ba0a014a758de446f1980f1ba5": {
     "views": []
    },
    "0675230fb92f4539bc257b768fb4cd10": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "06c93b34e1f4424aba9a0b172c428260": {
     "views": []
    },
    "077a5ea324be46c3ad0110671a0c6a12": {
     "views": []
    },
    "0781138d150142a08775861a69beaec9": {
     "views": []
    },
    "0783e74a8c2b40cc9b0f5706271192f4": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "07c7678b73634e728085f19d7b5b84f7": {
     "views": []
    },
    "07febf1d15a140d8adb708847dd478ec": {
     "views": []
    },
    "08299b681cd9477f9b19a125e186ce44": {
    "083af89d82e445aab4abddfece61d700": {
    "08a1129a8bd8486bbfe2c9e49226f618": {
    "08a2f800c0d540fdb24015156c7ffc15": {
    "097d8d0feccc4c76b87bbcb3f1ecece7": {
    "098f12158d844cdf89b29a4cd568fda0": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "09e96f9d5d32453290af60fbd29ca155": {
     "views": []
    },
    "0a2ec7c49dcd4f768194483c4f2e8813": {
     "views": []
    },
    "0b1d6ed8fe4144b8a24228e1befe2084": {
     "views": []
    },
    "0b299f8157d24fa9830653a394ef806a": {
     "views": []
    },
    "0b2a4ac81a244ff1a7b313290465f8f4": {
     "views": []
    },
    "0b52cfc02d604bc2ae42f4ba8c7bca4f": {
     "views": []
    },
    "0b65fb781274495ab498ad518bc274d4": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "0b865813de0841c49b41f6ad5fb85c6a": {
     "views": []
    },
    "0c2070d20fb04864aeb2008a6f2b8b30": {
     "views": []
    },
    "0cf5319bcde84f65a1a91c5f9be3aa28": {
     "views": []
    },
    "0d721b5be85f4f8aafe26b3597242d60": {
     "views": []
    },
    "0d9f29e197ad45d6a04bbb6864d3be6d": {
    "0e03c7e2c0414936b206ed055e19acba": {
    "0e2265aa506a4778bfc480d5e48c388b": {
    "0e4e3d0b6afc413e86970ec4250df678": {
    "0e6a5fe6423542e6a13e30f8929a8b02": {
    "0e7b2f39c94343c3b0d3b6611351886e": {
    "0eb5005fa34440988bcf3be231d31511": {
    "104703ad808e41bc9106829bb0396ece": {
    "109c376b28774a78bf90d3da4587d834": {
    "10b24041718843da976ac616e77ea522": {
    "11516bb6db8b45ef866bd9be8bb59312": {
    "1203903354fa467a8f38dbbad79cbc81": {
    "124ecbe68ada40f68d6a1807ad6bcdf9": {
    "1264becdbb63455183aa75f236a3413e": {
    "13061cc21693480a8380346277c1b877": {
    "130dd4d2c9f04ad28d9a6ac40045a329": {
    "1350a087b5a9422386c3c5f04dd5d1c9": {
    "139bd19be4a4427a9e08f0be6080188e": {
    "13f9f589d36c477f9b597dda459efd16": {
    "140917b5c77348ec82ea45da139a3045": {
    "145419657bb1401ba934e6cea43d5fd1": {
    "15d748f1629d4da1982cd62cfbcb1725": {
    "17ad015dbc744ac6952d2a6da89f0289": {
    "17b6508f32e4425e9f43e5407eb55ed3": {
    "185598d8e5fc4dffae293f270a6e7328": {
    "196473b25f384f3895ee245e8b7874e9": {
    "19c0f87663a0431285a62d4ad6748046": {
    "1a00a7b7446d4ad8b08c9a2a9ea9c852": {
    "1a97f5b88cdc4ae0871578c06bbb9965": {
    "1a9a07777b0c4a45b33e25a70ebdc290": {
    "1af711fe8e4f43f084cef6c89eec40ae": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "1aff6a6e15b34bb89d7579d445071230": {
     "views": []
    },
    "1b1ea7e915d846aea9efeae4381b2c48": {
     "views": []
    },
    "1ba02ae1967740b0a69e07dbe95635cb": {
     "views": []
    },
    "1c5c913acbde4e87a163abb2e24e6e38": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "1cfca0b7ef754c459e1ad97c1f0ceb3b": {
     "views": []
    },
    "1d8f6a4910e649589863b781aab4c4d4": {
     "views": []
    },
    "1e64b8f5a1554a22992693c194f7b971": {
     "views": []
    },
    "1e8f0a2bf7614443a380e53ed27b48c0": {
    "1f4e6fa4bacc479e8cd997b26a5af733": {
    "1fdf09158eb44415a946f07c6aaba620": {
     "views": []
    },
    "200e3ebead3d4858a47e2f6d345ca395": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "2050d4b462474a059f9e6493ba06ac58": {
    "20b5c21a6e6a427ba3b9b55a0214f75e": {
    "20b99631feba4a9c98c9d5f74c620273": {
    "20bcff5082854ab89a7977ae56983e30": {
    "20d708bf9b7845fa946f5f37c7733fee": {
    "210b36ea9edf4ee49ae1ae3fe5005282": {
    "21415393cb2d4f72b5c3f5c058aeaf66": {
    "2186a18b6ed8405a8a720bae59de2ace": {
    "220dc13e9b6942a7b9ed9e37d5ede7ba": {
    "221a735fa6014a288543e6f8c7e4e2ef": {
    "2288929cec4d4c8faad411029f5e21fa": {
    "22b86e207ea6469d85d8333870851a86": {
    "23283ad662a140e3b5e8677499e91d64": {
    "23a7cc820b63454ca6be3dcfd2538ac1": {
    "240ed02d576546028af3edfab9ea8558": {
    "24678e52a0334cb9a9a56f92c29750be": {
    "247820f6d83f4dd9b68f5df77dbda4b7": {
    "24b6a837fbd942c9a68218fb8910dcd5": {
    "24ee3204f26348bca5e6a264973e5b56": {
    "262c7bb5bd7447f791509571fe74ae44": {
    "263595f22d0d45e2a850854bcefe4731": {
    "2640720aa6684c5da6d7870abcbc950b": {
    "265ca1ec7ad742f096bb8104d0cf1550": {
    "26bf66fba453464fac2f5cd362655083": {
    "29769879478f49e8b4afd5c0b4662e87": {
    "29a13bd6bc8d486ca648bf30c9e4c2a6": {
    "29c5df6267584654b76205fc5559c553": {
    "29ce25045e7248e5892e8aafc635c416": {
    "2a17207c43c9424394299a7b52461794": {
    "2a777941580945bc83ddb0c817ed4122": {
    "2ae1844e2afe416183658d7a602e5963": {
    "2afa2938b41944cf8c14e41a431e3969": {
    "2bdc5f9b161548e3aab8ea392b5af1a1": {
    "2c26b2bcfc96473584930a4b622d268e": {
    "2ca2a914a5f940b18df0b5cde2b79e4b": {
    "2ca2c532840548a9968d1c6b2f0acdd8": {
    "2d17c32bfea143babe2b114d8777b15d": {
    "2d3acd8872c342eab3484302cac2cb05": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "2dc514cc2f5547aeb97059a5070dc9e3": {
    "2e1351ad05384d058c90e594bc6143c1": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "2e9b80fa18984615933e41c1c1db2171": {
    "2ef17ee6b7c74a4bbbbbe9b1a93e4fb6": {
    "2f5438f1b34046a597a467effd43df11": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "2f8d22417f3e421f96027fca40e1554f": {
    "2fb0409cfb49469d89a32597dc3edba9": {
    "303ccef837984c97b7e71f2988c737a4": {
    "3058b0808dca48a0bba9a93682260491": {
    "306b65493c28411eb10ad786bbf85dc5": {
    "30f5d30cf2d84530b3199015c5ff00eb": {
    "310b1ac518bd4079bdb7ecaf523a6809": {
    "313eca81d9d24664bcc837db54d59618": {
    "31413caf78c14548baa61e3e3c9edc55": {
    "317fbd3cb6324b2fbdfd6aa46a8d1192": {
    "319425ba805346f5ba366c42e220f9c6": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "31fc8165275e473f8f75c6215b5184ff": {
    "329f12edaa0c44d2a619450f188e8777": {
    "32edf057582f4a6ca30ce3cb685bf971": {
    "330e74773ba148e18674cfa3e63cd6cc": {
    "332a89c03bfb49c2bb291051d172b735": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "3347dfda0aca450f89dd9b39ca1bec7d": {
    "336e8bcfd7cc4a85956674b0c7bffff2": {
    "3376228b3b614d4ab2a10b2fd0f484fd": {
    "3380a22bc67c4be99c61050800f93395": {
    "34b5c16cbea448809c2ccbce56f8d5a5": {
    "34bb050223504afc8053ce931103f52c": {
    "34c28187175d49198b536a1ab13668c4": {
    "3521f32644514ecf9a96ddfa5d80fb9b": {
    "36511bd77ed74f668053df749cc735d4": {
    "36541c3490bd4268b64daf20d8c24124": {
    "37aa1dd4d76a4bac98857b519b7b523a": {
    "37aa3cfa3f8f48989091ec46ac17ae48": {
    "386991b0b1424a9c816dac6a29e1206b": {
    "386cf43742234dda994e35b41890b4d8": {
    "388571e8e0314dfab8e935b7578ba7f9": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "3974e38e718547efaf0445da2be6a739": {
    "398490e0cc004d22ac9c4486abec61e1": {
    "399875994aba4c53afa8c49fae8d369e": {
    "39b64aa04b1d4a81953e43def0ef6e10": {
    "39ffc3dd42d94a27ba7240d10c11b565": {
    "3a21291c8e7249e3b04417d31b0447cf": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "3a377d9f46704d749c6879383c89f5d3": {
    "3a44a6f1f62742849e96d957033a0039": {
    "3b22d68709b046e09fe70f381a3944cd": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "3b329209c8f547acae1925dc3eb4af77": {
    "3c1b2ec10a9041be8a3fad9da78ff9f6": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "3c2be3c85c6d41268bb4f9d63a43e196": {
    "3c6796eff7c54238a7b7776e88721b08": {
    "3cbca3e11edf439fb7f8ba41693b4824": {
    "3d4b6b7c0b0c48ff8c4b8d78f58e0f1c": {
    "3de1faf0d2514f49a99b3d60ea211495": {
    "3df60d9ac82b42d9b885d895629e372e": {
    "3e5b9fd779574270bf58101002c152ce": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "3e80f34623c94659bfab5b3b56072d9a": {
    "3e8bb05434cb4a0291383144e4523840": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "3ea1c8e4f9b34161928260e1274ee048": {
    "3f32f0915bc6469aaaf7170eff1111e3": {
    "3fe69a26ae7a46fda78ae0cb519a0f8b": {
    "4000ecdd75d9467e9dffd457b35aa65f": {
    "402d346f8b68408faed2fd79395cf3fb": {
    "402f4116244242148fdc009bb399c3bd": {
    "4049e0d7c0d24668b7eae2bb7169376e": {
    "4088c9ed71b0467b9b9417d5b04eda0e": {
    "40d70faa07654b6cb13496c32ba274b3": {
    "4146be21b7614abe827976787ec570f1": {
    "4198c08edda440dd93d1f6ce3e4efa62": {
    "42023d7d3c264f9d933d4cee4362852b": {
    "421ad8c67f754ce2b24c4fa3a8e951cf": {
    "4263fe0cef42416f8d344c1672f591f9": {
    "428e42f04a1e4347a1f548379c68f91b": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "42a47243baf34773943a25df9cf23854": {
    "4343b72c91d04a7c9a6080f30fc63d7d": {
    "43488264fc924c01a30fa58604074b07": {
    "4379175239b34553bf45c8ef9443ac55": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "43859798809a4a289c58b4bd5e49d357": {
    "43ad406a61a34249b5622aba9450b23d": {
    "4421c121414d464bb3bf1b5f0e86c37b": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "445cc08b4da44c2386ac9379793e3506": {
    "447cff7e256c434e859bb7ce9e5d71c8": {
    "44af7da9d8304f07890ef7d11a9f95fe": {
    "45021b6f05db4c028a3b5572bc85217f": {
    "457768a474844556bf9b215439a2f2e9": {
    "45d5689de53646fe9042f3ce9e281acc": {
    "461aa21d57824526a6b61e3f9b5af523": {
    "472ca253aab34b098f53ed4854d35f23": {
    "4731208453424514b471f862804d9bb8": {
       "cell_index": 27.0
    "47dfef9eaf0e433cb4b3359575f39480": {
     "views": []
    },
    "48220a877d494a3ea0cc9dae19783a13": {
     "views": []
    },
    "4882c417949b4b6788a1c3ec208fb1ac": {
     "views": []
    },
    "49f5c38281984e3bad67fe3ea3eb6470": {
     "views": []
    },
    "4a0d39b43eee4e818d47d382d87d86d1": {
     "views": []
    },
    "4a470bf3037047f48f4547b594ac65fa": {
     "views": []
    },
    "4abab5bca8334dfbb0434be39eb550db": {
    "4b48e08fd383489faa72fc76921eac4e": {
    "4b9439e6445c4884bd1cde0e9fd2405e": {
    "4b9fa014f9904fcf9aceff00cc1ebf44": {
    "4bdc63256c3f4e31a8fa1d121f430518": {
    "4bebb097ddc64bbda2c475c3a0e92ab5": {
    "4c201df21ca34108a6e7b051aa58b7f6": {
    "4ced8c156fd941eca391016fc256ce40": {
    "4d281cda33fa489d86228370e627a5b0": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "4d85e68205d94965bdb437e5441b10a1": {
     "views": []
    },
    "4e0e6dd34ba7487ba2072d352fe91bf5": {
     "views": []
    },
    "4e82b1d731dd419480e865494f932f80": {
     "views": []
    },
    "4e9f52dea051415a83c4597c4f7a6c00": {
     "views": []
    },
    "4ec035cba73647358d416615cf4096ee": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "4f09442f99aa4a9e9f460f82a50317c4": {
     "views": []
    },
    "4f80b4e6b074475698efbec6062e3548": {
     "views": []
    },
    "4f905a287b4f4f0db64b9572432b0139": {
    "50a339306cd549de86fbe5fa2a0a3503": {
    "51068697643243e18621c888a6504434": {
    "51333b89f44b41aba813aef099bdbb42": {
     "views": []
    },
    "5141ae07149b46909426208a30e2861e": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "515606cb3b3a4fccad5056d55b262db4": {
     "views": []
    },
    "51aa6d9f5a90481db7e3dd00d77d4f09": {
     "views": []
    },
    "524091ea717d427db2383b46c33ef204": {
     "views": []
    },
    "524d1132c88f4d91b15344cc427a9565": {
     "views": []
    },
    "52f70e249adc4edb8dca28b883a5d4f4": {
     "views": []
    },
    "531c080221f64b8ca50d792bbaa6f31e": {
     "views": []
    },
    "53349c544b54450f8e2af9b8ba176d78": {
     "views": []
    },
    "53a8b8e7b7494d02852a0dc5ccca51a2": {
     "views": []
    },
    "53c963469eee41b59479753201626f18": {
     "views": []
    },
    "5436516c280a49828c1c2f4783d9cf0e": {
     "views": []
    },
    "55a1b0b794f44ac796bc75616f65a2a1": {
     "views": [
      {
       "cell_index": 27.0
      }
     ]
    },
    "55ebf735de4c4b5ba2f09bc51d3593fd": {
     "views": []
    },
    "56007830e925480e94a12356ff4fb6a4": {
     "views": []
    },
    "56def8b3867843f990439b33dab3da58": {
     "views": []
    },
    "5719bb596a5649f6af38c11c3daae6e9": {
     "views": []
    },
    "572245b145014b6e91a3b5fe55e4cf78": {