Newer
Older
"metadata": {
"collapsed": true
},
"source": [
"sequential_decision_environment = GridMDP([[-0.4, -0.4, -0.4, +1],\n",
" [-0.4, None, -0.4, -1],\n",
" [-0.4, -0.4, -0.4, -0.4]],\n",
" terminals=[(3, 2), (3, 1)])"
]
},
{
"cell_type": "code",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> > > .\n",
"^ None ^ .\n",
"^ > ^ <\n"
]
}
],
"source": [
"pi = best_policy(sequential_decision_environment, value_iteration(sequential_decision_environment, .001))\n",
"from utils import print_table\n",
"print_table(sequential_decision_environment.to_arrows(pi))"
]
},
{
"metadata": {},
"source": [
"This is exactly the output we expected\n",
""
]
},
{
"metadata": {},
"source": [
"As the reward for each state is now more negative, life is certainly more unpleasant.\n",
"The agent takes the shortest route to the +1 state and is willing to risk falling into the -1 state by accident."
]
},
{
"metadata": {},
"source": [
"### Case 3\n",
"---\n",
"R(s) = -4 in all states except terminal states"
]
},
{
"cell_type": "code",
"metadata": {
"collapsed": true
},
"source": [
"sequential_decision_environment = GridMDP([[-4, -4, -4, +1],\n",
" [-4, None, -4, -1],\n",
" [-4, -4, -4, -4]],\n",
" terminals=[(3, 2), (3, 1)])"
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> > > .\n",
"^ None > .\n",
"> > > ^\n"
]
}
],
"source": [
"pi = best_policy(sequential_decision_environment, value_iteration(sequential_decision_environment, .001))\n",
"from utils import print_table\n",
"print_table(sequential_decision_environment.to_arrows(pi))"
]
},
{
"metadata": {},
"source": [
"This is exactly the output we expected\n",
""
]
},
{
"The living reward for each state is now lower than the least rewarding terminal. Life is so _painful_ that the agent heads for the nearest exit as even the worst exit is less painful than any living state."
"metadata": {},
"source": [
"### Case 4\n",
"---\n",
"R(s) = 4 in all states except terminal states"
]
},
{
"cell_type": "code",
"metadata": {
"collapsed": true
},
"source": [
"sequential_decision_environment = GridMDP([[4, 4, 4, +1],\n",
" [4, None, 4, -1],\n",
" [4, 4, 4, 4]],\n",
" terminals=[(3, 2), (3, 1)])"
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> > < .\n",
"> None < .\n",
"> > > v\n"
]
}
],
"source": [
"pi = best_policy(sequential_decision_environment, value_iteration(sequential_decision_environment, .001))\n",
"from utils import print_table\n",
"print_table(sequential_decision_environment.to_arrows(pi))"
]
},
{
"metadata": {},
"source": [
"In this case, the output we expect is\n",
"\n",
"<br>\n",
"As life is positively enjoyable and the agent avoids _both_ exits.\n",
"Even though the output we get is not exactly what we want, it is definitely not wrong.\n",
"The scenario here requires the agent to anything but reach a terminal state, as this is the only way the agent can maximize its reward (total reward tends to infinity), and the program does just that.\n",
"<br>\n",
"Currently, the GridMDP class doesn't support an explicit marker for a \"do whatever you like\" action or a \"don't care\" condition.\n",
"You can however, extend the class to do so.\n",
"<br>\n",
"For in-depth knowledge about sequential decision problems, refer **Section 17.1** in the AIMA book."
]
},
{
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
"metadata": {},
"source": [
"---\n",
"## Appendix\n",
"\n",
"Surprisingly, it turns out that there are six other optimal policies for various ranges of R(s). \n",
"You can try to find them out for yourself.\n",
"See **Exercise 17.5**.\n",
"To help you with this, we have a GridMDP editor in `grid_mdp.py` in the GUI folder. \n",
"<br>\n",
"Here's a brief tutorial about how to use it\n",
"<br>\n",
"Let us use it to solve `Case 2` above\n",
"1. Run `python gui/grid_mdp.py` from the master directory.\n",
"2. Enter the dimensions of the grid (3 x 4 in this case), and click on `'Build a GridMDP'`\n",
"3. Click on `Initialize` in the `Edit` menu.\n",
"4. Set the reward as -0.4 and click `Apply`. Exit the dialog. \n",
"\n",
"<br>\n",
"5. Select cell (1, 1) and check the `Wall` radio button. `Apply` and exit the dialog.\n",
"\n",
"<br>\n",
"6. Select cells (4, 1) and (4, 2) and check the `Terminal` radio button for both. Set the rewards appropriately and click on `Apply`. Exit the dialog. Your window should look something like this.\n",
"\n",
"<br>\n",
"7. You are all set up now. Click on `Build and Run` in the `Build` menu and watch the heatmap calculate the utility function.\n",
"\n",
"<br>\n",
"Green shades indicate positive utilities and brown shades indicate negative utilities. \n",
"The values of the utility function and arrow diagram will pop up in separate dialogs after the algorithm converges."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
},
"widgets": {
"state": {
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
"001e6c8ed3fc4eeeb6ab7901992314dd": {
"views": []
},
"00f29880456846a8854ab515146ec55b": {
"views": []
},
"010f52f7cde545cba25593839002049b": {
"views": []
},
"01473ad99aa94acbaca856a7d980f2b9": {
"views": []
},
"021a4a4f35da484db5c37c5c8d0dbcc2": {
"views": []
},
"02229be5d3bc401fad55a0378977324a": {
"views": []
},
"022a5fdfc8e44fb09b21c4bd5b67a0db": {
"views": [
{
"cell_index": 27
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
}
]
},
"025c3b0250b94d4c8d9b33adfdba4c15": {
"views": []
},
"028f96abfed644b8b042be1e4b16014d": {
"views": []
},
"0303bad44d404a1b9ad2cc167e42fcb7": {
"views": []
},
"031d2d17f32347ec83c43798e05418fe": {
"views": []
},
"03de64f0c2fd43f1b3b5d84aa265aeb7": {
"views": []
},
"03fdd484675b42ad84448f64c459b0e0": {
"views": []
},
"044cf74f03fd44fd840e450e5ee0c161": {
"views": []
},
"054ae5ba0a014a758de446f1980f1ba5": {
"views": []
},
"0675230fb92f4539bc257b768fb4cd10": {
"views": [
{
"cell_index": 27
}
]
},
"06c93b34e1f4424aba9a0b172c428260": {
"views": []
},
"077a5ea324be46c3ad0110671a0c6a12": {
"views": []
},
"0781138d150142a08775861a69beaec9": {
"views": []
},
"0783e74a8c2b40cc9b0f5706271192f4": {
"views": [
{
"cell_index": 27
}
]
},
"07c7678b73634e728085f19d7b5b84f7": {
"views": []
},
"07febf1d15a140d8adb708847dd478ec": {
"views": []
},
"08299b681cd9477f9b19a125e186ce44": {
"views": []
},
"083af89d82e445aab4abddfece61d700": {
"views": []
},
"08a1129a8bd8486bbfe2c9e49226f618": {
"views": []
},
"08a2f800c0d540fdb24015156c7ffc15": {
"views": []
},
"097d8d0feccc4c76b87bbcb3f1ecece7": {
"views": []
},
"098f12158d844cdf89b29a4cd568fda0": {
"views": [
{
"cell_index": 27
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
}
]
},
"09e96f9d5d32453290af60fbd29ca155": {
"views": []
},
"0a2ec7c49dcd4f768194483c4f2e8813": {
"views": []
},
"0b1d6ed8fe4144b8a24228e1befe2084": {
"views": []
},
"0b299f8157d24fa9830653a394ef806a": {
"views": []
},
"0b2a4ac81a244ff1a7b313290465f8f4": {
"views": []
},
"0b52cfc02d604bc2ae42f4ba8c7bca4f": {
"views": []
},
"0b65fb781274495ab498ad518bc274d4": {
"views": [
{
"cell_index": 27
}
]
},
"0b865813de0841c49b41f6ad5fb85c6a": {
"views": []
},
"0c2070d20fb04864aeb2008a6f2b8b30": {
"views": []
},
"0cf5319bcde84f65a1a91c5f9be3aa28": {
"views": []
},
"0d721b5be85f4f8aafe26b3597242d60": {
"views": []
},
"0d9f29e197ad45d6a04bbb6864d3be6d": {
"views": []
},
"0e03c7e2c0414936b206ed055e19acba": {
"views": []
},
"0e2265aa506a4778bfc480d5e48c388b": {
"views": []
},
"0e4e3d0b6afc413e86970ec4250df678": {
"views": []
},
"0e6a5fe6423542e6a13e30f8929a8b02": {
"views": []
},
"0e7b2f39c94343c3b0d3b6611351886e": {
"views": []
},
"0eb5005fa34440988bcf3be231d31511": {
"views": []
},
"104703ad808e41bc9106829bb0396ece": {
"views": []
},
"109c376b28774a78bf90d3da4587d834": {
"views": []
},
"10b24041718843da976ac616e77ea522": {
"views": []
},
"11516bb6db8b45ef866bd9be8bb59312": {
"views": []
},
"1203903354fa467a8f38dbbad79cbc81": {
"views": []
},
"124ecbe68ada40f68d6a1807ad6bcdf9": {
"views": []
},
"1264becdbb63455183aa75f236a3413e": {
"views": []
},
"13061cc21693480a8380346277c1b877": {
"views": []
},
"130dd4d2c9f04ad28d9a6ac40045a329": {
"views": []
},
"1350a087b5a9422386c3c5f04dd5d1c9": {
"views": []
},
"139bd19be4a4427a9e08f0be6080188e": {
"views": []
},
"13f9f589d36c477f9b597dda459efd16": {
"views": []
},
"140917b5c77348ec82ea45da139a3045": {
"views": []
},
"145419657bb1401ba934e6cea43d5fd1": {
"views": []
},
"15d748f1629d4da1982cd62cfbcb1725": {
"views": []
},
"17ad015dbc744ac6952d2a6da89f0289": {
"views": []
},
"17b6508f32e4425e9f43e5407eb55ed3": {
"views": []
},
"185598d8e5fc4dffae293f270a6e7328": {
"views": []
},
"196473b25f384f3895ee245e8b7874e9": {
"views": []
},
"19c0f87663a0431285a62d4ad6748046": {
"views": []
},
"1a00a7b7446d4ad8b08c9a2a9ea9c852": {
"views": []
},
"1a97f5b88cdc4ae0871578c06bbb9965": {
"views": []
},
"1a9a07777b0c4a45b33e25a70ebdc290": {
"views": []
},
"1af711fe8e4f43f084cef6c89eec40ae": {
"views": [
{
"cell_index": 27
}
]
},
"1aff6a6e15b34bb89d7579d445071230": {
"views": []
},
"1b1ea7e915d846aea9efeae4381b2c48": {
"views": []
},
"1ba02ae1967740b0a69e07dbe95635cb": {
"views": []
},
"1c5c913acbde4e87a163abb2e24e6e38": {
"views": [
{
"cell_index": 27
}
]
},
"1cfca0b7ef754c459e1ad97c1f0ceb3b": {
"views": []
},
"1d8f6a4910e649589863b781aab4c4d4": {
"views": []
},
"1e64b8f5a1554a22992693c194f7b971": {
"views": []
},
"1e8f0a2bf7614443a380e53ed27b48c0": {
"views": []
},
"1f4e6fa4bacc479e8cd997b26a5af733": {
"views": []
},
"1fdf09158eb44415a946f07c6aaba620": {
"views": []
},
"200e3ebead3d4858a47e2f6d345ca395": {
"views": [
{
"cell_index": 27
}
]
},
"2050d4b462474a059f9e6493ba06ac58": {
"views": []
},
"20b5c21a6e6a427ba3b9b55a0214f75e": {
"views": []
},
"20b99631feba4a9c98c9d5f74c620273": {
"views": []
},
"20bcff5082854ab89a7977ae56983e30": {
"views": []
},
"20d708bf9b7845fa946f5f37c7733fee": {
"views": []
},
"210b36ea9edf4ee49ae1ae3fe5005282": {
"views": []
},
"21415393cb2d4f72b5c3f5c058aeaf66": {
"views": []
},
"2186a18b6ed8405a8a720bae59de2ace": {
"views": []
},
"220dc13e9b6942a7b9ed9e37d5ede7ba": {
"views": []
},
"221a735fa6014a288543e6f8c7e4e2ef": {
"views": []
},
"2288929cec4d4c8faad411029f5e21fa": {
"views": []
},
"22b86e207ea6469d85d8333870851a86": {
"views": []
},
"23283ad662a140e3b5e8677499e91d64": {
"views": []
},
"23a7cc820b63454ca6be3dcfd2538ac1": {
"views": []
},
"240ed02d576546028af3edfab9ea8558": {
"views": []
},
"24678e52a0334cb9a9a56f92c29750be": {
"views": []
},
"247820f6d83f4dd9b68f5df77dbda4b7": {
"views": []
},
"24b6a837fbd942c9a68218fb8910dcd5": {
"views": []
},
"24ee3204f26348bca5e6a264973e5b56": {
"views": []
},
"262c7bb5bd7447f791509571fe74ae44": {
"views": []
},
"263595f22d0d45e2a850854bcefe4731": {
"views": []
},
"2640720aa6684c5da6d7870abcbc950b": {
"views": []
},
"265ca1ec7ad742f096bb8104d0cf1550": {
"views": []
},
"26bf66fba453464fac2f5cd362655083": {
"views": []
},
"29769879478f49e8b4afd5c0b4662e87": {
"views": []
},
"29a13bd6bc8d486ca648bf30c9e4c2a6": {
"views": []
},
"29c5df6267584654b76205fc5559c553": {
"views": []
},
"29ce25045e7248e5892e8aafc635c416": {
"views": []
},
"2a17207c43c9424394299a7b52461794": {
"views": []
},
"2a777941580945bc83ddb0c817ed4122": {
"views": []
},
"2ae1844e2afe416183658d7a602e5963": {
"views": []
},
"2afa2938b41944cf8c14e41a431e3969": {
"views": []
},
"2bdc5f9b161548e3aab8ea392b5af1a1": {
"views": []
},
"2c26b2bcfc96473584930a4b622d268e": {
"views": []
},
"2ca2a914a5f940b18df0b5cde2b79e4b": {
"views": []
},
"2ca2c532840548a9968d1c6b2f0acdd8": {
"views": []
},
"2d17c32bfea143babe2b114d8777b15d": {
"views": []
},
"2d3acd8872c342eab3484302cac2cb05": {
"views": [
{
"cell_index": 27
}
]
},
"2dc514cc2f5547aeb97059a5070dc9e3": {
"views": []
},
"2e1351ad05384d058c90e594bc6143c1": {
"views": [
{
"cell_index": 27
}
]
},
"2e9b80fa18984615933e41c1c1db2171": {
"views": []
},
"2ef17ee6b7c74a4bbbbbe9b1a93e4fb6": {
"views": []
},
"2f5438f1b34046a597a467effd43df11": {
"views": [
{
"cell_index": 27
}
]
},
"2f8d22417f3e421f96027fca40e1554f": {
"views": []
},
"2fb0409cfb49469d89a32597dc3edba9": {
"views": []
},
"303ccef837984c97b7e71f2988c737a4": {
"views": []
},
"3058b0808dca48a0bba9a93682260491": {
"views": []
},
"306b65493c28411eb10ad786bbf85dc5": {
"views": []
},
"30f5d30cf2d84530b3199015c5ff00eb": {
"views": []
},
"310b1ac518bd4079bdb7ecaf523a6809": {
"views": []
},
"313eca81d9d24664bcc837db54d59618": {
"views": []
},
"31413caf78c14548baa61e3e3c9edc55": {
"views": []
},
"317fbd3cb6324b2fbdfd6aa46a8d1192": {
"views": []
},
"319425ba805346f5ba366c42e220f9c6": {
"views": [
{
"cell_index": 27
}
]
},
"31fc8165275e473f8f75c6215b5184ff": {
"views": []
},
"329f12edaa0c44d2a619450f188e8777": {
"views": []
},
"32edf057582f4a6ca30ce3cb685bf971": {
"views": []
},
"330e74773ba148e18674cfa3e63cd6cc": {
"views": []
},
"332a89c03bfb49c2bb291051d172b735": {
"views": [
{
"cell_index": 27
}
]
},
"3347dfda0aca450f89dd9b39ca1bec7d": {
"views": []
},
"336e8bcfd7cc4a85956674b0c7bffff2": {
"views": []
},
"3376228b3b614d4ab2a10b2fd0f484fd": {
"views": []
},
"3380a22bc67c4be99c61050800f93395": {
"views": []
},
"34b5c16cbea448809c2ccbce56f8d5a5": {
"views": []
},
"34bb050223504afc8053ce931103f52c": {
"views": []
},
"34c28187175d49198b536a1ab13668c4": {
"views": []
},
"3521f32644514ecf9a96ddfa5d80fb9b": {
"views": []
},
"36511bd77ed74f668053df749cc735d4": {
"views": []
},
"36541c3490bd4268b64daf20d8c24124": {
"views": []
},
"37aa1dd4d76a4bac98857b519b7b523a": {
"views": []
},
"37aa3cfa3f8f48989091ec46ac17ae48": {
"views": []
},
"386991b0b1424a9c816dac6a29e1206b": {
"views": []
},
"386cf43742234dda994e35b41890b4d8": {
"views": []
},
"388571e8e0314dfab8e935b7578ba7f9": {
"views": [
{
"cell_index": 27
}
]
},
"3974e38e718547efaf0445da2be6a739": {
"views": []
},
"398490e0cc004d22ac9c4486abec61e1": {
"views": []
},
"399875994aba4c53afa8c49fae8d369e": {
"views": []
},
"39b64aa04b1d4a81953e43def0ef6e10": {
"views": []
},
"39ffc3dd42d94a27ba7240d10c11b565": {
"views": []
},
"3a21291c8e7249e3b04417d31b0447cf": {
"views": [
{
"cell_index": 27
}
]
},
"3a377d9f46704d749c6879383c89f5d3": {
"views": []
},
"3a44a6f1f62742849e96d957033a0039": {
"views": []
},
"3b22d68709b046e09fe70f381a3944cd": {
"views": [
{
"cell_index": 27
}
]
},
"3b329209c8f547acae1925dc3eb4af77": {
"views": []
},
"3c1b2ec10a9041be8a3fad9da78ff9f6": {
"views": [
{
"cell_index": 27
}
]
},
"3c2be3c85c6d41268bb4f9d63a43e196": {
"views": []
},
"3c6796eff7c54238a7b7776e88721b08": {
"views": []
},
"3cbca3e11edf439fb7f8ba41693b4824": {
"views": []
},
"3d4b6b7c0b0c48ff8c4b8d78f58e0f1c": {
"views": []
},
"3de1faf0d2514f49a99b3d60ea211495": {
"views": []
},
"3df60d9ac82b42d9b885d895629e372e": {
"views": []
},
"3e5b9fd779574270bf58101002c152ce": {
"views": [
{
"cell_index": 27
}
]
},
"3e80f34623c94659bfab5b3b56072d9a": {
"views": []
},
"3e8bb05434cb4a0291383144e4523840": {
"views": [
{
"cell_index": 27
}
]
},
"3ea1c8e4f9b34161928260e1274ee048": {
"views": []
},
"3f32f0915bc6469aaaf7170eff1111e3": {
"views": []
},
"3fe69a26ae7a46fda78ae0cb519a0f8b": {
"views": []
},
"4000ecdd75d9467e9dffd457b35aa65f": {
"views": []
},
"402d346f8b68408faed2fd79395cf3fb": {
"views": []
},
"402f4116244242148fdc009bb399c3bd": {
"views": []
},
"4049e0d7c0d24668b7eae2bb7169376e": {
"views": []
},
"4088c9ed71b0467b9b9417d5b04eda0e": {
"views": []
},
"40d70faa07654b6cb13496c32ba274b3": {
"views": []
},
"4146be21b7614abe827976787ec570f1": {
"views": []
},
"4198c08edda440dd93d1f6ce3e4efa62": {
"views": []
},
"42023d7d3c264f9d933d4cee4362852b": {
"views": []
},
"421ad8c67f754ce2b24c4fa3a8e951cf": {
"views": []
},
"4263fe0cef42416f8d344c1672f591f9": {
"views": []
},
"428e42f04a1e4347a1f548379c68f91b": {
"views": [
{
"cell_index": 27
}
]
},
"42a47243baf34773943a25df9cf23854": {
"views": []
},
"4343b72c91d04a7c9a6080f30fc63d7d": {
"views": []
},
"43488264fc924c01a30fa58604074b07": {
"views": []
},
"4379175239b34553bf45c8ef9443ac55": {
"views": [
{
"cell_index": 27
}
]
},
"43859798809a4a289c58b4bd5e49d357": {
"views": []
},
"43ad406a61a34249b5622aba9450b23d": {
"views": []
},
"4421c121414d464bb3bf1b5f0e86c37b": {
"views": [
{
"cell_index": 27
}
]
},
"445cc08b4da44c2386ac9379793e3506": {
"views": []
},
"447cff7e256c434e859bb7ce9e5d71c8": {
"views": []
},
"44af7da9d8304f07890ef7d11a9f95fe": {
"views": []
},
"45021b6f05db4c028a3b5572bc85217f": {
"views": []
},
"457768a474844556bf9b215439a2f2e9": {
"views": []
},
"45d5689de53646fe9042f3ce9e281acc": {
"views": []
},
"461aa21d57824526a6b61e3f9b5af523": {
"views": []
},
"472ca253aab34b098f53ed4854d35f23": {
"views": []
},
"4731208453424514b471f862804d9bb8": {
"views": [
{
"cell_index": 27
}
]
},
"47dfef9eaf0e433cb4b3359575f39480": {
"views": []
},
"48220a877d494a3ea0cc9dae19783a13": {
"views": []
},
"4882c417949b4b6788a1c3ec208fb1ac": {
"views": []
},
"49f5c38281984e3bad67fe3ea3eb6470": {
"views": []
},
"4a0d39b43eee4e818d47d382d87d86d1": {
"views": []
},
"4a470bf3037047f48f4547b594ac65fa": {
"views": []
},
"4abab5bca8334dfbb0434be39eb550db": {
"views": []
},
"4b48e08fd383489faa72fc76921eac4e": {
"views": []
},
"4b9439e6445c4884bd1cde0e9fd2405e": {
"views": []
},
"4b9fa014f9904fcf9aceff00cc1ebf44": {
"views": []
},
"4bdc63256c3f4e31a8fa1d121f430518": {
"views": []
},
"4bebb097ddc64bbda2c475c3a0e92ab5": {
"views": []
},
"4c201df21ca34108a6e7b051aa58b7f6": {
"views": []
},
"4ced8c156fd941eca391016fc256ce40": {
"views": []
},
"4d281cda33fa489d86228370e627a5b0": {
"views": [
{
"cell_index": 27
}
]
},
"4d85e68205d94965bdb437e5441b10a1": {
"views": []
},
"4e0e6dd34ba7487ba2072d352fe91bf5": {
"views": []
},
"4e82b1d731dd419480e865494f932f80": {
"views": []
},
"4e9f52dea051415a83c4597c4f7a6c00": {
"views": []
},
"4ec035cba73647358d416615cf4096ee": {
"views": [