RL01 联合开发网

Pudn.com > 下载中心 > 人工智能/神经网络/深度学习 > RL01

RL01

所属分类：人工智能/神经网络/深度学习
开发工具：Python
文件大小：4233KB
下载次数：3
上传日期：2018-12-01 22:23:09
上传者：Haydi

说明： Overview This repository provides code, exercises and solutions for popular Reinforcement Learning algorithms.

文件列表:

reinforcement-learning-master\DP\.ipynb_checkpoints\Gamblers Problem Solution-checkpoint.ipynb (37704, 2018-07-27)
reinforcement-learning-master\DP\.ipynb_checkpoints\Policy Evaluation Solution-checkpoint.ipynb (5434, 2018-07-27)
reinforcement-learning-master\DP\.ipynb_checkpoints\Policy Iteration Solution-checkpoint.ipynb (8821, 2018-07-27)
reinforcement-learning-master\DP\.ipynb_checkpoints\Value Iteration Solution-checkpoint.ipynb (6610, 2018-07-27)
reinforcement-learning-master\DP\.ipynb_checkpoints\Value Iteration-checkpoint.ipynb (9329, 2018-07-27)
reinforcement-learning-master\DP\Gamblers Problem Solution.ipynb (37704, 2018-07-27)
reinforcement-learning-master\DP\Gamblers Problem.ipynb (4402, 2018-05-28)
reinforcement-learning-master\DP\Policy Evaluation Solution.ipynb (5434, 2018-07-27)
reinforcement-learning-master\DP\Policy Evaluation.ipynb (7611, 2018-05-28)
reinforcement-learning-master\DP\Policy Iteration Solution.ipynb (8821, 2018-07-27)
reinforcement-learning-master\DP\Policy Iteration.ipynb (11307, 2018-05-28)
reinforcement-learning-master\DP\Value Iteration Solution.ipynb (6610, 2018-07-27)
reinforcement-learning-master\DP\Value Iteration.ipynb (10424, 2018-07-27)
reinforcement-learning-master\DQN\.ipynb_checkpoints\Breakout Playground-checkpoint.ipynb (21302, 2018-07-29)
reinforcement-learning-master\DQN\.ipynb_checkpoints\Deep Q Learning Solution-checkpoint.ipynb (24501, 2018-07-29)
reinforcement-learning-master\DQN\.ipynb_checkpoints\Double DQN Solution-checkpoint.ipynb (22213, 2018-07-29)
reinforcement-learning-master\DQN\Breakout Playground.ipynb (21397, 2018-07-30)
reinforcement-learning-master\DQN\Deep Q Learning Solution.ipynb (24501, 2018-07-30)
reinforcement-learning-master\DQN\Deep Q Learning.ipynb (20968, 2018-05-28)
reinforcement-learning-master\DQN\Double DQN Solution.ipynb (22213, 2018-07-30)
reinforcement-learning-master\DQN\dqn.py (16648, 2018-05-28)
reinforcement-learning-master\FA\.ipynb_checkpoints\MountainCar Playground-checkpoint.ipynb (30565, 2018-07-29)
reinforcement-learning-master\FA\.ipynb_checkpoints\Q-Learning with Value Function Approximation Solution-checkpoint.ipynb (192533, 2018-07-29)
reinforcement-learning-master\FA\MountainCar Playground.ipynb (22087, 2018-07-30)
reinforcement-learning-master\FA\Q-Learning with Value Function Approximation Solution.ipynb (192533, 2018-07-30)
reinforcement-learning-master\FA\Q-Learning with Value Function Approximation.ipynb (131946, 2018-05-28)
reinforcement-learning-master\lib\atari\helpers.py (829, 2018-05-28)
reinforcement-learning-master\lib\atari\state_processor.py (1077, 2018-05-28)
reinforcement-learning-master\lib\atari\__init__.py (1, 2018-05-28)
reinforcement-learning-master\lib\envs\blackjack.py (4251, 2018-05-28)
reinforcement-learning-master\lib\envs\cliff_walking.py (2685, 2018-05-28)
reinforcement-learning-master\lib\envs\gridworld.py (3488, 2018-05-28)
reinforcement-learning-master\lib\envs\windy_gridworld.py (2594, 2018-05-28)
reinforcement-learning-master\lib\envs\__init__.py (0, 2018-05-28)
... ...

### Overview This repository provides code, exercises and solutions for popular Reinforcement Learning algorithms. These are meant to serve as a learning tool to complement the theoretical materials from - [Reinforcement Learning: An Introduction (2nd Edition)](http://incompleteideas.net/book/bookdraft2018jan1.pdf) - [David Silver's Reinforcement Learning Course](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html) Each folder in corresponds to one or more chapters of the above textbook and/or course. In addition to exercises and solution, each folder also contains a list of learning goals, a brief concept summary, and links to the relevant readings. All code is written in Python 3 and uses RL environments from [OpenAI Gym](https://gym.openai.com/). Advanced techniques use [Tensorflow](https://www.tensorflow.org/) for neural network implementations. ### Table of Contents - [Introduction to RL problems & OpenAI Gym](Introduction/) - [MDPs and Bellman Equations](MDP/) - [Dynamic Programming: Model-Based RL, Policy Iteration and Value Iteration](DP/) - [Monte Carlo Model-Free Prediction & Control](MC/) - [Temporal Difference Model-Free Prediction & Control](TD/) - [Function Approximation](FA/) - [Deep Q Learning](DQN/) (WIP) - [Policy Gradient Methods](PolicyGradient/) (WIP) - Learning and Planning (WIP) - Exploration and Exploitation (WIP) ### List of Implemented Algorithms - [Dynamic Programming Policy Evaluation](DP/Policy%20Evaluation%20Solution.ipynb) - [Dynamic Programming Policy Iteration](DP/Policy%20Iteration%20Solution.ipynb) - [Dynamic Programming Value Iteration](DP/Value%20Iteration%20Solution.ipynb) - [Monte Carlo Prediction](MC/MC%20Prediction%20Solution.ipynb) - [Monte Carlo Control with Epsilon-Greedy Policies](MC/MC%20Control%20with%20Epsilon-Greedy%20Policies%20Solution.ipynb) - [Monte Carlo Off-Policy Control with Importance Sampling](MC/Off-Policy%20MC%20Control%20with%20Weighted%20Importance%20Sampling%20Solution.ipynb) - [SARSA (On Policy TD Learning)](TD/SARSA%20Solution.ipynb) - [Q-Learning (Off Policy TD Learning)](TD/Q-Learning%20Solution.ipynb) - [Q-Learning with Linear Function Approximation](FA/Q-Learning%20with%20Value%20Function%20Approximation%20Solution.ipynb) - [Deep Q-Learning for Atari Games](DQN/Deep%20Q%20Learning%20Solution.ipynb) - [Double Deep-Q Learning for Atari Games](DQN/Double%20DQN%20Solution.ipynb) - Deep Q-Learning with Prioritized Experience Replay (WIP) - [Policy Gradient: REINFORCE with Baseline](PolicyGradient/CliffWalk%20REINFORCE%20with%20Baseline%20Solution.ipynb) - [Policy Gradient: Actor Critic with Baseline](PolicyGradient/CliffWalk%20Actor%20Critic%20Solution.ipynb) - [Policy Gradient: Actor Critic with Baseline for Continuous Action Spaces](PolicyGradient/Continuous%20MountainCar%20Actor%20Critic%20Solution.ipynb) - Deterministic Policy Gradients for Continuous Action Spaces (WIP) - Deep Deterministic Policy Gradients (DDPG) (WIP) - [Asynchronous Advantage Actor Critic (A3C)](PolicyGradient/a3c) ### Resources Textbooks: - [Reinforcement Learning: An Introduction (2nd Edition)](http://incompleteideas.net/book/bookdraft2018jan1.pdf) Classes: - [David Silver's Reinforcement Learning Course (UCL, 2015)](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html) - [CS294 - Deep Reinforcement Learning (Berkeley, Fall 2015)](http://rll.berkeley.edu/deeprlcourse/) - [CS 8803 - Reinforcement Learning (Georgia Tech)](https://www.udacity.com/course/reinforcement-learning--ud600) Talks/Tutorials: - [Introduction to Reinforcement Learning (Joelle Pineau @ Deep Learning Summer School 2016)](http://videolectures.net/deeplearning2016_pineau_reinforcement_learning/) - [Deep Reinforcement Learning (Pieter Abbeel @ Deep Learning Summer School 2016)](http://videolectures.net/deeplearning2016_abbeel_deep_reinforcement/) - [Deep Reinforcement Learning ICML 2016 Tutorial (David Silver)](http://techtalks.tv/talks/deep-reinforcement-learning/62360/) - [Tutorial: Introduction to Reinforcement Learning with Function Approximation](https://www.youtube.com/watch?v=ggqnxyjaKe4) - [John Schulman - Deep Reinforcement Learning (4 Lectures)](https://www.youtube.com/playlist?list=PLjKEIQlKCTZYN3CYBlj8r58SbNorobqcp) - [Deep Reinforcement Learning Slides @ NIPS 2016](http://people.eecs.berkeley.edu/~pabbeel/nips-tutorial-policy-optimization-Schulman-Abbeel.pdf) Other Projects: - [carpedm20/deep-rl-tensorflow](https://github.com/carpedm20/deep-rl-tensorflow) - [matthiasplappert/keras-rl](https://github.com/matthiasplappert/keras-rl) Selected Papers: - [Human-Level Control through Deep Reinforcement Learning (2015-02)](http://www.readcube.com/articles/10.1038/nature14236) - [Deep Reinforcement Learning with Double Q-learning (2015-09)](http://arxiv.org/abs/1509.0***61) - [Continuous control with deep reinforcement learning (2015-09)](https://arxiv.org/abs/1509.02971) - [Prioritized Experience Replay (2015-11)](http://arxiv.org/abs/1511.05952) - [Dueling Network Architectures for Deep Reinforcement Learning (2015-11)](http://arxiv.org/abs/1511.06581) - [Asynchronous Methods for Deep Reinforcement Learning (2016-02)](http://arxiv.org/abs/1602.01783) - [Deep Reinforcement Learning from Self-Play in Imperfect-Information Games (2016-03)](http://arxiv.org/abs/1603.01121) - [Mastering the game of Go with deep neural networks and tree search](https://gogameguru.com/i/2016/03/deepmind-mastering-go.pdf)

近期下载者：

相关文件：

评论：[我要评论] [举报此文件]

收藏者：