RL02
所属分类:人工智能/神经网络/深度学习
开发工具:Python
文件大小:5979KB
下载次数:2
上传日期:2018-12-01 22:25:17
上 传 者:
Haydi
说明: Reinforcement Learning Trained Agents for Robotics
文件列表:
deep-reinforcement-learning-master\cheatsheet\cheatsheet.pdf (176759, 2018-07-07)
deep-reinforcement-learning-master\cheatsheet\cheatsheet.tex (16413, 2018-07-07)
deep-reinforcement-learning-master\cheatsheet\LICENSE.txt (1057, 2018-07-07)
deep-reinforcement-learning-master\cheatsheet\udacity-logo.png (11689, 2018-07-07)
deep-reinforcement-learning-master\cross-entropy\CEM.ipynb (32486, 2018-07-07)
deep-reinforcement-learning-master\cross-entropy\checkpoint.pth (1034, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\checkpoint_actor.pth (30493, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\checkpoint_critic.pth (426220, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\DDPG.ipynb (31407, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\ddpg_agent.py (7813, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\model.py (2708, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\checkpoint_actor.pth (489825, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\checkpoint_critic.pth (491028, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\DDPG.ipynb (33617, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\ddpg_agent.py (7800, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\model.py (2693, 2018-07-07)
deep-reinforcement-learning-master\discretization\Discretization.ipynb (24445, 2018-07-07)
deep-reinforcement-learning-master\discretization\Discretization_Solution.ipynb (338873, 2018-07-07)
deep-reinforcement-learning-master\dqn\exercise\Deep_Q_Network.ipynb (7901, 2018-07-07)
deep-reinforcement-learning-master\dqn\exercise\dqn_agent.py (5773, 2018-07-07)
deep-reinforcement-learning-master\dqn\exercise\model.py (647, 2018-07-07)
deep-reinforcement-learning-master\dqn\solution\checkpoint.pth (20993, 2018-07-07)
deep-reinforcement-learning-master\dqn\solution\Deep_Q_Network_Solution.ipynb (30260, 2018-07-07)
deep-reinforcement-learning-master\dqn\solution\dqn_agent.py (6282, 2018-07-07)
deep-reinforcement-learning-master\dqn\solution\model.py (1015, 2018-07-07)
deep-reinforcement-learning-master\dynamic-programming\.idea\dynamic-programming.iml (488, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\.idea\misc.xml (185, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\.idea\modules.xml (290, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\.idea\workspace.xml (7043, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\.ipynb_checkpoints\Dynamic_Programming_Solution-checkpoint.ipynb (107804, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\check_test.py (3441, 2018-07-07)
deep-reinforcement-learning-master\dynamic-programming\Dynamic_Programming.ipynb (24749, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\Dynamic_Programming_Solution.ipynb (107806, 2018-07-30)
... ...
[//]: # (Image References)
[image1]: https://user-images.githubusercontent.com/10624937/42135602-b0335606-7d12-11e8-8689-dd1cf9fa11a9.gif "Trained Agents"
[image2]: https://user-images.githubusercontent.com/10624937/42386929-76f671f0-8106-11e8-9376-f17da2ae852e.png "Kernel"
# Deep Reinforcement Learning Nanodegree
![Trained Agents][image1]
This repository contains material related to Udacity's [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program.
## Table of Contents
### Tutorials
The tutorials lead you through implementing various algorithms in reinforcement learning. All of the code is in PyTorch (v0.4) and Python 3.
* [Dynamic Programming](https://github.com/udacity/deep-reinforcement-learning/tree/master/dynamic-programming): Implement Dynamic Programming algorithms such as Policy Evaluation, Policy Improvement, Policy Iteration, and Value Iteration.
* [Monte Carlo](https://github.com/udacity/deep-reinforcement-learning/tree/master/monte-carlo): Implement Monte Carlo methods for prediction and control.
* [Temporal-Difference](https://github.com/udacity/deep-reinforcement-learning/tree/master/temporal-difference): Implement Temporal-Difference methods such as Sarsa, Q-Learning, and Expected Sarsa.
* [Discretization](https://github.com/udacity/deep-reinforcement-learning/tree/master/discretization): Learn how to discretize continuous state spaces, and solve the Mountain Car environment.
* [Tile Coding](https://github.com/udacity/deep-reinforcement-learning/tree/master/tile-coding): Implement a method for discretizing continuous state spaces that enables better generalization.
* [Deep Q-Network](https://github.com/udacity/deep-reinforcement-learning/tree/master/dqn): Explore how to use a Deep Q-Network (DQN) to navigate a space vehicle without crashing.
* [Robotics](https://github.com/dusty-nv/jetson-reinforcement): Use a C++ API to train reinforcement learning agents from virtual robotic simulation in 3D. (_External link_)
* [Hill Climbing](https://github.com/udacity/deep-reinforcement-learning/tree/master/hill-climbing): Use hill climbing with adaptive noise scaling to balance a pole on a moving cart.
* [Cross-Entropy Method](https://github.com/udacity/deep-reinforcement-learning/tree/master/cross-entropy): Use the cross-entropy method to train a car to navigate a steep hill.
* [REINFORCE](https://github.com/udacity/deep-reinforcement-learning/tree/master/reinforce): Learn how to use Monte Carlo Policy Gradients to solve a classic control task.
* **Proximal Policy Optimization**: Explore how to use Proximal Policy Optimization (PPO) to solve a classic reinforcement learning task. (_Coming soon!_)
* **Deep Deterministic Policy Gradients**: Explore how to use Deep Deterministic Policy Gradients (DDPG) with OpenAI Gym environments.
* [Pendulum](https://github.com/udacity/deep-reinforcement-learning/tree/master/ddpg-pendulum): Use OpenAI Gym's Pendulum environment.
* [BipedalWalker](https://github.com/udacity/deep-reinforcement-learning/tree/master/ddpg-bipedal): Use OpenAI Gym's BipedalWalker environment.
* **Finance**: Train an agent to discover optimal trading strategies. (_Coming soon!_)
### Labs / Projects
The labs and projects can be found below. All of the projects use rich simulation environments from [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents). In the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program, the projects are reviewed by Udacity experts. These reviews are meant to give you personalized feedback and to tell you what can be improved in your code.
* [The Taxi Problem](https://github.com/udacity/deep-reinforcement-learning/tree/master/lab-taxi): In this lab, you will train a taxi to pick up and drop off passengers.
* [Navigation](https://github.com/udacity/deep-reinforcement-learning/tree/master/p1_navigation): In the first project, you will train an agent to collect yellow bananas while avoiding blue bananas.
* **Continuous Control**: In the second project, you will train an robotic arm to reach target locations. (_Coming soon!_)
* **Collaboration and Competition**: In the third project, you will train a pair of agents to play tennis! (_Coming soon!_)
### Resources
* [Cheatsheet](https://github.com/udacity/deep-reinforcement-learning/blob/master/cheatsheet): You are encouraged to use [this PDF file](https://github.com/udacity/deep-reinforcement-learning/blob/master/cheatsheet/cheatsheet.pdf) to guide your study of reinforcement learning.
## OpenAI Gym Benchmarks
### Classic Control
- `Acrobot-v1` with [Tile Coding](https://github.com/udacity/deep-reinforcement-learning/blob/master/tile-coding/Tile_Coding_Solution.ipynb) and Q-Learning
- `Cartpole-v0` with [Hill Climbing](https://github.com/udacity/deep-reinforcement-learning/blob/master/hill-climbing/Hill_Climbing.ipynb) | solved in 13 episodes
- `Cartpole-v0` with [REINFORCE](https://github.com/udacity/deep-reinforcement-learning/blob/master/reinforce/REINFORCE.ipynb) | solved in 691 episodes
- `MountainCarContinuous-v0` with [Cross-Entropy Method](https://github.com/udacity/deep-reinforcement-learning/blob/master/cross-entropy/CEM.ipynb) | solved in 47 iterations
- `MountainCar-v0` with [Uniform-Grid Discretization](https://github.com/udacity/deep-reinforcement-learning/blob/master/discretization/Discretization_Solution.ipynb) and Q-Learning | solved in <50000 episodes
- `Pendulum-v0` with [Deep Deterministic Policy Gradients (DDPG)](https://github.com/udacity/deep-reinforcement-learning/blob/master/ddpg-pendulum/DDPG.ipynb)
### Box2d
- `BipedalWalker-v2` with [Deep Deterministic Policy Gradients (DDPG)](https://github.com/udacity/deep-reinforcement-learning/blob/master/ddpg-bipedal/DDPG.ipynb)
- `CarRacing-v0` with **Deep Q-Networks (DQN)** | _Coming soon!_
- `LunarLander-v2` with [Deep Q-Networks (DQN)](https://github.com/udacity/deep-reinforcement-learning/blob/master/dqn/solution/Deep_Q_Network_Solution.ipynb) | solved in 1504 episodes
### Toy Text
- `FrozenLake-v0` with [Dynamic Programming](https://github.com/udacity/deep-reinforcement-learning/blob/master/dynamic-programming/Dynamic_Programming_Solution.ipynb)
- `Blackjack-v0` with [Monte Carlo Methods](https://github.com/udacity/deep-reinforcement-learning/blob/master/monte-carlo/Monte_Carlo_Solution.ipynb)
- `CliffWalking-v0` with [Temporal-Difference Methods](https://github.com/udacity/deep-reinforcement-learning/blob/master/temporal-difference/Temporal_Difference_Solution.ipynb)
## Dependencies
To set up your python environment to run the code in this repository, follow the instructions below.
1. Create (and activate) a new environment with Python 3.6.
- __Linux__ or __Mac__:
```bash
conda create --name drlnd python=3.6
source activate drlnd
```
- __Windows__:
```bash
conda create --name drlnd python=3.6
activate drlnd
```
2. Follow the instructions in [this repository](https://github.com/openai/gym) to perform a minimal install of OpenAI gym.
- Next, install the **classic control** environment group by following the instructions [here](https://github.com/openai/gym#classic-control).
- Then, install the **box2d** environment group by following the instructions [here](https://github.com/openai/gym#box2d).
3. Clone the repository (if you haven't already!), and navigate to the `python/` folder. Then, install several dependencies.
```bash
git clone https://github.com/udacity/deep-reinforcement-learning.git
cd deep-reinforcement-learning/python
pip install .
```
4. Create an [IPython kernel](http://ipython.readthedocs.io/en/stable/install/kernel_install.html) for the `drlnd` environment.
```bash
python -m ipykernel install --user --name drlnd --display-name "drlnd"
```
5. Before running code in a notebook, change the kernel to match the `drlnd` environment by using the drop-down `Kernel` menu.
![Kernel][image2]
## Want to learn more?
Come learn with us in the Deep Reinforcement Learning Nanodegree program at Udacity!
近期下载者:
相关文件:
收藏者: