RL02

所属分类:人工智能/神经网络/深度学习
开发工具:Python
文件大小:5979KB
下载次数:2
上传日期:2018-12-01 22:25:17
上 传 者Haydi
说明:  Reinforcement Learning Trained Agents for Robotics

文件列表:
deep-reinforcement-learning-master\cheatsheet\cheatsheet.pdf (176759, 2018-07-07)
deep-reinforcement-learning-master\cheatsheet\cheatsheet.tex (16413, 2018-07-07)
deep-reinforcement-learning-master\cheatsheet\LICENSE.txt (1057, 2018-07-07)
deep-reinforcement-learning-master\cheatsheet\udacity-logo.png (11689, 2018-07-07)
deep-reinforcement-learning-master\cross-entropy\CEM.ipynb (32486, 2018-07-07)
deep-reinforcement-learning-master\cross-entropy\checkpoint.pth (1034, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\checkpoint_actor.pth (30493, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\checkpoint_critic.pth (426220, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\DDPG.ipynb (31407, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\ddpg_agent.py (7813, 2018-07-07)
deep-reinforcement-learning-master\ddpg-bipedal\model.py (2708, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\checkpoint_actor.pth (489825, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\checkpoint_critic.pth (491028, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\DDPG.ipynb (33617, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\ddpg_agent.py (7800, 2018-07-07)
deep-reinforcement-learning-master\ddpg-pendulum\model.py (2693, 2018-07-07)
deep-reinforcement-learning-master\discretization\Discretization.ipynb (24445, 2018-07-07)
deep-reinforcement-learning-master\discretization\Discretization_Solution.ipynb (338873, 2018-07-07)
deep-reinforcement-learning-master\dqn\exercise\Deep_Q_Network.ipynb (7901, 2018-07-07)
deep-reinforcement-learning-master\dqn\exercise\dqn_agent.py (5773, 2018-07-07)
deep-reinforcement-learning-master\dqn\exercise\model.py (647, 2018-07-07)
deep-reinforcement-learning-master\dqn\solution\checkpoint.pth (20993, 2018-07-07)
deep-reinforcement-learning-master\dqn\solution\Deep_Q_Network_Solution.ipynb (30260, 2018-07-07)
deep-reinforcement-learning-master\dqn\solution\dqn_agent.py (6282, 2018-07-07)
deep-reinforcement-learning-master\dqn\solution\model.py (1015, 2018-07-07)
deep-reinforcement-learning-master\dynamic-programming\.idea\dynamic-programming.iml (488, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\.idea\misc.xml (185, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\.idea\modules.xml (290, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\.idea\workspace.xml (7043, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\.ipynb_checkpoints\Dynamic_Programming_Solution-checkpoint.ipynb (107804, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\check_test.py (3441, 2018-07-07)
deep-reinforcement-learning-master\dynamic-programming\Dynamic_Programming.ipynb (24749, 2018-07-30)
deep-reinforcement-learning-master\dynamic-programming\Dynamic_Programming_Solution.ipynb (107806, 2018-07-30)
... ...

[//]: # (Image References) [image1]: https://user-images.githubusercontent.com/10624937/42135602-b0335606-7d12-11e8-8689-dd1cf9fa11a9.gif "Trained Agents" [image2]: https://user-images.githubusercontent.com/10624937/42386929-76f671f0-8106-11e8-9376-f17da2ae852e.png "Kernel" # Deep Reinforcement Learning Nanodegree ![Trained Agents][image1] This repository contains material related to Udacity's [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program. ## Table of Contents ### Tutorials The tutorials lead you through implementing various algorithms in reinforcement learning. All of the code is in PyTorch (v0.4) and Python 3. * [Dynamic Programming](https://github.com/udacity/deep-reinforcement-learning/tree/master/dynamic-programming): Implement Dynamic Programming algorithms such as Policy Evaluation, Policy Improvement, Policy Iteration, and Value Iteration. * [Monte Carlo](https://github.com/udacity/deep-reinforcement-learning/tree/master/monte-carlo): Implement Monte Carlo methods for prediction and control. * [Temporal-Difference](https://github.com/udacity/deep-reinforcement-learning/tree/master/temporal-difference): Implement Temporal-Difference methods such as Sarsa, Q-Learning, and Expected Sarsa. * [Discretization](https://github.com/udacity/deep-reinforcement-learning/tree/master/discretization): Learn how to discretize continuous state spaces, and solve the Mountain Car environment. * [Tile Coding](https://github.com/udacity/deep-reinforcement-learning/tree/master/tile-coding): Implement a method for discretizing continuous state spaces that enables better generalization. * [Deep Q-Network](https://github.com/udacity/deep-reinforcement-learning/tree/master/dqn): Explore how to use a Deep Q-Network (DQN) to navigate a space vehicle without crashing. * [Robotics](https://github.com/dusty-nv/jetson-reinforcement): Use a C++ API to train reinforcement learning agents from virtual robotic simulation in 3D. (_External link_) * [Hill Climbing](https://github.com/udacity/deep-reinforcement-learning/tree/master/hill-climbing): Use hill climbing with adaptive noise scaling to balance a pole on a moving cart. * [Cross-Entropy Method](https://github.com/udacity/deep-reinforcement-learning/tree/master/cross-entropy): Use the cross-entropy method to train a car to navigate a steep hill. * [REINFORCE](https://github.com/udacity/deep-reinforcement-learning/tree/master/reinforce): Learn how to use Monte Carlo Policy Gradients to solve a classic control task. * **Proximal Policy Optimization**: Explore how to use Proximal Policy Optimization (PPO) to solve a classic reinforcement learning task. (_Coming soon!_) * **Deep Deterministic Policy Gradients**: Explore how to use Deep Deterministic Policy Gradients (DDPG) with OpenAI Gym environments. * [Pendulum](https://github.com/udacity/deep-reinforcement-learning/tree/master/ddpg-pendulum): Use OpenAI Gym's Pendulum environment. * [BipedalWalker](https://github.com/udacity/deep-reinforcement-learning/tree/master/ddpg-bipedal): Use OpenAI Gym's BipedalWalker environment. * **Finance**: Train an agent to discover optimal trading strategies. (_Coming soon!_) ### Labs / Projects The labs and projects can be found below. All of the projects use rich simulation environments from [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents). In the [Deep Reinforcement Learning Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893) program, the projects are reviewed by Udacity experts. These reviews are meant to give you personalized feedback and to tell you what can be improved in your code. * [The Taxi Problem](https://github.com/udacity/deep-reinforcement-learning/tree/master/lab-taxi): In this lab, you will train a taxi to pick up and drop off passengers. * [Navigation](https://github.com/udacity/deep-reinforcement-learning/tree/master/p1_navigation): In the first project, you will train an agent to collect yellow bananas while avoiding blue bananas. * **Continuous Control**: In the second project, you will train an robotic arm to reach target locations. (_Coming soon!_) * **Collaboration and Competition**: In the third project, you will train a pair of agents to play tennis! (_Coming soon!_) ### Resources * [Cheatsheet](https://github.com/udacity/deep-reinforcement-learning/blob/master/cheatsheet): You are encouraged to use [this PDF file](https://github.com/udacity/deep-reinforcement-learning/blob/master/cheatsheet/cheatsheet.pdf) to guide your study of reinforcement learning. ## OpenAI Gym Benchmarks ### Classic Control - `Acrobot-v1` with [Tile Coding](https://github.com/udacity/deep-reinforcement-learning/blob/master/tile-coding/Tile_Coding_Solution.ipynb) and Q-Learning - `Cartpole-v0` with [Hill Climbing](https://github.com/udacity/deep-reinforcement-learning/blob/master/hill-climbing/Hill_Climbing.ipynb) | solved in 13 episodes - `Cartpole-v0` with [REINFORCE](https://github.com/udacity/deep-reinforcement-learning/blob/master/reinforce/REINFORCE.ipynb) | solved in 691 episodes - `MountainCarContinuous-v0` with [Cross-Entropy Method](https://github.com/udacity/deep-reinforcement-learning/blob/master/cross-entropy/CEM.ipynb) | solved in 47 iterations - `MountainCar-v0` with [Uniform-Grid Discretization](https://github.com/udacity/deep-reinforcement-learning/blob/master/discretization/Discretization_Solution.ipynb) and Q-Learning | solved in <50000 episodes - `Pendulum-v0` with [Deep Deterministic Policy Gradients (DDPG)](https://github.com/udacity/deep-reinforcement-learning/blob/master/ddpg-pendulum/DDPG.ipynb) ### Box2d - `BipedalWalker-v2` with [Deep Deterministic Policy Gradients (DDPG)](https://github.com/udacity/deep-reinforcement-learning/blob/master/ddpg-bipedal/DDPG.ipynb) - `CarRacing-v0` with **Deep Q-Networks (DQN)** | _Coming soon!_ - `LunarLander-v2` with [Deep Q-Networks (DQN)](https://github.com/udacity/deep-reinforcement-learning/blob/master/dqn/solution/Deep_Q_Network_Solution.ipynb) | solved in 1504 episodes ### Toy Text - `FrozenLake-v0` with [Dynamic Programming](https://github.com/udacity/deep-reinforcement-learning/blob/master/dynamic-programming/Dynamic_Programming_Solution.ipynb) - `Blackjack-v0` with [Monte Carlo Methods](https://github.com/udacity/deep-reinforcement-learning/blob/master/monte-carlo/Monte_Carlo_Solution.ipynb) - `CliffWalking-v0` with [Temporal-Difference Methods](https://github.com/udacity/deep-reinforcement-learning/blob/master/temporal-difference/Temporal_Difference_Solution.ipynb) ## Dependencies To set up your python environment to run the code in this repository, follow the instructions below. 1. Create (and activate) a new environment with Python 3.6. - __Linux__ or __Mac__: ```bash conda create --name drlnd python=3.6 source activate drlnd ``` - __Windows__: ```bash conda create --name drlnd python=3.6 activate drlnd ``` 2. Follow the instructions in [this repository](https://github.com/openai/gym) to perform a minimal install of OpenAI gym. - Next, install the **classic control** environment group by following the instructions [here](https://github.com/openai/gym#classic-control). - Then, install the **box2d** environment group by following the instructions [here](https://github.com/openai/gym#box2d). 3. Clone the repository (if you haven't already!), and navigate to the `python/` folder. Then, install several dependencies. ```bash git clone https://github.com/udacity/deep-reinforcement-learning.git cd deep-reinforcement-learning/python pip install . ``` 4. Create an [IPython kernel](http://ipython.readthedocs.io/en/stable/install/kernel_install.html) for the `drlnd` environment. ```bash python -m ipykernel install --user --name drlnd --display-name "drlnd" ``` 5. Before running code in a notebook, change the kernel to match the `drlnd` environment by using the drop-down `Kernel` menu. ![Kernel][image2] ## Want to learn more?

Come learn with us in the Deep Reinforcement Learning Nanodegree program at Udacity!


近期下载者

相关文件


收藏者