noisy-mappo

所属分类:人工智能/神经网络/深度学习
开发工具:Python
文件大小:167KB
下载次数:0
上传日期:2023-05-11 04:03:23
上 传 者sh-1993
说明:  具有噪声的多代理PPO(SMAC硬场景中97%的获胜率)
(Multi-agent PPO with noise (97% win rates on Hard scenarios of SMAC))

文件列表:
LICENSE (1077, 2022-07-12)
clean.sh (32, 2022-07-12)
environment.yaml (5127, 2022-07-12)
onpolicy (0, 2022-07-12)
onpolicy\__init__.py (194, 2022-07-12)
onpolicy\algorithms (0, 2022-07-12)
onpolicy\algorithms\__init__.py (0, 2022-07-12)
onpolicy\algorithms\r_mappo (0, 2022-07-12)
onpolicy\algorithms\r_mappo\__init__.py (0, 2022-07-12)
onpolicy\algorithms\r_mappo\algorithm (0, 2022-07-12)
onpolicy\algorithms\r_mappo\algorithm\rMAPPOPolicy.py (7369, 2022-07-12)
onpolicy\algorithms\r_mappo\algorithm\r_actor_critic.py (8651, 2022-07-12)
onpolicy\algorithms\r_mappo\r_mappo.py (11017, 2022-07-12)
onpolicy\algorithms\utils (0, 2022-07-12)
onpolicy\algorithms\utils\act.py (7870, 2022-07-12)
onpolicy\algorithms\utils\cnn.py (1852, 2022-07-12)
onpolicy\algorithms\utils\distributions.py (3474, 2022-07-12)
onpolicy\algorithms\utils\mlp.py (1892, 2022-07-12)
onpolicy\algorithms\utils\popart.py (3796, 2022-07-12)
onpolicy\algorithms\utils\rnn.py (2849, 2022-07-12)
onpolicy\algorithms\utils\util.py (425, 2022-07-12)
onpolicy\config.py (16657, 2022-07-12)
onpolicy\envs (0, 2022-07-12)
onpolicy\envs\__init__.py (90, 2022-07-12)
onpolicy\envs\env_wrappers.py (28209, 2022-07-12)
onpolicy\envs\hanabi (0, 2022-07-12)
onpolicy\envs\hanabi\CMakeLists.txt (382, 2022-07-12)
onpolicy\envs\hanabi\Hanabi_Env.py (37159, 2022-07-12)
onpolicy\envs\hanabi\__init__.py (574, 2022-07-12)
onpolicy\envs\hanabi\clean_all.sh (882, 2022-07-12)
onpolicy\envs\hanabi\hanabi_lib (0, 2022-07-12)
onpolicy\envs\hanabi\hanabi_lib\CMakeLists.txt (242, 2022-07-12)
onpolicy\envs\hanabi\hanabi_lib\canonical_encoders.cc (21779, 2022-07-12)
onpolicy\envs\hanabi\hanabi_lib\canonical_encoders.h (1621, 2022-07-12)
onpolicy\envs\hanabi\hanabi_lib\hanabi_card.cc (1020, 2022-07-12)
onpolicy\envs\hanabi\hanabi_lib\hanabi_card.h (1213, 2022-07-12)
onpolicy\envs\hanabi\hanabi_lib\hanabi_game.cc (7037, 2022-07-12)
... ...

# Noisy-MAPPO Codes for [Policy Regularization via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods](https://arxiv.org/abs/2106.14334). This repository is heavily based on https://github.com/marlbenchmark/on-policy. In this study we find that noise perturbation of the Advantage function can effectively improve the performance of MAPPO in SMAC. ## Environments supported: - [StarCraftII (SMAC)](https://github.com/oxwhirl/smac) **StarCraft 2 version: SC2.4.10. difficulty: 7.** ## 1. Usage **WARNING: by default all experiments assume a shared policy by all agents i.e. there is one neural network shared by all agents** All core code is located within the onpolicy folder. The algorithms/ subfolder contains code for MAPPO. * The config.py file contains relevant hyperparameter and env settings. Most hyperparameters are defaulted to the ones used in the paper; however, please refer to the appendix for a full list of hyperparameters used. ## 2. Installation Here we give an example installation on CUDA == 10.1. For non-GPU & other CUDA version installation, please refer to the [PyTorch website](https://pytorch.org/get-started/locally/). ``` Bash # create conda environment conda create -n marl python==3.7 conda activate marl conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia pip install -r requirements.txt ``` ``` # install on-policy package cd on-policy pip install -e . ``` Even though we provide requirement.txt, it may have redundancy. We recommend that the user try to install other required packages by running the code and finding which required package hasn't installed yet. ### 2.1 Install StarCraftII [4.10](https://blzdistsc2-a.akamaihd.net/Linux/SC2.4.10.zip) ``` Bash cd ~ wget https://blzdistsc2-a.akamaihd.net/Linux/SC2.4.10.zip unzip -P iagreetotheeula SC2.4.10.zip rm -rf SC2.4.10.zip echo "export SC2PATH=~/StarCraftII/" > ~/.bashrc ``` * download [SMAC Maps](https://github.com/oxwhirl/smac/releases/download/v1/SMAC_Maps_V1.tar.gz), and move it to `~/StarCraftII/Maps/`. ``` wget https://github.com/oxwhirl/smac/releases/download/v0.1-beta1/SMAC_Maps.zip unzip SMAC_Maps.zip mv ./SMAC_Maps ~/StarCraftII/Maps/ ``` * To use a stableid, copy `stableid.json` from https://github.com/Blizzard/s2client-proto.git to `~/StarCraftII/`. ## 3.Train **Please modify the hyperparameters in the shell scripts according to the Appendix of the paper.** **Noisy-Value MAPPO (NV-MAPPO)** ``` ./train_smac_value.sh 3s5z_vs_3s6z 3 ``` **Noisy-Advantage MAPPO (NA-MAPPO)** ``` ./train_smac_adv.sh 3s5z_vs_3s6z 3 ``` **Noisy-Value IPPO (NV-IPPO)** ``` ./train_smac_value_ippo.sh 3s5z_vs_3s6z 3 ``` **Vanilla MAPPO (MAPPO)** ``` ./train_smac_vanilla.sh 3s5z_vs_3s6z 3 ``` Local results are stored in subfold scripts/results. Note that we use Tensorboard as the default visualization platform; ## Citation ``` @article{hu2021policy, title={Policy Regularization via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods}, author={Jian Hu and Siyue Hu and Shih-wei Liao}, year={2021}, eprint={2106.14334}, archivePrefix={arXiv}, primaryClass={cs.MA} } ```

近期下载者

相关文件


收藏者