SDN

所属分类:论文
开发工具:Python
文件大小:0KB
下载次数:0
上传日期:2020-01-31 00:04:32
上 传 者sh-1993
说明:  [NeurIPS 2019]为什么我不能在购物中心跳舞学习减轻动作识别中的场景偏见,
([NeurIPS 2019] Why Can t I Dance in the Mall Learning to Mitigate Scene Bias in Action Recognition,)

文件列表:
LICENSE (1068, 2020-01-30)
datasets/ (0, 2020-01-30)
datasets/activitynet.py (10100, 2020-01-30)
datasets/dataset.py (9617, 2020-01-30)
datasets/diving48.py (6426, 2020-01-30)
datasets/hmdb51.py (6606, 2020-01-30)
datasets/kinetics.py (30714, 2020-01-30)
datasets/ucf101.py (6606, 2020-01-30)
libs/ (0, 2020-01-30)
libs/mean.py (635, 2020-01-30)
libs/opts.py (12011, 2020-01-30)
libs/spatial_transforms.py (11273, 2020-01-30)
libs/target_transforms.py (446, 2020-01-30)
libs/temporal_transforms.py (2716, 2020-01-30)
libs/test.py (3200, 2020-01-30)
libs/train_epoch.py (20181, 2020-01-30)
libs/utils.py (1634, 2020-01-30)
libs/validation_epoch.py (16608, 2020-01-30)
loss/ (0, 2020-01-30)
loss/hloss.py (789, 2020-01-30)
loss/soft_cross_entropy.py (632, 2020-01-30)
models/ (0, 2020-01-30)
models/densenet.py (7291, 2020-01-30)
models/grad_reversal.py (503, 2020-01-30)
models/model.py (14418, 2020-01-30)
models/pre_act_resnet.py (7473, 2020-01-30)
models/resnet.py (11650, 2020-01-30)
models/resnext.py (6418, 2020-01-30)
models/vgg.py (7503, 2020-01-30)
models/wide_resnet.py (5670, 2020-01-30)
sdn_packages.txt (9439, 2020-01-30)
train.py (23920, 2020-01-30)
utils/ (0, 2020-01-30)
utils/eval_diving48.py (7066, 2020-01-30)
utils/eval_hmdb51.py (6712, 2020-01-30)
utils/eval_kinetics.py (7648, 2020-01-30)
utils/eval_ucf101.py (6699, 2020-01-30)
utils/fps.py (1251, 2020-01-30)
... ...

# SDN: Scene Debiasing Network for Action Recognition in PyTorch We release the code of the "Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition". The code is built upon the [3D-ResNets-PyTorch codebase](https://github.com/kenshohara/3D-ResNets-PyTorch). For the details, visit our [project website](http://chengao.vision/SDN/) or see our [full paper](https://papers.nips.cc/paper/8372-why-cant-i-dance-in-the-mall-learning-to-mitigate-scene-bias-in-action-recognition.pdf). ## Reference [Jinwoo Choi](https://sites.google.com/site/jchoivision/), [Chen Gao](https://gaochen315.github.io/), [Joseph C. E. Messou](https://josephcmessou.weebly.com/about.html), [Jia-Bin Huang](https://filebox.ece.vt.edu/~jbhuang/index.html). Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition. Neural Information Processing Systems (NeurIPS) 2019. ``` @inproceedings{choi2019sdn, title = {Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition}, author = {Choi, Jinwoo and Gao, Chen and Messou, C. E. Joseph and Huang, Jia-Bin}, booktitle={NeurIPS}, year={2019} } ``` ## Requirements This codebase was developed and tested with: - Python 3.6 - PyTorch 0.4.1 - torchvision 0.2.1 - CUDA 9.0 - CUDNN 7.1 - GPU: 2xP100 You can find dependencies from `sdn_packages.txt` You can install dependencies by ``` pip install -r sdn_packages.txt ``` ## Datasets ### Prepare your dataset **1. Download and pre-process data** - Follow the [3D-ResNets-PyTorch instruction](https://github.com/kenshohara/3D-ResNets-PyTorch#preparation). **2. Download scene and human detection data numpy files** - [Download the Mini-Kinetics scene pseudo labels](https://filebox.ece.vt.edu/~jinchoi/files/sdn/places_data.zip) - [Download the Mini-Kinetics human detections](https://filebox.ece.vt.edu/~jinchoi/files/sdn/detections.zip) ## Train ### Training on a source dataset (mini-Kinetics) **- Baseline model without any debiasing** ``` python train.py --video_path \ --annotation_path /kinetics.json \ --result_path \ --root_path \ --dataset kinetics \ --n_classes 200 \ --n_finetune_classes 200 \ --model resnet \ --model_depth 18 \ --resnet_shortcut A \ --batch_size 32 \ --val_batch_size 16 \ --n_threads 16 \ --checkpoint 1 \ --ft_begin_index 0 \ --is_mask_adv \ --learning_rate 0.0001 \ --weight_decay 1e-5 \ --n_epochs 100 \ --pretrain_path ``` **- SDN model with scene adversarial loss only** ``` python train.py \ --video_path \ --annotation_path /kinetics.json \ --result_path \ --root_path \ --dataset kinetics_adv \ --n_classes 200 \ --n_finetune_classes 200 \ --model resnet \ --model_depth 18 \ --resnet_shortcut A \ --batch_size 32 \ --val_batch_size 16 \ --n_threads 16 \ --checkpoint 1 \ --ft_begin_index 0 \ --num_place_hidden_layers 3 \ --new_layer_lr 1e-2 \ --learning_rate 1e-4 \ --warm_up_epochs 5 \ --weight_decay 1e-5 \ --n_epochs 100 \ --place_pred_path \ --is_place_adv \ --is_place_soft \ --alpha 1.0 \ --is_mask_adv \ --num_places_classes 365 \ --pretrain_path ``` **- Full SDN model with 1) scene adversarial loss and 2) human mask confussion loss** ``` python train.py \ --video_path \ --annotation_path /kinetics.json \ --result_path \ --root_path \ --dataset kinetics_adv_msk \ --n_classes 200 \ --n_finetune_classes 200 \ --model resnet \ --model_depth 18 \ --resnet_shortcut A \ --batch_size 32 \ --val_batch_size 16 \ --n_threads 16 \ --checkpoint 1 \ --ft_begin_index 0 \ --num_place_hidden_layers 3 \ --num_human_mask_adv_hidden_layers 1 \ --new_layer_lr 1e-4 \ --learning_rate 1e-4 \ --warm_up_epochs 0 \ --weight_decay 1e-5 \ --n_epochs 100 \ --place_pred_path \ --is_place_adv \ --is_place_soft \ --is_mask_entropy \ --alpha 0.5 \ --mask_ratio 1.0 \ --slower_place_mlp \ --not_replace_last_fc \ --num_places_classes 365 \ --human_dets_path \ --pretrain_path ``` ### Finetuning on target datasets #### [Diving48](http://www.svcl.ucsd.edu/projects/resound/dataset.html) as an example ``` python train.py \ --dataset diving48 \ --root_path \ --video_path \ --n_classes 200 \ --n_finetune_classes 48 \ --model resnet \ --model_depth 18 \ --resnet_shortcut A \ --ft_begin_index 0 \ --batch_size 32 \ --val_batch_size 16 \ --n_threads 4 \ --checkpoint 1 \ --learning_rate 0.005 \ --weight_decay 1e-5 \ --n_epochs $epoch_ft \ --is_mask_adv \ --annotation_path $anno_path \ --result_path \ --pretrain_path ``` ## Test ``` python train.py \ --dataset diving48 \ --root_path \ --video_path \ --n_finetune_classes 48 \ --n_classes 48 \ --model resnet \ --model_depth 18 \ --resnet_shortcut A \ --batch_size 32 \ --val_batch_size 16 \ --n_threads 4 \ --test \ --test_subset val \ --no_train \ --no_val \ --is_mask_adv \ --annotation_path $anno_path \ --result_path \ --resume_path ``` This step will generate `val.json` file under `$result_path`. ## Evaluation ``` python utils/eval_diving48.py \ --annotation_path $anno_path \ --prediction_path ``` ## Pre-trained model weights provided [Download the pre-trained weights](https://drive.google.com/file/d/1gkyL80fDXmFCBjtgKlFNKVqb4OHNhrDL/view?usp=sharing) ## Acknowledgments This code is built upon [3D-ResNets-PyTorch codebase](https://github.com/kenshohara/3D-ResNets-PyTorch). We thank to Kensho Hara.

近期下载者

相关文件


收藏者