ResidualAttentionNetwork

所属分类:数值算法/人工智能
开发工具:Python
文件大小:45468KB
下载次数:0
上传日期:2019-06-12 03:31:15
上 传 者sh-1993
说明:  剩余注意力网络,剩余注意力网络的一个重要实现。cifar上的最佳acc为10-97.78%。
(ResidualAttentionNetwork,A Gluon implement of Residual Attention Network. Best acc on cifar10-97.78%.)

文件列表:
Attention92_cifar10_train.log (73580, 2019-06-12)
LICENSE (1067, 2019-06-12)
_config.yml (26, 2019-06-12)
cifar_param (0, 2019-06-12)
cifar_param\cifar_att92 (0, 2019-06-12)
cifar_param\cifar_att92\test_epoch215_0.97140.param (48994728, 2019-06-12)
imgs (0, 2019-06-12)
imgs\Figure1.png (736011, 2019-06-12)
imgs\Figure2.png (125667, 2019-06-12)
imgs\Figure3.png (144718, 2019-06-12)
kaggle (0, 2019-06-12)
kaggle\0.9778.png (25687, 2019-06-12)
kaggle\gen_submission.py (2208, 2019-06-12)
kaggle\test_ems.py (1369, 2019-06-12)
kaggle\utils.py (298, 2019-06-12)
lib (0, 2019-06-12)
lib\piston_util.py (1819, 2019-06-12)
logs (0, 2019-06-12)
logs\Attention92_cifar10_train.log (20067, 2019-06-12)
logs\Attention92_cifar10_train_mixup.log (30690, 2019-06-12)
model (0, 2019-06-12)
model\attention_module.py (16244, 2019-06-12)
model\basic_layer.py (1491, 2019-06-12)
model\residual_attention_network.py (9097, 2019-06-12)
test_module (0, 2019-06-12)
test_module\__init__.py (88, 2019-06-12)
test_module\test_model.py (1248, 2019-06-12)
train_cifar.py (6601, 2019-06-12)
train_imagenet.py (5910, 2019-06-12)

# Residual Attention Network [![GitHub](https://img.shields.io/github/license/PistonY/ResidualAttentionNetwork.svg)](./LICENSE) ![Status](https://img.shields.io/badge/status-%E8%90%8C-orange.svg) [![996.icu](https://img.shields.io/badge/link-996.icu-red.svg)](https://996.icu) ![ToDo](http://progressed.io/bar/100?title=ToDo) A Gluon implement of Residual Attention Network This code is refered to this project https://github.com/tengshaofeng/ResidualAttentionNetwork-pytorch ## Cifar-10 Kaggle ![4](kaggle/0.9778.png) ## [GluonCV](http://gluon-cv.mxnet.io) Project site: https://github.com/dmlc/gluon-cv I have contribute this project to GluonCV.Now you can easily use pre-trained model in few days. Usage: ```python from gluoncv.model_zoo.residual_attentionnet import * ``` Include which you can use: ```python __all__ = ['ResidualAttentionModel', 'cifar_ResidualAttentionModel', 'residualattentionnet56', 'cifar_residualattentionnet56', 'residualattentionnet92', 'cifar_residualattentionnet92', 'residualattentionnet128', 'cifar_residualattentionnet452', 'residualattentionnet1***', 'residualattentionnet200', 'residualattentionnet236', 'residualattentionnet452'] ``` ## Prerequisites Python3.6, Numpy, mxnet - I use maxnet-cu90 --pre but if not is just ok - If you want to train you need a recent NVIDIA GPU ## Results - [x] cifar-10: Acc-95.41(**Top-1 err 4.59**) with Attention-92(higher than paper top-1 err 4.99) - [x] cifar-10: Acc-95.68(**Top-1 err 4.32**) with Attention-92(use MSRAPrelu init) - [x] cifar-10: Acc-97.14(**Top-1 err 2.86**) with Attention-92, using [gluoncv-tricks](https://arxiv.org/pdf/1812.01187.pdf). - BS 256, - +mixup, - +LR warmup, - +No bias decay. - +Cosine decay. - +Cutout - [x] cifar-10: Acc-97.57(**Top-1 err 2.43**) with Attention-452, using [gluoncv-tricks](https://arxiv.org/pdf/1812.01187.pdf). - BS 128, - +mixup, - +LR warmup, - +No bias decay. - +Cosine decay. - +Cutout - [x] Network scale control: I add 'p,t,r,m' to control network scale.(Gluon-CV) - I add 'p,t,r,m.' control which origin paper proposed.Now you can use Attentnon 56/92/128/1***/200/236/452 in Gluon-cv.But I won't update to this project.Because I can't train them and if I add, the paprm I have trained won't use any more. - [x] ImageNet: Attention56 achieves (21.03 5.47) top1/top5 error on ImageNet.Better than paper.(21.76 5.9).(Gluon-cv) ## How to train & test For training cifar10, just run train_cifar.py For only testing cifar10, you can simply run below script. ```python import mxnet as mx from mxnet import gluon, image from train_cifar import test from model.residual_attention_network import ResidualAttentionModel_92_32input_update def trans_test(data, label): im = data.astype(np.float32) / 255. auglist = image.CreateAugmenter(data_shape=(3, 32, 32), mean=mx.nd.array([0.485, 0.456, 0.406]), std=mx.nd.array([0.229, 0.224, 0.225])) for aug in auglist: im = aug(im) im = nd.transpose(im, (2, 0, 1)) return im, label ctx = mx.gpu() val_data = gluon.data.DataLoader( gluon.data.vision.CIFAR10(train=False, transform=trans_test), batch_size=***) net = ResidualAttentionModel_92_32input_update() net.hybridize() net.load_parameters('cifar_param/test_iter225999_0.95410.param') test(net, ctx, val_data, 0) ``` ## Paper referenced Residual Attention Network for Image Classification (CVPR-2017 Spotlight) By Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Chen Li, Honggang Zhang, Xiaogang Wang, Xiaoou Tang(https://arxiv.org/pdf/1704.06904.pdf) ![1](imgs/Figure1.png) **Left**: an example shows the interaction between features and attention masks. **Right**: example images illustrating that different features have different corresponding attention masks in our network. The sky mask diminishes low-level background blue color features. The balloon instance mask highlights high-level balloon bottom part features.
![2](imgs/Figure2.png) Attention Network architecture.
![3](imgs/Figure3.png) The Attention-56 network outperforms ResNet-152 by a large margin with a 0.4% reduction on top-1 error and a 0.26% reduction on top-5 error. More importantly **Attention-56 network achieves better performance with only 52% parameters and 56% FLOPs compared with ResNet-152**, which suggests that the proposed attention mechanism can significantly improve network performance while reducing the model complexity.

近期下载者

相关文件


收藏者