ShuffleNet-master

所属分类:人工智能/神经网络/深度学习
开发工具:Python
文件大小:159KB
下载次数:5
上传日期:2018-09-26 19:04:15
上 传 者123xy
说明:  ShuffleNet是Face++的一篇关于降低深度网络计算量的论文,号称是可以在移动设备上运行的深度网络。这篇文章可以和MobileNet、Xception和ResNeXt结合来看,因为有类似的思想。卷积的group操作从AlexNet就已经有了,当时主要是解决模型在双GPU上的训练。ResNeXt借鉴了这种group操作改进了原本的ResNet。MobileNet则是采用了depthwise separable convolution代替传统的卷积操作,在几乎不影响准确率的前提下大大降低计算量,具体可以参考MobileNets-深度学习模型的加速。Xception主要也是采用depthwise separable convolution改进Inception v3的结构。
(ShuffleNet, a paper on reducing deep network computing by face+, claims to be a deep network that can run on mobile devices. This article can be combined with MobileNet, Xception, and ResNeXt because of similar ideas. The convolution group operation has been available since AlexNet, when it was mainly to solve the model's training on the dual GPU. ResNeXt took advantage of this group operation to improve the original ResNet. MobileNet is adopted depthwise separable to replace the conventional convolution convolution operation, on the premise of almost not influence the accuracy greatly reduce the amount of calculation, specific can consult MobileNets - the acceleration of deep learning model. Xception mainly is USES depthwise separable convolution to improve the structure of the Inception v3.)

文件列表:
LICENSE (11357, 2018-03-23)
config (0, 2018-03-23)
config\test.json (362, 2018-03-23)
data (0, 2018-03-23)
data\0.jpg (7081, 2018-03-23)
data\1.jpg (36601, 2018-03-23)
data\2.jpg (34211, 2018-03-23)
data_loader.py (3680, 2018-03-23)
figures (0, 2018-03-23)
figures\shuffle.PNG (39412, 2018-03-23)
figures\unit.PNG (33766, 2018-03-23)
layers.py (20058, 2018-03-23)
main.py (2285, 2018-03-23)
model.py (7957, 2018-03-23)
summarizer.py (1875, 2018-03-23)
train.py (7374, 2018-03-23)
utils.py (2914, 2018-03-23)

# ShuffleNet An implementation of `ShuffleNet` introduced in TensorFlow. According to the authors, `ShuffleNet` is a computationally efficient CNN architecture designed specifically for mobile devices with very limited computing power. It outperforms `Google MobileNet` by small error percentage at much lower FLOPs. Link to the original paper: [ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices](https://arxiv.org/abs/1707.01083) ## ShuffleNet Unit


### Group Convolutions The paper uses the group convolution operator. However, that operator is not implemented in TensorFlow backend. So, I implemented the operator using graph operations. This issue was discussed here: [Support Channel groups in convolutional layers #10482](https://github.com/tensorflow/tensorflow/pull/10482) ## Channel Shuffling


### Channel Shuffling can be achieved by applying three operations: 1. Reshaping the input tensor from (N, H, W, C) into (N, H, W, G, C'). 2. Performing matrix transpose operation on the two dimensions (G, C'). 3. Reshaping the tensor back into (N, H, W, C). N: Batch size, H: Feature map height, W: Feature map width, C: Number of channels, G: Number of groups, C': Number of channels / Number of groups Note that: The number of channels should be divisible by the number of groups. ## Usage ### Main Dependencies ``` Python 3 or above tensorflow 1.3.0 numpy 1.13.1 tqdm 4.15.0 easydict 1.7 matplotlib 2.0.2 ``` ### Train and Test 1. Prepare your data, and modify the data_loader.py/DataLoader/load_data() method. 2. Modify the config/test.json to meet your needs. ### Run ``` python main.py --config config/test.json ``` ## Results The model have successfully overfitted TinyImageNet-200 that was presented in [CS231n - Convolutional Neural Networks for Visual Recognition](https://tiny-imagenet.herokuapp.com/). I'm working on ImageNet training.. ## Benchmarking The paper has achieved 140 MFLOPs using the vanilla version. Using the group convolution operator implemented in TensorFlow, I have achieved approximately 270 MFLOPs. The paper counts multiplication+addition as one unit, so roughly dividing 270 by two, I have achieved what the paper proposes. To calculate the FLOPs in TensorFlow, make sure to set the batch size equal to 1, and execute the following line when the model is loaded into memory. ``` tf.profiler.profile( tf.get_default_graph(), options=tf.profiler.ProfileOptionBuilder.float_operation(), cmd='scope') ``` ## TODO * Training on ImageNet dataset. In progress... ## Updates * Inference and training are working properly. ## License This project is licensed under the Apache License 2.0 - see the LICENSE file for details. ## Acknowledgments Thanks for all who helped me in my work and special thanks for my colleagues: [Mo'men Abdelrazek](https://github.com/moemen95), and [Mohamed Zahran](https://github.com/moh3th1).

近期下载者

相关文件


收藏者