assignment4
所属分类:人工智能/神经网络/深度学习
开发工具:Python
文件大小:278KB
下载次数:0
上传日期:2020-06-30 07:41:40
上 传 者:
shanshiva
说明: deep learning exercise-2, description etc
文件列表:
assignment4\' (0, 2020-06-30)
assignment4\'\tmp (0, 2020-06-30)
assignment4\'\tmp\mymodel' (0, 2020-06-30)
assignment4\'\tmp\mymodel'\log (0, 2020-06-30)
assignment4\'\tmp\mymodel'\log\events.out.tfevents.1544184097.THUSHAN (333631, 2018-12-07)
assignment4\__pycache__ (0, 2020-06-30)
assignment4\__pycache__\mymodel.cpython-36.pyc (4290, 2018-12-07)
assignment4\datasets (0, 2020-06-30)
assignment4\datasets\__init__.py (1, 2018-12-07)
assignment4\datasets\__pycache__ (0, 2020-06-30)
assignment4\datasets\__pycache__\__init__.cpython-36.pyc (197, 2018-12-07)
assignment4\datasets\__pycache__\cifar10.cpython-36.pyc (2320, 2018-12-07)
assignment4\datasets\__pycache__\dataset_utils.cpython-36.pyc (4430, 2018-12-07)
assignment4\datasets\cifar10.py (3237, 2018-12-07)
assignment4\datasets\dataset_utils.py (4680, 2018-12-07)
assignment4\images (0, 2020-06-30)
assignment4\images\graph.jpg (88921, 2018-12-07)
assignment4\images\histograms.jpg (53111, 2018-12-07)
assignment4\images\scale1.jpg (96819, 2018-12-07)
assignment4\images\scale2.jpg (29919, 2018-12-07)
assignment4\model_eval.py (5200, 2018-12-07)
assignment4\model_train.py (6187, 2018-12-08)
assignment4\mymodel.py (5376, 2018-12-07)
# Build own model using tensorflow api
Let's say, the structure of the model is:
| Input (32 x 32 RGB images) | Layers|
|:----------:|:-------:|
| Conv3-8 | Layer-1 |
| Conv3-8 | Layer-2 |
| Conv3-8 | Layer-3 |
| maxpool | Layer-4 |
| Conv3-*** | Layer-5 |
| Conv3-*** | Layer-6 |
| Conv3-*** | Layer-7 |
| maxpool | Layer-8 |
| Conv3-*** | Layer-9 |
| Conv3-*** | Layer-10 |
| Conv3-*** | Layer-11 |
| maxpool | Layer-12 |
| FC-1024 | Layer-13 |
| FC-10 | Layer-14 |
| softmax | Layer-15 |
* Conv3-8 means the convolutional kernel is 3 — 3, and number of output channels is 8, the padding style is **SAME** rather than **VALID** (see `tf.nn.conv2d`)
* FC-1024 means the output size of the FC layer is 1024
* stride of Conv layers is 1, stride of pooling layers is 2
* kernel size of maxpool is 2 x 2
## Dirs used
Here, we spicify the following paths for the model for convenience. Please change them to your own path when doing yourself.
**Path where you download cifar10 dataset:** */mymodel/cifar10-data*
**Path where you save your model to when training and load your model from when testing or finetuning:** */mymodel/model*
**Path where you save your log informations:** */mymodel/log*
# Prepare cifar10 dataset
As assignment 3, we follow the instructions in [Downloading and converting to TFRecord format](https://github.com/tensorflow/models/tree/master/research/slim) by changing the `DATA_DIR` to '/mymodel/cifar10-data' and `dataset_name` to cifar10
Now in the /mymodel/cifar10-data, there would be the following files: *cifar10_test.tfrecord cifar10_train.tfrecord labels.txt*
# Build your own model with the above structure
Create a file with name "mymodel.py" and build your model there.
First, the input of the model should definitely be the `images`, which has the shape of [batch_size, height, width, channels], i.e., [***, 32, 32, 3] since we now just set the batch size to ***
We then begin to input the `images` to the structure, the first thing we come across here would be the first Conv3-8 layer, which has kernel size of 3 x 3, stride of 1 and output channels of 8. So when we type:
```python
conv_l_1 = conv_layer(images, 3, 8, "layer1")
```
we get the output of the first Conv3-8 layer, `conv_l_1`, which has the shape of [batch_size, height_1, width_1, output_channels], i.e., [***, 32, 32, 8]. Since the stride of the conv layer is 1, and the padding strategy is **SAME**, the height and width will not change
Now, take a look at the function `conv_layer`
```python
def conv_layer(bottom, in_channels, out_channels, name):
with tf.variable_scope(name):
filt, conv_biases = get_conv_var(3, in_channels, out_channels, name)
conv = tf.nn.conv2d(bottom, filt, [1, 1, 1, 1], padding='SAME')
bias = tf.nn.bias_add(conv, conv_biases)
relu = tf.nn.relu(bias)
return relu
def get_conv_var(filter_size, in_channels, out_channels, name):
filters = tf.get_variable('filters',
shape=[filter_size, filter_size, in_channels, out_channels],
initializer=tf.truncated_normal_initializer(mean=0.0, stddev=5e-2),
trainable=trainable, # trainable on the right hand is a global variable
regularizer=None)
biases = tf.get_variable('biases',
shape=[out_channels],
initializer=tf.constant_initializer(value=0.0),
trainable=trainable,
regularizer=None)
return filters, biases
```
where, filters and biases are the parameters of conv layer whose values will be learnt during training, actually, the training process of the model is to find the optimized values of these parameters that minimize the final loss
Similar to layer1, we add two more conv layers, then followed by a maxpool layer.
```python
conv_l_2 = conv_layer(conv_l_1, 8, 8, "layer2")
conv_l_3 = conv_layer(conv_l_2, 8, 8, "layer3")
pool_l_4 = max_pool(conv_l_3, 'layer4')
```
The functions `max_pool` looks like
```python
def max_pool(bottom, name):
return tf.nn.max_pool(bottom, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name)
```
Like `conv_layer` we just mentioned, it is using tensorflow apis again, i.e., `tf.nn.max_pool`. These are the basic apis in tensorflow for CNN. By following this post, you will get to know many of these basic apis, cheer up!
OK, now repeat the above process. After some blocks of convs-pools, we get the output of the last max-pooling layer:
```python
pool_l_12 = max_pool(conv_l_11, 'layer12')
```
A voice says: maybe it's time to connect some FC layers. So we just follow what it says.
```python
fc_l_13 = fc_layer(pool_l_12, 1024, "layer13")
fc_l_14 = fc_layer(fc_l_13, 10, "layer14")
```
and fc_layer
```python
def fc_layer(bottom, out_size, name):
with tf.variable_scope(name):
batch_size = bottom.get_shape()[0]
x = tf.reshape(bottom, [batch_size, -1])
in_size = x.get_shape()[1]
weights, biases = get_fc_var(in_size, out_size, name)
fc = tf.nn.bias_add(tf.matmul(x, weights), biases)
tf.summary.histogram('weights', weights)
return fc
def get_fc_var(in_size, out_size, name):
# please set a regularizer for the fc weights
weights = tf.get_variable('weights',
shape=[in_size, out_size],
initializer=tf.truncated_normal_initializer(mean=0.0, stddev=5e-2),
trainable=trainable,
regularizer=regularizer) # regularizer on the right hand is a global variable
biases = tf.get_variable('biases',
shape=[out_size],
initializer=tf.constant_initializer(value=0.0),
trainable=trainable,
regularizer=None)
return weights, biases
```
One can find that there is a extra `regularizer` for the `weights`, which is used to prevent the values of weights being too large. In CNN, it is important to import such regularizers for the parameters in some ocassions, since values of parameters can be easily uncontrollable
Untill now, we need to calculate the loss and accuracy of the model.
```python
softmax_l_15 = tf.nn.softmax(fc_l_14, name="layer15")
accuracy = tf.metrics.accuracy(labels=tf.argmax(labels, 1), predictions=tf.argmax(softmax_l_15, 1))
loss = tf.nn.softmax_cross_entropy_with_logits_v2(
labels=labels, logits=fc_l_14)
loss = tf.reduce_mean(loss)
reg_loss = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
# the reg_loss will be empty if no regularizer added
if len(reg_loss) > 0:
reg_loss = tf.add_n(reg_loss)
tf.summary.scalar('reg_loss', reg_loss)
total_loss = loss + reg_loss
else:
total_loss = loss
```
The above codes are just to help you understand how to build a model with your own structure. For training and evaluation, see `model_train.py` and `model_eval.py`, where you will see how to save and restore a model to or from a path, how to do summary and track the values of variables.
There is an example of training and evaluation
```bash
python model_train.py \
--train_dir='/mymodel/model' \
--data_dir='/mymodel/cifar10-data'
```
```bash
python model_eval.py \
--train_dir='/mymodel/model' \
--data_dir='/mymodel/cifar10-data'
```
* Please at least understand the following code:
* mymodel.py
* model_train.py
* model_eval.py
*Note:* The performance of this model may depend on the initialization, sometimes the accuracy gets stuck in 0.1 and does not increase. For this case, just kill it and rerun. The solution of this probelm may be introducing other tricks, like dropout, loss average and batch normalization, or modifying the regularization of weights. Have a try.
Assuming that you have trained the model and save the log file to /mymodel/model/log, for visualization, use
```bash
tensorboard --logdir='/mymodel/model/log'
```
then open the link created in your browser.
The results of tensorboard are like this
![](https://github.com/SuZhuo/DeepLearning-course/raw/master/assignment4/images/graph.jpg)
![](https://github.com/suzhuo/DeepLearning-course/raw/master/assignment4/images/scale1.jpg)
![](https://github.com/suzhuo/DeepLearning-course/raw/master/assignment4/images/scale2.jpg)
![](https://github.com/suzhuo/DeepLearning-course/raw/master/assignment4/images/histogram.jpg)
Now, bonus:
Can you modify the `model_train.py` so that it can load the saved checkpoint and continue to train?
Any problem: email zhuo.su@oulu.fi
近期下载者:
相关文件:
收藏者: