# In depth -- CNN convolutional neural network using tf cnn for mnist handwritten digital code demonstration project

Back to CNN convolutional neural network directory

The last chapter: Depth part -- CNN convolutional neural network (3) On ROI pooling and ROI Align and interpolation

### In this section, I will elaborate on the demonstration project of mnist handwritten digit code using tf cnn

github code of this project: https://github.com/wandaoyi/tf_cnn_mnist_pro

## 5, TF CNN MNIST handwritten digit code demonstration

### (1) preface.

Prior to In depth - neural network We have learned ANN and DNN. Now, we have learned CNN. For learning to apply, we will use CNN to build convolutional neural network.

### (2) . define requirements

The requirement of the project is to recognize the handwritten Arabic numeral pictures from 0 to 9. For example, the number on the invoice (the first one for handwritten number recognition was written by a bank in the United States in 1989, hired by a tycoon, and was written by LeNet-5, a convolutional neural network technology at that time). Front In depth part -- neural network (7) detailed description of DNN Neural Network handwritten digit code demonstration We have used DNN case study; now, let's use LeNet-5 for a case study. At that time, the project was used to identify the signed numbers on the check. Training network, of course, is inseparable from data, so we first download the data, which has been uploaded to Baidu cloud disk for you: link: https://pan.baidu.com/s/13OokGc0h3F5rGrxuSLYj9Q extraction code: qfj6.

### (3) . build project

The project structure is as follows: The above model is the accuracy of 10 epochs I randomly trained: 0.984000. Before we used DNN, the accuracy of 10 epochs was only 0.96 +. In this way, we can see that the accuracy has increased by 2 percentage points. In this way, some people may not be satisfied with it and think it is not good. However, if we look at it in reverse, we can see that the error rate has been reduced by half, so the effect will be very considerable. It's important to be able to do things when working on projects in the company, but it's also important to be able to express yourself.

### (4) . environmental dependence

Environmental dependence:

```pip install numpy==1.16
pip install easydict
conda install tensorflow-gpu==1.13.1 # It is not recommended to use tf version 2.0, which has many pits
```

The installation of tensorflow is explained in detail in my previous blog: Fragmented part -- Installation of tensorflow gpu version If not, you can see how to install.

```# tf_cnn_mnist_pro
tf_cnn Handwritten number forecast 2020-02-09
- Links: https://pan.baidu.com/s/13OokGc0h3F5rGrxuSLYj9Q extraction code: qfj6

## Parameter setting
- Before training or forecasting, we need to set parameters
- open config.py File, where parameters or paths are set.

## Model
- Model code model_net.py
- Here, we use lenet-5 Network model to extract features

## Training model
- Function cnn_mnist_train.py ，Simple operation, right click directly run
- The training effect is as follows:
- acc_train: 1.0
- epoch: 10, acc_test: 0.984000
- Here is the effect of random training. If you want to get good results, you can train more epoch
- You can also add it yourself early-stopping Go in it's not a problem

## Forecast
- Function cnn_mnist_test.py ，Simple operation, right click directly run
- After running, some forecast results will be printed on the console
- The prediction effect is as follows:
- predicted value: [7 2 1 0 4]
- True value: [7 2 1 0 4]

## Tensorbboard log
- Use tensorboard The advantage of this log is that it is real-time, and you can watch the renderings while training.
- stay cmd Command window, enter the following command:
- tensorboard --logdir=G:\work_space\python_space\pro2018_space\wandao\mnist_pro\logs\mnist_log_train --host=localhost
- stay --logdir= Followed by the folder path of the log,
- stay --host= Is used to specify ip If you don't write it, you can only use the address of the computer instead of using it localhost
- Open on Google browser tensorboard Journal: http://localhost:6006/

- Model acc
![image](./docs/images/acc.png)
- model structure
![image](./docs/images/graphs.png)
```

The following file or code, in which there are comments

### (5) . config.py

```#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/08 19:23
# @Author   : WanDaoYi
# @FileName : config.py
# ============================================

from easydict import EasyDict as edict
import os

__C = edict()

cfg = __C

# common options public profile
__C.COMMON = edict()
# Windows gets the absolute path of files, which is convenient for windows to run projects in black windows
__C.COMMON.BASE_PATH = os.path.abspath(os.path.dirname(__file__))
# # Get the path of the current window. When using Linux, switch to this, or an error will be reported. (windows can also use this)
# __C.COMMON.BASE_PATH = os.getcwd()

__C.COMMON.DATA_PATH = os.path.join(__C.COMMON.BASE_PATH, "dataset")

# Shape of image
__C.COMMON.DATA_RESHAPE = [-1, 28, 28, 1]
# Shape of image rezie
__C.COMMON.DATA_RESIZE = (32, 32)

# Training configuration
__C.TRAIN = edict()

# Learning rate
__C.TRAIN.LEARNING_RATE = 0.01
# batch_size
__C.TRAIN.BATCH_SIZE = 32
# Iteration times
__C.TRAIN.N_EPOCH = 10

# Model save path, use relative path, easy to transplant
__C.TRAIN.MODEL_SAVE_PATH = "./checkpoint/model_"
# dropout's holdings, 0.7 represents 70% of the nodes.
__C.TRAIN.KEEP_PROB_DROPOUT = 0.7

# Test configuration
__C.TEST = edict()

# Test model save path
__C.TEST.CKPT_MODEL_SAVE_PATH = "./checkpoint/model_acc=0.984000.ckpt-10"

# Log configuration
__C.LOG = edict()
# Log saving path, followed by a trail or test: for example, MNIST log trail
__C.LOG.LOG_SAVE_PATH = "./logs/mnist_log_"

```

### (6) . common.py

```#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/08 19:26
# @Author   : WanDaoYi
# @FileName : common.py
# ============================================

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
from config import cfg
import numpy as np

class Common(object):

def __init__(self):
# Data path
self.data_file_path = cfg.COMMON.DATA_PATH

pass

train_image = mnist_data.train.images
train_label = mnist_data.train.labels
_, n_feature = train_image.shape
_, n_label = train_label.shape

return mnist_data, n_feature, n_label

# bn operation
def deal_bn(self, input_data, train_flag=True):
bn_info = tf.layers.batch_normalization(input_data, beta_initializer=tf.zeros_initializer(),
gamma_initializer=tf.ones_initializer(),
moving_mean_initializer=tf.zeros_initializer(),
moving_variance_initializer=tf.ones_initializer(),
training=train_flag)
return bn_info
pass

# Pooling treatment
def deal_pool(self, input_data, ksize=(1, 2, 2, 1), strides=(1, 2, 2, 1),
pool_info = tf.nn.avg_pool(value=input_data, ksize=ksize,
name=name)
tf.summary.histogram('pooling', pool_info)
return pool_info
pass

# dropout processing
def deal_dropout(self, hidden_layer, keep_prob):
with tf.name_scope("dropout"):
tf.summary.scalar('dropout_keep_probability', keep_prob)
dropped = tf.nn.dropout(hidden_layer, keep_prob)
tf.summary.histogram('dropped', dropped)
return dropped
pass

# Parameter record
def variable_summaries(self, param):
with tf.name_scope('summaries'):
mean = tf.reduce_mean(param)
tf.summary.scalar('mean', mean)
with tf.name_scope('stddev'):
stddev = tf.sqrt(tf.reduce_mean(tf.square(param - mean)))
tf.summary.scalar('stddev', stddev)
tf.summary.scalar('max', tf.reduce_max(param))
tf.summary.scalar('min', tf.reduce_min(param))
tf.summary.histogram('histogram', param)

# Full connection operation
def neural_layer(self, x, n_neuron, name="fc"):
# Include all computing nodes. For this layer, the name scope can be written or not
with tf.name_scope(name=name):
n_input = int(x.get_shape())
stddev = 2 / np.sqrt(n_input)

# The w in this layer can be regarded as a two-dimensional array. Each neuron has a set of w parameters
# truncated normal distribution has a smaller value than regular normal distribution
# There will be no big weight value to ensure a slow and steady training
# Using this standard deviation will make convergence faster
# The w parameter needs to be random, not 0, otherwise the output is 0, and the final adjustment is not significant.
with tf.name_scope("weights"):
init_w = tf.truncated_normal((n_input, n_neuron), stddev=stddev)
w = tf.Variable(init_w, name="weight")
self.variable_summaries(w)

with tf.name_scope("biases"):
b = tf.Variable(tf.zeros([n_neuron]), name="bias")
self.variable_summaries(b)
with tf.name_scope("wx_plus_b"):
z = tf.matmul(x, w) + b
tf.summary.histogram('pre_activations', z)

return z

# Convolution operation
def conv2d(self, input_data, filter_shape, strides_shape=(1, 1, 1, 1),
with tf.variable_scope(name):
weight = tf.get_variable(name="weight", dtype=tf.float32,
trainable=train_flag,
shape=filter_shape,
initializer=tf.random_normal_initializer(stddev=0.01))

conv = tf.nn.conv2d(input=input_data, filter=weight,

conv_2_bn = self.deal_bn(conv, train_flag=train_flag)

return conv_2_bn
pass
pass
```

### (7) . model code

```#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/08 22:26
# @Author   : WanDaoYi
# @FileName : model_net.py
# ============================================

import tensorflow as tf
from core.common import Common

class ModelNet(object):

def __init__(self):
self.common = Common()
pass

def lenet_5(self, input_data, n_label=10, keep_prob=1.0, train_flag=True):
with tf.variable_scope("lenet-5"):
conv_1 = self.common.conv2d(input_data, (5, 5, 1, 6), name="conv_1")
tanh_1 = tf.nn.tanh(conv_1, name="tanh_1")
avg_pool_1 = self.common.deal_pool(tanh_1, name="avg_pool_1")

conv_2 = self.common.conv2d(avg_pool_1, (5, 5, 6, 16), name="conv_2")
tanh_2 = tf.nn.tanh(conv_2, name="tanh_2")
avg_pool_2 = self.common.deal_pool(tanh_2, name="avg_pool_2")

conv_3 = self.common.conv2d(avg_pool_2, (5, 5, 16, 120), name="conv_3")
tanh_3 = tf.nn.tanh(conv_3, name="tanh_3")

reshape_data = tf.reshape(tanh_3, [-1, 120])

dropout_1 = self.common.deal_dropout(reshape_data, keep_prob)

fc_1 = self.common.neural_layer(dropout_1, 84, name="fc_1")
tanh_4 = tf.nn.tanh(fc_1, name="tanh_4")

dropout_2 = self.common.deal_dropout(tanh_4, keep_prob)

fc_2 = self.common.neural_layer(dropout_2, n_label, name="fc_2")
scale_2 = self.common.deal_bn(fc_2, train_flag=train_flag)
result_info = tf.nn.softmax(scale_2, name="result_info")

return result_info

pass

```

The model here, I used lenet-5, of course, I want to change other models later, it is also OK. In lenet-5, the input of the model is a 32 x 32 size image required by the shape. Otherwise, if the scale is not enough, the model will report an error. So, resize the image to 32 x 32.

### (8) . training code

```#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/08 19:24
# @Author   : WanDaoYi
# @FileName : cnn_mnist_train.py
# ============================================

from datetime import datetime
import tensorflow as tf
from config import cfg
from core.common import Common
from core.model_net import ModelNet

class CnnMnistTrain(object):

def __init__(self):
# Model save path
self.model_save_path = cfg.TRAIN.MODEL_SAVE_PATH
self.log_path = cfg.LOG.LOG_SAVE_PATH

self.learning_rate = cfg.TRAIN.LEARNING_RATE
self.batch_size = cfg.TRAIN.BATCH_SIZE
self.n_epoch = cfg.TRAIN.N_EPOCH

self.data_shape = cfg.COMMON.DATA_RESHAPE
self.data_resize = cfg.COMMON.DATA_RESIZE

self.common = Common()
self.model_net = ModelNet()

# Create a blueprint
with tf.name_scope(name="input_data"):
self.x = tf.placeholder(dtype=tf.float32, shape=(None, self.n_feature), name="input_data")
self.y = tf.placeholder(dtype=tf.float32, shape=(None, self.n_label), name="input_labels")

with tf.name_scope(name="input_shape"):
# 784 dimensions are transformed into pictures and kept to nodes
# -1 represents the number of incoming pictures, 28, 28 is the height and width of the picture, 1 is the color channel of the picture
image_shaped_input = tf.reshape(self.x, self.data_shape)
# resize the input image to the size required by the network
image_resize = tf.image.resize_images(image_shaped_input, self.data_resize)
tf.summary.image('input', image_resize, self.n_label)

self.keep_prob_dropout = cfg.TRAIN.KEEP_PROB_DROPOUT
self.keep_prob = tf.placeholder(tf.float32)

# Get the return result of the last level of lenet 5
self.result_info = self.model_net.lenet_5(image_resize, n_label=self.n_label,
keep_prob=self.keep_prob_dropout)

# Calculated loss
with tf.name_scope(name="train_loss"):
# Define loss function
self.cross_entropy = tf.reduce_mean(-tf.reduce_sum(self.y * tf.log(self.result_info),
reduction_indices=))
tf.summary.scalar("train_loss", self.cross_entropy)
pass

with tf.name_scope(name="optimizer"):
self.train_op = self.optimizer.minimize(self.cross_entropy)
pass

with tf.name_scope(name="accuracy"):
self.correct_pred = tf.equal(tf.argmax(self.result_info, 1), tf.argmax(self.y, 1))
self.acc = tf.reduce_mean(tf.cast(self.correct_pred, tf.float32))
tf.summary.scalar("accuracy", self.acc)
pass

# Because we have defined too many tf.summary summary operations before, it is too troublesome to perform them one by one,
# Get all the summary operations directly using TF. Summary. Merge all() for later execution
self.merged = tf.summary.merge_all()

self.sess = tf.InteractiveSession()
# Save training model
self.saver = tf.train.Saver()

# Define two tf.summary.FileWriter file recorders and different subdirectories to store the training and test log data respectively
# At the same time, the Session calculation graph sess.graph is added to the training process so that it can be displayed in the graphics window of TensorBoard
self.train_writer = tf.summary.FileWriter(self.log_path + 'train', self.sess.graph)
self.test_writer = tf.summary.FileWriter(self.log_path + 'test')

pass

# Irrigation data
def feed_dict(self, train_flag=True):
# training sample
if train_flag:
# Get next batch of samples
x_data, y_data = self.mnist_data.train.next_batch(self.batch_size)
keep_prob = self.keep_prob_dropout
pass
# Validation sample
else:
x_data, y_data = self.mnist_data.test.images, self.mnist_data.test.labels
keep_prob = 1.0
pass
return {self.x: x_data, self.y: y_data, self.keep_prob: keep_prob}
pass

def do_train(self):
# Define initialization
init = tf.global_variables_initializer()
self.sess.run(init)

test_acc = None
for epoch in range(self.n_epoch):
# Obtain the total number of samples
batch_number = self.mnist_data.train.num_examples
# Obtain the total samples in several batches
size_number = int(batch_number / self.batch_size)
for number in range(size_number):
summary, _ = self.sess.run([self.merged, self.train_op], feed_dict=self.feed_dict())

# Cycle number
i = epoch * size_number + number + 1

if number == size_number - 1:
# Get next batch of samples
x_batch, y_batch = self.mnist_data.train.next_batch(self.batch_size)
acc_train = self.acc.eval(feed_dict={self.x: x_batch, self.y: y_batch})
print("acc_train: {}".format(acc_train))

# Verification method two or two, any one can be chosen.
test_summary, acc_test = self.sess.run([self.merged, self.acc], feed_dict=self.feed_dict(False))
print("epoch: {}, acc_test: {}".format(epoch + 1, acc_test))

test_acc = acc_test
pass

save_path = self.model_save_path + "acc={:.6f}".format(test_acc) + ".ckpt"
# Preservation model
self.saver.save(self.sess, save_path, global_step=self.n_epoch)

self.train_writer.close()
self.test_writer.close()

pass

if __name__ == "__main__":

# Code start time
start_time = datetime.now()
print("start time: {}".format(start_time))

demo = CnnMnistTrain()
demo.do_train()

# Code end time
end_time = datetime.now()
print("End time: {}, Training model time consuming: {}".format(end_time, end_time - start_time))
```

Training code, just rough to training, not to do grid search, not to do fine tuning. There is no early stopping. If you are interested, you can add it yourself.

### (9) . test code

```#!/usr/bin/env python
# _*_ coding:utf-8 _*_
# ============================================
# @Time     : 2020/02/08 19:24
# @Author   : WanDaoYi
# @FileName : cnn_mnist_test.py
# ============================================

from datetime import datetime
import tensorflow as tf
import numpy as np
from config import cfg
from core.common import Common
from core.model_net import ModelNet

class CnnMnistTest(object):

def __init__(self):
self.common = Common()
self.model_net = ModelNet()

# ckpt model
self.test_ckpt_model = cfg.TEST.CKPT_MODEL_SAVE_PATH
print("test_ckpt_model: {}".format(self.test_ckpt_model))

# tf.reset_default_graph()
# Create a blueprint
with tf.name_scope(name="input"):
self.x = tf.placeholder(dtype=tf.float32, shape=(None, self.n_feature), name="input_data")
self.y = tf.placeholder(dtype=tf.float32, shape=(None, self.n_label), name="input_labels")

self.data_shape = cfg.COMMON.DATA_RESHAPE
self.data_resize = cfg.COMMON.DATA_RESIZE
with tf.name_scope(name="input_shape"):
# 784 dimensions are transformed into pictures and kept to nodes
# -1 represents the number of incoming pictures, 28 x 28 is the height and width of the picture, 1 is the color channel of the picture
self.image_shaped_input = tf.reshape(self.x, self.data_shape)
# resize the input image to the size 32 x 32 required by the network
self.image_resize = tf.image.resize_images(self.image_shaped_input, self.data_resize)

# Get the return result of the last level of lenet 5
self.result_info = self.model_net.lenet_5(self.image_resize, n_label=self.n_label)

pass

# Forecast
def do_ckpt_test(self):

saver = tf.train.Saver()

with tf.Session() as sess:
saver.restore(sess, self.test_ckpt_model)

# Forecast
output = self.result_info.eval(feed_dict={self.x: self.mnist_data.test.images})

# Convert one hot forecast to number
y_perd = np.argmax(output, axis=1)
print("predicted value: {}".format(y_perd[: 5]))

# True value
y_true = np.argmax(self.mnist_data.test.labels, axis=1)
print("True value: {}".format(y_true[: 5]))
pass

pass

if __name__ == "__main__":
# Code start time
start_time = datetime.now()
print("start time: {}".format(start_time))

demo = CnnMnistTest()
# Test with ckpt model
demo.do_ckpt_test()

# Code end time
end_time = datetime.now()
print("End time: {}, Training model time consuming: {}".format(end_time, end_time - start_time))
```

### (10) . view log effect

Image of acc: graphs image: In the tensorboard log, you can double-click lenet-5. The model structure is as follows: After opening the log graphs, you can zoom in to see a clear image

From the training of DNN and CNN, it is not difficult to see that CNN is better than DNN in image information prediction (the accuracy of 10 epochs of DNN is only 96%, while the accuracy of 10 epochs of CNN is 98%. Of course, this is only the training effect in the early stage, which is not good to explain directly. However, if the training times are enough, the effect of CNN will be a little better. That's why, in image processing, CNN is mostly used instead of pure DNN)   