Implementation of Generate Countermeasure Network (GAN) and its Variant (DCGAN):Based on tensorflow

Preface

To understand what generates an antagonistic network, first explain supervised learning and unsupervised learning:
Supervised learning: Machine learning based on a large number of labeled training sets and test sets, such as a picture classifier that requires a series of pictures and corresponding labels ("cat", "dog"...). The MNIST handwritten numeric set is a set of labeled training and test sets.
Unsupervised learning: Solves problems in pattern recognition based on training samples with unknown (unmarked) categories. This is the biggest difference between unsupervised learning and supervised learning. Unsupervised learning does not require a lot of manpower to label the training set. It can learn its own characteristics from the training set. K-means clustering algorithm is one of the more famous algorithms.

Concepts - GAN

Ian J. Goodfellow et al. proposed a new framework for generating models by estimating the confrontation process in Generative Adversarial Networks in October 2014. Two models are trained simultaneously in the framework: generation model G, which captures the distribution of data, and discrimination model D, which estimates the probability that samples will come from the training data. G's training program maximizes the probability of D errors. This framework corresponds to a two-party confrontation game with a minimum maximum set. It can be demonstrated that in the space of any function G and D, there is a unique solution to make G reproduce the distribution of training data, while D=0.5. Where G and D are defined by a multilayer sensor, the entire system can be trained using reverse propagation. During training or sample generation, no Markov chains or expanded approximate inference networks are required.

The framework is Generative Adversarial Networks (GAN). A simple understanding is that there are two models in the generation of the antagonistic Network - generators and discriminators. During the training process, two models are trained to play against each other. Finally, a generator that can generate near-real pictures and a judge that can recognize real and virtual pictures are trained.

GAN Variant - DCGAN

DCGAN is a better improvement to GAN. Its main improvement is in the network structure. To date, DCGAN greatly improves the stability of GAN training and the quality of results generated.
The original GAN used the multilayer perception machine as the basic model of generator and judger (of course, in theory, other models can be used as long as the corresponding tasks can be completed), while DCGAN used the convolution neural network as the basic model of generator and judger. Of course, the convolution network used here is different from the usual one. The main changes are as follows:

1. Cancel all pooling layers, complete up-sampling with transpose convolution in G and replace with step convolution in D.
2. Batch normalization is used in both generator and adjudicator;
3. Remove the fully connected hidden layer;
4. G uses relu as the activation function, but tanh as the output;
5. The discriminator uses leakyrelu as its activation function.

Training process


As shown in the figure, when training the discriminator, use the D model directly to train and set different labels for the real and generated data. When training the generator, it is trained directly through the whole generation of the antagonistic network, but the D model is not trainable at this time, and the label should be set to 1 because we want the final output to be that the discriminator recognizes the generated picture as a real picture.
The two models do not train G separately until D is fully trained. They train D simultaneously after a batch, G starts training, G completes a batch, D starts training again, and so on.

Here is only a simple overview of the training process, detailed training process can be referred to Detailed Generate Antagonist Network (GAN).

code implementation

GAN

First import the appropriate Libraries

import tensorflow as tf
import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential,Model
from tensorflow.keras.layers import Input,Dense,Dropout,Activation,Flatten
from tensorflow.keras.optimizers import Adam,RMSprop
import numpy as np
import matplotlib.pyplot as plt
import random

Then, import the data and preprocess it, using the MNIST dataset directly here

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000,784)
x_test = x_test.reshape(10000,784)
x_train = x_train.astype('float32')/255
x_test = x_test.astype('float32')/255

Set the noise level

z_dim = 100

Set up the optimizer, the parameters set in the DCGAN paper referenced here

adam = Adam(lr=0.0002,beta_1=0.5)

Build Generator Model

g = Sequential()
g.add(Dense(256,input_dim=z_dim,activation='relu'))
g.add(Dense(512,activation='relu'))
g.add(Dense(1024,activation='relu'))
g.add(Dense(2048,activation='relu'))
g.add(Dense(784,activation='sigmoid'))
g.compile(loss='binary_crossentropy',optimizer=adam,metrics=['accuracy'])

Build the discriminator model, first set the D model as untrainable, because we need to train the G first

d = Sequential()
d.add(Dense(1024,input_dim=784,activation='relu'))
d.add(Dropout(0.3))
d.add(Dense(512,activation='relu'))
d.add(Dropout(0.3))
d.add(Dense(256,activation='relu'))
d.add(Dropout(0.3))
d.add(Dense(64,activation='relu'))
d.add(Dropout(0.3))
d.add(Dense(1,activation='sigmoid'))
d.compile(loss='binary_crossentropy',optimizer=adam,metrics=['accuracy'])
d.trainable = False

Connect the two models to form an antagonistic network

inputs = Input(shape=(z_dim,))
hidden = g(inputs)
output = d(hidden)
gan = Model(inputs,output)
gan.compile(loss='binary_crossentropy',optimizer=adam,metrics=['accuracy'])

Write two functions for the final output of the loss value and the image generated by the generator

def plot_loss(losses):
    d_loss = [v[0] for v in losses["D"]]
    g_loss = [v[0] for v in losses["G"]]

    plt.figure(figsize=(10,8))
    plt.plot(d_loss,label="Discriminator_loss")
    plt.plot(g_loss,label="Generator_loss")
    plt.legend()
    plt.show()

def plot_generatored(n_ex=10,dim=(1,10),figsize=(12,2)):
    #####
    #np.random.seed(time.time())
    noise = np.random.normal(0,1,size=(n_ex,z_dim))
    generatored_images = g.predict(noise)
    generatored_images = generatored_images.reshape(n_ex,28,28)

    plt.figure(figsize = figsize)
    for i in range(generatored_images.shape[0]):
        plt.subplot(dim[0],dim[1],i+1)
        plt.imshow(generatored_images[i],interpolation='nearest',cmap='gray_r')
        plt.axis('off')
    plt.tight_layout()
    plt.show()

Set up two dictionaries to hold loss values

losses = {"D":[],"G":[]}

Here is the train function

def train(epochs=1,plt_frq=1,BATCH_SIZE=128):
    batchCount = int(x_train.shape[0]/BATCH_SIZE)
    print("Epochs:",epochs)
    print("Batch size:",BATCH_SIZE)
    print("Batches per epoch:",batchCount)

    for e in range(1,epochs+1):
        if e == 1 or e%plt_frq == 0:
            print('-'*15,'Epoch %d' %e,'-'*15)
        for _ in range(batchCount):
            image_batch = x_train[np.random.randint(0,x_train.shape[0],size=BATCH_SIZE)]
            ########
            #np.random.seed(time.time())
            noise = np.random.normal(0,1,size=(BATCH_SIZE,z_dim))
            generatored_images = g.predict(noise)
            #train d
            #set data set which is composed of 2 parts
            x = np.concatenate((image_batch, generatored_images))
            #y are labels
            y = np.zeros(2*BATCH_SIZE)
            y[:BATCH_SIZE] = 0.9

            d.trainable = True
            d_loss = d.train_on_batch(x,y)

            #train g
            #set up data set
            noise = np.random.normal(0,1,size=(BATCH_SIZE,z_dim))
            y2 = np.ones(BATCH_SIZE)
            d.trainable = False
            g_loss = gan.train_on_batch(noise,y2)
        losses["D"].append(d_loss)
        losses["G"].append(g_loss)
        if e==1 or e%plt_frq==0:
            plot_generatored()
    plot_loss(losses)

Then you can start training

train(200,20,128)

Since my previous training data was lost, the GAN results will not be shown here.

DCGAN

I can't copy the results of the paper completely here, just replace the neural network used in the code above with a few points mentioned in the paper and then transform the convolution neural network I used before.
The first step is still to import the related libraries, but this is not needed for some of the above layers

import tensorflow as tf
import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential,Model
from tensorflow.keras.layers import Input,Dense,Dropout,Activation,Flatten,Conv2D,Conv2DTranspose,BatchNormalization,LeakyReLU
from tensorflow.keras.optimizers import Adam,RMSprop
import numpy as np
import matplotlib.pyplot as plt

Then import the data and preprocess it as well

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000,784)
x_test = x_test.reshape(10000,784)
x_train = x_train.astype('float32')/255
x_test = x_test.astype('float32')/255

Setting the noise dimension here uses a different noise dimension than above. When I studied self-encoder before, I found that the dimension of 4_4_8 in the dataset restores handwritten digital pictures very well, so I set it here as well.

z_dim = (4,4,8)

Set up optimizer, same as above

adam = Adam(lr=0.0002,beta_1=0.5)

Build Generator Model

g = Sequential()
g.add(Conv2DTranspose(8,(5,5),activation='relu',input_shape=z_dim))
g.add(BatchNormalization())
g.add(Conv2DTranspose(8,(5,5),activation='relu',strides=2,padding='same'))
g.add(BatchNormalization())
g.add(Conv2DTranspose(32,(5,5),activation='relu'))
g.add(BatchNormalization())
g.add(Conv2DTranspose(64,(5,5),activation='relu'))
g.add(Conv2DTranspose(1,(5,5),activation='tanh'))
g.compile(loss='binary_crossentropy',optimizer=adam,metrics=['accuracy'])
g.summary()

: This is the structure of the G model

Build Discriminator

d = Sequential()
d.add(Conv2D(32, (5,5),input_shape=(28, 28, 1)))
d.add(LeakyReLU(alpha=0.2))
d.add(BatchNormalization())

d.add(Conv2D(64, (5,5)))
d.add(LeakyReLU(alpha=0.2))

d.add(Flatten())

d.add(Dense(128))
d.add(LeakyReLU(alpha=0.2))

d.add(BatchNormalization())
d.add(Dense(1, activation='softmax'))
d.compile(loss='categorical_crossentropy', optimizer=adam, metrics = ['accuracy'])#Define loss values, optimizers
d.trainable=False
d.summary()

: This is the structure of the discriminator

Similarly, combine the two models to generate a antagonistic network

inputs = Input(shape=(z_dim[0],z_dim[1],z_dim[2],))
hidden = g(inputs)
output = d(hidden)
gan = Model(inputs,output)
gan.compile(loss='binary_crossentropy',optimizer=adam,metrics=['accuracy'])
gan.summary()

Two functions to draw loss values and generator images

def plot_loss(losses):
    d_loss = [v[0] for v in losses["D"]]
    g_loss = [v[0] for v in losses["G"]]

    plt.figure(figsize=(10,8))
    plt.plot(d_loss,label="Discriminator_loss")
    plt.plot(g_loss,label="Generator_loss")
    plt.legend()
    plt.show()

def plot_generatored(n_ex=10,dim=(1,10),figsize=(12,2)):

    noise = np.random.normal(0,1,size=(n_ex,z_dim[0],z_dim[1],z_dim[2]))
    generatored_images = g.predict(noise)
    generatored_images = generatored_images.reshape(n_ex,28,28)

    plt.figure(figsize = figsize)
    for i in range(generatored_images.shape[0]):
        plt.subplot(dim[0],dim[1],i+1)
        plt.imshow(generatored_images[i],interpolation='nearest',cmap='gray_r')
        plt.axis('off')
    plt.tight_layout()
    plt.show()

Set up a dictionary, as above

losses = {"D":[],"G":[]}

train function, same as above

def train(epochs=1,plt_frq=1,BATCH_SIZE=128):
    batchCount = int(x_train.shape[0]/BATCH_SIZE)
    print("Epochs:",epochs)
    print("Batch size:",BATCH_SIZE)
    print("Batches per epoch:",batchCount)

    for e in range(1,epochs+1):
        if e == 1 or e%plt_frq == 0:
            print('-'*15,'Epoch %d' %e,'-'*15)
        for _ in range(batchCount):
            image_batch = x_train[np.random.randint(0,x_train.shape[0],size=BATCH_SIZE)]
            noise = np.random.normal(0,1,size=(BATCH_SIZE,z_dim[0],z_dim[1],z_dim[2]))
            generatored_images = g.predict(noise)
            #train d
            #set data set which is composed of 2 parts
            x = np.concatenate((np.reshape(image_batch,(-1,28,28,1)), generatored_images))
            #y are labels
            y = np.zeros(2*BATCH_SIZE)
            y[:BATCH_SIZE] = 0.9

            d.trainable = True
            d_loss = d.train_on_batch(x,y)

            #train g
            #set up data set
            noise = np.random.normal(0,1,size=(BATCH_SIZE,z_dim[0],z_dim[1],z_dim[2]))
            y2 = np.ones(BATCH_SIZE)
            d.trainable = False
            g_loss = gan.train_on_batch(noise,y2)
        losses["D"].append(d_loss)
        losses["G"].append(g_loss)
        if e==1 or e%plt_frq==0:
            plot_generatored()
    plot_loss(losses)

Then you can start training

train(100,10,128)

Result Display

This is the first image generated:

It's so messy that I can't see anything. Maybe my network model is not set up well.
After 10 rounds

You can see something now, but it's still messy

100 wheels

Now you can basically see something, 9, 7, 6, 5 have a strong recognition.

summary

When I first trained DCGAN, I used the usual convolution neural networks directly, including convolution pooling, etc. Generators also used up-sampling instead of transposing convolution. At that time, the computer not only crashed after a very long training period, but also had very poor results.
All code has been published in the whale community Click Direct Can be run directly on it.
When I blog, I refer to a lot of big guys'blogs, and the original articles of GAN and DCGAN are pasted below:
The pictures in the blog are also from here.
Detailed Generate Antagonist Network (GAN)
GAN Generation Conflict Network
Paper:
GAN
DCGAN

Keywords: Machine Learning TensorFlow Deep Learning

Added by mentalfloss on Fri, 21 Jan 2022 18:33:00 +0200