python+DCGAN model generate verification code + train CNN model + test model accuracy

python+DCGAN model generate verification code + train CNN model + test model accuracy

preface

  • I haven't seen you for a long time, my friends. This article has been "premeditated" for a long time. I haven't had time to write it. Today, I finally squeeze out some time to write it well.
  • Because I was reading books about deep learning and wrote python generates verification code → processes verification code → establishes CNN model training → tests model accuracy → identifies verification code At that time, this blog was particularly interested in CNN and GAN models and always wanted to make a practical application. Therefore, the protagonist of this blog is GAN.
  • Before that, I read many articles about GAN and found that it has many variants. This paper mainly uses DCGAN, that is, deep convolution to generate countermeasure network.
  • Limited to my understanding ability, I only know the fur of GAN. If there are any mistakes in this article, please correct them~

abstract

In this paper, 51000 verification code pictures are generated by users, 5w as training sets and 1k as test sets. First, the DCGAN model is trained with the training set, and then the verification code is generated based on the DCGAN model. Here, it is stipulated that the generated verification code needs to meet two requirements, one is that the score of the discriminator is greater than 0.95, and the other is that the 4 characters of the verification code are not in the training set. Finally, use these verification codes generated by the model to train a CNN model, and use the first 1k test sets to verify the accuracy of the model. The final accuracy is about 80%.

This paper mainly solves the problems

Use GAN to generate the verification code picture of the specified character, that is, the generator should not only generate the realistic verification code, but also generate the verification code of the specified character. Similarly, the discriminator should not only distinguish the authenticity, but also recognize the character of the verification code.

1, Generate real verification code

This step is omitted, because it is basically the same as my last blog, except that a large number of verification codes are generated, that is, self train_ Change num to 50000.

2, Define DCGAN model

generator
  • Input: batchsize × one hundred × one × 1D noise + batchsize × onehot coding vector of random 4 characters
  • Intermediate processing: use deconvolution to up sample the noise and label, generate matrix vectors with the same dimension, then combine them together, and finally feed them to the generator
  • Output: batchsize × 132 corresponding to the input random 4 characters × Verification code picture of 40
Discriminator
  • Input: batchsize × Real or generator generated 132 × 40 verification code picture + batchsize × The verification code character onehot encoding vector
  • Intermediate processing: use convolution to down sample the matrix vector and label of the picture, generate the matrix vector of the same dimension, then combine them together, and finally feed them to the discriminator
  • Output: batchsize × The probability of belonging to the real verification code picture (0 ~ 1), that is, score. The closer the score is to 1, the more real it is
code
import string
import torch
import torch.nn as nn

word2num = {v:k for k,v in enumerate(list(string.digits+string.ascii_uppercase))}
captcha_number = 4
nc = 3
image_size = 64
latent_space_size = 100
ngf = 128
ndf = 128

class Generator(nn.Module):
    '''generator '''
    def __init__(self):
        super(Generator,self).__init__()
        self.deconv_x = nn.Sequential(nn.ConvTranspose2d(latent_space_size, ngf//2, 4, 1, 0),nn.ReLU(True))
        self.deconv_y = nn.Sequential(nn.ConvTranspose2d(captcha_number*len(word2num), ngf//2, 4, 1, 0),nn.ReLU(True))
        self.model = nn.Sequential(
            nn.ConvTranspose2d(ngf,ngf*8,4,1,0,bias=False),
            nn.BatchNorm2d(ngf*8),
            nn.ReLU(True),

            nn.ConvTranspose2d(ngf*8,ngf*4,4,1,1,bias=False),
            nn.BatchNorm2d(ngf*4),
            nn.ReLU(True),

            nn.ConvTranspose2d(ngf*4,ngf*2,4,2,1,bias=False),
            nn.BatchNorm2d(ngf*2),
            nn.ReLU(True),

            nn.ConvTranspose2d(ngf*2,ngf,4,2,1,bias=False),
            nn.BatchNorm2d(ngf),
            nn.ReLU(True),

            nn.ConvTranspose2d(ngf,nc,4,2,1,bias=False),
            nn.Tanh())
    def forward(self,x,y):
        x = self.deconv_x(x)
        y = self.deconv_y(y.unsqueeze(2).unsqueeze(3))
        out = torch.cat([x,y],1)
        output = self.model(out)
        return output
    
class Discriminator(nn.Module):
    '''Discriminator'''
    def __init__(self):
        super(Discriminator,self).__init__()
        self.conv1_x = nn.Sequential(nn.Conv2d(nc, ndf//2, 4, 2, 1),nn.LeakyReLU(0.2,inplace=True))
        self.conv1_y = nn.Sequential(nn.Conv2d(captcha_number*len(word2num), ndf//2, 4, 2, 1),nn.LeakyReLU(0.2,inplace=True))
        self.model = nn.Sequential(
            nn.Conv2d(ndf,ndf,4,2,1,bias=False),
            nn.LeakyReLU(0.2,inplace=True),

            nn.Conv2d(ndf,ndf*2,4,2,1,bias=False),
            nn.BatchNorm2d(ndf*2),
            nn.LeakyReLU(0.2,inplace=True),

            nn.Conv2d(ndf*2,ndf*4,4,2,1,bias=False),
            nn.BatchNorm2d(ndf*4),
            nn.LeakyReLU(0.2,inplace=True),

            nn.Conv2d(ndf*4,ndf*8,4,2,1,bias=False),
            nn.BatchNorm2d(ndf*8),
            nn.LeakyReLU(0.2,inplace=True),

            nn.Conv2d(ndf*8,1,2,1,0,bias=False),
            nn.Sigmoid())
        
    def forward(self,x,y):
        x = self.conv1_x(x)
        y = self.conv1_y(y.view(x.size(0), captcha_number*len(word2num), 1, 1).expand(-1,-1,image_size,image_size))
        out = torch.cat([x,y],1)
        output = self.model(out)
        return output

3, Training DCGAN model

Parameter definition
  1. Both generator and discriminator use Adam optimizer to improve learning rate_ The rate is set to 1e-05 and the beta1 coefficient is set to 0.5
  2. Use BCELoss as loss function
  3. Training cycle epoch: 100
  4. batch_size: 64
Confrontation training process
  • Discriminator
  1. Receive real verification code picture + corresponding 4 characters, y of loss function_ If true is all 1, perform back propagation and update the optimizer;
  2. The generator receives random noise z + random 4 characters, and the calculated pseudo picture needs to execute detach to truncate the gradient stream of back propagation;
  3. Receive the pseudo verification code picture generated by the generator + corresponding 4 characters, y of the loss function_ If true is all 0, perform back propagation and update the optimizer.
  • Generator generator
  1. Receive random noise z + random 4 characters, y of loss function_ If true is all 1, perform back propagation and update the optimizer.
code
import numpy as np
import string
import os
from PIL import Image
import torch
import torch.nn as nn
from torch.optim import Adam
from torch.autograd import Variable
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.utils import save_image, make_grid
import matplotlib.pyplot as plt
%matplotlib inline

device = 'cuda' if torch.cuda.is_available() else 'cpu'
image_path = 'images/train/'
latent_space_size = 100
nc = 3 # chanel of img 
batch_size = 64
epochs = 100
learning_rate = 0.00001
beta1 = 0.5
workers = 2

def one_hot_encode(value):
    order = []
    shape = captcha_number * len(word2num)
    vector = np.zeros(shape, dtype=float)
    for k, v in enumerate(value):
        index = k * len(word2num) + word2num.get(v)
        vector[index] = 1.0
        order.append(index)
    return vector, order

def one_hot_decode(value):
    res = []
    for ik, iv in enumerate(value):
        val = iv - ik * len(word2num) if ik else iv
        for k, v in word2num.items():
            if val == int(v):
                res.append(k)
                break
    return "".join(res)

class ImageDataSet(Dataset):
    def __init__(self, folder):
        self.transform=transforms.Compose([
                transforms.Resize((image_size,image_size)),
                transforms.ToTensor(),
                transforms.Normalize([0.5]*nc,[0.5]*nc)
                ])
        self.images = [os.path.join(folder,i) for i in os.listdir(folder)]

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        image_path = self.images[idx]
        captcha_str=image_path[-8:-4]
        vector,order = one_hot_encode(captcha_str)
        vector=torch.FloatTensor(vector)
        image = self.transform(Image.open(image_path))
        return image,vector,order

def loader(image_path,batch_size):
    imgdataset=ImageDataSet(image_path)
    return DataLoader(imgdataset,batch_size=batch_size,shuffle=True,num_workers=workers)

if __name__ == '__main__':
    netd=Discriminator().to(device=device)  # generator 
    netg=Generator().to(device=device)      # Discriminator

    # Adam optimizer
    optimizerD = Adam(netd.parameters(),lr=learning_rate,betas=(beta1,0.999))
    optimizerG = Adam(netg.parameters(),lr=learning_rate,betas=(beta1,0.999))

    # BCELoss loss function
    criterion = nn.BCELoss().to(device=device)

    # Generate a batch of fixed noise z and character labels for viewing the effect of model fitting
    fix_z = Variable(torch.FloatTensor(10,latent_space_size,1,1).normal_(0,1)).to(device=device)
    fix_y=[]
    random_strs=[np.random.choice(os.listdir(image_path))[:4] for _ in range(10)]
    print(random_strs)
    print()
    for i in random_strs:
        fix_y.append(one_hot_encode(i)[0])
    fix_y=Variable(torch.FloatTensor(fix_y)).to(device=device)

    G_LOSS=[]
    D_LOSS=[]
    dataloader=loader(image_path,batch_size)
    for epoch in range(epochs):
        mean_G=[]
        mean_D=[]
        for ii,(img,vector,order) in enumerate(dataloader):
            img=Variable(img).to(device=device)
            vector=Variable(vector).to(device=device)

            is_real = Variable(torch.ones(img.size(0))).to(device=device) # 1 for real
            is_fake = Variable(torch.zeros(img.size(0))).to(device=device) # 0 for fake

            # Training discriminator
            netd.zero_grad()
            output=netd(img,vector)
            errD_real = criterion(output.view(-1), is_real)
            errD_real.backward()

            z = Variable(torch.randn(img.size(0),latent_space_size,1,1).normal_(0,1)).to(device=device)
            fake_pic=netg(z,vector).detach()
            output=netd(fake_pic,vector)
            errD_fake = criterion(output.view(-1), is_fake)
            errD_fake.backward()
            optimizerD.step()

            # Training generator
            netg.zero_grad()
            fake_pic=netg(z,vector)
            output=netd(fake_pic,vector)
            errG = criterion(output.view(-1), is_real)
            errG.backward()
            optimizerG.step()

            mean_G.append(errG.item())
            mean_D.append(errD_real.item()+errD_fake.item())
        print(f'epoch:{epoch}         D_LOSS:{np.mean(mean_D)}           G_LOSS:{np.mean(mean_G)}')
        G_LOSS.append(np.mean(mean_G))
        D_LOSS.append(np.mean(mean_D))
        if epoch%20==0:
            fake_u=netg(fix_z,fix_y)
            imgs = make_grid(fake_u.data*0.5+0.5,nrow=5).cpu()
            plt.imshow(imgs)
            plt.show()

    plt.plot(list(range(len(G_LOSS))),G_LOSS)
    plt.plot(list(range(len(G_LOSS))),D_LOSS)
    plt.show() 

    # Save model
    torch.save(netd.state_dict(),'dcgan_netd.pth')
    torch.save(netg.state_dict(),'dcgan_netg.pth')

After about three hours, the model training is completed. Let's see the effect. The comparison between the DCGAN model and the real verification code image is as follows:

Verification code generated by DCGANReal verification code

The effect is actually pretty good, but there is still a little gap compared with the real one.

4, Generating verification code with DCGAN model

The generated verification code will meet two conditions:

  1. The score of the result input into the discriminator is greater than 0.95
  2. The generated 4-character verification code is not in the training set, that is, a new verification code picture
import torch
from torch.autograd import Variable
from torchvision.utils import save_image, make_grid
from PIL import Image
from tqdm import tqdm
import os
import string

word2num={v:k for k,v in enumerate(list(string.digits+string.ascii_uppercase))}
image_path='images/train/'
device = 'cuda' if torch.cuda.is_available() else 'cpu'
latent_space_size = 100
image_height=40
image_width=132
if not os.path.exists('Generated picture/')
	os.makedirs('Generated picture/')
temp=list(word2num.keys())
all_a=[] # Store all possible 4-character verification codes
for i in temp:
    for j in temp:
        for k in temp:
            for l in temp:
                what=i+j+k+l
                all_a.append(what)

netd=Discriminator().to(device=device)
netg=Generator().to(device=device)
netd.load_state_dict(torch.load('dcgan_netd.pth'))
netg.load_state_dict(torch.load('dcgan_netg.pth'))

all_aa=[i[:4] for i in os.listdir(image_path)]
for a in tqdm(all_a):
    z = Variable(torch.randn(1,latent_space_size,1,1).normal_(0,1)).to(device=device)
    fake_pic=netg(z,Variable(torch.FloatTensor([one_hot_encode(a)[0]])).to(device=device))
    score=netd(fake_pic,Variable(torch.FloatTensor([one_hot_encode(a)[0]])).to(device=device)).view(-1).data.cpu().numpy()[0]
    if (score>0.95)&(a not in all_aa):
        imgs = make_grid(fake_pic.data*0.5+0.5).cpu() # CHW
        save_image(imgs,f'Generated picture/{a}.png')
        imgs=Image.open(f'Generated picture/{a}.png')
        imgs=transforms.Resize((image_height,image_width))(imgs)
        imgs.save(f'Generated picture/{a}.png')

Run the above code and complete the generation of verification code image in about three hours. If there is no accident, the generated verification code is more than 1w +

5, Establish and train CNN model

The code limited to the margin is omitted, because it is basically the same as my last blog, except that the training set needs to use the verification code generated by DCGAN, process_ When the IMG function is binarized, the threshold is changed to 120, epochs is changed to 10, and batch_ Change the size to 32.

The following is the printout of the training process:

Train start
Iteration is 383
epoch:1, step:100, loss:0.11870528757572174
epoch:1, step:200, loss:0.09803573042154312
epoch:1, step:300, loss:0.07167644798755646
epoch:2, step:100, loss:0.060339584946632385
epoch:2, step:200, loss:0.0454578697681427
epoch:2, step:300, loss:0.045735735446214676
epoch:3, step:100, loss:0.03509911149740219
epoch:3, step:200, loss:0.03168116882443428
epoch:3, step:300, loss:0.03217519074678421
epoch:4, step:100, loss:0.029901988804340363
epoch:4, step:200, loss:0.032566048204898834
epoch:4, step:300, loss:0.028481818735599518
epoch:5, step:100, loss:0.022674065083265305
epoch:5, step:200, loss:0.019393315538764
epoch:5, step:300, loss:0.023355185985565186
epoch:6, step:100, loss:0.027277015149593353
epoch:6, step:200, loss:0.018431685864925385
epoch:6, step:300, loss:0.01690380461513996
epoch:7, step:100, loss:0.022878311574459076
epoch:7, step:200, loss:0.02011089399456978
epoch:7, step:300, loss:0.020655091851949692
epoch:8, step:100, loss:0.013621113263070583
epoch:8, step:200, loss:0.015619204379618168
epoch:8, step:300, loss:0.024786094203591347
epoch:9, step:100, loss:0.016219446435570717
epoch:9, step:200, loss:0.015738267451524734
epoch:9, step:300, loss:0.016928061842918396
epoch:10, step:100, loss:0.01601400598883629
epoch:10, step:200, loss:0.015124175697565079
epoch:10, step:300, loss:0.01665317639708519
Train done

6, Test model accuracy

This step is omitted, because it is exactly the same as the code when I tested the accuracy of the model in my last blog

The following is the test printout:

load cnn model
Fail, captcha:NTJI->NTJ1
Fail, captcha:E57N->E5Z6
Fail, captcha:BKI6->BKT6
Fail, captcha:U0IQ->UCIQ
Fail, captcha:GEQI->GEQ1
Fail, captcha:KIC4->K1C4
Fail, captcha:PCSO->PCS0
Fail, captcha:XW4O->XW40
Fail, captcha:TQXU->TQYU
Fail, captcha:4KCY->4K0Y
Fail, captcha:COG1->CCG1
Fail, captcha:CZX7->CZY7
Fail, captcha:Q508->Q5D8
Fail, captcha:79GR->798R
Fail, captcha:DNBT->DNBI
......
Fail, captcha:V043->VO43
Fail, captcha:G1XF->G1YF
 Done. The total number of predicted pictures is 1000, and the accuracy is 81%

It can be seen that the accuracy is 81%

7, Summary

  • Without using any real pictures, only the pseudo pictures generated by DCGAN are used to train the CNN model, and then the real pictures are used to verify the accuracy of the model. The effect is still good, but the number of verification codes needed to train DCGAN is a little more.
  • At the beginning, 500 to 50000 pieces were used to train the DCGAN model. Finally, the accuracy of CNN model gradually increased, which is understandable because the quantity increased. And in this process, we also need to constantly change the super parameters to adapt to the increasing amount.
  • During this period, some people have tried to train CNN model with pseudo verification code pictures generated by DCGAN model and some real verification code pictures, and the accuracy will be a little better than that of training only with the real part of verification code pictures.
  • If you want to train GAN on very few real verification code images and get good results, I think you can only change the current DCGAN model or use other deep learning models.

The above is the whole content of this article, all the main core code is on it. If you want to have the complete source code, then pay attention to the official account of Python King's road, reply the key words: 20210601, you can get the source code.

Links referenced in this article

Write at the end

After half a year, I picked up my CSDN again. It's been half a year. Do you know how I spent the past half a year...

One reason is that I'm too busy at work. I don't have time to calm down and write. Although it's a two-day break at present, and I always fantasize that I must study hard at the weekend, I can't find it until the weekend. The bed is more comfortable~

Another reason is that there is a stuck point that can't pass and can't be perfect. At first, I thought that with only 500 real verification code pictures, I can train a GAN model that can output realistic verification code pictures. In this way, all character verification codes can be solved perfectly. However, the imagination is very beautiful, but the reality is very skinny..

However, now I have finally completed the blog I wanted to write six months ago. In fact, if I record the process of exploration, my sense of achievement is full. I can relax my hanging heart, and then continue to move forward towards the next goal!!

It happens that today is June day. I wish all children and friends a happy holiday~

Keywords: Python neural networks Deep Learning GAN

Added by Fox1337 on Wed, 02 Feb 2022 19:47:48 +0200