python+DCGAN model generate verification code + train CNN model + test model accuracy
preface
- I haven't seen you for a long time, my friends. This article has been "premeditated" for a long time. I haven't had time to write it. Today, I finally squeeze out some time to write it well.
- Because I was reading books about deep learning and wrote python generates verification code → processes verification code → establishes CNN model training → tests model accuracy → identifies verification code At that time, this blog was particularly interested in CNN and GAN models and always wanted to make a practical application. Therefore, the protagonist of this blog is GAN.
- Before that, I read many articles about GAN and found that it has many variants. This paper mainly uses DCGAN, that is, deep convolution to generate countermeasure network.
- Limited to my understanding ability, I only know the fur of GAN. If there are any mistakes in this article, please correct them~
abstract
In this paper, 51000 verification code pictures are generated by users, 5w as training sets and 1k as test sets. First, the DCGAN model is trained with the training set, and then the verification code is generated based on the DCGAN model. Here, it is stipulated that the generated verification code needs to meet two requirements, one is that the score of the discriminator is greater than 0.95, and the other is that the 4 characters of the verification code are not in the training set. Finally, use these verification codes generated by the model to train a CNN model, and use the first 1k test sets to verify the accuracy of the model. The final accuracy is about 80%.
This paper mainly solves the problems
Use GAN to generate the verification code picture of the specified character, that is, the generator should not only generate the realistic verification code, but also generate the verification code of the specified character. Similarly, the discriminator should not only distinguish the authenticity, but also recognize the character of the verification code.
1, Generate real verification code
This step is omitted, because it is basically the same as my last blog, except that a large number of verification codes are generated, that is, self train_ Change num to 50000.
2, Define DCGAN model
generator
- Input: batchsize × one hundred × one × 1D noise + batchsize × onehot coding vector of random 4 characters
- Intermediate processing: use deconvolution to up sample the noise and label, generate matrix vectors with the same dimension, then combine them together, and finally feed them to the generator
- Output: batchsize × 132 corresponding to the input random 4 characters × Verification code picture of 40
Discriminator
- Input: batchsize × Real or generator generated 132 × 40 verification code picture + batchsize × The verification code character onehot encoding vector
- Intermediate processing: use convolution to down sample the matrix vector and label of the picture, generate the matrix vector of the same dimension, then combine them together, and finally feed them to the discriminator
- Output: batchsize × The probability of belonging to the real verification code picture (0 ~ 1), that is, score. The closer the score is to 1, the more real it is
code
import string import torch import torch.nn as nn word2num = {v:k for k,v in enumerate(list(string.digits+string.ascii_uppercase))} captcha_number = 4 nc = 3 image_size = 64 latent_space_size = 100 ngf = 128 ndf = 128 class Generator(nn.Module): '''generator ''' def __init__(self): super(Generator,self).__init__() self.deconv_x = nn.Sequential(nn.ConvTranspose2d(latent_space_size, ngf//2, 4, 1, 0),nn.ReLU(True)) self.deconv_y = nn.Sequential(nn.ConvTranspose2d(captcha_number*len(word2num), ngf//2, 4, 1, 0),nn.ReLU(True)) self.model = nn.Sequential( nn.ConvTranspose2d(ngf,ngf*8,4,1,0,bias=False), nn.BatchNorm2d(ngf*8), nn.ReLU(True), nn.ConvTranspose2d(ngf*8,ngf*4,4,1,1,bias=False), nn.BatchNorm2d(ngf*4), nn.ReLU(True), nn.ConvTranspose2d(ngf*4,ngf*2,4,2,1,bias=False), nn.BatchNorm2d(ngf*2), nn.ReLU(True), nn.ConvTranspose2d(ngf*2,ngf,4,2,1,bias=False), nn.BatchNorm2d(ngf), nn.ReLU(True), nn.ConvTranspose2d(ngf,nc,4,2,1,bias=False), nn.Tanh()) def forward(self,x,y): x = self.deconv_x(x) y = self.deconv_y(y.unsqueeze(2).unsqueeze(3)) out = torch.cat([x,y],1) output = self.model(out) return output class Discriminator(nn.Module): '''Discriminator''' def __init__(self): super(Discriminator,self).__init__() self.conv1_x = nn.Sequential(nn.Conv2d(nc, ndf//2, 4, 2, 1),nn.LeakyReLU(0.2,inplace=True)) self.conv1_y = nn.Sequential(nn.Conv2d(captcha_number*len(word2num), ndf//2, 4, 2, 1),nn.LeakyReLU(0.2,inplace=True)) self.model = nn.Sequential( nn.Conv2d(ndf,ndf,4,2,1,bias=False), nn.LeakyReLU(0.2,inplace=True), nn.Conv2d(ndf,ndf*2,4,2,1,bias=False), nn.BatchNorm2d(ndf*2), nn.LeakyReLU(0.2,inplace=True), nn.Conv2d(ndf*2,ndf*4,4,2,1,bias=False), nn.BatchNorm2d(ndf*4), nn.LeakyReLU(0.2,inplace=True), nn.Conv2d(ndf*4,ndf*8,4,2,1,bias=False), nn.BatchNorm2d(ndf*8), nn.LeakyReLU(0.2,inplace=True), nn.Conv2d(ndf*8,1,2,1,0,bias=False), nn.Sigmoid()) def forward(self,x,y): x = self.conv1_x(x) y = self.conv1_y(y.view(x.size(0), captcha_number*len(word2num), 1, 1).expand(-1,-1,image_size,image_size)) out = torch.cat([x,y],1) output = self.model(out) return output
3, Training DCGAN model
Parameter definition
- Both generator and discriminator use Adam optimizer to improve learning rate_ The rate is set to 1e-05 and the beta1 coefficient is set to 0.5
- Use BCELoss as loss function
- Training cycle epoch: 100
- batch_size: 64
Confrontation training process
- Discriminator
- Receive real verification code picture + corresponding 4 characters, y of loss function_ If true is all 1, perform back propagation and update the optimizer;
- The generator receives random noise z + random 4 characters, and the calculated pseudo picture needs to execute detach to truncate the gradient stream of back propagation;
- Receive the pseudo verification code picture generated by the generator + corresponding 4 characters, y of the loss function_ If true is all 0, perform back propagation and update the optimizer.
- Generator generator
- Receive random noise z + random 4 characters, y of loss function_ If true is all 1, perform back propagation and update the optimizer.
code
import numpy as np import string import os from PIL import Image import torch import torch.nn as nn from torch.optim import Adam from torch.autograd import Variable from torch.utils.data import Dataset from torch.utils.data import DataLoader from torchvision import transforms from torchvision.utils import save_image, make_grid import matplotlib.pyplot as plt %matplotlib inline device = 'cuda' if torch.cuda.is_available() else 'cpu' image_path = 'images/train/' latent_space_size = 100 nc = 3 # chanel of img batch_size = 64 epochs = 100 learning_rate = 0.00001 beta1 = 0.5 workers = 2 def one_hot_encode(value): order = [] shape = captcha_number * len(word2num) vector = np.zeros(shape, dtype=float) for k, v in enumerate(value): index = k * len(word2num) + word2num.get(v) vector[index] = 1.0 order.append(index) return vector, order def one_hot_decode(value): res = [] for ik, iv in enumerate(value): val = iv - ik * len(word2num) if ik else iv for k, v in word2num.items(): if val == int(v): res.append(k) break return "".join(res) class ImageDataSet(Dataset): def __init__(self, folder): self.transform=transforms.Compose([ transforms.Resize((image_size,image_size)), transforms.ToTensor(), transforms.Normalize([0.5]*nc,[0.5]*nc) ]) self.images = [os.path.join(folder,i) for i in os.listdir(folder)] def __len__(self): return len(self.images) def __getitem__(self, idx): image_path = self.images[idx] captcha_str=image_path[-8:-4] vector,order = one_hot_encode(captcha_str) vector=torch.FloatTensor(vector) image = self.transform(Image.open(image_path)) return image,vector,order def loader(image_path,batch_size): imgdataset=ImageDataSet(image_path) return DataLoader(imgdataset,batch_size=batch_size,shuffle=True,num_workers=workers) if __name__ == '__main__': netd=Discriminator().to(device=device) # generator netg=Generator().to(device=device) # Discriminator # Adam optimizer optimizerD = Adam(netd.parameters(),lr=learning_rate,betas=(beta1,0.999)) optimizerG = Adam(netg.parameters(),lr=learning_rate,betas=(beta1,0.999)) # BCELoss loss function criterion = nn.BCELoss().to(device=device) # Generate a batch of fixed noise z and character labels for viewing the effect of model fitting fix_z = Variable(torch.FloatTensor(10,latent_space_size,1,1).normal_(0,1)).to(device=device) fix_y=[] random_strs=[np.random.choice(os.listdir(image_path))[:4] for _ in range(10)] print(random_strs) print() for i in random_strs: fix_y.append(one_hot_encode(i)[0]) fix_y=Variable(torch.FloatTensor(fix_y)).to(device=device) G_LOSS=[] D_LOSS=[] dataloader=loader(image_path,batch_size) for epoch in range(epochs): mean_G=[] mean_D=[] for ii,(img,vector,order) in enumerate(dataloader): img=Variable(img).to(device=device) vector=Variable(vector).to(device=device) is_real = Variable(torch.ones(img.size(0))).to(device=device) # 1 for real is_fake = Variable(torch.zeros(img.size(0))).to(device=device) # 0 for fake # Training discriminator netd.zero_grad() output=netd(img,vector) errD_real = criterion(output.view(-1), is_real) errD_real.backward() z = Variable(torch.randn(img.size(0),latent_space_size,1,1).normal_(0,1)).to(device=device) fake_pic=netg(z,vector).detach() output=netd(fake_pic,vector) errD_fake = criterion(output.view(-1), is_fake) errD_fake.backward() optimizerD.step() # Training generator netg.zero_grad() fake_pic=netg(z,vector) output=netd(fake_pic,vector) errG = criterion(output.view(-1), is_real) errG.backward() optimizerG.step() mean_G.append(errG.item()) mean_D.append(errD_real.item()+errD_fake.item()) print(f'epoch:{epoch} D_LOSS:{np.mean(mean_D)} G_LOSS:{np.mean(mean_G)}') G_LOSS.append(np.mean(mean_G)) D_LOSS.append(np.mean(mean_D)) if epoch%20==0: fake_u=netg(fix_z,fix_y) imgs = make_grid(fake_u.data*0.5+0.5,nrow=5).cpu() plt.imshow(imgs) plt.show() plt.plot(list(range(len(G_LOSS))),G_LOSS) plt.plot(list(range(len(G_LOSS))),D_LOSS) plt.show() # Save model torch.save(netd.state_dict(),'dcgan_netd.pth') torch.save(netg.state_dict(),'dcgan_netg.pth')
After about three hours, the model training is completed. Let's see the effect. The comparison between the DCGAN model and the real verification code image is as follows:
Verification code generated by DCGAN | Real verification code |
---|---|
The effect is actually pretty good, but there is still a little gap compared with the real one.
4, Generating verification code with DCGAN model
The generated verification code will meet two conditions:
- The score of the result input into the discriminator is greater than 0.95
- The generated 4-character verification code is not in the training set, that is, a new verification code picture
import torch from torch.autograd import Variable from torchvision.utils import save_image, make_grid from PIL import Image from tqdm import tqdm import os import string word2num={v:k for k,v in enumerate(list(string.digits+string.ascii_uppercase))} image_path='images/train/' device = 'cuda' if torch.cuda.is_available() else 'cpu' latent_space_size = 100 image_height=40 image_width=132 if not os.path.exists('Generated picture/') os.makedirs('Generated picture/') temp=list(word2num.keys()) all_a=[] # Store all possible 4-character verification codes for i in temp: for j in temp: for k in temp: for l in temp: what=i+j+k+l all_a.append(what) netd=Discriminator().to(device=device) netg=Generator().to(device=device) netd.load_state_dict(torch.load('dcgan_netd.pth')) netg.load_state_dict(torch.load('dcgan_netg.pth')) all_aa=[i[:4] for i in os.listdir(image_path)] for a in tqdm(all_a): z = Variable(torch.randn(1,latent_space_size,1,1).normal_(0,1)).to(device=device) fake_pic=netg(z,Variable(torch.FloatTensor([one_hot_encode(a)[0]])).to(device=device)) score=netd(fake_pic,Variable(torch.FloatTensor([one_hot_encode(a)[0]])).to(device=device)).view(-1).data.cpu().numpy()[0] if (score>0.95)&(a not in all_aa): imgs = make_grid(fake_pic.data*0.5+0.5).cpu() # CHW save_image(imgs,f'Generated picture/{a}.png') imgs=Image.open(f'Generated picture/{a}.png') imgs=transforms.Resize((image_height,image_width))(imgs) imgs.save(f'Generated picture/{a}.png')
Run the above code and complete the generation of verification code image in about three hours. If there is no accident, the generated verification code is more than 1w +
5, Establish and train CNN model
The code limited to the margin is omitted, because it is basically the same as my last blog, except that the training set needs to use the verification code generated by DCGAN, process_ When the IMG function is binarized, the threshold is changed to 120, epochs is changed to 10, and batch_ Change the size to 32.
The following is the printout of the training process:
Train start Iteration is 383 epoch:1, step:100, loss:0.11870528757572174 epoch:1, step:200, loss:0.09803573042154312 epoch:1, step:300, loss:0.07167644798755646 epoch:2, step:100, loss:0.060339584946632385 epoch:2, step:200, loss:0.0454578697681427 epoch:2, step:300, loss:0.045735735446214676 epoch:3, step:100, loss:0.03509911149740219 epoch:3, step:200, loss:0.03168116882443428 epoch:3, step:300, loss:0.03217519074678421 epoch:4, step:100, loss:0.029901988804340363 epoch:4, step:200, loss:0.032566048204898834 epoch:4, step:300, loss:0.028481818735599518 epoch:5, step:100, loss:0.022674065083265305 epoch:5, step:200, loss:0.019393315538764 epoch:5, step:300, loss:0.023355185985565186 epoch:6, step:100, loss:0.027277015149593353 epoch:6, step:200, loss:0.018431685864925385 epoch:6, step:300, loss:0.01690380461513996 epoch:7, step:100, loss:0.022878311574459076 epoch:7, step:200, loss:0.02011089399456978 epoch:7, step:300, loss:0.020655091851949692 epoch:8, step:100, loss:0.013621113263070583 epoch:8, step:200, loss:0.015619204379618168 epoch:8, step:300, loss:0.024786094203591347 epoch:9, step:100, loss:0.016219446435570717 epoch:9, step:200, loss:0.015738267451524734 epoch:9, step:300, loss:0.016928061842918396 epoch:10, step:100, loss:0.01601400598883629 epoch:10, step:200, loss:0.015124175697565079 epoch:10, step:300, loss:0.01665317639708519 Train done
6, Test model accuracy
This step is omitted, because it is exactly the same as the code when I tested the accuracy of the model in my last blog
The following is the test printout:
load cnn model Fail, captcha:NTJI->NTJ1 Fail, captcha:E57N->E5Z6 Fail, captcha:BKI6->BKT6 Fail, captcha:U0IQ->UCIQ Fail, captcha:GEQI->GEQ1 Fail, captcha:KIC4->K1C4 Fail, captcha:PCSO->PCS0 Fail, captcha:XW4O->XW40 Fail, captcha:TQXU->TQYU Fail, captcha:4KCY->4K0Y Fail, captcha:COG1->CCG1 Fail, captcha:CZX7->CZY7 Fail, captcha:Q508->Q5D8 Fail, captcha:79GR->798R Fail, captcha:DNBT->DNBI ...... Fail, captcha:V043->VO43 Fail, captcha:G1XF->G1YF Done. The total number of predicted pictures is 1000, and the accuracy is 81%
It can be seen that the accuracy is 81%
7, Summary
- Without using any real pictures, only the pseudo pictures generated by DCGAN are used to train the CNN model, and then the real pictures are used to verify the accuracy of the model. The effect is still good, but the number of verification codes needed to train DCGAN is a little more.
- At the beginning, 500 to 50000 pieces were used to train the DCGAN model. Finally, the accuracy of CNN model gradually increased, which is understandable because the quantity increased. And in this process, we also need to constantly change the super parameters to adapt to the increasing amount.
- During this period, some people have tried to train CNN model with pseudo verification code pictures generated by DCGAN model and some real verification code pictures, and the accuracy will be a little better than that of training only with the real part of verification code pictures.
- If you want to train GAN on very few real verification code images and get good results, I think you can only change the current DCGAN model or use other deep learning models.
The above is the whole content of this article, all the main core code is on it. If you want to have the complete source code, then pay attention to the official account of Python King's road, reply the key words: 20210601, you can get the source code.
Links referenced in this article
Write at the end
After half a year, I picked up my CSDN again. It's been half a year. Do you know how I spent the past half a year...
One reason is that I'm too busy at work. I don't have time to calm down and write. Although it's a two-day break at present, and I always fantasize that I must study hard at the weekend, I can't find it until the weekend. The bed is more comfortable~
Another reason is that there is a stuck point that can't pass and can't be perfect. At first, I thought that with only 500 real verification code pictures, I can train a GAN model that can output realistic verification code pictures. In this way, all character verification codes can be solved perfectly. However, the imagination is very beautiful, but the reality is very skinny..
However, now I have finally completed the blog I wanted to write six months ago. In fact, if I record the process of exploration, my sense of achievement is full. I can relax my hanging heart, and then continue to move forward towards the next goal!!
It happens that today is June day. I wish all children and friends a happy holiday~