⚡ Reptile advanced ⚡ Five line code identification verification code - ddddocr (with brother OCR)

📣 Xiaobai training column is suitable for newcomers who have just started. Welcome to subscribe Programming Xiaobai advanced

📣 The interesting Python hand training project includes interesting articles such as robot awkward chat and spoof program, which can make you happy to learn python Training project column

📣 In addition, students who want to learn java web can take a look at this column: Teleporters

📣 This is an algorithm exercise for interview and postgraduate entrance examination. Let's refuel together The way ashore

💠 Reading guide

Some time ago, when I was working on a project in school, I came into contact with the image recognition of deep learning. However, as a little white, it is very difficult to achieve high precision when I just started, unless I call the API interface. When I was running around for the project, I found a training project on verification code recognition. However, people used pilot and pyteseract at that time. Although it is more convenient to use and the accuracy is not very poor, However, Feixue is still not satisfied. After hard searching, he found a library that can be solved in a few lines of code. It is said that the homonym is with brother. OCR is just used to crack the verification code in the crawler
.
.
.

💠 Start the verification journey

💠 brief introduction

Github address: Portal

Environmental requirements:

python >= 3.8

Windows/Linux/Macox..

If you want to try this function and suffer from the python version below 3.8, you can download Anaconda and build another environment to separate them, which is not necessary
🌸 🌸 🌸
Parameter Description:

Ddddocr accepts two parameters

Parameter nameDefault valueexplain
use_gpuFalseWhether Bool uses gpu for reasoning. If the value is False, then device_id not valid
device_id0int cuda device number. Currently, it only supports single graphics card

classification

Parameter nameDefault valueexplain
img0bytes format of the picture

🌸 🌸 🌸

💠 Primary measurement

Install library files

pip install ddddocr

test.jpg

import ddddocr
import time
begin=time.time()
ocr = ddddocr.DdddOcr()

with open('test.png', 'rb') as f:

    img_bytes = f.read()

res = ocr.classification(img_bytes)
finish=time.time()
print("result:")
print(res)
print("Time:%s second" % str(finish-begin))

result:
7364
 Time: 0.10026359558105469 second

How is it? It's not very simple, but the verification code is a little too simple. Let's make it a little more complex.
.
.
.

💠 upgrade





I think it's not fun enough. I wrote one myself. Try its effect. Let's have a look.

import ddddocr
import time
begin=time.time()
arr=[]
ocr = ddddocr.DdddOcr()
for i in range(1,5):
	with open(('%d.png'%i), 'rb') as f:
    	img_bytes = f.read()

	res = ocr.classification(img_bytes)
	arr.append(res)
finish=time.time()
print(arr)
print("Time:%s second" % str(finish-begin))
['uwv6','7482','DWSe','feixue']
Time: 0.13045329758155361 second

As a result, we can see that there seems to be no problem replacing w with W in the photos in Chapter 3. You can try it.

💠 Make verification code

It just can't be cracked. Let's make a thorough one this time. The principle is very simple, that is, the use of simple random functions and PIL library

from PIL import Image,ImageDraw,ImageFont
import random
def getRandomColor():
   r = random.randint(0, 255)
   g = random.randint(0, 255)
   b = random.randint(0, 255)
   return (r,g,b)
def getRandomStr():
   num_random = str(random.randint(1,9))
   random_upper_alpha = chr(random.randint(65,90))
   random_char = random.choice([num_random,random_upper_alpha])
   return random_char
image = Image.new('RGB',(120,40),(255,255,255))
draw = ImageDraw.Draw(image)
font = ImageFont.truetype(r'K:\msyh.ttc',size=24)
for i in range(4):
   draw.text((10+i*30,10),getRandomStr(),getRandomColor(),font=font)
width = 120
height = 40
for i in range(5):
   x1 = random.randint(0,width)
   x2 = random.randint(0,width)
   y1 = random.randint(0,height)
   y2 = random.randint(0,height)
   draw.line((x1,x2,y1,y2),fill=getRandomColor())
for i in range(20):
   draw.point([random.randint(0,width),random.randint(0,height)],fill=getRandomColor())
   x = random.randint(0,width)
   y = random.randint(0,height)
   draw.arc((x,y,x+5,y+5),0,90,fill=getRandomColor())
image.save('feixue.jpg')

💠 ending

In addition, let's guess if it can recognize Chinese characters??? The answer will not be revealed here soon. Welcome to the comment area of the big guys to tell us the answer.
That's all for today's sharing. Thank you for watching, if you can 🌸 🌸 🌸 Triple 🌸 🌸 🌸 Go again!!

📣 Xiaobai training column is suitable for newcomers who have just started. Welcome to subscribe Programming Xiaobai advanced
📣 Python interesting hand training project can make you happy to learn python Training project column

📣 In addition, students who want to learn java web can take a look at this column: Teleporters
📣 This is an algorithm exercise for interview and postgraduate entrance examination. Let's refuel together The way ashore

Keywords: Python crawler image identification OCR

Added by msimonds on Thu, 23 Dec 2021 02:38:41 +0200