preface
1. This article is based on the previous article: Some optimizations on improving the accuracy of OCR recognition (2) Some optimizations are made to improve the accuracy of picture direction recognition to 96%.
2. Before reading this article, it is recommended to read the previous one for better understanding
1, Optimization ideas
1. In the last article, we used the paddleocr direction classifier to directly identify the direction of the picture. We found that the effect is not very good and the efficiency is very low. It takes an average of 2s to identify a picture.
2. In view of the above problems, a new optimization scheme is proposed:
- Use the text rectangle detection of paddleocr to get the coordinates of all text rectangles
- Take out the coordinates of the text rectangle with an aspect ratio between 5 - 25 and 0.04 - 0.2
- Randomly take out a rectangle or sort it according to the size of the aspect ratio, and take out the rectangle with the middle aspect ratio (here, for convenience, directly take out the 0th rectangle)
- Use the extracted rectangle to cut from the original image
- The extracted image is used as the input of paddeocr direction classifier
2, Complete code
import cv2 import os import time import numpy as np from PIL import Image from paddleocr import PaddleOCR class GetImageRotation(object): def __init__(self): self.ocr = PaddleOCR(use_angle_cls=True) self.ocr_angle = PaddleOCR(use_angle_cls=True) def get_real_rotation_when_null_rect(self, rect_list): w_div_h_sum = 0 count = 0 for rect in rect_list: p0 = rect[0] p1 = rect[1] p2 = rect[2] p3 = rect[3] width = abs(p1[0] - p0[0]) height = abs(p3[1] - p0[1]) w_div_h = width / height if abs(w_div_h - 1.0) < 0.5: count +=1 continue w_div_h_sum += w_div_h length = len(rect_list) - count if length == 0: length = 1 if w_div_h_sum / length >= 1.5: return 1 else: return 0 def get_real_rotation_flag(self, rect_list): ret_rect = [] w_div_h_list = [] w_div_h_sum = 0 for rect in rect_list: p0 = rect[0] p1 = rect[1] p2 = rect[2] p3 = rect[3] width = abs(p1[0] - p0[0]) height = abs(p3[1] - p0[1]) w_div_h = width / height # w_div_h_list.append(w_div_h) # print(w_div_h) if 5 <= abs(w_div_h - 1.0) <= 25 or 0.04 <= abs(w_div_h) <= 0.2: ret_rect.append(rect) w_div_h_sum += w_div_h if w_div_h_sum / len(ret_rect) >= 1.5: return 1, ret_rect else: return 0, ret_rect def crop_image(self, rect, image): p0 = rect[0] p1 = rect[1] p2 = rect[2] p3 = rect[3] crop = image[int(p0[1]):int(p2[1]), int(p0[0]):int(p2[0])] # crop_image = Image.fromarray(crop) return crop def get_img_real_angle(self, img_path): ret_angle = 0 image = cv2.imread(img_path) # ocr = PaddleOCR(use_angle_cls=True) # angle_cls = ocr.ocr(img_path, det=False, rec=False, cls=True) rect_list = self.ocr.ocr(image, rec=False) # print(rect_list) if rect_list != [[]]: try: real_angle_flag, rect_good = get_real_rotation_flag(rect_list) # rect_crop = choice(rect_good) rect_crop = rect_good[0] image_crop = crop_image(rect_crop, image) # ocr_angle = PaddleOCR(use_angle_cls=True) angle_cls = self.ocr_angle.ocr(image_crop, det=False, rec=False, cls=True) print(angle_cls) except: real_angle_flag = get_real_rotation_when_null_rect(rect_list) # ocr_angle = PaddleOCR(use_angle_cls=True) angle_cls = self.ocr_angle.ocr(image, det=False, rec=False, cls=True) print(angle_cls) else: return 0 print('real_angle_flag: {}'.format(real_angle_flag)) if angle_cls[0][0] == '0': if real_angle_flag: ret_angle = 0 else: ret_angle = 270 if angle_cls[0][0] == '180': if real_angle_flag: ret_angle = 180 else: ret_angle = 90 return ret_angle def get_files_path_2(file_dir): '''Gets the absolute path of all files with the specified suffix in the specified folder''' files_path = [] # label = file_dir.split('/')[-1] for root, dirs, files in os.walk(file_dir): for file in files: path = os.path.join(root, file) files_path.append(path) return files_path
Q: why instantiate two paddleocrs?
A: when only one PaddleOCR is instantiated, the following warning will appear, so that the direction cannot be detected
[2021/07/03 12:51:32] root WARNING: Since the angle classifier is not initialized, the angle classifier will not be uesd during the forward process
It should be an internal problem of PaddleOCR. You can delve into it when you have time
test
from time import time get_image_rotation = GetImageRotation() image_path = get_files_path_2('/Users/zhangzc/Desktop/workplace/ocrtest/test') count = 0 time_list = [] for path in image_path: if path == '/Users/Desktop/workplace/ocrtest/test/.DS_Store': continue t1 = time() angle = get_image_rotation.get_img_real_angle(path) t2 = time() print('----'*10) print(angle) print('cost time: {} s'.format(t2-t1)) time_list.append(t2-t1) print('----'*10) if angle != 0: print('****'*10) print(path) print('****'*10) count +=1 print('print average cost time : {} s'.format(np.mean(time_list)))
Test results: 200 0-degree pictures, only 8 detection errors, 96% accuracy
Average time: 1.25s
summary
1. The accuracy is 96% higher than the previous 60%
2. The average time consumption decreased to 1.25s compared with the previous 2s
3. At present, it has only been tested on 0-degree pictures. For the picture test after rotation, interested students can test it by themselves
4. If you have a better optimization scheme, you are welcome to send a private letter at any time. Thank you very much
Related articles:
Some optimization on improving the accuracy of OCR recognition (I)
Some optimizations on improving the accuracy of OCR recognition (2)