Using python+selenium super eagle to crack the image recognition verification code

When you are a crawler, you will certainly encounter many verification code columns, such as image recognition mentioned in this paper:

  When I was climbing the dragnet, I was too often jumped to the verification system. It was really annoying

url=Secure access verification - pull hookhttps://sec.lagou.com/verify.html?e=2&f=https://www.lagou.com/jobs/list_python?labelWords=&fromSearch=true&suginput=

 

  We need to click the verification button here first, and then the image verification code will pop up directly. After the recognition is completed, click OK to go directly to the position page

 

ok, let's write the previous code first

from  selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver import ChromeOptions   #This package is used to avoid the risk of being detected
import time  #delay
from chaojiying import Chaojiying_Client
from selenium.webdriver import ActionChains  #Action chain

#These two pieces of code evade detection
option = ChromeOptions()
option.add_experimental_option('excludeSwitches',['enable-automation'])
driver_path=r'C:\Users\Godzilla\AppData\Local\Google\Chrome\Application\chromedriver.exe' #Define the path

driver=webdriver.Chrome(executable_path=driver_path,options=option) #Initialization path + evasion detection

def register():
        driver.get('https://sec.lagou.com/verify.html?e=2&f=https://www.lagou.com/jobs/list_python?labelWords=&fromSearch=true&suginput=')

Here, we'll find the element position first, and simulate the click with selenium. You can see that it's in / / a[@class="btn"] '   Under this label

verification = driver.find_element_by_xpath('//a[@class="btn"]').click()
        time.sleep(2)

After the simulation click, get the area where the verification code is located, and then take a screenshot of this area. There are two steps to write here. The reason will be explained later

  We also need to import super Eagle here. If you don't know, you can read another article I wrote

(21 messages) how to use super Eagle_ m0_59874815 blog - CSDN bloghttps://blog.csdn.net/m0_59874815/article/details/121007373?spm=1001.2014.3001.5501

Here's a supplementary note. Because all computer screens have different sizes and scales, the offset should be set when taking screenshots (otherwise the screenshots are incomplete). I'm too lazy to calculate the offset here, so I set the resolution to 100% in the system directly

 

 code_img=driver.find_element_by_xpath('//div[@class="geetest_widget"]) # get the location of the verification code
        code_imgs=driver.find_element_by_xpath('//div[@class="geetest_widget"]').screenshot_as_png # screenshot verification code area
        chaojiying = Chaojiying_Client('Super Eagle account', 'Super Eagle password', 'Software id')  # Super Eagle account
        result=chaojiying.PostPic(code_imgs,9008)['pic_str']
        kk = driver.find_element_by_xpath('//div[@class="geetest_commit_tip"]) # click the confirm button

Through the recognition of super eagle, we can see that the returned coordinates are divided by | so we will extract it and put it into a list

all_list=[]   #Store the coordinates of the point to be clicked
        global x
        if '|' in result:
                list_1=result.split('|')
                count_1=len(list_1)
                for i in range(count_1):
                        xy_list=[]
                        x=int(list_1[i].split(',')[0])
                        y = int(list_1[i].split(',')[1])
                        xy_list.append(x)
                        xy_list.append(y)
                        all_list.append(xy_list)
        else:
                x = int(list_1[i].split(',')[0])
                y = int(list_1[i].split(',')[1])
                xy_list = []
                xy_list.append(x)
                xy_list.append(y)
                all_list.append(xy_list)
        #Traverse the list and click on the position specified by x and y corresponding to each list element using the action chain
        print(all_list)

Next, we need to traverse the list and click on the x and y corresponding to each list element using the action chain

 for l in all_list:
                x=l[0]
                y=l[1]
                ActionChains(driver).move_to_element_with_offset(code_img,x,y).click().perform()   #(what is passed in here is the reference, that is, the position of the verification code, x, y (xy is the coordinate))
                time.sleep(0.5)  #Here he didn't click every 0.5 seconds
        time.sleep(2)  
        kk.click()   #Click the confirm button

You can see here () what I passed in is code_img, if we directly obtain the location at the beginning, and then omit a piece of code in the screenshot, an error will be reported here. Therefore, the screenshot and obtaining the location need to be written separately. The subsequent test can also pass the check-in verification system, that is, the recognition rate will be lower (there is no way, the picture is too vague, and sometimes I can't understand the picture myself)

The complete code is as follows:

from  selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver import ChromeOptions   #This package is used to avoid the risk of being detected
import time  #delay
from chaojiying import Chaojiying_Client
from selenium.webdriver import ActionChains  #Action chain

#These two pieces of code evade detection
option = ChromeOptions()
option.add_experimental_option('excludeSwitches',['enable-automation'])
driver_path=r'C:\Users\Godzilla\AppData\Local\Google\Chrome\Application\chromedriver.exe' #Define the path

driver=webdriver.Chrome(executable_path=driver_path,options=option) #Initialization path + evasion detection

def register():
        driver.get('https://sec.lagou.com/verify.html?e=2&f=https://www.lagou.com/jobs/list_python?labelWords=&fromSearch=true&suginput=')
        verification = driver.find_element_by_xpath('//a[@class="btn"]').click()
        time.sleep(2)
        code_img=driver.find_element_by_xpath('//div[@class="geetest_widget"]) # get the location of the verification code
        code_imgs=driver.find_element_by_xpath('//div[@class="geetest_widget"]').screenshot_as_png # screenshot verification code area
        chaojiying = Chaojiying_Client('Super Eagle account', 'Super Eagle password', 'Software ID')  # Super Eagle account
        result=chaojiying.PostPic(code_imgs,9008)['pic_str']
        kk = driver.find_element_by_xpath('//div[@class="geetest_commit_tip"]) # click the confirm button
        print(result)
        all_list=[]   #Store the coordinates of the point to be clicked
        global x
        if '|' in result:
                list_1=result.split('|')
                count_1=len(list_1)
                for i in range(count_1):
                        xy_list=[]
                        x=int(list_1[i].split(',')[0])
                        y = int(list_1[i].split(',')[1])
                        xy_list.append(x)
                        xy_list.append(y)
                        all_list.append(xy_list)
        else:
                x = int(list_1[i].split(',')[0])
                y = int(list_1[i].split(',')[1])
                xy_list = []
                xy_list.append(x)
                xy_list.append(y)
                all_list.append(xy_list)
        #Traverse the list and click on the position specified by x and y corresponding to each list element using the action chain
        print(all_list)
        for l in all_list:
                x=l[0]
                y=l[1]
                ActionChains(driver).move_to_element_with_offset(code_img,x,y).click().perform()   #(what is passed in here is the reference, that is, the position of the verification code, x, y (xy is the coordinate))
                time.sleep(0.5) #0.5 seconds per click
        time.sleep(2)
        kk.click()  #Click the confirm button



if __name__ == '__main__':
    register()

 

This article is limited to technical exchange and learning. Please do not use it for any illegal purpose!

 

Keywords: Python Selenium crawler

Added by djdon11 on Sat, 30 Oct 2021 10:04:03 +0300