Knowledge points:
Master selenium control tab switching;
Master the switching of iframe controlled by selenium;
Master the method of using selenium to obtain cookie s;
Master manual page waiting;
Master the method of selenium controlling the browser to execute js code;
Master selenium to open no interface mode;
Understand how selenium uses proxy ip;
Understand selenium replacing user agent;
1. Switching between selenium tabs
When selenium controls the browser to open multiple tabs, how to control the browser to switch between different tabs? We need to do the following two steps:
Get the window handle of all tabs;
Use the window handle word to switch to the tab pointed to by the handle;
The window handle here refers to the identification pointing to the tab object;
Specific methods:
# 1. Gets a list of all current tab handles current_windows = driver.window_handles # 2. Switch according to the index subscript of the tab handle list driver.switch_to.window(current_window[0])
from selenium import webdriver url = 'https://jn.58.cn' driver = webdriver.Chrome() driver.get(url) print(driver.current_url) print(driver.window_handles) # Locate and click the rent button el = driver.find_element_by_xpath('/html/body/div[3]/div[1]/div[1]/div/' 'div[1]/div[1]/span[1]/a') el.click() print(driver.current_url) print(driver.window_handles) driver.switch_to.window(driver.window_handles[-1]) el_list = driver.find_element_by_xpath('/html/body/div[7]/div[2]/ul/li/div[2]/h2/a') print(len(el_list))
from selenium import webdriver url = 'https://qzone.qq.com/' driver = webdriver.Chrome() driver.get(url) # el_frame = driver.find_element_by_xpath('//*[@id="login_frame"]') driver.switch_to.frame('login_frmae') driver.switch_to.frame(el_frame) driver.find_element_by_id('switcher plogin').click() driver.find_element_by_id('u').send_keys('2634809316') driver.find_element_by_id('p').send_keys('461324karura') driver.find_element_by_id('login_button').click()
from selenium import webdriver url = 'http://www.baidu.com' driver = webdriver.Chrome() driver.get(url) print(driver.get_cookie()) cookies = {data['name']:data['value']for data in driver.get_cookie()} print(cookies)
1.2 deleting cookie s
# Delete a cookie driver.delete_cookie("CookieName") # Delete all cookie s driver.delete_all_cookies()
2. selenium controls the browser to execute js code
selenium can let the browser execute the js code specified by us, and run the following code to see the running effect.
from selenium import webdriver url = 'https://jn.lianjia.com' driver =webdriver.Chrome() driver.get(url) # Scroll bar drag js = 'scrollTo(0,500)' # Execute js driver.execute_script(js) el_button = driver.find_element_by_xpath('html/body/div[2]/ul/li[2]/a') el_button.click()
2. Page waiting
The page needs to spend time waiting for the response of the website server during the loading process. In this process, the tag element may not be visible and has not been loaded. How can this happen?
1) . page waiting classification;
2) . mandatory waiting for introduction;
3) . explicit waiting for introduction;
4) . implicitly waiting for introduction;
5) . manually realize page waiting;
2.1 classification of page waiting
First, let's understand the following categories of selenium pages;
1) . forced waiting;
2) . implicit waiting;
3) , explicit wait;
2.2 forced waiting (understand)
It's actually time sleep()
It is determined that it is not intelligent. The setting time is too short and the element has not been loaded; If the setting time is too long, it will waste time;
2.3 implicit waiting
Implicit waiting is for element positioning. Implicit waiting sets a time to judge whether the element positioning is successful within a period of time. If it is completed, proceed to the next step;
If the positioning is not successful within the set time, the timeout loading will be reported
Example code:
from selenium import webdriver url = 'http://www.baidu.com' driver = webdriver.Chrome() driver.get(url) # All element positioning operations after setting the position have a maximum waiting time of 10 seconds. Element positioning will be carried out regularly within 10 seconds. If the setting time is exceeded, an error will be reported driver.implicitly_wait(10) driver.get(url) el = driver.find_element_by_xpath('//*[@id="lg"]/img[10000]') print(el)
3. Manually implement page waiting
After learning about implicit waiting, explicit waiting and forced waiting, we find that there is no general method to solve the problem of page waiting, such as the scenario of "page needs to slide to trigger ajax asynchronous loading". Next, we take Taobao home page as an example to manually realize page waiting.
Principle:
The idea of forced waiting and explicit waiting is used to realize it manually;
Constantly judge or limit the number of times whether a label object has been loaded (whether it exists)
The implementation code is as follows:
import time from selenium import webdriver driver = webdriver.Chrome('C:\Program Files\Python3.9.4\Scripts/chromedriver') driver.get('https://www.taobao.com/') time.sleep(1) # i =0 # while True: for i in range(10): i += 1 try: time.sleep(3) element = driver.find_element_by_xpath('//div[@class="shop-inner"]/h3[1]/a') print(element.get_attribute('href')) break except: js = 'window.scrollTo(0,{})'.format(i*500) # js statement driver.execute_script(js) # Method of executing js driver.quit()