selenium basic usage

Selenium+Python environment construction and configuration

selenium+Python environment configuration

Prerequisite: Python development environment has been installed (Python 3.5 and above are recommended)
Installation steps:
Installing selenium
Win: pip install selenium
Mac:pip3 install selenium

Install webdriver
For the webdriver addresses of major browsers, see: https://docs.seleniumhq.org/download/
Firefox: https://github.com/mozilla/geckodriver/releases/
Chrome: https://sites.google.com/a/chromium.org/chromedriver/ Or
http://chromedriver.storage.googleapis.com/index.html
IE: http://selenium-release.storage.googleapis.com/index.html
Note: webdriver should correspond to the corresponding browser version and selenium version

webdriver installation path
Win: copy webdriver to Python installation directory
Mac: copy webdriver to / usr/local/bin directory

Element positioning and browser basic operation

  • Launch browser
  • Normal mode start
    Launch Chrome browser:
from selenium import webdriver

browser = webdriver.Chrome()
browser.get('http://www.baidu.com/')
start-up Firefox browser:

from selenium import webdriver

browser = webdriver.Firefox()
browser.get('http://www.baidu.com/')
start-up IE browser:

from selenium import webdriver

browser = webdriver.Ie()
browser.get('http://www.baidu.com/')
  • Headless mode startup
    Headless Chrome is a non interface form of Chrome browser. You can run your program using all chrome supported features without opening the browser. Compared with modern browsers, headless Chrome is more convenient to test web applications, obtain screenshots of websites, do crawlers to grab information, etc. Compared with earlier phantom JS and SlimerJS, headless Chrome is closer to the browser environment.

Headless Chrome requires Chrome version:
According to the official documents, the mac and linux environments require the chrome version to be 59 +, while the windows version requires the chrome version to be 60 +, and the chrome River requires the 2.30 + version.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

chrome_options = webdriver.ChromeOptions()

Use headless no interface browser mode

chrome_options.add_argument('--headless') //Add no interface option
chrome_options.add_argument('--disable-gpu') //If this option is not added, sometimes there will be positioning problems

Start the browser and get the web page source code

browser = webdriver.Chrome(chrome_options=chrome_options)
mainUrl = "https://www.taobao.com/"
browser.get(mainUrl)
print(f"browser text = {browser.page_source}")
browser.quit()
2.1.3 Load configuration launch browser
Selenium The operation browser does not load any configuration. The following is about loading Chrome Method of configuration:

use Chrome Address bar entry chrome://version /, view your "profile path", and then call this configuration file when the browser starts. The code is as follows:

#coding=utf-8
from selenium import webdriver
option = webdriver.ChromeOptions()
option.add_argument('--user-data-dir=C:\Users\Administrator\AppData\Local\Google\Chrome\User Data') #Set to the user's own data directory
driver=webdriver.Chrome(chrome_options=option)
While loading Firefox The configuration method is somewhat different:

# coding=utf-8
from selenium import webdriver
# Profile address
profile_directory = r'C:\Users\xxx\AppData\Roaming\Mozilla\Firefox\Profiles\1x41j9of.default'
# Load configuration
profile = webdriver.FirefoxProfile(profile_directory)
# Launch browser configuration
driver = webdriver.Firefox(profile)

Element positioning

id location: find_element_by_id()
name location: find_element_by_name()
class location: find_element_by_class_name()
link location: find_element_by_link_text()
partial link location: find_element_by_partial_link_text()
tag location: find_element_by_tag_name()
xpath location: find_element_by_xpath()
css location: find_element_by_css_selector()
#coding=utf-8
from selenium import webdriver
browser=webdriver.Firefox()
browser.get("http://www.baidu.com")
#########Positioning method of Baidu input box##########
#Locate by id
browser.find_element_by_id("kw").send_keys("selenium")
#Locate by name
browser.find_element_by_name("wd").send_keys("selenium")
#Locate by tag name
browser.find_element_by_tag_name("input").send_keys("selenium")
#Locate by class name
browser.find_element_by_class_name("s_ipt").send_keys("selenium")
#Positioning by CSS
browser.find_element_by_css_selector("#kw").send_keys("selenium")
#Locate by xpath
browser.find_element_by_xpath("//input[@id='kw']").send_keys("selenium")
############################################
browser.find_element_by_id("su").click()
time.sleep(3)
browser.quit()

Solution when class contains spaces:

When actually locating elements, it is often found that class name is a composite class with multiple class combinations, separated by spaces. If an error is reported during direct positioning, it can be handled in the following ways:

  • The class attribute is unique, but there are spaces. Select the only one on both sides of the space
  • If space separated class es are not unique, they can be located by index
    self.driver.find_elements_by_class_name('table-dragColumn')[0].click()
  • Positioning through css method (space is replaced by '.)
    #Add (.) Use a dot (.) To replace
    self.driver.find_element_by_css_selector('.dtb-style-1.table-dragColumns').click()
    #Contains the entire class
    self.driver.find_element_by_css_selector('class="dtb-style-1 table-dragColumns').click()
    Reference code:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://mail.126.com/")
driver.implicitly_wait(20)
 
driver.switch_to.frame("x-URS-iframe")

# Method 1: take a single class attribute
driver.find_element_by_class_name("dlemail").send_keys("yoyo")
driver.find_element_by_class_name("dlpwd").send_keys("12333")
 
# Method 2: locate a group and take down the subscript to locate (which is the worst Policy)
driver.find_elements_by_class_name("j-inputtext")[0].send_keys("yoyo")
driver.find_elements_by_class_name("j-inputtext")[1].send_keys("12333")
 
# Method 3: css positioning
driver.find_element_by_css_selector(".j-inputtext.dlemail").send_keys("yoyo")
driver.find_element_by_css_selector(".j-inputtext.dlpwd").send_keys("123")
 
# Method 4: it is also possible to take a single class attribute
driver.find_element_by_css_selector(".dlemail").send_keys("yoyo")
driver.find_element_by_css_selector(".dlpwd").send_keys("123")
 
# Method 5: CSS attribute positioning method directly containing spaces
driver.find_element_by_css_selector("[class='j-inputtext dlemail']").send_keys("yoyo")

selenium three waiting modes

Sometimes, in order to ensure the stability of the script, the waiting time needs to be added to the script.

  • Forced waiting
    The first and simplest way is to forcibly wait for sleep(xx). You need to introduce the "time" module. This is called forced waiting. No matter whether your browser is loaded or not, the program has to wait for 3 seconds. Once 3 seconds arrive, continue to execute the following code. It is very useful for debugging. Sometimes you can wait in the code, but it is not recommended to always use this waiting method, It is too rigid, which seriously affects the execution speed of the program.
# -*- coding: utf-8 -*-
from selenium import webdriver
import time

driver = webdriver.Firefox()
driver.get('http://baidu.com')

time.sleep(3)  # Force a wait of 3 seconds before proceeding to the next step

print(driver.current_url)
driver.quit()
  • Implicit waiting
    The second method is called implicit waiting by adding implicitly_wait() method can easily realize intelligent waiting; implicitly_wait(30) should be used better than time Sleep () is more intelligent. The latter can only choose to wait for a fixed time. The former can wait intelligently within a time range.
# -*- coding: utf-8 -*-
from selenium import webdriver

driver = webdriver.Firefox()
driver.implicitly_wait(30)  # Wait implicitly for up to 30 seconds
driver.get('http://baidu.com')

print(driver.current_url)
driver.quit()

Invisible waiting is to set a maximum waiting time. If the web page is loaded within the specified time, execute the next step. Otherwise, wait until the time expires, and then execute the next step. Note that there is a disadvantage here, that is, the program will wait until the whole page is loaded, that is, generally, you will not execute the next step until you see that the small circle in the browser tab bar is no longer turned, but sometimes the elements you want on the page are already loaded, but because individual js and other things are very slow, I still have to wait until the page is completely completed, I want to wait until the elements I want come out. What's the next step? There is a way. It depends on another waiting method provided by selenium - explicit waiting.
It should be noted that the hidden waiting works for the entire driver cycle, so it can be set only once. I once saw someone use the hidden waiting as a sleep and come wherever they go

  • Explicit waiting
    The third method is explicit wait, WebDriverWait, which matches the until() and until() of this class_ With the not () method, you can wait flexibly according to the judgment conditions. Its main meaning is: the program takes a look every xx seconds. If the condition is true, execute the next step. Otherwise, continue to wait until the maximum time set is exceeded, and then throw TimeoutException.

The WebDriverWait class of the wait module is an explicit wait class. First look at its parameters and methods:

selenium.webdriver.support.wait.WebDriverWait (class)

init

Driver: pass in the WebDriver instance, that is, the driver in our example above
Timeout: timeout, the longest waiting time (taking into account the hidden waiting time)
poll_frequency: call until or until_ The interval between methods in not. The default is 0.5 seconds
ignored_exceptions: exceptions ignored if until or until is called_ If an exception in this tuple is thrown in the process of not, the code will not be interrupted and continue to wait. If an exception outside this tuple is thrown, the code will be interrupted and an exception will be thrown. By default, there is only NoSuchElementException.
until

Method: during the waiting period, call the incoming method at regular intervals (_init_frequency) until the return value is not False
Message: if timeout occurs, TimeoutException will be thrown and message will be passed in as an exception
until_not

In contrast to until, until continues to execute when an element occurs or any condition is true,
until_not means that when an element disappears or any condition does not hold, the execution will continue. The parameters are the same, and will not be repeated.
After reading the above contents, it is basically clear that the calling method is as follows:

WebDriverWait(driver, timeout duration, call frequency, ignore exception) Until (executable method, information returned when timeout)

What needs special attention here is until or until_ For the method parameter of the executable method in not, many people passed in the WebElement object, as follows:

WebDriverWait(driver, 10).until(driver.find_element_by_id('kw ') # error

This is an incorrect usage. The parameters here must be callable, that is, the object must have a call() method, otherwise an exception will be thrown:

TypeError: 'xxx' object is not callable

Here, you can use expected provided by selenium_ Various conditions in the conditions module can also use the is of WebElement_ displayed() ,is_enabled(),**is_ The selected() * * method or the method encapsulated by yourself can be used.

#coding=utf-8
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait

base_url = "http://www.baidu.com"
driver = webdriver.Firefox()
driver.implicitly_wait(5)
'''When both implicit wait and display wait exist, the timeout time is the greater of the two'''
locator = (By.ID,'kw')
driver.get(base_url)

WebDriverWait(driver,10).until(EC.title_is(u"Baidu once, you know"))
'''judge title,Returns a Boolean value'''

WebDriverWait(driver,10).until(EC.title_contains(u"use Baidu Search"))
'''judge titleļ¼ŒReturns a Boolean value'''

WebDriverWait(driver,10).until(EC.presence_of_element_located((By.ID,'kw')))
'''Determine whether an element is added to dom In the tree, it does not mean that the element must be visible. If it is located, it will be returned WebElement'''

WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.ID,'su')))
'''Determine whether an element has been added to the dom Inside and visible, visible representative elements can be displayed, and both width and height are greater than 0'''

WebDriverWait(driver,10).until(EC.visibility_of(driver.find_element(by=By.ID,value='kw')))
'''Judge whether the element is visible. If it is visible, return this element'''

WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,'.mnav')))
'''Determine whether at least 1 element exists in dom In the tree, if it is located, it returns the list'''

WebDriverWait(driver,10).until(EC.visibility_of_any_elements_located((By.CSS_SELECTOR,'.mnav')))
'''Judge whether at least one element is visible in the page. If it is located, it will return to the list'''

WebDriverWait(driver,10).until(EC.text_to_be_present_in_element((By.XPATH,"//*[@ id='u1']/a[8]"),u' setting ')
'''Determines whether the specified element contains the expected string and returns a Boolean value'''

WebDriverWait(driver,10).until(EC.text_to_be_present_in_element_value((By.CSS_SELECTOR,'#su'),u' Baidu once ')
'''Determines whether the attribute value of the specified element contains the expected string and returns a Boolean value'''

#WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it(locator))
'''Judge the frame Whether it can be or not? switch Go in, if you can, go back True also switch Go in, or go back False'''
#Note that there is no frame that can be switched in

WebDriverWait(driver,10).until(EC.invisibility_of_element_located((By.CSS_SELECTOR,'#swfEveryCookieWrap')))
'''Determine whether an element exists in dom Or invisible,Return if visible False,Invisible returns this element'''
#be careful#Swofeverycookeiewrap is a hidden element in this page

WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//*[@id='u1']/a[8]"))).click()
'''Determine whether an element is visible and enable Yes, delegates can click'''
driver.find_element_by_xpath("//*[@id='wrapper']/div[6]/a[1]").click()
#WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//*[@id='wrapper']/div[6]/a[1]"))).click()

#WebDriverWait(driver,10).until(EC.staleness_of(driver.find_element(By.ID,'su')))
'''Wait for an element from dom Remove from tree'''
#There is no suitable example here

WebDriverWait(driver,10).until(EC.element_to_be_selected(driver.find_element(By.XPATH,"//*[@id='nr']/option[1]")))
'''Determine whether an element is selected,Generally used in drop-down lists'''

WebDriverWait(driver,10).until(EC.element_selection_state_to_be(driver.find_element(By.XPATH,"//*[@id='nr']/option[1]"),True))
'''Judge whether the selected state of an element meets the expectation'''

WebDriverWait(driver,10).until(EC.element_located_selection_state_to_be((By.XPATH,"//*[@id='nr']/option[1]"),True))
'''Judge whether the selected state of an element meets the expectation'''
driver.find_element_by_xpath(".//*[@id='gxszButton']/a[1]").click()

instance = WebDriverWait(driver,10).until(EC.alert_is_present())
'''Determine whether there is on the page alert,If so, switch to alert And return alert Content of'''
print instance.text
instance.accept()

driver.close()

Browser operation

Browser maximization and minimization

Maximize browser display
browser.maximize_window()

Minimize browser display
browser.minimize_window()

Browser setting window size

Set the browser width of 480 and height of 800 to display
browser.set_window_size(480, 800)

Browser forward and backward

forward
browser.forword()

back off
browser.back()

Operation test object

Generally speaking, the following methods are commonly used to manipulate objects in webdriver:

Click -- click the object
send_keys -- simulates key input on an object
Clear -- clear the contents of the object, if possible
Submit -- submit the contents of the object, if possible
Text -- used to get the text information of the element

Keyboard events

To invoke keyboard key operations, you need to import the keys package:
from selenium.webdriver.common.keys import Keys send_keys() call key:
send_keys(Keys.TAB) # TAB
send_keys(Keys.ENTER) # enter

Reference code:

#coding=utf-8 
from selenium import webdriver 
from selenium.webdriver.common.keys import Keys #The keys package needs to be introduced
import os,time

driver = webdriver.Firefox() 
driver.get("http://passport.kuaibo.com/login/?referrer=http%3A%2F%2Fwebcloud .kuaibo.com%2F")

time.sleep(3) 
driver.maximize_window() # Browser full screen display

driver.find_element_by_id("user_name").clear() 
driver.find_element_by_id("user_name").send_keys("fnngj")

#The positioning of the tab is equivalent to clearing the default prompt information of the password box, which is equivalent to the above clear() 
driver.find_element_by_id("user_name").send_keys(Keys.TAB) 
time.sleep(3) 
driver.find_element_by_id("user_pwd").send_keys("123456")

#Instead of the login button, locate the password box and enter
driver.find_element_by_id("user_pwd").send_keys(Keys.ENTER)

#You can also locate the login button and enter instead of click() 
driver.find_element_by_id("login").send_keys(Keys.ENTER) 
time.sleep(3)

driver.quit()
Usage of keyboard combination keys:

#ctrl+a select all 
driver.find_element_by_id("kw").send_keys(Keys.CONTROL,'a')
#ctrl+x cuts the contents of the input box 
driver.find_element_by_id("kw").send_keys(Keys.CONTROL,'x')

Mouse event

Mouse events generally include right clicking, double clicking, dragging, moving the mouse over an element, and so on.
ActionChains class needs to be introduced.
Introduction method:
from selenium.webdriver.common.action_chains import ActionChains

Common methods of ActionChains:
perform() executes all behaviors stored in ActionChains;
context_ Right click;
double_click() double click;
drag_and_drop() drag;
move_to_element() mouse over.
Double click mouse example:

#Navigate to the element you want to double-click
 qqq =driver.find_element_by_xpath("xxx") 
#Double click the anchored element 
 ActionChains(driver).double_click(qqq).perform()
Mouse drag and drop example:

#Locate the original location of the element 
element = driver.find_element_by_name("source") 
#Locate the target location to which the element is to be moved 
target = driver.find_element_by_name("target")
#Perform the move operation of the element 
ActionChains(driver).drag_and_drop(element, target).perform()

Multi storey frame / level positioning

In the process of locating elements, we often encounter the problem of missing elements, which is generally caused by the following factors:

Incorrect element positioning method
Page has iframe or embedded window
Page Timeout
webdriver provides a switch_to_frame method can easily solve this problem.
Usage:

#ifrome1 (id = f1) was found first
browser.switch_to_frame("f1")
Similarly, if it is an embedded window:
browser.switch_to_window("f1")

Expected Conditions resolution

There are two usage scenarios for Expected Conditions:

Use directly in assertions
Use with WebDriverWait to dynamically wait for elements on the page to appear or disappear
Relevant methods:

title_is: judge whether the title of the current page is exactly equal to the expected value
title_contains: judge whether the title of the current page contains the expected string
presence_of_element_located: to judge whether an element is added to the dom tree does not mean that the element must be visible
visibility_of_element_located: determines whether an element is visible Visible means that the element is not hidden, and the width and height of the element are not equal to 0
visibility_of: it does the same thing as the above method, except that the above method needs to be passed into the locator, and this method can directly pass the located element
presence_of_all_elements_located: determines whether at least one element exists in the dom tree. For example, if the class of n elements on the page is' column-md-3 ', this method returns True as long as one element exists
text_to_be_present_in_element: determines whether the text in an element contains the expected string
text_to_be_present_in_element_value: determines whether the value attribute in an element contains the expected string
frame_to_be_available_and_switch_to_it: judge whether the frame can be switched in. If yes, return True and switch in. Otherwise, return False
invisibility_of_element_located: determines whether an element does not exist in the dom tree or is invisible
element_to_be_clickable: judge whether an element is visible and enable d. In this case, it is called clickable
staleness_of: wait for an element to be removed from the dom tree. Note that this method also returns True or False
element_to_be_selected: determines whether an element is selected. It is generally used in the drop-down list
element_selection_state_to_be: judge whether the selected state of an element meets the expectation
element_located_selection_state_to_be: the function of the above method is the same, except that the above method passes in the located element, and this method passes in the locator
alert_is_present: judging whether there is an alert on the page is an old question, which many students will ask
Example:
Judgment Title: title_is(),title_contains()

Import expected first_ Conditions module
Because the module name is relatively long, it is renamed EC for the convenience of subsequent calls (a bit like renaming when querying multiple tables in the database)
After opening the blog home page, judge the title, and the returned result is True or False

# coding:utf-8
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("http://baidu.com")
# Judge that title is exactly equal to
title = EC.title_is(u'Baidu')
print title(driver)

# Determine whether the title contains
title1 = EC.title_contains(u'Baidu')
print title1(driver)

# Another way to write
r1 = EC.title_is(u'Baidu')(driver)
r2 = EC.title_contains(u'Baidu')(driver)
print r1
print r2

Selenium checklist

Python Webdriver Exception quick lookup table

Various exceptions may occur during the use of webdriver. We need to understand the exception and know how to handle it.

Exception description
WebDriverException is the base class of all webdriver exceptions. It is thrown when there are exceptions and they do not belong to the following exceptions
InvalidSwitchToTargetException is the parent class of the following two exceptions, which is thrown when the target to switch does not exist
NoSuchFrameException when you want to use switch_ to. Thrown when frame() cuts into a nonexistent frame
NoSuchWindowException when you want to use switch_ to. Thrown when window() cuts into a nonexistent window
NoSuchElementException element does not exist. It is usually found_ Element and find_elements thrown
NoSuchAttributeException is generally thrown when you get non-existent element attributes. Note that some attributes have different attribute names in different browsers
The element specified by StaleElementReferenceException is out of date and not in the current DOM tree. It may have been deleted or the page or iframe has been refreshed
Unexpected alert occurs in unexpected alertpresentexception, which is thrown when it hinders the execution of the instruction
NoAlertPresentException is thrown when you want to get an alert, but no alert actually appears
InvalidElementStateException is the parent class of the following two exceptions. It is thrown when the element state cannot perform the desired operation
The ElementNotVisibleException element exists, but is not visible and cannot interact with it
ElementNotSelectableException is thrown when you want to select an element that cannot be selected
InvalidSelectorException usually throws this error when your xpath syntax is wrong
Invalidcookie domainexception is thrown when you want to add a cookie in a domain other than the current url
Unabletosetcookie exception is thrown when the driver cannot add a cookie
TimeoutException is thrown when an instruction does not complete in sufficient time
Thrown during the move operation of MoveTargetOutOfBoundsException actions to move the target out of the window
Thrown when the element tag obtained by the unexpected tagnameexception does not meet the requirements. Compared with the instantiation of select, when you pass in an element with a non select tag
ImeNotAvailableException is thrown when the input method is not supported. These two exceptions are not common. It is said that the ime engine is only used when Chinese / Japanese is supported under linux
ImeActivationFailedException is thrown when activation of the input method fails
ErrorInResponseException is not common. It may be thrown when an error occurs on the server side
RemoteDriverServerException is uncommon. It seems that this error will be reported when the driver fails to start the browser in some cases

Quick lookup table of XPath & CSS positioning method

Describe Xpath Css
Direct child element / / div / a div > A
Child element or descendant element / / div//a div a
Locate with ID / / div[@id = 'idValue'] / / a div#idValue a
Locate / / div[@class = 'classValue'] / / a div.classValue a by class
Sibling element / / ul/li[@class = 'first'] / following - UL > Li first + li
Attribute / / form/input[@name = 'username'] form input[name = 'username']
Multiple attributes / / input[@name = 'continue' and input[name = 'continue'] [type='button
4th sub element / / ul[@id = 'list'] / Li [4] ul#list Li: nth child (4)
The first child element / / ul[@id = 'list'] / Li [1] ul#list Li: first child
Last sub element / / ul[@id = 'list'] / Li [last()] ul#list Li: last child
Attribute contains a field / / div[contains(@title, 'title')] div[title * = "title"]
Attribute starts with a field / / input [starts with (@ name, 'user')] input[name ^ = "user"]
Attribute ends with a field / / input [ends with (@ name, 'name')] input[name $= "name"]
Text contains a field / / div[contains(text(), 'text')] cannot be located
Element has an attribute / / div[@title] div[title]
Parent node / / div /... Cannot be located
Peer node / / Li / preceding sibling:: div [1] cannot locate
Tips
Here is an online code beautification tool, online access address:
https://carbon.now.sh

In addition, if you use vscade, you can install the corresponding plug-ins for fast online beautification.

Open vscope and enter: carbon now SH in the plug-in column
Click Install
Click reload to install
Press the shortcut key Alt + CMD + a (used in win system: ALT+WIN+A)

Added by feckless on Mon, 17 Jan 2022 22:17:29 +0200