Python tiktok video Python mobile App data capture practice
Environmental preparation
- fiddler
- appium
- mitmproxy(mitmdump)
- python3.6
- Android virtual machine with root
- Android SDK
Android simulator needs to install xposed framework and install JustTrustMe components, because ssl will be verified by shaking, which will lead to the fact that we can not connect to the data when we send the data to our tiktok tools. So we need to install this component to turn off ssl verification.
The mitmproxy and Android SDK need to add environment variables, which will not be described in this step
Data collection interface SDK is required, please Click to view the interface document
Project preparation
First, we need to install fiddler and mitmproxy certificates for the virtual machine. 1)fiddler is mainly set as follows
The port settings can be changed at will. This article sets the 8889 computer host to open the command line and enter ipconfig to view the local IP
Set agent for simulator
Next, open the browser and enter ip:prot, such as 117.90 211.134:8889 download and install fiddler certificate as follows
Next, we can use fiddler to correctly capture mobile phone packets
2) Mitmproxy certificate we often use mitmdump of mitmproxy on Windows systems. Open cmd and enter mitmdump -p port number to start the service
For convenience, we also set the port number to 8889, but when opening this, we need to close fiddler first, otherwise there will be port conflict, and then open the browser of the simulator to see if it can accept data
We found that there was a problem. We found that the empty response instead of the certificate was directly prompted. We saw that the error reported by mitmdump was killed by block_ What is the reason for global? This is mitmdump's self-protection measure. It prevents the connection of the global network, which means that if it is a local LAN, it will not block it. What should we do? There are two solutions: first, set the network connection of the simulator to bridge mode, so that there will be no problems; The second method is to add the parameter mitmdump - P 8889 -- set block when starting_ Global = false, as shown in the figure below
Next, install the certificate browser of mitmproxy and enter the web address mitm It then installs the certificate
Now when you open the web page, there will be no certificate reminder. At this time, the certificates of the two main packet capturing tools we need to use are installed. It should be noted that we may need to reinstall the certificates every time we test, because our host IP may change
App analysis
First, we first confirm that the JustTrustMe component of the xposed framework in the simulator is tiktok or we can not access the network normally by shaking App.
Tiktok tiktok: note that the following steps can only be used in the network bridging mode (or using the real machine and the computer in the same network). Otherwise, Fiddler can not grab all the packages. First, use Fiddler analysis, open fiddler and open the jitter operation. This project is crawling the cxk fans' fans data, so we first enter his jitter home page to enter fans list.
There are many fans. Let's slide down to see what data fiddler will grab when sliding down to refresh more fans
You will find that every time you slide down, there will be a website containing aweme/v1/user/follower/list /. We suspect that this is the interface of fan data. We put the data returned from this website into JSON Cn look
It's really the fan data. It will be found that 20 fans are updated each time, so we've found the right direction. Let's analyze this request
GET https://aweme-hl.snssdk.com/aweme/v1/user/follower/list/?user_id=103313639528&sec_user_id=MS4wLjABAAAAxj2Cuu75g3I2pGOs7jtw5XN6WMiCKbA-jfIjlONRRvM&max_time=1570336550&count=20&offset=0&source_type=1&address_book_access=1&gps_access=1&openudid=3ca06768d1f58615&version_name=8.1.1&ts=1570336895&device_type=OPPO%20R11&ssmix=a&iid=87664447665&app_type=normal&os_api=19&mcc_mnc=46007&device_id=68799320259&resolution=720*1280&device_brand=OPPO&aid=1128&manifest_version_code=811&app_name=aweme&_rticket=1570336895512&os_version=4.4.2&device_platform=android&version_code=811&update_version_code=8112&ac=wifi&dpi=240&uuid=866174010601603&language=zh&channel=tengxun_new HTTP/1.1
We will find that the url of this request has many, many parameters. Yes, this is the way of the encryption itself. It is a good tiktok to crack this. So we can not use requests to construct requests directly to get data. How do we get data? Yes, we use mitmdump. The biggest advantage of mitmdump is that it can interact with python files. We can directly write commands in python and capture packets with mitmdump. We can analyze the data by sliding manually and write douyin_mitmdump.py file code is as follows:
import json #The function name must be written like this. This is the mitmdump rule def response(flow): #The following URL is obtained through fiddler, but we can't decrypt some data, so we need to capture the data packet with mitmdump and analyze it if 'aweme-hl.snssdk.com/aweme/v1/user/follower/list' in flow.request.url: for user in json.loads(flow.response.text)['followers']: user_info={} user_info['nickname'] = user['nickname'] user_info['share_id'] = user['uid'] user_info['douyin_id'] = user['short_id'] #Some users modified the tiktok number. if user_info['douyin_id'] == '0': user_info['douyin_id'] = user['unique_id'] print(user_info)
Now open cmd and switch to the project directory and execute the command mitmdump -p 8889 -s douyin_mitmdump.py
Next, manually slide the interface to see if the data will be parsed
OK, now we have successfully analyzed the fan data, but we can't slide the mouse all the time, can we? So we now need to use Appium for automated testing to simulate sliding
Appium automated test simulation slide
Configuration information Appium is an open source test automation framework that can be used for native, hybrid and mobile Web application testing. It uses the WebDriver protocol to drive iOS, Android and Windows applications. For example, in this article, we use Appium to realize all operations from clicking on the program to simulating sliding. First, we need to install Appium on the computer
This is equivalent to the appium server. We need to open the server on the computer before executing the automation test. Then we use the program to connect the virtual machine or the real machine to execute the script for the automation test. Click start server to open the server
First, we use appium's own test program to try how to operate. Click the magnifying glass symbol in the upper right corner to enter the configuration options interface and start filling in the option information
Let me explain how each parameter is obtained. 1) platformName is the platform name. I believe we don't need to explain too much when filling in Andriod. 2) platformVersion is the platform version information. Fill in it independently according to different mobile phones. Andriod4 is used in this article 4.2 3) deviceName is the name of the device. How can we get it? At this time, we use the adb tool in our Android SDK. adb is a tool used to connect the computer and mobile phone. We enter the mobile phone into the developer option and turn on allow USB debugging, then open the command line and enter the command adb devices to see if there is output
Returned 127.0 0.1:52001 is the device name, which is the name of the simulator, Using a real machine will be different Tiktok (if you do not return, close the developer mode and re open USB debug several times) 4) appPackage and appActivity are very important parameters. It specifies the app of our automated test. The two parameters get a little trouble. Here is a detailed explanation of how to get the first mobile phone to open App. This article is shaking the voice, then the computer command line input adb shell to enter the interactive interface. Then enter the command dumpsys activity | grep mFocusedActivity
The first is the package name, The second is the activity name. Let's write it down and write it in later Tiktok mobile phone (activity name before you want to follow the package name), that is, the package name com.ss.android.ugc.aweme activity name com.ss.android.ugc.aweme.main.MainActivity 5) noReset unicodekeyboard resetkeyboard explains that in a short time, then click the right bottom to save configuration information and start session. If click start session, it is found that the phone automatically opens the jitter, then it will explain our configuration information. Once written correctly, you can start to use these configuration information, which will be used in our python script later, so you must fill in it correctly
content analysis
Tiktok tiktok is actually similar to web crawler. app first analyzes how we should do some opening and shaking. This step is no longer necessary for us to do. Then we need to click the magnifying glass button in the upper left corner, click the search box to enter the jitter, click search, click on the user, click on the homepage, click on the fans, and slide up.
How do we locate buttons and enter information? You have to use the tools in the Android SDK again. This time, use the Android SDK \ tools \ monitor Some friends of the monitor in bat may ask why not use the UI automatorviewer, which can view the magic modified version of xpath? This is because I found some trembling sounds when I was testing. We all know that uiautomatorviewer can not get data from dynamic pages, and I find that some monitor interfaces can be, so use monitor tiktok to start.
First, click the blue circle on the left to obtain the current mobile phone interface data, then click the control we need, and then the information we need appears on the right. We can find the specified data through resource ID. now let's write douyin_appium.py file test can automatically open the jitter and click the upper left magnifier button tiktok.
from appium import webdriver #WebDriverWait is used to add time judgment. Sometimes control elements take a period of time to appear from selenium.webdriver.support.ui import WebDriverWait import time #configuration information option={ "platformName": "Android", "platformVersion": "4.4.2", "deviceName": "127.0.0.1:52001", #Automated test package name "appPackage": "com.ss.android.ugc.aweme", #Automated test Activity "appActivity": "com.ss.android.ugc.aweme.main.MainActivity", #Restart does not require reinstallation "noReset": True, #unicode keyboard, we can enter Chinese "unicodekeyboard": True, #Restore the original input method after the operation "resetkeyboard":True } #4723 is the port number when appium service is started driver = webdriver.Remote("http://localhost:4723/wd/hub",option) #Magnifying glass button try: #Use resource ID to find the button if WebDriverWait(driver,5).until(lambda x:x.find_element_by_id('com.ss.android.ugc.aweme:id/b3o')): #Click the button driver.find_element_by_id('com.ss.android.ugc.aweme:id/b3o').click() except: pass
Before running, first open the appium server, that is, start server, and then run the python file
Data collection interface SDK is required, please Click to view the interface document
OK, we found that we can automatically click the magnifying glass button. Then we just need to continue writing files to complete the automation operation. The code is as follows
from appium import webdriver #WebDriverWait is used to add time judgment. Sometimes control elements take a period of time to appear from selenium.webdriver.support.ui import WebDriverWait import time #configuration information option={ "platformName": "Android", "platformVersion": "4.4.2", "deviceName": "127.0.0.1:52001", #Automated test package name "appPackage": "com.ss.android.ugc.aweme", #Automated test Activity "appActivity": "com.ss.android.ugc.aweme.main.MainActivity", #Restart does not require reinstallation "noReset": True, #unicode keyboard, we can enter Chinese "unicodekeyboard": True, #Restore the original input method after the operation "resetkeyboard":True } #4723 is the port number when appium service is started driver = webdriver.Remote("http://localhost:4723/wd/hub",option) #Magnifying glass button try: #Use resource ID to find the button if WebDriverWait(driver,5).until(lambda x:x.find_element_by_id('com.ss.android.ugc.aweme:id/b3o')): #Click the button driver.find_element_by_id('com.ss.android.ugc.aweme:id/b3o').click() except: pass #Get window size def get_size(): x=driver.get_window_size()['width'] y=driver.get_window_size()['height'] return (x,y) #Search box try: # Locate search box if WebDriverWait(driver,3).until(lambda x:x.find_element_by_id('com.ss.android.ugc.aweme:id/ad1')): #Click the search box driver.find_element_by_id('com.ss.android.ugc.aweme:id/ad1').click() #Input the jitter and search tiktok. driver.find_element_by_id('com.ss.android.ugc.aweme:id/ad1').send_keys("1307311292") driver.find_element_by_id('com.ss.android.ugc.aweme:id/dy8').click() #Click the user to pay attention to the writing method. This control cannot obtain resource_ The values of ID and xpath can only be found through text driver.find_element_by_android_uiautomator("text(\"user\")").click() except: pass #Click in the specific interface try: if WebDriverWait(driver,5).until(lambda x:x.find_element_by_id('com.ss.android.ugc.aweme:id/bck')): #Enter the user information interface driver.find_element_by_id('com.ss.android.ugc.aweme:id/bck').click() time.sleep(2) #Number of fans clicked driver.find_element_by_id('com.ss.android.ugc.aweme:id/akf').click() except: pass #Get screen size size = get_size() #Define sliding x1 = int(size[0]*0.5) x2 = int(size[0]*0.7) y1 = int(size[1]*0.9) y2 = int(size[1]*0.2) while(True): time.sleep(0.5) #Simulated sliding driver.swipe(x1, y1, x2, y2)
Disclaimer: this content is only for learning and communication. If it infringes the rights and interests of your company, contact the author to delete it