Python Notes-Python 3 Realizes Timing and Automatically Submitting Questionnaire Star Questionnaire

The general contents are as follows:

  1. Using Fiddler to capture packets, the data packets transmitted by click-submit are analyzed; (Key)
  2. Climb the IP address published by the free proxy IP website (e.g. Sissi proxy) and construct the IP address pool;
  3. Refer to the User Agent package of the fake_useragent library to get the random User-Agent
    (Points 2 and 3 are all designed to construct HTTP headers to deal with the anti-crawler mechanism of websites. However, the actual post s are too frequent and will report errors: the remote host actively disconnects and so on)

Small details:

Because the post address is based on https protocol, if the request method is used directly to transmit the reference, the error will be reported (which may be a certificate-related problem, but we will not go into it here). Then the following methods can be used to solve the problem:

import requests
requests.packages.urllib3.disable_warnings()#Lack of error warning does not seem to affect script execution
 r = requests.post(url, headers=headers, data=data, verify=False)# Very is True by default, so it is manually set to False.

Using Fiddler Grab a bag to analyze data packets transmitted by click Submit

To compare the submission process with sending letters, browser is the sender, server is the recipient, then url is the address, letters record basic information and body content (header and body).

Record the destination url of the browser reference t

Look at what information is contained in the header and gather it to enrich the request header we built
We can see the use of post methods to pass parameters, url s, request headers, and submitdata content (encoded)

Look at the decoded parameters and analyze the submitdata parameters, which may consist of the title and the serial number of the selected options.

(If it is an open questionnaire, that is to say, the submission parameters include Chinese, which needs to be coded before submission)

data = 'submit=1$2}2$3}3$python Great law.'.encode("utf-8").decode("latin1")

Finally, let's look at what data the server will return when the submission is successful (pro-test: successfully returning json=10, otherwise returning 22)

When data collection and analysis is done, I can start coding.

  1. Climb the website of free proxy IP address, use positive expression to get its published free proxy IP, enrich our IP address pool, and then random. choise (address list) to randomly get the IP in the address pool.
def Get_IP():
    headers = {     #Construct a simple request header for visiting the Spurs proxy website
        'User-Agent': UserAgent().random
    }
    html = urllib.request.Request(url='https://www.xicidaili.com/nn/', headers=headers)
    html = urllib.request.urlopen(html).read().decode('utf-8')
    reg = r'<td>(.+?)</td>'#View page elements through the browser's F12, and find that all elements are placed in the td tag, and in the order of IP address, port, protocol, address, time.
    reg = re.compile(reg)
    #After matching the positive and lateral expressions, all the elements are placed in the list in order, but this is not the final result.
    pools = re.findall(reg, html)[0:499:5]#Extract all the IP addresses and store them in the list to form the address pool.
    Random_IP = random.choice(pools)#Random selection of an IP address in the address pool
    return Random_IP
  1. Build the request header (where you need to grab your own package to collect data (Cookie) to build your own request header)
    If it hasn't been successful, you can grab the bag again and replace the coookie.
def Get_Headers():
    headers = {  
        'Host':'www.wjx.cn',
        'User-Agent': UserAgent().random,#Random User-Agent, which requires the User Agent package from the fake_useragent Library
        'Content-Type':'application/x-www-form-urlencoded; charset=UTF-8',#Submit data in tabular form
        'Referer':'https://www.wjx.cn/m/XXXXX.aspx',  Link to your questionnaire
        'Cookie':'XXXXX',#Grab a bag
        'X-Forwarded-For':Get_IP()#Call function to get proxy IP address
    }
    return headers
  1. Constructing parameter functions to submit parameters
def Auto_WjX():
    url = 'objective url'
    #data is the submission parameter.
    #If you include Chinese parameters, you need to specify the encoding, for example: data = submit = 1 $2} 2 $3} 3 $python method is good'. encode("utf-8").decode("latin1")
    data = "submitdata=1$1}2$3}3$1}4$2}5$1}6$2}7$2}8$1}9$1}10$1}11$1}12$1}13$1}14$1}15$1}16$1}17$1}18$1}19$1}20$1}21$1}22$1}23$1}24$1}25$2}26$3}27$3}28$2|10|13|19}29$4|10}30$3|7}31$2}32$3}33$4}34$1}35$1}36$1}37$2}38$2}39$2}40$2}41$1}42$2}43$1}44$2}45$1}46$1}47$4}48$4}49$4}50$4}51$3}52$3}53$1}54$1}55$1}56$3}57$3}58$3}59$1}60$3}61$3"
    r = requests.post(url, headers=Get_Headers(), data=data, verify=False)
    #The key data ('10'or'22') that indicates success or failure in the returned data is at the beginning of the test, so only the first two elements in the returned data need to be extracted.
    result = r.text[0:2]
    return result
  1. All that's left is to write your own mian function (not to mention here, just a little longer dormancy settings)
def main():
    global PostNum
    for i in range(10):
        result = Auto_WjX()
        if int(result) in [10]:#Loop 10 times, call 10 times Auto_WjX function (test submission 10 times, success 5 times, 50% success rate)
            print('[ Response : %s ]  ===> Successful submission!!!' % result)
            PostNum += 1
        else:
            print('[ Response : %s ]  ===> Submission failure!!!!' % result)
        time.sleep(30)  # Set up a dormancy time. Here, set a long enough dormancy time.
    print('Successful submission of script after running%s A Survey Report' % PostNum)  # Summarize the number of successful submissions and print them

if __name__ == '__main__':
    main()

Put in the complete code

import requests
import urllib.request
from fake_useragent import UserAgent
import re
import random
import time

requests.packages.urllib3.disable_warnings()
PostNum = 0

def Get_Headers():
    headers = {  
        'Host':'www.wjx.cn',
        'User-Agent': UserAgent().random,
        'Content-Type':'application/x-www-form-urlencoded; charset=UTF-8',
        'Referer':'https://www.wjx.cn/m/38072076.aspx',
        'Cookie':'UM_distinctid=169ced4487c381-0eb21ff10e540a-784a5037-144000-169ced4487e128f; CNZZDATA4478442=cnzz_eid%3D198455657-1553952914-%26ntime%3D1555839771; .ASPXANONYMOUS=b8bj7o8d1QEkAAAAZjU3NjRkOWEtYzZjNC00ZDg4LTkxZmQtODdkMWZmZmYzM2EyPkFcM46KG2F_Bo62rCi-B5EyW9M1; acw_tc=2f624a1d15539532106003838e1bce9d7d440f74597e79e5b0c885288baa35; jac38072076=19904652; Hm_lvt_21be24c80829bd7a683b2c536fcf520b=1553953297,1553953308,1553953311,1555841361; Hm_lpvt_21be24c80829bd7a683b2c536fcf520b=1555841361',
        'X-Forwarded-For':Get_IP()
    }
    return headers

def Get_IP():
    headers = {
        'User-Agent': UserAgent().random
    }
    html = urllib.request.Request(url='https://www.xicidaili.com/nn/', headers=headers)
    html = urllib.request.urlopen(html).read().decode('utf-8')
    reg = r'<td>(.+?)</td>'
    reg = re.compile(reg)
    pools = re.findall(reg, html)[0:499:5]
    Random_IP = random.choice(pools)
    return Random_IP

def Auto_WjX():
    url = 'https://www.wjx.cn/joinnew/processjq.ashx?curid=38072076&starttime=2019%2F4%2F21%2018%3A09%3A16&source=directphone&submittype=1&ktimes=482&hlv=1&rn=3661365232.19904652&t=1555841474155&jqnonce=b3931762-39e9-4136-a3a8-087e62f3497d&jqsign=%601%3B13540%2F1%3Bg%3B%2F6314%2Fc1c%3A%2F2%3A5g40d16%3B5f'
    data = "submitdata=1$1}2$3}3$1}4$2}5$1}6$2}7$2}8$1}9$1}10$1}11$1}12$1}13$1}14$1}15$1}16$1}17$1}18$1}19$1}20$1}21$1}22$1}23$1}24$1}25$2}26$3}27$3}28$2|10|13|19}29$4|10}30$3|7}31$2}32$3}33$4}34$1}35$1}36$1}37$2}38$2}39$2}40$2}41$1}42$2}43$1}44$2}45$1}46$1}47$4}48$4}49$4}50$4}51$3}52$3}53$1}54$1}55$1}56$3}57$3}58$3}59$1}60$3}61$3"
    r = requests.post(url, headers=Get_Headers(), data=data, verify=False)
    result = r.text[0:2]
    return result

def main():
    global PostNum
    for i in range(10):
        result = Auto_WjX()
        if int(result) in [10]:
            print('[ Response : %s ]  ===> Successful submission!!!' % result)
            PostNum += 1
        else:
            print('[ Response : %s ]  ===> Submission failure!!!!' % result)
        time.sleep(30)  # Set up a dormancy time. Here, set a long enough dormancy time.
    print('Successful submission of script after running%s A Survey Report' % PostNum)  # Summarize the number of successful submissions and print them

if __name__ == '__main__':
    main()

Just for research and study, don't abuse it.

Keywords: Python JSON encoding

Added by lordgreg on Sun, 12 May 2019 09:36:57 +0300