python implementation of huluxian brush comment script

As we all know (no), I mentioned before that I have a huluxian account. Huluxian is also a platform that has been with me for a long time. From the earliest search for cracked games to various miscellaneous technology sharing, entertainment and making friends, it can be barely completed. And I managed to get a lot of contribution value after mixing for so long. By the way, there is a certificate of merit (as for this, I didn't expect), more or less emotional. But in the twinkling of an eye, there are less than 10000 comments. It's more or less a shame to be an old user, but I don't have much time to comment next to each other. Suddenly, I thought I saw so many robot comments before. I just wrote one in python and started working (water article)

Show the wave chart at the beginning

First of all, to realize the automatic comment of scripts, we first understand what links and data are used in the process of a comment through the package capturing software

It is found here that a comment generates a total of three pieces of data

After judgment, the post data in the middle is parsed into the comment data just submitted

Next, let's analyze this packet in detail. First, we see the header information

Briefly, from the url parameters in the post packet, you can clearly see such as key and device_code and other words, it can be inferred that the url parameters are generated by huluxia App based on the device code and cookie s. There is no need to take time to solve them. After logging in, you can directly capture the package and obtain them. After testing, as long as you don't manually logout, the key and device generated in the first article_ Code sustainable use, let's next look at the body of the post request

Format it

We can see that there are six parameters to analyze next to each other

The first parameter is post_id, which is found to be the id of the comment post after filtering the data (key points, which will be used in the later comments)

The second parameter is comment_id, which is determined to be the comment floor after calculation and test. Count from 0. This parameter has little impact on the packet structure. Skip it

The third parameter is text. From the screenshot of the post data body, we can see that the source data is the comment content after URL coding, which needs to be used to brush the comment later

The fourth parameter, patcha, seems to be that if the comment frequency is too high, it will require the input of verification code (I haven't encountered it at present, it seems that it's because I haven't commented much? Skip it for the time being)

The fifth parameter, images, is not much to say. It is obviously a picture in the comment. You should not encounter it when you brush the comment. Skip it for the moment. You can study it if you need it

The sixth parameter remindUsers is determined to be the user name of the comment reply after testing. It is not very useful to brush the comment. It may be used if the automatic reply function is used. It is reserved for the time being

After the main body is roughly analyzed, we begin to construct python code

First, realize the basic comment function. The module we need to use is the requests module, which is written based on the python 3 structure. Students who use Python 2 remember to change it

The requests module should not be self-contained. Remember to install pip first

First import the requests module

import requests

This is a reference for constructing the header information of headers, because huluxian is often updated, and the specific content shall be subject to the data obtained from packet capturing

headers = {
    'Connection': 'close',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Content-Length': '106',
    'Host': 'floor.huluxia.com',
    'Accept-Encoding': 'gzip',
    'User-Agent': 'okhttp/3.8.1'
}

Next, set up comment links and post data

comment_url = "http://floor.huluxia.com/comment/create/ANDROID/2.0?platform=2&gkey=000000&app_version=4.1.1.4&versioncode=324&market_id=tool_web&_key=075B357C3926F30BCC22DF6DDF35BF834AEDEC0CEF7DF73857CBC415C5A4B60A3437D077C741C925CB867319372E116A265F603385488BA9&device_code=%5Bd%5Dd113ce76-27bb-469a-a8b5-0d704ea275a7&phone_brand_type=MI"
post_data = "post_id=47723375&comment_id=0&text=%E6%B5%8B%E8%AF%95%5B%E6%BB%91%E7%A8%BD%5D&patcha=&images=&remindUsers="

Finally, use the post method of the requests module to submit the data and get the return value The final code is as follows

import requests
#Set header information
headers = {
    'Connection': 'close',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Content-Length': '106',
    'Host': 'floor.huluxia.com',
    'Accept-Encoding': 'gzip',
    'User-Agent': 'okhttp/3.8.1'
}

#Set reply url and post data
post_data = "post_id=47723375&comment_id=0&text=%E6%B5%8B%E8%AF%95%5B%E6%BB%91%E7%A8%BD%5D&patcha=&images=&remindUsers="
comment_url = "http://floor.huluxia.com/comment/create/ANDROID/2.0?platform=2&gkey=000000&app_version=4.1.1.4&versioncode=324&market_id=tool_web&_key=075B357C3926F30BCC22DF6DDF35BF834AEDEC0CEF7DF73857CBC415C5A4B60A3437D077C741C925CB867319372E116A265F603385488BA9&device_code=%5Bd%5Dd113ce76-27bb-469a-a8b5-0d704ea275a7&phone_brand_type=MI"

#post Submit Comment data
response = requests.post(url=comment_url, data=post_data, headers=headers)
print(response.text)

Run it and return successfully

Comment success

After the basic functions are written, it's almost the same. Just after the implementation, send a comment To achieve the effect of brushing comments, we need to solve the following problems: continuous comments, too fast comment frequency, unable to comment on the same content continuously, etc The idea given here is to poll or loop statements, appropriately adjust the comment interval, random characters or call the robot api So we look forward to adding 100 million details. The code after adding details is like this

import requests
import random
import urllib.parse
import json
import time

# Set header information
headers = {
    'Connection': 'close',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Content-Length': '106',
    'Host': 'floor.huluxia.com',
    'Accept-Encoding': 'gzip',
    'User-Agent': 'okhttp/3.8.1'
}


# Generate random comments
def random_comment():
    post_id = random.randint(40000000, 50000000)
    comment_id = random.randint(1, 10)
    comment_api = "http://api.btstu.cn/yan/api.php?charset=utf-8"
    comment = urllib.parse.quote("%s" % comment_id + "[sleep][teacup]")
    # comment_info = requests.get(comment_api, verify=False)
    # comment = urllib.parse.quote("daily poisonous chicken soup time:" + comment_info.text + "[funny]")
    comment_data = "post_id=%d" % post_id + "&comment_id=0&text=%s" % comment + "&patcha=&images=&remindUsers="
    return comment_data, post_id, urllib.parse.unquote(comment)


def main():
    # Set reply url and post
    post_data = random_comment()
    comment_url = "http://floor.huluxia.com/comment/create/ANDROID/2.0?platform=2&gkey=000000&app_version=4.1.1.4&versioncode=324&market_id=tool_web&_key=6862A43A7126C3C5D7AB0BFE61BAE16FA5852495DFCFB4DCD5D1D34F3B9678F18C11DF30A4EAC58ED693FD5A1E4E7DCA522FA2837AB0F714&device_code=%5Bd%5Dd113ce76-27bb-469a-a8b5-0d704ea275a7&phone_brand_type=MI"
    post_url = "http://floor.huluxia.com/post/detail/ANDROID/4.1?post_id=%d" % post_data[1]

    # Get post information (1)
    post_response = requests.get(post_url)
    post_json_data = json.loads(post_response.text)

    # post Submit Comment data
    print(comment_url)
    print(post_data[0])
    print(headers)
    response = requests.post(url=comment_url, data=post_data[0], headers=headers)
    json_data = json.loads(response.text)
    if json_data['code'] == 104:
        print("----------------------------")
        print("Topic does not exist, skip this ID")
    else:
        # Get post information (2)
        post_info = post_json_data['post']
        post_category = post_info['category']
        # Return status information
        print("----------------------------")
        print("Comment status:" + json_data['msg'])
        print("Comments:" + post_data[2])
        print("Post name:" + post_info['title'])
        print("Section:" + post_category['title'])


for i in range(1, 99999):
    main()
    time.sleep(5)

It can be regarded as version 1.0 of this script However, after the script test a few days ago, I found that many id posts have the situation that the topic has been deleted or the topic does not exist, which greatly affects the efficiency in the process of executing the code Therefore, a script is needed to collect val id post IDs

First of all, we get the subject information of the post by grabbing the package. Finally, we get such a url

http://floor.huluxia.com/post/detail/ANDROID/4.1?post_id=1

This url returns a string of json structure data. Let's format it to see the structure

Let's look at a deleted post link

http://floor.huluxia.com/post/detail/ANDROID/4.1?post_id=112333

Finally, let's look at a non-existent post link

http://floor.huluxia.com/post/detail/ANDROID/4.1?post_id=112333231221321

Observe their features. The features returned by deleted and non-existent posts are easier to be captured by crawlers. We take these two cases as the judgment conditions Start to construct python code, first import the module we want to use and define the url

import requests
import json

url = "http://floor.huluxia.com/post/detail/ANDROID/4.1?post_id=%s"

Then we write a for loop and assign values from 1 to traverse the post id, and use json Loads parses the returned json data

for post_id in range(1, 50000000):
    post_data = requests.get(url % post_id)
    post_json_data = json.loads(post_data.text)

The following features can be clearly seen from the just pictures: If the post is deleted, the returned title is / topic has been deleted/ The post does not exist. One key value pair in the returned json is "code":104 Except for these two states, the rest is normal Thus, you can write an if elif loop to match the string

if '"code":104' in post_data.text:
    print("Posts ID: %d" % post_id + " Status: topic does not exist")
elif "/* Topic deleted */" in post_data.text:
    print("Posts ID: %d" % post_id + " Status: topic deleted")
else:
    print("Posts ID: %d" % post_id + " Status: this topic is normal")

When the "this topic is normal" condition is met, our write function writes the valid id to the text file By the way, another text file is continuously updated with the id of the loop to facilitate the next backtracking of the crawl progress The complete code is as follows:

import requests
import json

url = "http://floor.huluxia.com/post/detail/ANDROID/4.1?post_id=%s"

for post_id in range(29042, 50000000):
    post_data = requests.get(url % post_id)
    post_json_data = json.loads(post_data.text)
    # print(post_json_data)
    if '"code":104' in post_data.text:
        print("Posts ID: %d" % post_id + " Status: topic does not exist")
    elif "/* Topic deleted */" in post_data.text:
        print("Posts ID: %d" % post_id + " Status: topic deleted")
    else:
        print("Posts ID: %d" % post_id + " Status: this topic is normal")
        valid_id = open("valid_id.txt", "a")
        valid_id.write(str(post_id))
        valid_id.write("\n")
        valid_id.close()
        end_id = open("end_id.txt", "w")
        end_id.write(str(post_id))
        end_id.close()

After running the script, the valid ID will be output to the file valid_id.txt

The format is like this

Next, we can use the file generated by this script to automatically obtain the valid id, which improves the efficiency a lot After improvement, the following functions are added: Automatically get comments by reading the contents of the file. You can change the comment configuration file and add comment statements at any time Automatically identify the section of the post and select different comment configuration files (stupid method, write repeatedly, the code is a little miscellaneous, because it's too delicious, I hope someone can help me optimize, whining)

The complete code is as follows:

import requests
import random
import urllib.parse
import json
import time
import linecache

# Set account key link
comment_url = "http://floor.huluxia.com/comment/create/ANDROID/2.0?platform=2&gkey=000000&app_version=4.1.1.4&versioncode=324&market_id=tool_web&_key=6862A43A7126C3C5D7AB0BFE61BAE16FA5852495DFCFB4DCD5D1D34F3B9678F18C11DF30A4EAC58ED693FD5A1E4E7DCA522FA2837AB0F714&device_code=%5Bd%5Dd113ce76-27bb-469a-a8b5-0d704ea275a7&phone_brand_type=MI"

# Set header information
headers = {
    'Connection': 'close',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Content-Length': '106',
    'Host': 'floor.huluxia.com',
    'Accept-Encoding': 'gzip',
    'User-Agent': 'okhttp/3.8.1'
}


# Gets the number of text lines for random comments
def get_lines(filept):  # filept pass file path
    f = open(file=filept, encoding='utf-8')  # Adjust encoding code according to file content
    n = len(f.readlines())
    return n


# Get post ID
def get_post_id():
    lines = get_lines(filept="valid_id.txt")  # Cooperate with post_scan.py to read the collected valid post id
    id_2 = random.randint(1, lines)  # Random number of rows
    post_id = linecache.getline(filename="valid_id.txt", lineno=id_2)  # One post id per line
    return int(post_id)


# Get post information
def get_post_info(post_id):
    post_url = "http://floor.huluxia.com/post/detail/ANDROID/4.1?post_id=%d" % post_id
    post_response = requests.get(post_url)
    post_json_data = json.loads(post_response.text)
    return post_json_data


# Determine the section of the post
def category_decide(post_info):
    post = post_info['post']
    category = post['category']
    return category['title']


# Practical software comments are generated with the main intention of "thank you for sharing and take away"
def comment_syrj(post_id):
    post_id = post_id
    lines = get_lines(filept="comment/syrj.txt")  # Get the number of reply profile lines
    flag = random.randint(1, lines)  # With the next statement, randomly extract the number of lines from the reply configuration file
    comment = linecache.getline(filename="comment/syrj.txt", lineno=flag)
    comment_data = "post_id=%d" % post_id + "&comment_id=0&text=%s" % comment + "&patcha=&images=&remindUsers="  # Comment post data body
    return comment_data, post_id, urllib.parse.unquote(comment)


# Xinfan source comments are generated, mainly comments with the main intention of "very good-looking, thanks for sharing"
def comment_xfy(post_id):
    post_id = post_id
    lines = get_lines(filept="comment/xfy.txt")  # Get the number of reply profile lines
    flag = random.randint(1, lines)  # With the next statement, randomly extract the number of lines from the reply configuration file
    comment = linecache.getline(filename="comment/xfy.txt", lineno=flag)
    comment_data = "post_id=%d" % post_id + "&comment_id=0&text=%s" % comment + "&patcha=&images=&remindUsers="  # Comment post data body
    return comment_data, post_id, urllib.parse.unquote(comment)


# Pool comments are generated, mainly those with water
def comment_yc(post_id):
    post_id = post_id
    lines = get_lines(filept="comment/yc.txt")  # Get the number of reply profile lines
    flag = random.randint(1, lines)  # With the next statement, randomly extract the number of lines from the reply configuration file
    comment = linecache.getline(filename="comment/yc.txt", lineno=flag)
    comment_data = "post_id=%d" % post_id + "&comment_id=0&text=%s" % comment + "&patcha=&images=&remindUsers="  # Comment post data body
    return comment_data, post_id, urllib.parse.unquote(comment)


# Wishing comments are generated, mainly with the main intention of "thank you for sharing, wait for big guys to share"
def comment_xy(post_id):
    post_id = post_id
    lines = get_lines(filept="comment/xy.txt")  # Get the number of reply profile lines
    flag = random.randint(1, lines)  # With the next statement, randomly extract the number of lines from the reply configuration file
    comment = linecache.getline(filename="comment/xy.txt", lineno=flag)
    comment_data = "post_id=%d" % post_id + "&comment_id=0&text=%s" % comment + "&patcha=&images=&remindUsers="  # Comment post data body
    return comment_data, post_id, urllib.parse.unquote(comment)


# Comments on beautiful legs are generated, mainly praising and praising comments
def comment_mt(post_id):
    post_id = post_id
    lines = get_lines(filept="comment/mt.txt")  # Get the number of reply profile lines
    flag = random.randint(1, lines)  # With the next statement, randomly extract the number of lines from the reply configuration file
    comment = linecache.getline(filename="comment/mt.txt", lineno=flag)
    comment_data = "post_id=%d" % post_id + "&comment_id=0&text=%s" % comment + "&patcha=&images=&remindUsers="  # Comment post data body
    return comment_data, post_id, urllib.parse.unquote(comment)


# Game comments are generated, mainly comments with the main intention of "whether to play and position"
def comment_yx(post_id):
    post_id = post_id
    lines = get_lines(filept="comment/yx.txt")  # Get the number of reply profile lines
    flag = random.randint(1, lines)  # With the next statement, randomly extract the number of lines from the reply configuration file
    comment = linecache.getline(filename="comment/yx.txt", lineno=flag)
    comment_data = "post_id=%d" % post_id + "&comment_id=0&text=%s" % comment + "&patcha=&images=&remindUsers="  # Comment post data body
    return comment_data, post_id, urllib.parse.unquote(comment)


# Main function to send comment packet body
def main():
    post_id = get_post_id()
    post_info = get_post_info(post_id)
    # print(str(post_info))
    # print(type(post_info))
    if "'code': 104" in str(post_info):  # Determine whether the post is valid
        print("----------------------------")
        print("Topic does not exist, skip this ID")
    elif "/* Topic deleted */" in str(post_info):
        print("----------------------------")
        print("Topic deleted, skip this ID")
    else:  # If the above two special conditions are not met, the post content is normal. Proceed to the next step to construct the data package

        category = category_decide(post_info)  # Get the section to which the post belongs, and use it to select the reply profile of different sections
        post_data = post_info['post']  # Get post information for subsequent printing and return status

        if category == "Practical software" or category == "Mobile phone beautification" or category == "Game tutorial" or category == "Technology sharing" or category == "Original technology" or category == "Welfare activities" or category == "Third floor College" or category == "Avatar signature":  # Similar sections dominated by practical software sections
            comment_data = comment_syrj(post_id)  # Comment data
            response = requests.post(url=comment_url, data=comment_data[0].encode('utf-8'),
                                     headers=headers)  # Receive comment return packet
            response_json = json.loads(response.text)  # The comment return packet is converted to json/dict format for subsequent reading of key values
            # Return comment status information
            print("----------------------------")
            print("Comment status:" + response_json['msg'])
            print("Comments:" + comment_data[2])
            print("Post name:" + post_data['title'])
            print("Section:" + category)
        elif category == "Xinfanyuan" or category == "Three two shadows":
            comment_data = comment_xfy(post_id)  # Comment data
            response = requests.post(url=comment_url, data=comment_data[0].encode('utf-8'),
                                     headers=headers)  # Receive comment return packet
            response_json = json.loads(response.text)  # The comment return packet is converted to json/dict format for subsequent reading of key values
            # Return comment status information
            print("----------------------------")
            print("Comment status:" + response_json['msg'])
            print("Comments:" + comment_data[2])
            print("Post name:" + post_data['title'])
            print("Section:" + category)
        elif category == "Swimming pool" or category == "Entertainment world" or category == "Spoof" or category == "Model toy" or category == "Playstation":
            comment_data = comment_yc(post_id)  # Comment data
            response = requests.post(url=comment_url, data=comment_data[0].encode('utf-8'),
                                     headers=headers)  # Receive comment return packet
            response_json = json.loads(response.text)  # The comment return packet is converted to json/dict format for subsequent reading of key values
            # Return comment status information
            print("----------------------------")
            print("Comment status:" + response_json['msg'])
            print("Comments:" + comment_data[2])
            print("Post name:" + post_data['title'])
            print("Section:" + category)
        elif category == "Make a wish" or category == "Drawing workshop":
            comment_data = comment_xy(post_id)  # Comment data
            response = requests.post(url=comment_url, data=comment_data[0].encode('utf-8'),
                                     headers=headers)  # Receive comment return packet
            response_json = json.loads(response.text)  # The comment return packet is converted to json/dict format for subsequent reading of key values
            # Return comment status information
            print("----------------------------")
            print("Comment status:" + response_json['msg'])
            print("Comments:" + comment_data[2])
            print("Post name:" + post_data['title'])
            print("Section:" + category)
        elif category == "charming legs" or category == "Dimensional Pavilion" or category == "selfie":
            comment_data = comment_mt(post_id)  # Comment data
            response = requests.post(url=comment_url, data=comment_data[0].encode('utf-8'),
                                     headers=headers)  # Receive comment return packet
            response_json = json.loads(response.text)  # The comment return packet is converted to json/dict format for subsequent reading of key values
            # Return comment status information
            print("----------------------------")
            print("Comment status:" + response_json['msg'])
            print("Comments:" + comment_data[2])
            print("Post name:" + post_data['title'])
            print("Section:" + category)
        elif category == "game" or category == "Steam" or category == "LOL Mobile Games" or category == "Glory of Kings" or category == "DNF Mobile Games" or category == "Game for Peace" or category == "Call of duty mobile game" or category == "Cross Fire" or category == "Original God" or category == "Crazyracing Kartrider" or category == "QQ Flying car" or category == "Big ball battle" or category == "Minecraft" or category == "League of Heroes" or category == "DungeonFighter" or category == "After tomorrow" or category == "Cool running every day" or category == "Xinyou recommendation":
            comment_data = comment_yx(post_id)  # Comment data
            response = requests.post(url=comment_url, data=comment_data[0].encode('utf-8'),
                                     headers=headers)  # Receive comment return packet
            response_json = json.loads(response.text)  # The comment return packet is converted to json/dict format for subsequent reading of key values
            # Return comment status information
            print("----------------------------")
            print("Comment status:" + response_json['msg'])
            print("Comments:" + comment_data[2])
            print("Post name:" + post_data['title'])
            print("Section:" + category)
        else:
            print("----------------------------")
            print("No predefined sections:" + category + ",skip")


for i in range(1000):
    main()
    time.sleep(6)

The specific function descriptions are in the notes, so I won't explain more The comment configuration file is mainly the file name read by the comment parameter and placed in the comment directory under the same directory. The configuration file needs to be stored in utf-8 code, otherwise the script may report an error

I will put all the code in github, and there may be updates in the future. github project link: https://github.com/Y5neKO/huluxia_auto_comment

Added by bonekrusher on Thu, 13 Jan 2022 07:44:28 +0200