Python crawler (15)
Learning Python crawler experience in the process and knowledge point arrangement, convenient for my own search, also hope to be able to communicate with you.
——Thread library practice continuous output and fast scan port——
Article directory
1. Continuous output
We use the subprocess library to output the target continuously:
#coding:utf-8 import subprocess def test1(ip='127.0.0.1'): #Return output results once p=subprocess.Popen("ping -c 4 " + ip, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) returncode = p.poll() #stdoutdata, stderrdata = p.communicate() p.wait() stdoutdata = p.stdout.read() print (stdoutdata) return returncode if __name__ == "__main__": test1()
The first method is the method we used in the application details, which is to output the results after all the subprocesses are finished.
Let's look at method 2:
def test2(ip='127.0.0.1'): #Get the return result in real time and output p = subprocess.Popen("ping -c 4 " + ip, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) #p.stdin.write(ip) returncode = p.poll() while returncode is None: line = p.stdout.readline() print (returncode) returncode = p.poll() line = line.strip() print (line) print (returncode) if __name__ == "__main__": test2()
The difference between method 2 and method 1 is that method 2 outputs once every time it ends, and does not need to wait for all subprocesses to end in the output. In the above example, the comparison is not obvious, but when a large number of inputs and outputs are carried out, the second method is recommended, which can get part of the results quickly, and can also terminate in time if there is a problem.
The second method is to see if the sub process is still running at any time during each cycle. If it is still running, it will output once, and know that the sub process is finished.
2. Fast scan port
This time, we will link the subprocess Library of Python 3 with the namp to scan the port quickly. For the use of namp, you can click namp tutorial See.
import subprocess import re import threading import time PORTS = "22,53,445,3306,8080" IPS = [ '127.0.0.1', '192.168.99.1' ] #Port scan for a single ip def getports(ip): print (ip) cmd = "nmap -v --open -T4 -p " + PORTS + ' ' + ip p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) out, err = p.communicate() for i in out.split('\n'): if i.strip() == "": continue try: r = re.match('\d+\/\w+\s+open\s+\S+', i.strip()) print (r.group()) except Exception,e: pass #Main process def main(): threads = [] for ip in IPS: t = threading.Thread(target=getports, args=(ip,)) threads.append(t) n = 0 while True: if n == len(threads): break thread = threads[n] if threading.activeCount() < 2: print ('started: ' + str(thread)) thread.start() else: n = n - 1 n = n + 1 print (n) time.sleep(5) for i in threads: i.join() if __name__ == "__main__": main()