Learn a few very useful Python libraries in a minute

Python has always been called "batteries included" because it has built-in many very useful modules, which can be used directly without additional installation and configuration.

In addition to the built-in modules, python also has a large number of third-party modules, which can be installed directly using pip. Let's give you a brief introduction to several Python very practical built-in libraries and third-party libraries.

Own library

1, datetime

datetime is Python's standard library for processing dates and times.

1. Get current date and time

>>> from datetime import datetime

>>> now = datetime.now()

>>> print(now)

2021-06-14 09:33:10.460192

>>> print(type(now))

<class 'datetime.datetime'>

2. Gets the specified date and time

>>> from datetime import datetime

>>> dt = datetime(2021,6,10,12,0)

>>> print(dt)

2021-06-10 12:00:00

3. datetime to timestamp

In a computer, time is actually expressed in numbers. We call the time of 00:00:00 UTC+00:00 time zone on January 1, 1970 as epoch time, which is recorded as 0 (the time timestamp before 1970 is negative). The current time is the number of seconds relative to epoch time, which is called timestamp.

>>> from datetime import datetime

>>> now = datetime.now()

>>> now

datetime.datetime(2021, 6, 14, 9, 38, 34, 969006)

>>> now.timestamp()	#Convert datetime to timestamp

1623634714.969006

4. timestamp to datetime

>>> from datetime import datetime

>>> timestamp = 1623634714.969006

>>> print(datetime.fromtimestamp(timestamp))

2021-06-14 09:38:34.969006

5. str to datetime

>>> from datetime import datetime

>>> day = datetime.strptime('2021-6-10 12:12:12','%Y-%m-%d %H:%M:%S')

>>> print(day)

2021-06-10 12:12:12

6. datetime to str

>>> from datetime import datetime

>>> now = datetime.now()

>>> print(now)

2021-06-14 09:49:02.281820

>>> print(type(now))

<class 'datetime.datetime'>

>>> str_day = now.strftime('%Y-%m-%d %H:%M:%S')

>>> print(str_day)

2021-06-14 09:49:02

>>> print(type(str_day))

<class 'str'>

2, collections

collections is a collection module built in Python. It provides many useful collection classes, among which the statistical function is very practical.

Counter

Counter is a simple counter, for example, counting the number of characters

>>> from collections import Counter

>>> c = Counter()

>>> str = 'jdkjefwnewnfjqbefbqbefqbferbb28934`83278784727'

>>> c.update(str)

>>> c

Counter({'b': 6, 'e': 5, 'f': 5, '8': 4, '7': 4, 'j': 3, 'q': 3, '2': 3, 'w': 2, 'n': 2, '3': 2, '4': 2, 'd': 1, 'k': 1, 'r': 1, '9': 1, '`': 1})

3, base64

Base64 is a method of representing arbitrary binary data with 64 characters.

When opening exe, jpg and pdf files with Notepad, we will see a lot of random code, because binary files contain many characters that cannot be displayed and printed. Therefore, if text processing software such as Notepad can process binary data, we need a binary to string conversion method. Base64 is one of the most common binary encoding methods.

>>> import base64
>>> base64.b64encode(b'binary\x00string')
b'YmluYXJ5AHN0cmluZw=='
>>> base64.b64decode(b'YmluYXJ5AHN0cmluZw==')
b'binary\x00string'

4, hashlib

Python's hashlib provides common summarization algorithms, such as MD5, SHA1 and so on.

What is a digest algorithm? The algorithm is also called hash algorithm and hash algorithm. It converts any length of data into a fixed length data string (usually represented by hexadecimal string) through a function.

Taking the common summary algorithm MD5 as an example, we calculate the MD5 value of a string:

>>> import hashlib

>>> md5 = hashlib.md5()

>>> md5.update("Programmer Tang Ding".encode('utf-8'))

>>> print(md5.hexdigest())

05eb21a61d2cf0cf84e474d859c4c055

Where can the summary algorithm be applied? For example:

Any website that allows users to log in will store the user name and password of the user. How to store user name and password? The method is to save to the database table. If the user password is saved in clear text, if the database is leaked, the passwords of all users will fall into the hands of hackers. In addition, the website operation and maintenance personnel can access the database, that is, they can obtain the passwords of all users. The correct way to save the password is not to store the user's plaintext password, but to store the summary of the account order, such as MD5. When the user logs in, first calculate the MD5 of the plaintext password entered by the user, and then compare it with the MD5 stored in the database. If it is consistent, it indicates that the password is entered correctly. If it is inconsistent, the password must be wrong.

Third party Library

1, requests

requests is a Python third-party library, which is particularly convenient for processing URL resources. In the previous "Introduction to reptile" article, we have a preliminary understanding of it.

1. Install requests

If Anaconda is installed, requests is already available. Otherwise, you need to install through pip on the command line:

$ pip install requests

If you encounter Permission denied installation failure, please add sudo and try again.

2. To access Douban homepage through GET, you only need a few lines of code:

>>> import requests
>>> r = requests.get('https://www.douban.com / '# Douban Homepage
>>> r.status_code
200
>>> r.text
r.text
'<!DOCTYPE HTML>\n<html>\n<head>\n<meta name="description" content="Provide recommendations, comments and suggestions on books, films and music records...'

3. For URL s with parameters, pass in a dict as params parameter:

>>> r = requests.get('https://www.douban.com/search', params={'q': 'python', 'cat': '1001'})
>>> r.url # Actual requested URL
'https://www.douban.com/search?q=python&cat=1001'

4. requests automatically detects the encoding. You can use the encoding attribute to view:

>>> r.encoding
'utf-8'

5. Whether the response is text or binary content, we can use the content attribute to obtain the bytes object:

>>> r.content
b'<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8">\n...'

6. The convenience of requests is that for specific types of responses, such as JSON, you can directly obtain:

>>> r = requests.get('https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20weather.forecast%20where%20woeid%20%3D%202151330&format=json')
>>> r.json()
{'query': {'count': 1, 'created': '2017-11-17T07:14:12Z', ...

7. When the HTTP Header needs to be passed in, we pass in a dict as the header parameter:

>>> r = requests.get('https://www.douban.com/', headers={'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 11_0 like Mac OS X) AppleWebKit'})
>>> r.text
'<!DOCTYPE html>\n<html>\n<head>\n<meta charset="UTF-8">\n <title>Watercress(Mobile version)</title>...'

8. To send a POST request, you only need to change the get() method to post(), and then pass in the data parameter as the data of the POST request:

>>> r = requests.post('https://accounts.douban.com/login', data={'form_email': 'abc@example.com', 'form_password': '123456'})

9. requests uses application/x-www-form-urlencoded to encode POST data by default. If you want to pass JSON data, you can directly pass in JSON parameters:

params = {'key': 'value'}
r = requests.post(url, json=params) # Internal automatic serialization to JSON

10. Similarly, uploading files requires a more complex encoding format, but requests simplifies it to the files parameter:

>>> upload_files = {'file': open('report.xls', 'rb')}
>>> r = requests.post(url, files=upload_files)

When reading a file, be sure to use 'rb', that is, binary mode, so that the length of bytes obtained is the length of the file.

Replace the post() method with put(), delete() and so on to request resources in the form of PUT or DELETE.

11. In addition to easily obtaining the response content, requests is also very simple to obtain other information of HTTP response. For example, get the response header:

>>> r.headers
{Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Content-Encoding': 'gzip', ...}
>>> r.headers['Content-Type']
'text/html; charset=utf-8'

12. requests makes special processing for cookies, so that we can easily obtain the specified cookies without parsing the cookies:

>>> r.cookies['ts']
'example_cookie_12345'

13. To pass in cookies in the request, just prepare a dict incoming Cookie parameter:

>>> cs = {'token': '12345', 'status': 'working'}
>>> r = requests.get(url, cookies=cs)

14. Finally, to specify the timeout, pass in the timeout parameter in seconds:

>>> r = requests.get(url, timeout=2.5) # Timeout after 2.5 seconds

2, chardet

String encoding has always been a headache, especially when dealing with some non-standard third-party Web pages. Although Python provides two data types, str and bytes, which are expressed in Unicode, and can be converted through the encode() and decode() methods, it is not easy to decode() bytes without knowing the encoding.

For unknown encoded bytes, to convert them into str, you need to "guess" the encoding first. The way of guessing is to collect various coded characteristic characters first, and judge according to the characteristic characters, you can have a great probability of "guessing right".

Of course, we certainly can't write the function of detection coding from scratch, which is time-consuming and laborious. The third-party library, chardet, comes in handy. Use it to detect coding, simple and easy to use.

1. Install chardet

If Anaconda is installed, chardet is already available. Otherwise, you need to install through pip on the command line:

$ pip install chardet

If you encounter Permission denied installation failure, please add sudo and try again.

2. When we get a byte, we can detect and encode it. To detect the code with chardet, only one line of code is required:

>>> chardet.detect(b'Hello, world!')
{'encoding': 'ascii', 'confidence': 1.0, 'language': ''}

3, psutil

Using Python to write scripts to simplify daily operation and maintenance work is an important use of Python. Under Linux, there are many system commands that allow us to monitor the running state of the system at all times, such as ps, top, free and so on. To get these system information, python can call through the subprocess module and get the results. But it is very troublesome to do so, especially to write a lot of parsing code.

Another good way to obtain system information in Python is to use psutil, a third-party module. It can not only realize system monitoring through one or two lines of code, but also be used across platforms. It supports Linux / UNIX / OSX / Windows and so on. It is an indispensable module for system administrators and operation and maintenance partners.

1. Install psutil

If Anaconda is installed, psutil is already available. Otherwise, you need to install through pip on the command line:

$ pip install psutil

If you encounter Permission denied installation failure, please add sudo and try again.

2. Get CPU Information

Let's get the CPU information first:

>>> import psutil
>>> psutil.cpu_count() # Number of CPU logic
4
>>> psutil.cpu_count(logical=False) # CPU physical core
2
# 2 indicates dual core hyper threading, and 4 indicates 4-core non hyper threading

3. Statistics of CPU user / system / idle time:

>>> psutil.cpu_times()
scputimes(user=10963.31, nice=0.0, system=5138.67, idle=356102.45)

4. Get memory information

Use psutil to obtain physical memory and exchange memory information, respectively:

>>> psutil.virtual_memory()
svmem(total=8589934592, available=2866520064, percent=66.6, used=7201386496, free=216178688, active=3342192640, inactive=2650341376, wired=1208852480)
>>> psutil.swap_memory()
sswap(total=1073741824, used=150732800, free=923009024, percent=14.0, sin=10705981440, sout=40353792)

The returned is an integer in bytes. You can see that the total memory size is 8589934592 = 8 GB. 7201386496 = 6.7 GB has been used, accounting for 66.6%.

The swap size is 1073741824 = 1 GB.

5. Get disk information

You can obtain disk partition, disk utilization and disk IO information through psutil:

>>> psutil.disk_partitions() # Disk partition information
[sdiskpart(device='/dev/disk1', mountpoint='/', fstype='hfs', opts='rw,local,rootfs,dovolfs,journaled,multilabel')]
>>> psutil.disk_usage('/') # Disk usage
sdiskusage(total=998982549504, used=390880133120, free=607840272384, percent=39.1)
>>> psutil.disk_io_counters() # Disk IO
sdiskio(read_count=988513, write_count=274457, read_bytes=14856830464, write_bytes=17509420032, read_time=2228966, write_time=1618405)

Well, let's introduce so many first. More practical Python libraries will be introduced one by one by Tang Ding

Keywords: Python

Added by blyz on Sun, 30 Jan 2022 06:18:08 +0200