1. Confusing operation
This section compares some of Python's confusing operations.
1.1 random sampling with return and random sampling without return
import random random.choices(seq, k=1) # The list with length k is put back for sampling random.sample(seq, k) # list with length k, no return sampling
1.2 parameters of lambda function
func = lambda y: x + y # The value of x is bound when the function runs func = lambda y, x=x: x + y # The value of x is bound when the function is defined
1.3 copy and deepcopy
import copy y = copy.copy(x) # Copy only the top layer y = copy.deepcopy(x) # Copy all nested parts
When replication is combined with variable alias, it is easy to be confused:
a = [1, 2, [3, 4]] # Alias. b_alias = a assert b_alias == a and b_alias is a # Shallow copy. b_shallow_copy = a[:] assert b_shallow_copy == a and b_shallow_copy is not a and b_shallow_copy[2] is a[2] # Deep copy. import copy b_deep_copy = copy.deepcopy(a) assert b_deep_copy == a and b_deep_copy is not a and b_deep_copy[2] is not a[2]
The modification of the alias will affect the original variable. The elements in the (shallow) copy are the aliases of the elements in the original list, while the deep copy is a recursive copy. The modification of the deep copy does not affect the original variable.
1.4 = = and is
x == y # Whether the two reference objects have the same value x is y # Whether two references point to the same object
1.5 judgment type
type(a) == int # Ignoring polymorphism in object-oriented design isinstance(a, int) # The polymorphism in object-oriented design is considered
1.6 string search
str.find(sub, start=None, end=None); str.rfind(...) # - 1 if not found str.index(sub, start=None, end=None); str.rindex(...) # Throw ValueError exception if not found
1.7 List backward index
This is just a matter of habit. The subscript starts from 0 when the forward index is used. If the reverse index also wants to start from 0, it can be used ~.
print(a[-1], a[-2], a[-3]) print(a[~0], a[~1], a[~2])
2. C/C + + User Guide
Many Python users have migrated from the previous C/C + +. The two languages are somewhat different in syntax and code style. This section briefly introduces them.
2.1 large numbers and small numbers
The habit of C/C + + is to define a large number. There are inf and - inf in Python:
a = float('inf') b = float('-inf')
2.2 Boolean
The habit of C/C + + is to use 0 and non-0 values to represent True and False. Python recommends using True and False directly to represent Boolean values.
a = True b = False
2.3 judgment is null
The habit of C/C + + in judging null pointers is if (a) and if (!a). Python's judgment on None is:
if x is None: pass
If if not x is used, other objects (such as strings with length of 0, lists, tuples, dictionaries, etc.) will be treated as False.
2.4 exchange value
The habit of C/C + + is to define a temporary variable to exchange values. Using the Tuple operation of Python, it can be achieved in one step.
a, b = b, a
2.5 comparison
The habit of C/C + + is to use two conditions. Python can be used in one step.
if 0 < a < 5: pass
2.6 Set and Get of class members
The habit of C/C + + is to Set class members to private and access their values through a series of Set and Get functions. In Python, although the corresponding Set and Get functions can also be Set through @ property, @ setter, @ delete, we should avoid unnecessary abstraction, which will be 4 - 5 times slower than direct access.
2.7 input and output parameters of function
The habit of C/C + + is to list the input and output parameters as the parameters of the function, change the value of the output parameters through the pointer, the return value of the function is the execution state, and the function caller checks the return value to judge whether it is successfully executed. In Python, there is no need for the function caller to check the return value. In case of special circumstances in the function, an exception is thrown directly.
2.8 reading documents
Compared with C/C + +, it is much easier for Python to read files. The opened file is an iteratable object that returns one line of content at a time.
with open(file_path, 'rt', encoding='utf-8') as f: for line in f: print(line) # The end of \ n will be retained
2.9 file path splicing
The habit of C/C + + is usually to splice paths directly with +, which is easy to make mistakes. OS in Python path. Join will automatically supplement the separator between paths according to different operating systems:
import os os.path.join('usr', 'lib', 'local')
2.10 parsing command line options
Although you can also use sys in Python like C/C + + Argv directly parses the command line selection, but it is more convenient and powerful to use the ArgumentParser tool under argparse.
2.11 calling external commands
Although OS. OS can also be used in Python like C/C + + System calls external commands directly, but uses subprocess check_ Output can freely choose whether to execute Shell or not, and can also obtain the execution results of external commands.
import subprocess # If the return value of the external command is not 0, a subprocess. Is thrown Calledprocesserror exception result = subprocess.check_output(['cmd', 'arg1', 'arg2']).decode('utf-8') # Collect standard output and standard error at the same time result = subprocess.check_output(['cmd', 'arg1', 'arg2'], stderr=subprocess.STDOUT).decode('utf-8') # Execute shell commands (pipeline, redirection, etc.), you can use shlex Quote() encloses the parameter in double quotes result = subprocess.check_output('grep python | wc > out', shell=True).decode('utf-8')
2.12 no repeated wheel making
Don't build wheels repeatedly. Python is called batteries included, which means that Python provides solutions to many common problems.
3. Common tools
3.1 reading and writing CSV files
import csv # Reading and writing without header with open(name, 'rt', encoding='utf-8', newline='') as f: # newline = '' let Python not handle line breaks uniformly for row in csv.reader(f): print(row[0], row[1]) # The data read from CSV is of str type with open(name, mode='wt') as f: f_csv = csv.writer(f) f_csv.writerow(['symbol', 'change']) # Read and write with header with open(name, mode='rt', newline='') as f: for row in csv.DictReader(f): print(row['symbol'], row['change']) with open(name, mode='wt') as f: header = ['symbol', 'change'] f_csv = csv.DictWriter(f, header) f_csv.writeheader() f_csv.writerow({'symbol': xx, 'change': xx})
Note that when the CSV file is too large, an error will be reported:_ csv.Error: field larger than field limit (131072), solved by modifying the upper limit
import sys csv.field_size_limit(sys.maxsize)
csv can also read data divided by \ t
f = csv.reader(f, delimiter='\t')
3.2 iterator tools
itertools defines many iterator tools, such as subsequence tools:
import itertools itertools.islice(iterable, start=None, stop, step=None) # islice('ABCDEF', 2, None) -> C, D, E, F itertools.filterfalse(predicate, iterable) # Filter out elements whose predicate is False # filterfalse(lambda x: x < 5, [1, 4, 6, 4, 1]) -> 6 itertools.takewhile(predicate, iterable) # Stop iteration when predicate is False # takewhile(lambda x: x < 5, [1, 4, 6, 4, 1]) -> 1, 4 itertools.dropwhile(predicate, iterable) # Start iteration when predicate is False # dropwhile(lambda x: x < 5, [1, 4, 6, 4, 1]) -> 6, 4, 1 itertools.compress(iterable, selectors) # selectors select according to whether each element is True or False # compress('ABCDEF', [1, 0, 1, 0, 1, 1]) -> A, C, E, F
Sequence sorting:
sorted(iterable, key=None, reverse=False) itertools.groupby(iterable, key=None) # Grouped by value, iterable needs to be sorted first # groupby(sorted([1, 4, 6, 4, 1])) -> (1, iter1), (4, iter4), (6, iter6) itertools.permutations(iterable, r=None) # The return value is Tuple # permutations('ABCD', 2) -> AB, AC, AD, BA, BC, BD, CA, CB, CD, DA, DB, DC itertools.combinations(iterable, r=None) # Combination, the return value is Tuple itertools.combinations_with_replacement(...) # combinations('ABCD', 2) -> AB, AC, AD, BC, BD, CD
Merge multiple sequences:
itertools.chain(*iterables) # Direct splicing of multiple sequences # chain('ABC', 'DEF') -> A, B, C, D, E, F import heapq heapq.merge(*iterables, key=None, reverse=False) # Multiple sequences are spliced in sequence # merge('ABF', 'CDE') -> A, B, C, D, E, F zip(*iterables) # When the shortest sequence is exhausted, it stops and the result can only be consumed once itertools.zip_longest(*iterables, fillvalue=None) # When the longest sequence is exhausted, it stops and the result can only be consumed once
3.3 counter
The counter counts the number of occurrences of each element in an iteratable object.
import collections # establish collections.Counter(iterable) # frequency collections.Counter[key] # key occurrence frequency # Returns the n elements with the highest occurrence frequency and their corresponding occurrence frequency. If n is None, returns all elements collections.Counter.most_common(n=None) # Insert / update collections.Counter.update(iterable) counter1 + counter2; counter1 - counter2 # counter addition and subtraction # Check whether the constituent elements of two strings are the same collections.Counter(list1) == collections.Counter(list2)
3.4 Dict with default value
When accessing a Key that does not exist, defaultdict will set it to a default value.
import collections collections.defaultdict(type) # When accessing dict[key] for the first time, type will be called without parameters to provide an initial value for dict[key]
3.5 ordered Dict
import collections collections.OrderedDict(items=None) # Preserve the original insertion order during iteration
4. High performance programming and debugging
4.1 output error and warning information
Output information to standard error
import sys sys.stderr.write('')
Output warning information
import warnings warnings.warn(message, category=UserWarning) # The values of category include deprecationwarning, syntax warning, runtimewarning, resourcewarning and futurewarning
Controls the output of warning messages
$ python -W all # Output all warnings, which is equivalent to setting warnings simplefilter('always') $ python -W ignore # Ignoring all warnings is equivalent to setting warnings simplefilter('ignore') $ python -W error # Converting all warnings to exceptions is equivalent to setting warnings simplefilter('error')
4.2 testing in code
Sometimes, in order to debug, we want to add some code to the code, usually some print statements, which can be written as:
# In the debug part of the code if __debug__: pass
Once debugging is completed, this part of the code will be ignored by executing the - O option on the command line:
$ python -0 main.py
4.3 code style check
Using pylint, you can check a lot of code style and syntax, and find some errors before running
pylint main.py
4.4 code time consuming
Time consuming test
$ python -m cProfile main.py
Testing a block of code takes time
# Code block definition from contextlib import contextmanager from time import perf_counter @contextmanager def timeblock(label): tic = perf_counter() try: yield finally: toc = perf_counter() print('%s : %s' % (label, toc - tic)) # Time consuming testing of code blocks with timeblock('counting'): pass
Some principles of code time-consuming optimization
- Focus on optimizing where performance bottlenecks occur, not all code.
- Avoid using global variables. The search of local variables is faster than that of global variables. It is usually 15% - 30% faster to define the code of global variables in functions.
- Avoid using Access properties. Using from module import name will be faster, and the member variable of the frequently accessed class self Put member into a local variable.
- Try to use built-in data structures. STR, list, set and dict are implemented in C and run very fast.
- Avoid creating unnecessary intermediate variables, and copy deepcopy().
- String splicing, such as a + ':' + b + ':' + c, will create a large number of useless intermediate variables, and the efficiency of ':', join([a, b, c]) will be much higher. In addition, it is necessary to consider whether string splicing is necessary. For example, the efficiency of print(':'.join([a, b, c]) is lower than that of print(a, b, c, sep = ':').
5. Other skills
5.1 argmin and argmax
items = [2, 1, 3, 4] argmin = min(range(len(items)), key=items.__getitem__)
The same is true for argmax.
5.2 transpose 2D list
A = [['a11', 'a12'], ['a21', 'a22'], ['a31', 'a32']] A_transpose = list(zip(*A)) # list of tuple A_transpose = list(list(col) for col in zip(*A)) # list of list
5.3 expand one-dimensional list into two-dimensional list
A = [1, 2, 3, 4, 5, 6] # Preferred. list(zip(*[iter(A)] * 2))
Author: Zhang Hao https://zhuanlan.zhihu.com/p/48293468
Finally, let's talk about technology. Let's talk about it together.
Long press scan code to chat together