Python learning notes: 6 bad habits of code performance

1, Background

There are often many different ways to achieve a data requirement, but there are still some differences in speed after all.

2, Bad habits

1. Do not import the root directory

Whether it is a built-in module or a third-party module, it must be imported before use.

If we only need some of these functions, we can import them separately.

## slower 
import math
%%timeit
math.sqrt(100)
# 171 ns ± 16.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

## Faster
from math import sqrt
%%timeit
sqrt(100)
# 120 ns ± 6.34 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

2. Avoid using point / point chains

When accessing the properties or functions of an object, using dot is very intuitive, but avoiding it may have better performance.

# Append an element, and then delete the element
my_list = [1,2,3]

## slower 
%%timeit
my_list.append(4)
my_list.remove(4)
# 306 ns ± 38.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

## Faster
append = my_list.append
remove = my_list.remove
%%timeit
append(4)
remove(4)
# 189 ns ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

The syntax is a little intuitive, but if we delete it millions of times, we can consider using this technique.

You need to balance the performance and readability of your code.

3. Do not use "+" connection string

Strings are immutable in python.

When using "+" for splicing, each substring is operated separately.

strs = ['Life', 'is', 'short', ',', 'I', 'FUCK']

## slower 
def join_strs(strs):
    result = ''
    for s in strs:
        result += ' ' + s  # result = result + ' ' + s
    return result[1:] # Delete first space
%%timeit
join_strs(strs) # 'Life is short , I FUCK'
# 1.23 µs ± 59.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

## Faster
def join_strs_better(strs):
    return ' '.join(strs)
%%timeit
join_strs_better(strs)
# 327 ns ± 55.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

For each substring, you need to apply for a memory address and then connect it with the original string, which becomes a very extravagant overhead.

The join() function knows all substrings in advance and allocates the address space in the most appropriate way.

4. Do not use temporary variables for value exchange

Python's built-in syntax supports value exchange.

## slower 
%%timeit
a = 1
b = 2
temp = a
a = b
b = temp
# 72.1 ns ± 7.31 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

## Faster
%%timeit
a = 1
b = 2
a, b = b, a
# 65.8 ns ± 8.37 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

5. Use if condition short circuit

"Short circuit" evaluation exists in many programming languages, as does Python.

Basically, it refers to the behavior of some Boolean operators in which the second parameter is executed or evaluated only if the first parameter is insufficient to determine the value of the entire expression.

# Filter list: starting with C, older than 30 years old
my_dict = [{'name': 'Alice', 'age': 28},
           {'name': 'Bob', 'age': 23},
           {'name': 'Chris', 'age': 33},
           {'name': 'Chelsea', 'age': 2},
           {'name': 'Carol', 'age': 24}]

## slower 
%%timeit
filtered_list = []
for person in my_dict:
    if person['name'].startswith('C') and person['age'] >= 30:
        filtered_list.append(person) 
# 1.68 µs ± 45.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

## Faster
%%timeit
filtered_list = []
for person in my_dict:
    if person['age'] >= 30 and person['name'].startswith('C'):
        filtered_list.append(person)
# 973 ns ± 57.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) 

When you know the approximate distribution of data, you can control the filtering speed before and after setting the filtering conditions.

6. Try to use the for loop instead of the while loop

Python uses a lot of CPython to improve performance.

For Loop has relatively few steps and better performance.

## slower 
%%timeit
result = 0
max_number = 10

i = 0
while i < max_number:
    result += i
    i += 1
# 1.51 µs ± 63.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

## Faster
%%timeit
result = 0
max_number = 10

for i in range(max_number):
    result += i
# 1.04 µs ± 48.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

3, Summary

Comprehensively consider the performance, readability and brevity, and review the code.

Reference link: Shot? Six bad habits that slow down your Python program

Keywords: Python

Added by jaz529 on Mon, 10 Jan 2022 06:06:49 +0200