[Liao Xuefeng python tutorial learning] - advanced features: list generation, generator and iterator

Continue with an advanced feature - slicing and iteration

1, List generation

List Comprehensions is a very simple but powerful built-in Python generator that can be used to create lists.

1. Basic use of list generation

1. Use list(range(1,11)) to generate list: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>> list(range(1,11))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

2. Generate [1x1, 2x2, 3X3,..., 10x10] with list generation formula

>>> [x * x for x in range(1,11)]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

if judgment can also be added after the for loop, so that we can filter out only even squares

>>> [x * x for x in range(1,11) if x % 2 == 0]
[4, 16, 36, 64, 100]

Generate full permutation using two-layer loop

>>> [m + n for m in 'ABC' for n in 'XYZ']
['AX', 'AY', 'AZ', 'BX', 'BY', 'BZ', 'CX', 'CY', 'CZ']

Using list generation, you can write very concise code. List all the file and directory names in the current directory, which can be realized by one line of code:

>>> import os # Import os module. The concept of module will be described later
>>> [d for d in os.listdir('.')] # os.listdir can list files and directories
['.emacs.d', '.ssh', '.Trash', 'Adlm', 'Applications', 'Desktop', 'Documents', 'Downloads', 'Library', 'Movies', 'Music', 'Pictures', 'Public', 'VirtualBox VMs', 'Workspace', 'XCode']

The for loop can use two or more variables at the same time, so the list generator can also use two variables to generate a list:

>>> d = {'x':'A','y':'S', 'z':'C'}
>>> [k + '=' + v for k, v in d.items()]
['x=A', 'y=S', 'z=C']

Make all strings in a list lowercase

>>> L = ['Hello', 'World', 'IBM', 'Apple']
>>> [s.lower() for s in L]
['hello', 'world', 'ibm', 'apple']

2. if... else in the list expression

if followed by for is a filter condition and cannot contain else

>>> [x for x in range(1, 11) if x % 2 == 0 else 0]
  File "<stdin>", line 1
    [x for x in range(1, 11) if x % 2 == 0 else 0]
                                           ^
SyntaxError: invalid syntax

if is written before for, else must be added, otherwise an error will be reported;

>>> [x if x % 2 == 0 for x in range(1, 11)]
  File "<stdin>", line 1
    [x if x % 2 == 0 for x in range(1, 11)]
                       ^
SyntaxError: invalid syntax

Because for is preceded by an expression, a result must be calculated according to x, so else must be added

>>> [x if x % 2 == 0 else -x for x in range(1, 11)]
[-1, 2, -3, 4, -5, 6, -7, 8, -9, 10]

It can be seen that in a list generation formula, if before for Else is an expression, while if after for is a filter condition, and else cannot be taken.

3. Exercise: make the string in the list lowercase

If the list contains both strings and integers, the list generator will report an error because the non string type does not have the lower() method:

>>> L = ['Hello', 'World', 18, 'Apple', None]
>>> [s.lower() for s in L]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <listcomp>
AttributeError: 'int' object has no attribute 'lower'

Use the built-in isinstance function to judge whether a variable is a string:

>>> x = 'abc'
>>> y = 123
>>> isinstance(x, str)
True
>>> isinstance(y, str)
False

Please modify the list generation formula to ensure the correct execution of the list generation formula by adding an if statement:

L1 = ['Hello', 'World', 18, 'Apple', None]
#Consider creating a new list, filtering out string elements to form a new list, and then lower case
L3 = []
for i in L1:
    if isinstance(i, str):
        L3.append(i)
L2 = [s.lower() for s in L3]
# Directly use the list generation formula to set if after for for conditional filtering
L2 = [s.lower() for s in L1 if isinstance(s, str)]

2, Generator

Through the list generation formula, we can directly create a list. However, due to memory constraints, the list capacity must be limited. Moreover, creating a list containing 1 million elements not only takes up a lot of storage space, but if we only need to access the first few elements, the space occupied by most of the latter elements is wasted.

Therefore, if the list elements can be calculated according to some algorithm, can we continuously calculate the subsequent elements in the process of circulation? This eliminates the need to create a complete list, saving a lot of space. In Python, this mechanism of calculating while looping is called generator

1. Method of creating generator

Method 1:
Just change [] of a list generation formula to (), and a generator is created:

>>> L = [x * x for x in range(10)]
>>> L # As a list, L can print each element directly
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> g = (x * x for x in range(10)) #g is a generator
>>> g
<generator object <genexpr> at 0x1022ef630>

Because the generator is also an iterative object, use the for loop to print out each element of the generator

>>> g = (x * x for x in range(10))
>>> for n in g:
...     print(n)
... 
0
1
4
9
16
25
36
49
64
81

2. Implement generator with function

If the calculation algorithm is complex and cannot be realized with a for loop similar to list generation, it can also be realized with functions.

For example, in the famous Fibonacci sequence, except the first and second numbers, any number can be obtained by adding the first two numbers:
1, 1, 2, 3, 5, 8, 13, 21, 34, ...

#Printing fiboracci sequence with function
def fib(max):
#The above function can output the first max number of Fibonacci sequence:
    n, a, b = 0, 0, 1
    while n < max:
        print(b)
        a, b = b, a + b # t = (a, a + b) a = t[0] b = t[1]
        n = n + 1
    return 'done'

Method 2:
To change fib function into generator function, just change print(b) to yield b:

def fib(max):
    n, a, b = 0, 0, 1
    while n < max:
        yield b
        a, b = b, a + b
        n = n + 1
    return 'done'

If a function definition contains the yield keyword, the function is no longer an ordinary function, but a generator function. Calling a generator function will return a generator:

>>> f = fib(6)
>>> f
<generator object fib at 0x104feaaa0>

The execution process of generator function is different from that of ordinary function. Ordinary functions are executed sequentially. They return when they encounter a return statement or the last line of function statements.
The function that becomes the generator is executed every time next() is called. If the yield statement returns, it will continue to execute from the last returned yield statement when it is executed again.

Calling the generator function will create a generator object, and multiple calls to the generator function will create multiple independent generators. The correct way to write this is to create a generator object and then call next() on this generator object.

3. When the for loop calls the generator, it gets the return value

When the return statement is called, the value returned by the return generator is not used. Because the call is interrupted once at yield.

def fib(max):
    n, a, b = 0, 0, 1
    while n < max:
        yield b
        a, b = b, a + b
        n = n + 1
    return 'done'
    
>>> for n in fib(6):
...     print(n)
...
1
1
2
3
5
8

If you want to get the return value, you must catch the StopIteration error, and the return value is included in the value of StopIteration

>>> g = fib(6)
>>> while True:
...     try:
...         x = next(g)
...         print('g:', x)
...     except StopIteration as e:
...         print('Generator return value:', e.value)
...         break
...
g: 1
g: 1
g: 2
g: 3
g: 5
g: 8
Generator return value: done

4. Exercise: Yang Hui's triangle problem

Use the generator to solve the Yang Hui triangle problem, regard each line as a list, try to write a generator, and constantly output the list of the next line

Refer to the simple ending method proposed by the boss!
# -*- coding: utf-8 -*-
# That is, the i-th number in line n+1 is equal to the sum of the i-1 st number and the i-th number in line n, which is also one of the properties of combinatorial numbers.
# That is, C(n+1,i)=C(n,i)+C(n,i-1)

def triangles():
    # Initialize a list as the first row of output
    L = [1]
    while True:
        yield L
        # The list can be added directly. Adding zero here ensures that when calculating the nth line, the previous line is n+1 elements (forming an inverted triangle), which is convenient for addition operation
        X = [0] + L
        Y = L + [0]
        L = [X[i] + Y[i] for i in range(len(X))] #List generation

test

# Expected output:
# [1]
# [1, 1]
# [1, 2, 1]
# [1, 3, 3, 1]
# [1, 4, 6, 4, 1]
# [1, 5, 10, 10, 5, 1]
# [1, 6, 15, 20, 15, 6, 1]
# [1, 7, 21, 35, 35, 21, 7, 1]
# [1, 8, 28, 56, 70, 56, 28, 8, 1]
# [1, 9, 36, 84, 126, 126, 84, 36, 9, 1]
n = 0
results = []
for t in triangles():
    results.append(t)
    n = n + 1
    if n == 10:
        break

for t in results:
    print(t)

if results == [
    [1],
    [1, 1],
    [1, 2, 1],
    [1, 3, 3, 1],
    [1, 4, 6, 4, 1],
    [1, 5, 10, 10, 5, 1],
    [1, 6, 15, 20, 15, 6, 1],
    [1, 7, 21, 35, 35, 21, 7, 1],
    [1, 8, 28, 56, 70, 56, 28, 8, 1],
    [1, 9, 36, 84, 126, 126, 84, 36, 9, 1]
]:
    print('Test passed!')
else:
    print('Test failed!')

3, Iterator

There are several data types that can act directly on the for loop:
One is set data type, such as list, tuple, dict, set, str, etc;
One is generator, including generator and generator function with yield.

Objects that can directly act on a for loop are collectively referred to as iteratable objects: iteratable

Generators and iterators

Generator: it can not only act on the for loop, but also be continuously called by the next() function and return the next value until a StopIteration error is thrown, indicating that the next value cannot be returned
Iterator: an object that can be called by the next() function and continuously return the next value is called iterator
You can use isinstance() to determine whether an object is an Iterator object:

>>> from collections.abc import Iterator
>>> isinstance((x for x in range(10)), Iterator)
True #Generators are Iterator objects
# list, dict and str are iteratable, but not Iterator
>>> isinstance([], Iterator)
False
>>> isinstance({}, Iterator)
False	
>>> isinstance('abc', Iterator)
False		

You may ask, why are list, dict, str and other data types not iterators?
This is because Python's Iterator object represents a data stream. The Iterator object can be called by the next() function and continue to return the next data until a StopIteration error is thrown when there is no data.

This data stream can be regarded as an ordered sequence, but we can't know the length of the sequence in advance. We can only continuously calculate the next data on demand through the next() function. Therefore, the calculation of Iterator is inert and will be calculated only when the next data needs to be returned.

The iter() function can be used to turn list, dict, str and other iteratable objects into iterators:

>>> isinstance(iter([]),Iterator)
True
>>> isinstance(iter(['abc']),Iterator)
True

Summary

All objects that can act on the for loop are of type iteratable;
All objects that can act on the next() function are Iterator types, which represent a sequence of lazy calculations;
The collection data types such as list, dict, str, etc. are iteratable but not Iterator, but an Iterator object can be obtained through the iter() function.

Keywords: Python Back-end

Added by johncollins on Mon, 28 Feb 2022 14:47:00 +0200