Ignore the details of Python iterators and generators, and the application of lambda functions

Ignore the details of Python iterators and generators, and the application of lambda functions

brief introduction

The first half of this paper is about iterator and generator, which is as detailed as possible, and points out several details that are easy to ignore; The latter part is the use of iterators combined with lambda functions, and the pure lambda part is simpler.
ps: there are dry goods in the summary

preface

It's strange why we put these two pieces together, isn't it? Because I'm lazy.

Recently, I came into contact with the relevant contents related to these two pieces and forgot them. So I reviewed some relevant documents and shared my experience. By the way, it is convenient for me to consult them in the future. After all, what I write is the most fragrant.

To get to the point, do you know the knowledge points and details involved in the following two lines of code?

def function(*args, **kwgs):
	yield from iter(lambda params: expression, sentinel)

If you are completely clear, you can turn the page. This article may not help you. Thank you for coming!!

If you don't understand it very well, I believe you will gain something after reading this article. The details are worth understanding.

For the above problems, based on the principle of pulling out the cocoon bit by bit, I try to clarify each knowledge point one by one.

First, briefly explain the parameters:

  1. *args, **kwgs: * args, **kwgs: when the number of parameters is uncertain, * args, **kwgs represents the list of parameters that may be passed by the function* Args has no key value and * * kwargs has a key value.
  2. params: a set of parameters obtained from * args, **kwgs, or null
  3. Expression: single expression
  4. sentinel: sentinel is an object that plays a supervisory role. Its type is related to the first parameter of iter(), which will be further described below.

The specific content of them is the main knowledge point involved in this paper.

Iterators and generators

iterator

Let's start from the beginning. An iterator is an object that can remember the traversal position. What does it mean? That is, the iterator can only traverse backward sequentially from the element in the first position, and can know the position of the next element that should be accessed in the traversal process.

""" This paper is based on python 3.7 """

# First use the for loop
tmp_list = [2, 4, 6, 8, 10] 
for i in range(0, len(tmp_list)):
    print(tmp_list[i], end=' ')
# A little simpler
tmp_list = [2, 4, 6, 8, 10] # Replace with () tuple, [] list, {} collection, and '' string
for i in tmp_list:
    print(i, end=' ')
    
# The results were: 24 6 8 10

# Next, use the iterator to achieve the same effect
tmp_iter = iter(tmp_list) # Generate an iterator object with iter()
while True:
    try:
        print (next(tmp_iter), end=' ') # A call to next() will traverse the next element
    except StopIteration:
        break
# The result is: 2 4 6 8 10

print(type(tmp_iter))
# <class 'list_iterator'>
tmp_set = {2, 4, 6, 8, 10}
tmp_iter = iter(tmp_set) 
print(type(tmp_iter))
# <class 'set_iterator'>
  • TMP defined above_ ITER is an iterator object. At the same time, we should also note that the iterator we often say is a broad type (base class), under which there are various sub classes, such as list_iterator, set_iterator. Iterators have two basic methods, iter() and next(). iter() returns an iterator object, and each call to next() returns the next element value. In other words, iter() is like initializing an iterator, while next() accesses the elements in the iterator object from scratch.

  • The reason for using StopIteration here is that when the last element of the iterator has been accessed, an iteration has been completed, and there is no next element to access in next(), so stop iteration. If the try... except... Exception mechanism is not added, the following situations will occur, and the program execution will throw the stop iteration exception.

    tmp_iter = iter(tmp_list) 
    while True:
        print (next(tmp_iter), end=' ')
        
    # Traceback (most recent call last):
    #  File "****.py", line 20, in <module>
    #    print (next(tmp_iter), end=' ')
    # StopIteration
    

Using a for loop instead of a while loop can automatically stop the iteration without throwing a StopIteration exception. An example is given below. As for the use of iter(), list, tuple, collection and string objects can be used to create iterators, that is, they can be used as parameters of iter(). Objects that can be used as parameters of iter() are called iteratable objects. The following method is also feasible. It also shows that when for iterates, it implicitly calls next()

tmp_list = {2, 4, 6, 8, 10, 12}
for i in iter(tmp_list):
  print(i, end=' ')
# The result is still: 2 4 6 8 10 12, and note that no StopIteration exception is thrown here
print('\n=================')
iter1 = iter(tmp_list)
for i in iter1:
  print(i, 'and', next(iter1))
# =================
# 2 and 4
# 6 and 8
# 10 and 12

generator

Let's talk about the generator. An iterator is an object that can remember the traversal position, while a generator is a function that can return an iterator. Functions that use yield are called generators. What is yield?

from Rookie tutorial:
In the process of calling the generator to run, each time a yield is encountered, the function will pause and save all the current running information, return the value of yield, and continue to run from the current position the next time the next() method is executed.

For ease of understanding, we can think that each execution of yield returns a value, while the whole generator returns an iterator. Next is an example

# 1~n power of output 2, 2, 4, 8, 16
def func(n: int):
    i, num = 1, 2
    while i <= n:
        yield num
        num = num * 2
        i += 1
       
for i in func(5):
    print(i, end=',')
# The results are: 2,4,8,16,32,

print(list(func(5)))
# [2, 4, 8, 16, 32]

Every time the program runs to yield, it will pause and save all the current running information. Yield will return the value of num at the moment, and then continue to run the program downward from the suspended position. In fact, it is not entirely accurate to say that the generator return value is an iterator. In fact

f = func(5)
print(f)
# <generator object func at 0x000001BF70F123C8>
print(type(f))
# <class 'generator'>

The return value of the generator is still a generator object, but the function of this return value is basically the same as that of the iterator. Why is the effect the same? Because the generator object also has the next() method, let's compare the two methods next() and next()__ next__ ()

while True:
    try:
        print(next(f), end=' ')
    except StopIteration:
        break
# 2 4 8 16 32 
while True:
    try:
        print(f.__next__())
    except StopIteration:
        break
# 2 4 8 16 32 

next() is a python 3 built-in function, and__ next__ () is a self-defined method inside the class generator, which is used to find the next element to be accessed when traversing the generator. When next(f) is called, next() actually finds the in the generator__ next__ () and execute, so the results of the two methods are consistent, which also leads to some characteristics of the generator object as an iterator.

  • There is another small detail worth noting

    f = func(5)
    print(list(f))
    print('=================')
    print(next(f))  # print(f.__next__())
    

    Guess what the code runs to? You might as well think about the expected result first.

    [2, 4, 8, 16, 32]
    =================
    Traceback (most recent call last):
      File "****.py", line 25, in <module>
      print(f.__next__())
    StopIteration
    

    The last print() does not output the desired result '2', so what about removing print(list(f))?

    f = func(5)
    print(next(f))  # print(f.__next__())
     # The result is: 2
    
    print(next(f))  # print(f.__next__())
    # The result is: 4
    
  • This time, the results are in line with expectations. This shows that when we call list(), we actually traverse the generator and store all the elements of the generator in the list in order. In this process, list() implicitly moves the traversal position to the end of the iterator, resulting in no next element to traverse when we execute next(f). next() mistakenly thinks that the iterator has completed the traversal, So a StopIteration exception is thrown.

  • In addition, the collection {} mentioned earlier can be used to create iterators. Can the dictionary dict type with curly braces {} as the boundary also be used?

    tmp_dict = {'a': '1 apple', 'b': '2 banana', 'c': '3 cabbage'}
    it = iter(tmp_dict)
    for k, v in it:
        print(v, end=' ')
    # Traceback (most recent call last):
    #   File "****.py", line 49, in <module>
    #     for k, v in it:
    # ValueError: not enough values to unpack (expected 2, got 1)
    

    Not enough values to fetch? Why? I quietly removed a k,

    for v in it:
        print(v, end=' ')
    # Results: a b c 
    
    # Why only the key value in the key value pair is output?
    print(type(it))
    # <class 'dict_keyiterator'>
    

    It turns out that the iterator object it here belongs to class' dict_ Keyiterator 'is an iterator of keys in dictionary dict. It only iterates the key in the key value pair, and the value of the key value pair is ruthlessly abandoned by iter.

Class as an iterator

Not only some common types we are familiar with can create iterator objects, but user-defined classes can also be used as iterators, which breaks through the limitation that iterators have only fixed types, expands the scope of application of iterators, and can bring great convenience in some cases.

To use a class as an iterator, we need to implement two methods in the class__ iter__ () and__ next__ (). __ iter__ The () method returns a special iterator object that requires implementation__ next__ () method and identify the completion of the iteration through the StopIteration exception__ next__ () method returns the next iterator object.

The generator class mentioned earlier has the nature of iterator precisely because it is also implemented in the generator__ iter__ () and__ next__ (). For a generator object gener1, calling iter(gener1) is equivalent to calling gener1__ iter__ (), calling next(gener1) is equivalent to calling gener1__ next__ (), so that the generator class can be used as an iterator, and gener1 becomes an iteratable object.

For a simple example, notice__ iter__ () method needs to return an implementation__ next__ The iterator object of () method does not have to be self. (there is an example of _iter _ () returning other classes at the end of the text)

class MyIter:
	def __init__(self):
		self.num = 1

	def __iter__(self):
		self.num += 1
		return self

	def __next__(self):
		self.num *= 2
		return self.num
		
a_iter = MyIter()  # self.num = 1
print(next(a_iter), end=' ')
print(next(a_iter), end=' ')
print(next(a_iter), end=' ')
# 2 4 8 
b_iter = iter(a_iter)  # Here is self Num is changed from + 1 to 9 on the basis of 8
print(next(b_iter), end=' ')
print(next(b_iter), end=' ')
print(next(b_iter), end=' ')
# 18 36 72 

Summary (with dry goods!!!)

To sum up, lists, tuples, sets, strings, iterators, generators and so on (more than these) are iteratable, and their instantiation objects are iterable objects; The difference between an iterator and a generator is that a generator is a function that uses yield, which can be understood as returning an iterator; The iterator has iter() and next() methods. iter() returns an iterator, and next() returns the value of the specific element of the iterator object. The elements of the iterator object can be printed through list(), set(), etc.

In addition, yield also has a more flexible usage. It can not return a value or return a null value after yield. Sometimes, in order to make beginners feel the danger of society, some programs will use the yield from structure to simplify the generator. The code is simple, but it needs a little thought to read. The usage of yield from is as follows

"""
Format:
def generator(iterator)
	yield from iterator
yield from + Iteratable object iter   
yield Return in order iter Element value in
"""
def fun():
    yield from [1, 12, 23, 34, 45]
 
 for i in fun():
 	print(i, end=',')
 # 1,12,23,34,45,
  • by the way, I saw some people on the Internet say that iterators save more memory space than lists, collections and other types, so I used Python 3 7 tried

    list1 = [1, 12, 23, 34, 45]    # list
    iter1 = iter(list1)            # iterator 
    def fun(lists):                # generator 
        yield from iter(lists)  # yield from lists
    gener1 = fun(list1)
    
    print(f'list1: {type(list1)}, {len(list1)}')
    # list1:  <class 'list'> 5
    print(f'iter1: {type(iter1)}, {len(iter1)}')
    # TypeError: object of type 'list_iterator' has no len()
    print(f'gener1: {type(gener1)}, {len(gener1)}')
    # TypeError: object of type 'generator' has no len()
    

    Lost in thought... Does it really save space? I don't read much. I'm not sure

    This road won't work. Let's go another way:

    def fun(list1):
        for i in list1:
            print(i,end=' ')
            yield
    
    list1 = [1, 12, 23, 34, 45]
    i = fun(list1)
    print(i)
    # The result is: < generator object fun at 0x0000021f048e23c8 >
    

    The print() of the for loop in fun() is not executed, that is, the whole function is not really executed in the way we think. In fact, calling fun(list1) will not execute the fun function, but return an iterable object! When we start the iteration, fun(list1) will actually execute:

    for ele in fun(list1):
    	print(ele, end='; ')
    #The results were: 1 None; 12 None; 23 None; 34 None; 45 None; 
    
    • The difference between the two print() end s is to see clearly the execution sequence of the two print(). From the results, it can be analyzed that print() in fun() precedes print() in the for loop, and because there is no value to return after yield, ele receives the null value returned by yield and outputs None. After that, the two execute the printing function alternately.

    • That is to say, when the generator fun() is called, it does not really execute fun(), and it will only be executed when entering the iteration. The alternating output in the execution process also shows that fun() does not return all list1 elements at one time, but executes print(i,end = '') first, and then stops when it reaches yield, and yield returns a null value to ele, Let the print(ele, end = ';') in for output, and then enter the next iteration. In this way, the loop alternately executes two print() until the iteration ends. In this process, the space occupied by fun (list1) is indeed less than that occupied by list1, which may not be rigorous, because fun() itself, as a generator object, also occupies memory space. However, when list1 is long enough and contains 1000, 10000... Elements, the size of fun (list1) must be less than that of list1, which should be easy to understand.

    I almost forgot that troublemakers such as list() are different from for and while, which imply iteration in their execution. Based on the above code, we make some adjustments:

    def fun(list1):
        for i in list1:
            print(i, end=' ')
            yield
    
    list1 = [1, 12, 23, 34, 45]
    i = fun(list1)
    i = list(i)  # i = set(i)
    print('\n+++++++++++++++')
    print(i)  
    # 1 12 23 34 45 
    # +++++++++++++++
    # [None, None, None, None, None]
    

    This fully shows that list(), set() iterates "implicitly", and also confirms the conflict between list() and next(). Interested students can search the implementation of related functions by themselves.

lambda function

Use of simple lambda

Anonymous function lambda: refers to a class of functions or subroutines that do not need to define identifiers (function names).

lambda function is an anonymous function in the following format:

"""
lambda parameter list :  expression
 The parameter list can be empty. It has no parameters and can be used between multiple parameters ',' separate
 There can be no more than one expression, that is, the expression can be written in one line in the ordinary function definition
lambda The return value of a function is the address of a function, that is, the function object.
"""
a = lambda : print("1")  # Lambda function definition, and name the function object returned by lambda function a

# Pay attention to distinguish between a and a()
print(a)  # Output function object
# <function <lambda> at 0x000001EB61A680D8>
print(type(a)) 
# <class 'function'>
a()  # Call function
# 1
print(a()) # Call the function and return the value of the expression. The value of the expression print("1") is None
# 1
# None
print(type(a()))
# 1
# <class 'NoneType'>

It can be understood as follows: A is the name of an anonymous lambda function, and a() represents the value of the expression returned after the lambda function is executed.

aa = lambda : 1  # The expression is always 1
print(aa)
print(aa())
# <function <lambda> at 0x0000014D7D6180D8>
# 1

aaa = lambda x: x + 1  # The parameter list is no longer empty, but requires an x to be passed in
print(aaa)
print(aaa(2)) # Incoming parameters
# <function <lambda> at 0x000001B55ACBA1F8>
# 3

print((lambda y: y * 2)(3)) # Pass 3 to lambda function and execute
# 6

In fact, the essence of a lambd function is a function. How to use an ordinary function, the lambda function can also be used. In fact, any lambda function can be rewritten as an ordinary function. For some simple and easy to read one-time single line functions, replace them with lambda functions, omitting the formatted def...: return... To make the code more elegant. For example, when using the map() function:

aaa = lambda x: x + 1

def bbb(x):
    return x + 1

print(aaa)
print(lambda x: x + 1)
print(bbb)

i = map(aaa, [1, 2, 3])
j = map(lambda x: x + 1, [1, 2, 3])
k = map(bbb, [1, 2, 3])

print(i)
print(j)
print(k)
print(list(i))
print(list(j))
print(list(k))

# <function <lambda> at 0x000002C20A06E948>
# <function <lambda> at 0x000002C20A06E8B8>
# <function bbb at 0x000002C20A06E678>
# <map object at 0x000002C20A22BE88>
# <map object at 0x000002C20A2356C8>
# <map object at 0x000002C20A235748>
# [2, 3, 4]
# [2, 3, 4]
# [2, 3, 4]
  • Note that the first line and the second line of output correspond to two different lambda functions respectively. Why are they different? Aren't two lambda functions exactly the same from arguments to expressions? However, if you look carefully at the first and second lines of the output, you will find that the positions of the two functions are different. In fact, each time you define an anonymous function with lambda, you will allocate a new memory space to the function. In this example, although the functions of the two lambda functions are the same, they are functions in different spaces.
  • Here, the first parameter of map() is a function func, and the second parameter is an iterative object or one or more sequences. The function of map() is to return an iterative object, that is, the map object in the output of lines 4-6 is iterative. The map() function takes out the element e from the second parameter (that is, the sequence) in sequence, passes it to the first parameter func, and executes func(e). The execution result is the element in the returned iteratable object map object.

lambda and iter() can be used like this

Finally, here is my original intention to write this article. I came across the following function definition by chance:

def function(*args, **kwgs):
	yield from iter(lambda params: expression, sentinel)

At first glance,
??? What is this

Usually, ITER () only needs to pass in an iteratable parameter, such as list, collection, etc. why is ITER () different here? It turns out that the iter() function is actually in this form:

iter(object[, sentinel]), where the content in [] represents that it can be selectively omitted.

  • Object – a collection object that supports iteration.
  • Sentinel – if the second parameter is passed, the parameter object must be a callable object (for example, a function). At this time, iter creates an iterator object, which is called every time__ next__ () method, object will be called. If__ next__ If the return value of is equal to sentinel, a StopIteration exception is thrown, otherwise the next value is returned.

To be clear, iter() returns an iterator object iter01 anyway. In other words, when we want to pass two parameters to iter(), the first parameter should be a callable object, that is, an object that can be called, and the function can be called. Therefore, the function belongs to a callable object. Here, we might as well assume that the first parameter passed in is a function func(). The second parameter sentinel is of the same type as the return value of the first parameter func(). Sentinel is used when iter01 starts iteration,

  1. Macro perspective: every iteration, func() will be called. At this time, compare the return value of func() with sentinel. If they are not equal, pass the return value of func() to iter01 as its element; Otherwise, stop the iteration.
  2. From the perspective of class internal implementation: each iteration will actually implicitly call the class defined in the class callable_iterator to which iter01 belongs__ next__ () method. And every time this iterator object is called__ next__ When using the () method, func() will be called. When the return value of func() is not equal to sentinel__ next__ () returns the value. Otherwise, a StopIteration exception is thrown.
x = 0
def bbb():
    global x  # Declare to use the global variable x
    x += 1
    return x

iter01 = iter(bbb, 10)
print(iter01)
for i in iter01:
    print(i, end=' ')
# <callable_iterator object at 0x000001A41A96E688>
# 1 2 3 4 5 6 7 8 9 
  • The object returned by iter(object, sentinel) belongs to class callable_iterator
  • For the iter(object, sentinel) I currently contact and use, if the object is a function, the function does not need to pass in parameters; If the object is a callable class, class initialization__ init__ () also does not need to pass in parameters.

Now that you know another use of iter(), go back to the beginning of this section

def function(*args, **kwgs):
	yield from iter(lambda params: expression, sentinel)

The knowledge points involved are almost the same. Here is an example to facilitate further understanding and consolidation. If you can easily understand the examples, I believe you have a preliminary grasp of this part of knowledge. IT technology is as deep as the sea. Let's learn and refuel together!

"""EASY pattern"""
def easy_func():
	yield from iter(lambda :f.readline(), "")

f = open('hello.txt', encoding='utf-8')
for i in easy_func():
    print(i, end='')
f.close()
# This program can output the entire hello file by line

"""HARD pattern"""
class Iter1:
    def __iter__(self):
        self.num1 = 1
        return Iter2(self.num1)
        
class Iter2:
    def __init__(self, num):
        self.num2 = num

    def __next__(self):
        self.num2 *= 2
        return self.num2

def last_func(a):
    yield from iter(lambda : next(aa), 16)

aa = iter(Iter1())
# print(next(aa), end=' ')
# print(next(aa), end=' ')
# print(next(aa), end=' ')
for i in last_func(aa):
    if i <= 1024:  # Why add an if? Just turn down the second parameter 16 of iter()
        print(i, end=' ')
    else:
        break

If you think you have a harvest, you might as well praise the collection. I am motivated to be a blogger without hydrology~

Keywords: Python

Added by edawson003 on Wed, 19 Jan 2022 12:07:43 +0200