Detailed knowledge of Python iterations, iterators and generators

Have you distinguished Iterable, iterator and generator? What are they, what is the relationship between them, what are their uses and how to create them? This article explains in detail one by one

Iteratable

Any object that can be iterated is iterable. In short, the objects that can be used in for loop are, for example:

  • All sequence types are iterable
    • String str
    • List list
    • Yuanzu tuple
    • Byte object bytes
    • Array array
    • Memory view
    • Byte array, byte array, etc
  • Some nonsequential types are iterable
    • Dictionary dict
    • File object file object
    • The object of a custom class needs to be implemented__ iter__ () or__ getitem__ () method

Iterator iterator

iterator is a data flow object, which refers to those objects that really perform iterative behavior

  • All iterator s are iterable
  • After iter() is executed on iterable object, it is iterator
  • Must be implemented in the iterator object_ iter_ () method
    • This method returns the iterator object, which is itself
    • Because of this, the iterator is iterable
  • Must be implemented in the iterator object_ next_ () method
    • Each time the next() method is called on the iterator object, the next element in the data stream is returned in turn
    • When the elements in the data stream are traversed, calling next() again will throw StopIteration Abnormal In the for loop, you can automatically receive exceptions and exit the loop
# Quickly create an iterator
>>> a = [1, 2, 3]
>>> a_iterator = iter(a)
>>> a_iterator
<list_iterator object at 0x105e6eee0>
# Contain__ iter__  And__ next__  method
>>> dir(a_iterator)
['__iter__', '__next__', '__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'gi_code', 'gi_frame', 'gi_running', 'gi_yieldfrom', 'send', 'throw']
>>> next(a_iterator)
0
>>> next(a_iterator)
1
>>> next(a_iterator)
2
>>> next(a_iterator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Advantages of Iterator

Iterators allow you to start with the first element and use next() to iterate all the way to the last element One by one The advantage of using iterators is that you don't have to prepare all the elements in the iteration process in advance Only when traversing an element, the current element is calculated. It is a lazy loading mode

  • Save space

It only takes 48 bytes to create an iterator of the order of magnitude of 10 to the power of 9, while it takes 8 GB to create a list of the same order of magnitude

>>> from itertools import repeat
>>> import sys
>>> iters = repeat(1, times=10**9)
>>> arry = [1] * 10**9
>>> sys.getsizeof(iters)
48
>>> sys.getsizeof(arry)
8000000056

How to create an Iterator

According to the definition of iterator, we can use object - oriented method to create an iterator

Steps:

  1. Realize_ iter_ () method, return iterator
  2. Realize_ next_ () method, which returns the next available data until there is no data, and throws a StopIteration exception
class IterableFile(object):
    files = ['input.txt', 'data.csv', 'test.csv']

    def __init__(self):
        self.idx = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.idx >= len(self.files):
            raise StopIteration()
        next_file = self.files[self.idx]
        self.idx += 1
        return next_file

Through the object-oriented method, you can program an iterator for a custom class. This method looks cool, but it's not the way we usually create iterators The most common method is to create a generator generator

Generator generator

Generator is the most convenient way to create an iterator The generator is actually a special iterator, but it does not need to be implemented like an iterator__ iter__ And__ next__ method.

There are two ways to create a generator

Generator expression

The iterator protocol is followed behind the generator expression, which can produce elements one by one instead of building a complete list first The syntax of generator expressions is similar to list derivation, except that square brackets are replaced by parentheses

>>> mylist = [x*x for x in range(3)]
>>> mylist
[0, 1, 4]
>>> mygenerator = (x*x for x in range(3))
>>> mygenerator
<generator object <genexpr> at 0x102ebcf20>

yield expression

If a function contains a yield expression, it is a generator function; Calling it returns a special iterator called a generator.

>>> def count(start=0):
...     num = start
...     while True:
...         yield num
...         num += 1
>>> c = count()
>>> c
<generator object count at 0x10e04e870>
>>> next(c)
0
>>> next(c)
1
  • Yield is a keyword similar to return. When an iteration encounters yield, it returns the value behind (on the right) yield. The key point is: in the next iteration, execute from the code (next line) after the yield encountered in the previous iteration

summary

This article mentioned several concepts, such as iteratability, iterator and generator What is the relationship between them? Let's strengthen it through a picture~

reference material

  • https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
  • https://liam.page/2017/06/30/understanding-yield-in-python/
  • https://nvie.com/posts/iterators-vs-generators/

Keywords: Python Programming

Added by kelvin on Tue, 01 Feb 2022 02:06:34 +0200