Have you distinguished Iterable, iterator and generator? What are they, what is the relationship between them, what are their uses and how to create them? This article explains in detail one by one
Iteratable
Any object that can be iterated is iterable. In short, the objects that can be used in for loop are, for example:
- All sequence types are iterable
- String str
- List list
- Yuanzu tuple
- Byte object bytes
- Array array
- Memory view
- Byte array, byte array, etc
- Some nonsequential types are iterable
- Dictionary dict
- File object file object
- The object of a custom class needs to be implemented__ iter__ () or__ getitem__ () method
Iterator iterator
iterator is a data flow object, which refers to those objects that really perform iterative behavior
- All iterator s are iterable
- After iter() is executed on iterable object, it is iterator
- Must be implemented in the iterator object_ iter_ () method
- This method returns the iterator object, which is itself
- Because of this, the iterator is iterable
- Must be implemented in the iterator object_ next_ () method
- Each time the next() method is called on the iterator object, the next element in the data stream is returned in turn
- When the elements in the data stream are traversed, calling next() again will throw StopIteration Abnormal In the for loop, you can automatically receive exceptions and exit the loop
# Quickly create an iterator >>> a = [1, 2, 3] >>> a_iterator = iter(a) >>> a_iterator <list_iterator object at 0x105e6eee0> # Contain__ iter__ And__ next__ method >>> dir(a_iterator) ['__iter__', '__next__', '__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'gi_code', 'gi_frame', 'gi_running', 'gi_yieldfrom', 'send', 'throw'] >>> next(a_iterator) 0 >>> next(a_iterator) 1 >>> next(a_iterator) 2 >>> next(a_iterator) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
Advantages of Iterator
Iterators allow you to start with the first element and use next() to iterate all the way to the last element One by one The advantage of using iterators is that you don't have to prepare all the elements in the iteration process in advance Only when traversing an element, the current element is calculated. It is a lazy loading mode
- Save space
It only takes 48 bytes to create an iterator of the order of magnitude of 10 to the power of 9, while it takes 8 GB to create a list of the same order of magnitude
>>> from itertools import repeat >>> import sys >>> iters = repeat(1, times=10**9) >>> arry = [1] * 10**9 >>> sys.getsizeof(iters) 48 >>> sys.getsizeof(arry) 8000000056
How to create an Iterator
According to the definition of iterator, we can use object - oriented method to create an iterator
Steps:
- Realize_ iter_ () method, return iterator
- Realize_ next_ () method, which returns the next available data until there is no data, and throws a StopIteration exception
class IterableFile(object): files = ['input.txt', 'data.csv', 'test.csv'] def __init__(self): self.idx = 0 def __iter__(self): return self def __next__(self): if self.idx >= len(self.files): raise StopIteration() next_file = self.files[self.idx] self.idx += 1 return next_file
Through the object-oriented method, you can program an iterator for a custom class. This method looks cool, but it's not the way we usually create iterators The most common method is to create a generator generator
Generator generator
Generator is the most convenient way to create an iterator The generator is actually a special iterator, but it does not need to be implemented like an iterator__ iter__ And__ next__ method.
There are two ways to create a generator
Generator expression
The iterator protocol is followed behind the generator expression, which can produce elements one by one instead of building a complete list first The syntax of generator expressions is similar to list derivation, except that square brackets are replaced by parentheses
>>> mylist = [x*x for x in range(3)] >>> mylist [0, 1, 4] >>> mygenerator = (x*x for x in range(3)) >>> mygenerator <generator object <genexpr> at 0x102ebcf20>
yield expression
If a function contains a yield expression, it is a generator function; Calling it returns a special iterator called a generator.
>>> def count(start=0): ... num = start ... while True: ... yield num ... num += 1 >>> c = count() >>> c <generator object count at 0x10e04e870> >>> next(c) 0 >>> next(c) 1
- Yield is a keyword similar to return. When an iteration encounters yield, it returns the value behind (on the right) yield. The key point is: in the next iteration, execute from the code (next line) after the yield encountered in the previous iteration
summary
This article mentioned several concepts, such as iteratability, iterator and generator What is the relationship between them? Let's strengthen it through a picture~
reference material
- https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
- https://liam.page/2017/06/30/understanding-yield-in-python/
- https://nvie.com/posts/iterators-vs-generators/