[reading notes] reading notes of fluent python 3.1-3.4

Reading notes of fluent python 3.1-3.4

1, Universal mapping type (3.1)
In the collections.abc module, there are two abstract base classes, Mapping and MutableMapping, which are used to define formal interfaces for dict and other similar types (in Python 2.6 to Python 3.2 versions, these classes do not belong to the collections.abc module, but belong to the collections module).
Non Abstract mapping types generally do not directly inherit these abstract base classes. They directly extend dict or collections.User.Dict. The main function of these abstract base classes is to serve as formal documents. They define the most basic interfaces required to build a mapping type. Then they can be used together with isinstance to determine whether a data is a generalized mapping type:

my_dict = {} 
print(isinstance(my_dict, abc.Mapping))

Output True
Here, isinstance instead of type is used to check whether a parameter is of dict type, because this parameter may not be dict, but an alternative mapping type. All mapping types in the standard library are implemented by dict, so they have a common limitation, that is, only hashable data types can be used as keys in these mappings (only keys have this requirement, and values do not need to be hashable data types).
What are hashable data types
If an object is hashable, its hash value will remain unchanged during the object's life cycle, and the object needs to implement the hash() method. In addition, the hashable object must have a qe() method so that it can be compared with other keys. If two hashable objects are equal, their hash values must be the same
Atomic immutable data types (str, bytes and numeric types) are hashable, and frozenset is also hashable, because according to its definition, frozenset can only accommodate hashable types. A tuple is hashable only if all the elements contained in it are hashable.

	tt = (1, 2, (30, 40)) 
	hash(tt) 8027212646858338501 
	tl = (1, 2, [30, 40]) 
	hash(tl) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list' 
	tf = (1, 2, frozenset([30, 40]))
	hash(tf) -4118419923444501110

Generally speaking, objects of user-defined types are hashable, and the hash value is the return value of their id() function, so all these objects are not equal during comparison. If an object implements the eq method and uses the internal state of the object in the method, the object is hashable only when all these internal states are immutable.
2, Dictionary derivation (3.2)
Dictionary derivation (dictcomp) can build a dictionary from any iteratable object with key value pairs as elements.

 DIAL_CODES = [ (86, 'China'),   # A list carrying paired data, which can be directly used in the construction method of dictionary.
               (91, 'India'),
               (1, 'United States'),
               (62, 'Indonesia'),
               (55, 'Brazil'),
               (92, 'Pakistan'),
               (880, 'Bangladesh'),
               (234, 'Nigeria'),
               (7, 'Russia'),
               (81, 'Japan')
               ]
country_code = {country: code for code, country in DIAL_CODES}  # Here, the matched data is changed from left to right. The country name is the key and the area code is the value.
print(country_code)
print({code: country.upper() for country, code in country_code.items()
if code < 66})   # In contrast to the above, using the region code as the key, the country name is converted to uppercase, and the regions with the region code greater than or equal to 66 are filtered out.

output

{'China': 86, 'India': 91, 'United States': 1, 'Indonesia': 62, 'Brazil': 55, 'Pakistan': 92, 'Bangladesh': 880, 'Nigeria': 234, 'Russia': 7, 'Japan': 81}
{1: 'UNITED STATES', 62: 'INDONESIA', 55: 'BRAZIL', 7: 'RUSSIA'}

3, Common mapping methods (3.3)
Dict, defaultdict and OrderedDict are common methods. The latter two data types are variants of dict and are located in the collections module.

dictdefaultdictOrderedDict
d.clear()Remove all elements
d.contains(k)Check whether k is in d
d.copy()Shallow replication
d.copy()Used to support copy.copy
d.default_factoryA function called in the missing function to set a value for an element that is not found 😀 one
d.delitem(k)del d[k], remove the element with key K
d.fromkeys(it, [initial])Set the element in the iterator it as the key in the mapping. If there is an initial parameter, it will be used as the corresponding value of these keys (the default is None)
d.get(k, [default])Returns the value corresponding to key k. if there is no key k in the dictionary, it returns None or default
d.getitem(k)Let dictionary d return the value corresponding to key K in the form of d[k]
d.items()Returns all key value pairs in d
d.iter()Gets the iterator for the key
d.keys()Get all keys
d.len()The number of key value pairs in the dictionary can be obtained in the form of len(d)
d.missing(k)This method is called when getitem cannot find the corresponding key
d.move_to_end(k, [last])Move the element with key k to the front or back position (the default value of last is True)
d.pop(k, [defaul]Return the value corresponding to key k, and then remove the key value pair. If you don't have this key, return None or defaul
d.popitem()Returns a random key value pair and removes it from the dictionary 😀 two
d.reversed()Iterator that returns keys in reverse order
d.setdefault(k, [default])If there is a key K in the dictionary, set its corresponding value to default, and then return this value; If not, let d[k] = default, and then return to default
d.setitem(k, v)Realize the operation of d[k] = v, and set the value corresponding to K to v
d.update(m, [**kargs])m can be a map or a key value pair iterator to update the corresponding entry in d
d.values()Returns all values in the dictionary

😀 1 .default_factory is not a method, but a callable object. Its value is set by the user when defaultdict is initialized.
😀 2. OrderedDict.popitem() will remove the first inserted element in the dictionary (first in first out); At the same time, this method also has an optional last parameter. If it is true, the last inserted element (last in first out) will be removed.
In the above table, the way the update method handles parameter m is a typical "duck type". The update function first checks whether M has a keys method. If so, the update function treats it as a mapping object. Otherwise, the function will step back and treat m as an iterator containing key, value elements. Most mapping type construction methods in Python use similar logic, so you can either create a mapping object with a mapping object or initialize a mapping object with an iteratable object containing (key, value) elements. In the method of mapping objects, setdefault may be a subtle one. Although we will not use it every time, once it works, we can save many key queries, so as to make the program more efficient.
Handling missing keys with setdefault
Python throws an exception when the dictionary d[k] cannot find the correct key, which is in line with Python's philosophy of "rapid failure". Perhaps every Python programmer knows that d.get(k, default) can be used instead of D [k], giving a default return value to the key that cannot be found (which is much more convenient than dealing with KeyError). However, when updating the value corresponding to a key, whether using getitem or get will be unnatural and inefficient.

"""Create a mapping from a word to its occurrence""" 
import sys 
import re 
WORD_RE = re.compile(r'\w+') 
index = {} 
with open(sys.argv[1], encoding='utf-8') as fp: 
	for line_no, line in enumerate(fp, 1): 
		for match in WORD_RE.finditer(line): 
			word = match.group() 
			column_no = match.start()+1 
			location = (line_no, column_no) 
			# This is actually a very bad implementation, which is written just to prove the argument 
			occurrences = index.get(word, [])  # When extracting word, if there is no record of it, return [].
			occurrences.append(location)  # Add the new position of the word to the back of the list. 
			index[word] = occurrences  # Putting the new list back into the dictionary involves another query operation. 
			# Print the results in alphabetical order 
			for word in sorted(index, key=str.upper):   # The key = parameter of the sorted function does not call str.uppper, but passes the reference of this method to the sorted function, so that the words will be standardized into a unified format during sorting. 
			print(word, index[word])
"""Create a mapping from a word to its occurrence""" 
import sys 
import re 
WORD_RE = re.compile(r'\w+') 
index = {} 
with open(sys.argv[1], encoding='utf-8') as fp: 
	for line_no, line in enumerate(fp, 1): 
		for match in WORD_RE.finditer(line): 
			word = match.group() 
			column_no = match.start()+1 
			location = (line_no, column_no) 
			index.setdefault(word, []).append(location)  # Get the occurrence list of words. If the words do not exist, put the words and an empty list into the mapping, and then return to the empty list, so that the list can be updated without the second search. 
			# Print the results in alphabetical order 
			for word in sorted(index, key=str.upper): 
				print(word, index[word])

Namely:

my_dict.setdefault(key, []).append(new_value)

And

if key not in my_dict: 
	my_dict[key] = [] 
my_dict[key].append(new_value)

The effect of the two is the same, but the latter needs to query the key at least twice - if the key does not exist, it is three times. The whole operation can be completed only once with setdefault.
4, Mapped elastic key query
(1) defaultdict: handles a selection of keys that cannot be found
When instantiating a defaultdict, you need to provide a callable object for the constructor,
This callable object will be called when getitem encounters a key that cannot be found, and getitem will return some default value.
For example, we have created a new dictionary: dd = defaultdict(list). If the key 'new key' does not exist in dd, the expression dd ['new key'] will follow the following steps.
(1) Call list() to create a new list.
(2) Take the new list as the value and 'new key' as its key and put it in dd.
(3) Returns a reference to this list.
The callable object used to generate the default value is stored in a file named default_ In the instance property of the factory.
Default in defaultdict_ Factory will only be called in getitem and will not work in other methods. For example, DD is a defaultdict and K is a key that cannot be found. The expression dd[k] will call default_factory creates a default value, and dd.get(k) returns None.
missing. It calls default when defaultdict encounters a key that cannot be found_ Factory, which is actually supported by all mapping types
(2) Special method__ missing__
If a class inherits dict, then the inherited class provides__ missing__ Method, when getitem encounters a key that cannot be found, Python will automatically call it instead of throwing a KeyError exception.

class StrKeyDict0(dict):  # StrKeyDict0 inherits dict.
    def __missing__(self, key):
        if isinstance(key, str):  # If the missing key itself is a string, a KeyError exception is thrown.
            raise KeyError(key)
        return self[str(key)]  # If the missing key is not a string, convert it to a string and look it up.

    def get(self, key, default=None):
        try:
            return self[key]  # The get method delegates the search work to the user in the form of self[key]__ getitem__, In this way, you can pass before declaring the search failed__ missing__  Give a key another chance.
        except KeyError:
            return default  # If KeyError is thrown, it indicates that__ missing__  Also failed, so default is returned.

    def __contains__(self, key):
        return key in self.keys() or str(key) in self.keys()
        # First, search according to the original value of the incoming key (our mapping type may contain non string keys). If it is not found, use the str() method to convert the key into a string and search again.

Why is the isinstance(key, str) test required in missing above?
Without this test, as long as str(k) returns an existing key, the missing method is OK. It can work normally whether it is a string key or a non string key. But if str(k) is not an existing key, the code will fall into infinite recursion. This is because self[str(key)] in the last line of missing will call getitem, and this str(key) does not exist, so__ missing__ Will be called again.
In order to maintain consistency, the contains method is also required here. This is because the k in d operation will call it, but the contains method we inherited from dict will not be called when the key cannot be found__ missing__ method. Another detail in contains is that we don't use a more Python style - k in my_dict -- to check if the key exists, because that also causes contains to be called recursively. In order to avoid this situation, a more explicit method is adopted here, which is directly queried in self.keys().
For the sake of accuracy, we also need this operation to find the key according to the original value of the key (that is, key in self.keys()), because when creating StrKeyDict0 and adding a new value to it, we do not force the incoming key to be a string. Because this operation does not specify the type of dead key, it makes the lookup operation more friendly.

Keywords: Python

Added by vtolbert on Wed, 24 Nov 2021 05:53:52 +0200