Python serialization and deserialization
By serializing the object, it can be stored in variables or files, and the state of the object at that time can be preserved, so as to prolong its life cycle. And this object can be read out again if necessary. There are several common modules in Python to implement this function.
pickle module
Stored in variables
dumps(obj) returns the stored bytes
dic = {'age': 23, 'job': 'student'} byte_data = pickle.dumps(dic) # out -> b'\x80\x03}q\x00(X\x03\x00\x00\...' print(byte_data)
Read data
The data is stored in byte_data variables in bytes, and the loads function is used when it needs to be reused.
obj = pickle.loads(byte_data) print(obj)
Stored in files
It can also exist in files to make objects persistent. The dump and load functions are used. Notice the difference between them, less s. Because pickle writes binary data, it needs to be opened in wb and rb modes.
# serialize with open('abc.pkl', 'wb') as f: dic = {'age': 23, 'job': 'student'} pickle.dump(dic, f) # Deserialize with open('abc.pkl', 'rb') as f: aa = pickle.load(f) print(aa) print(type(aa)) # <class 'dict'>
Serialize user-defined objects
Suppose I write a class called Person.
class Person: def __init__(self, name, age, job): self.name = name self.age = age self.job = job def work(self): print(self.name, 'is working...')
pickle can also be written, not only to the class itself, but also to an instance of it.
# Storing instances in variables can also exist in files a_person = Person('abc', 22, 'waiter') person_abc = pickle.dumps(a_person) p = pickle.loads(person_abc) p.work() # Store the class itself in variables, and return the class itself when loads, rather than an instance of it. class_Person = pickle.dumps(Person) Person = pickle.loads(class_Person) p = Person('Bob', 23, 'Student') p.work() # The following example demonstrates storing classes in files # serialize with open('person.pkl', 'wb') as f: pickle.dump(Person, f) # Deserialize with open('person.pkl', 'rb') as f: Person = pickle.load(f) aa = Person('gg', 23, '6') aa.work()
json module
Pickle can easily serialize all objects. However, as a more standard format, json has better readability (pickle is binary data) and cross-platform. It's a good choice.
json uses the same four function names as pickle.
Serialization into strings
dic = {'age': 23, 'job': 'student'} dic_str = json.dumps(dic) print(type(dic_str), dic_str) # out: <class 'str'> {"age": 23, "job": "student"} dic_obj = json.loads(dic_str) print(type(dic_obj), dic_obj) # out: <class 'dict'> {'age': 23, 'job': 'student'}
As you can see, the dumps function converts objects into strings. The loads function restores it to a dictionary.
Store as json file
It can also be stored in json files
dic = {'age': 23, 'job': 'student'} with open('abc.json', 'w', encoding='utf-8') as f: json.dump(dic, f) with open('abc.json', encoding='utf-8') as f: obj = json.load(f) print(obj)
Storing custom objects
Or the Person object above. Error reporting if serialized directly
aa = Person('Bob', 23, 'Student') with open('abc.json', 'w', encoding='utf-8') as f: json.dump(aa, f) # Report errors
Object of type'Person'is not JSON serializable at this point, just pass a parameter default in the dump function, which accepts a function that converts an object into a dictionary.
Just write one.
def person2dict(person): return {'name': person.name, 'age': person.age, 'job': person.job}
This returns a dictionary, and object instances have a way to simplify the process. Call directly the _dict_ of the instance. for example
print(aa.__dict) # {'name': 'Bob', 'age': 23, 'job': 'Student'}
Very convenient.
At the same time, when reading, load out a dictionary, and then turn back to the object, you also need an object_hook parameter, which receives a function to turn the dictionary into an object.
def dict2person(dic): return Person(dic['name'], dic['age'], dic['job'])
So the complete program should be written as follows
with open('abc.json', 'w', encoding='utf-8') as f: json.dump(aa, f, default=person2dict) with open('abc.json', encoding='utf-8') as f: obj = json.load(f, object_hook=dict2person) print(obj.name, obj.age, obj.job) obj.work()
Since _dict_ can be used instead of person2dict function, lambda function can be used to simplify it.
with open('abc.json', 'w', encoding='utf-8') as f: json.dump(aa, f, default=lambda obj: obj.__dict__)
These are stored in files and variables.
But as far as I'm concerned, I don't know how to use json serialization for our custom classes as easily as pickle, or maybe other extension functions. I'll talk about it later.
Shell module
There is also a module, not very common, usually using an open just fine. Sheve stores data in the form of key-value pairs.
with shelve.open('aa') as f: f['person'] = {'age': 23, 'job': 'student'} f['person']['age'] = 44 # Here's an attempt to change the old age of 23 f['numbers'] = [i for i in range(10)] with shelve.open('aa') as f: person = f['person'] print(person) # {'age': 23, 'job': 'student'} nums = f['numbers'] print(nums) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Documents do not have suffix names. Three files (a little too many) are generated under windows, namely aa. bak, aa. DAT and aa. dir. The Bak and dir files are viewable (seemingly the same contents of the two files) and generate such data in the following example.
'person', (0, 44) 'numbers', (512, 28)
Allow writeback
In one detail, when we read the key person, we found that age was still 23 years old, and f['person']['age'] = 44 did not change to 44. The following
with shelve.open('aa', writeback=True) as f: dic = {'age': 23, 'job': 'student'} f['person'] = dic dic['age'] = 44 f['person'] = dic
It is equivalent to two assignments. This method can change the value.
By default, the stored value is not updated after the value is changed directly using f['person'], that is, the update is not written back to the file, even after the file is close d. If so, add a parameter writeback=True to the open function. Run it again and see if the age is changed.
Write custom objects
Still use the Person object above
with shelve.open('aa') as f: f['class'] = Person # Write to the class itself with shelve.open('aa') as f: Person = f['class'] a = Person('Bob', 23, 'Student') a.work()
The example above shows that shelve s can also serialize the class itself. Of course, serialized instances are certainly possible.
with shelve.open('aa') as f: a = Person('God', 100, 'watch') f['class'] = a with shelve.open('aa') as f: god = f['class'] god.work()
Note that since we open with open, we don't need to write close statements. This module has close functions. If it's not open with the method, remember to actively close.
by @sunhaiyu
2017.6.27