Python serialization and deserialization

Python serialization and deserialization

By serializing the object, it can be stored in variables or files, and the state of the object at that time can be preserved, so as to prolong its life cycle. And this object can be read out again if necessary. There are several common modules in Python to implement this function.

pickle module

Stored in variables

dumps(obj) returns the stored bytes

dic = {'age': 23, 'job': 'student'}
byte_data = pickle.dumps(dic)
# out -> b'\x80\x03}q\x00(X\x03\x00\x00\...'
print(byte_data)

Read data

The data is stored in byte_data variables in bytes, and the loads function is used when it needs to be reused.

obj = pickle.loads(byte_data)
print(obj)

Stored in files

It can also exist in files to make objects persistent. The dump and load functions are used. Notice the difference between them, less s. Because pickle writes binary data, it needs to be opened in wb and rb modes.

# serialize
with open('abc.pkl', 'wb') as f:
    dic = {'age': 23, 'job': 'student'}
    pickle.dump(dic, f)
# Deserialize
with open('abc.pkl', 'rb') as f:
    aa = pickle.load(f)
    print(aa)
    print(type(aa))  # <class 'dict'>

Serialize user-defined objects

Suppose I write a class called Person.

class Person:
    def __init__(self, name, age, job):
        self.name = name
        self.age = age
        self.job = job

    def work(self):
        print(self.name, 'is working...')

pickle can also be written, not only to the class itself, but also to an instance of it.

# Storing instances in variables can also exist in files
a_person = Person('abc', 22, 'waiter')
person_abc = pickle.dumps(a_person)
p = pickle.loads(person_abc)
p.work()
# Store the class itself in variables, and return the class itself when loads, rather than an instance of it.
class_Person = pickle.dumps(Person)
Person = pickle.loads(class_Person)
p = Person('Bob', 23, 'Student')
p.work()

# The following example demonstrates storing classes in files
# serialize
with open('person.pkl', 'wb') as f:
    pickle.dump(Person, f)
# Deserialize
with open('person.pkl', 'rb') as f:
    Person = pickle.load(f)
    aa = Person('gg', 23, '6')
    aa.work()

json module

Pickle can easily serialize all objects. However, as a more standard format, json has better readability (pickle is binary data) and cross-platform. It's a good choice.

json uses the same four function names as pickle.

Serialization into strings

dic = {'age': 23, 'job': 'student'}
dic_str = json.dumps(dic)
print(type(dic_str), dic_str)
# out: <class 'str'> {"age": 23, "job": "student"}

dic_obj = json.loads(dic_str)
print(type(dic_obj), dic_obj)
# out: <class 'dict'> {'age': 23, 'job': 'student'}

As you can see, the dumps function converts objects into strings. The loads function restores it to a dictionary.

Store as json file

It can also be stored in json files

dic = {'age': 23, 'job': 'student'}
with open('abc.json', 'w', encoding='utf-8') as f:
    json.dump(dic, f)

with open('abc.json', encoding='utf-8') as f:
    obj = json.load(f)
    print(obj)

Storing custom objects

Or the Person object above. Error reporting if serialized directly

aa = Person('Bob', 23, 'Student')
with open('abc.json', 'w', encoding='utf-8') as f:
    json.dump(aa, f) # Report errors

Object of type'Person'is not JSON serializable at this point, just pass a parameter default in the dump function, which accepts a function that converts an object into a dictionary.

Just write one.

def person2dict(person):
    return {'name': person.name,
            'age': person.age,
            'job': person.job}

This returns a dictionary, and object instances have a way to simplify the process. Call directly the _dict_ of the instance. for example

print(aa.__dict) # {'name': 'Bob', 'age': 23, 'job': 'Student'}

Very convenient.

At the same time, when reading, load out a dictionary, and then turn back to the object, you also need an object_hook parameter, which receives a function to turn the dictionary into an object.

def dict2person(dic):
    return Person(dic['name'], dic['age'], dic['job'])

So the complete program should be written as follows

with open('abc.json', 'w', encoding='utf-8') as f:
    json.dump(aa, f, default=person2dict)

with open('abc.json', encoding='utf-8') as f:
    obj = json.load(f, object_hook=dict2person)
    print(obj.name, obj.age, obj.job)
    obj.work()

Since _dict_ can be used instead of person2dict function, lambda function can be used to simplify it.

with open('abc.json', 'w', encoding='utf-8') as f:
   json.dump(aa, f, default=lambda obj: obj.__dict__)

These are stored in files and variables.

But as far as I'm concerned, I don't know how to use json serialization for our custom classes as easily as pickle, or maybe other extension functions. I'll talk about it later.

Shell module

There is also a module, not very common, usually using an open just fine. Sheve stores data in the form of key-value pairs.

with shelve.open('aa') as f:
    f['person'] = {'age': 23, 'job': 'student'}
    f['person']['age'] = 44  # Here's an attempt to change the old age of 23
    f['numbers'] = [i for i in range(10)]

with shelve.open('aa') as f:
    person = f['person']
    print(person) # {'age': 23, 'job': 'student'}
    nums = f['numbers']
    print(nums) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Documents do not have suffix names. Three files (a little too many) are generated under windows, namely aa. bak, aa. DAT and aa. dir. The Bak and dir files are viewable (seemingly the same contents of the two files) and generate such data in the following example.

'person', (0, 44)
'numbers', (512, 28)

Allow writeback

In one detail, when we read the key person, we found that age was still 23 years old, and f['person']['age'] = 44 did not change to 44. The following

with shelve.open('aa', writeback=True) as f:
    dic = {'age': 23, 'job': 'student'}
    f['person'] = dic
    dic['age'] = 44
    f['person'] = dic

It is equivalent to two assignments. This method can change the value.

By default, the stored value is not updated after the value is changed directly using f['person'], that is, the update is not written back to the file, even after the file is close d. If so, add a parameter writeback=True to the open function. Run it again and see if the age is changed.

Write custom objects

Still use the Person object above

with shelve.open('aa') as f:
    f['class'] = Person
    
# Write to the class itself
with shelve.open('aa') as f:
    Person = f['class']
    a = Person('Bob', 23, 'Student')
    a.work()

The example above shows that shelve s can also serialize the class itself. Of course, serialized instances are certainly possible.

with shelve.open('aa') as f:
    a = Person('God', 100, 'watch')
    f['class'] = a

with shelve.open('aa') as f:
    god = f['class']
    god.work()

Note that since we open with open, we don't need to write close statements. This module has close functions. If it's not open with the method, remember to actively close.

by @sunhaiyu

2017.6.27

Keywords: Python JSON encoding Lambda

Added by pazzy on Tue, 18 Jun 2019 21:17:39 +0300