Serialization and deserialization of python (json and pickle)

Serialization and deserialization of python (json and pickle)

1, What is serialization and deserialization

1. Serialization

  • Convert a language variable into a json format string
  • The data structure * * ---- > in memory is converted into an intermediate format (string) ----- > * * and saved in a file

2. Deserialization

  • Convert json format string into a language variable
  • File * * ---- > read intermediate format (string) ----- > * * eval() into data structure in memory

3.json standard format string

  • {"name" : "zb", "age" : 18, "local" : true, "xx" : null}
  • json format string symbols must be double quotes

2, Why serialize

Two uses of serialization

1. Persistent state

  • The operation of a software or program is to deal with a series of state changes. In the programming language, "state" will be saved in memory with various structured data types (or variables)
  • The memory cannot store data permanently. When the power is off or the program is restarted, the structured data in the memory will be cleared
  • Then serialization is to save the data in the current memory to a file before your machine is powered off or restarted. The next execution program can directly load the previous data and then continue to execute (equivalent to a single game Archive)
# Access data (format standard), one program writes and the other reads (these two programs can be written in different languages)  

2. Cross platform data interaction

  • After serialization, the serialized content can not only be stored on disk, but also be transmitted to other machines through the network
  • If the sender and receiver agree to use the same serialization format, the differences brought by the platform and language will be shielded and cross platform data interaction will be realized
  • Conversely, the contents of data (variables) are re read from serialized objects to memory, which is called deserialization
# The data from the back end to the front end is the "json" format string

3, json serializable types

  • json module can serialize: dictionary, list, Boolean
  • The corresponding relationship between json data type and python data type is shown in the figure

[the external chain image transfer fails, and the source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-mtk4lbdx-1625827116670) (C: \ users \ pilgrim \ appdata \ roaming \ typora user images \ image-20210709181410520. PNG)]

  • Difference between json format serialized and original data type:

[the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-dssuifp-1625827116672) (C: \ users \ pilgrim \ appdata \ roaming \ typora \ user images \ image-20210709181456365. PNG)]

4, Use of json serialization and deserialization

1. Basic use of JSON serialization and deserialization

import json

# 1. Serialization: attention! After serialization. The type returned is string type.
res = json.dumps([1, 1.1, True, False, None, {'name': 'jack'}])
print(res, type(res))  # [1, 1.1, true, false, null, {"name": "jack"}] <class 'str'>

# 2. Deserialization: after serialization. Back to the original type.
res = json.loads(res)
print(res, type(res))  # [1, 1.1, True, False, None, {'name': 'jack'}] <class 'list'>

2. Write the result of json serialization to the file and read the json string format for deserialization

The result of serialization is written to the file

  • dump() and dumps()
import json

# I The result of serialization is written to the file
# 1. Method 1: JSON dumps
info = [1, 1.1, True, False, None, {'name': 'jack'}]
res = json.dumps(info) # Tip: after serialization, the string type is returned, so it can be written to the file.
with open('a.json', 'wt', encoding='utf-8') as f:
    f.write(res)

# 2. Method 2: JSON Dump. This method is recommended for writing files
info = [1, 1.1, True, False, None, {'name': 'jack'}]
with open('a.json', 'wt', encoding='utf-8') as f:
    json.dump(info, f) #It comes with f.write() without calling the f.write() function. Serialize the passed info into json format and write it to the file object F

Read the string in json format from the file for deserialization

  • load() and loads()
# 1. Method 1: JSON loads
with open('a.json', 'rt', encoding='utf-8') as f:
    res = json.loads(f.read())
    print(res, type(res))  # [1, 1.1, True, False, None, {'name': 'jack'}] <class 'list'>

# 2. Mode 2: json Load, the file reads the string in json format. This method is recommended.
with open('a.json', 'rt', encoding='utf-8') as f:
    print(json.load(f)) #Self contained f.read()

5, Serialization summary

  • json module can serialize: dictionary, list, Boolean
  • json.dumps() is a general serialization, JSON Loads () is a general deserialization
  • Both load and loads implement deserialization, and dumps and dump implement serialization
  • Both load and dump are file objects. These two default built-in read and write functions do not need to be called read() or write() function
  • load and read the file content, deserialize it, and return the deserialized content
  • dump serializes the passed values into json format and writes them to the file object

Four notes on using json

import json

# Note 1: json is compatible with data types common to all languages. The unique type of a language is not recognized. The unique types in Python contain collections, so json cannot recognize them.
json.dumps({1, 2, 3})  # TypeError: Object of type set is not JSON serializable

# Note 2: strings in json format must use double quotation marks. json does not recognize single quotation marks 
json.loads("['logging_test', 'config_test']")  # json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

# You need to know three things No matter how the data is created, as long as it meets the json format, it can be json When loads comes out, you don't have to dump the data to load (as shown in the figure above. Show how json format is expressed)

# Notice IV Serialization and deserialization of json for Chinese (serialization for Chinese will convert it into a specific json encoding format, which can not be seen intuitively as English strings.)
res = json.dumps('I am Chinese')
print(res, type(res))  # "\u6211\u662f\u4e2d\u6587" <class 'str'>

res = json.loads(res)
print(res, type(res))  # I am Chinese < class' STR >

6, Serializing byte types using json

  • demonstration
import json

dic = {"name": "Split", "age": 22}
res = json.dumps(dic)
print(res)  # {"name": "\u6d3e\u5927\u661f", "age": 22}
  • Join ensure_ascii=False function
dic = {"name": "Split", "age": 22}
res = json.dumps(dic,ensure_ascii=False)
print(res)  # {"name": "crack", "age": 22}
  • Problem summary
# There is no change in use. What is saved and what is taken out is still what. It's just that others help you save it in "uncode" format

PS: in Java, for performance reasons, there are many packages to complete serialization and deserialization: Google's gson, Alibaba's open source fastjson, and so on

Supplement:

# Understand: str is the byte type in python 2 JSON can be used after python interpreter 2.7 and 3.6 Loads (bytes type), string in python 3 is actually equivalent to Unicode encoding format. But in addition to python 3 5. JSON internally optimizes this expression form so that it can also be used in python 3, but not in 3.5
>>> import json
>>> json.loads(b'{"a":111}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/linhaifeng/anaconda3/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'  

Keywords: Python JSON

Added by zushiba on Fri, 21 Jan 2022 19:48:40 +0200