"Detailed explanation" of json encoder and decoder of Python standard library

json codec and decoder of Python standard library

1, Introduction to Python json Library

JSON (JavaScript Object Notation) is a lightweight data exchange format conceived and designed by Douglas Crockford. Its content is composed of attributes and values, so it also has the advantage of easy reading and processing. JSON is a data format independent of programming language. It is not only a subset of JavaScript, but also adopts the idiom of C language family. At present, many programming languages can parse and string it, and its wide use also makes it a general data format.

Python's json library provides support for json serialization, with API interfaces similar to the standard libraries Marshall and pickle.

2, Import json Library

Don't forget to import the json standard library before looking at the following contents

import json

3, Python corresponds to JSON data type

1) , JSON to Python data type conversion

JSON data formatTo Python data format
objectdict
arraylist
stringstr
number (int)int
number (real)float
trueTrue
falseFalse
nullNone

2) , Python to JSON data type conversion

Python data formatTo JSON data format
dictobject
list, tuplearray
strstring
int, float, int and float derived enumerationsnumber
Truetrue
Falsefalse
Nonenull

3) . source code reference

In the source code of the encoder and decoder of the JSON standard library, you can see the comments on the conversion type

JSONEncoder

class JSONEncoder(object):
    """Extensible JSON <http://json.org> encoder for Python data structures.

    Supports the following objects and types by default:

    +-------------------+---------------+
    | Python            | JSON          |
    +===================+===============+
    | dict              | object        |
    +-------------------+---------------+
    | list, tuple       | array         |
    +-------------------+---------------+
    | str               | string        |
    +-------------------+---------------+
    | int, float        | number        |
    +-------------------+---------------+
    | True              | true          |
    +-------------------+---------------+
    | False             | false         |
    +-------------------+---------------+
    | None              | null          |
    +-------------------+---------------+

    To extend this to recognize other objects, subclass and implement a
    ``.default()`` method with another method that returns a serializable
    object for ``o`` if possible, otherwise it should call the superclass
    implementation (to raise ``TypeError``).

    """
    pass

JSONDecoder

class JSONDecoder(object):
    """Simple JSON <http://json.org> decoder

    Performs the following translations in decoding by default:

    +---------------+-------------------+
    | JSON          | Python            |
    +===============+===================+
    | object        | dict              |
    +---------------+-------------------+
    | array         | list              |
    +---------------+-------------------+
    | string        | str               |
    +---------------+-------------------+
    | number (int)  | int               |
    +---------------+-------------------+
    | number (real) | float             |
    +---------------+-------------------+
    | true          | True              |
    +---------------+-------------------+
    | false         | False             |
    +---------------+-------------------+
    | null          | None              |
    +---------------+-------------------+

    It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as
    their corresponding ``float`` values, which is outside the JSON spec.

    """
    pass

4, Basic use of "key" 🧊

1. Serialization operation

1),json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)

Serialize the obj object into JSON formatted stream form and store it in the file. It should be noted that the JSON module always returns str string object. Therefore, when reading and writing JSON files, it should operate in text mode, and ensure FP Write supports str writing.

The parameters are as follows:

  • obj: the object to be serialized.
  • fp: pass in an owner write() writes the file object of the method, such as file like object. It should be noted that a file object in text mode should be passed in, and the encoding should be UTF-8, UTF-16 or UTF-32.
  • skipkeys: when True (False by default), skip keys that are not the dictionary of basic objects (including str, int, float, bool, None). Otherwise, a TypeError exception will be thrown.
  • ensure_ascii: when True (the default is True), the output guarantees that all input non ASCII characters will be escaped. If ensure_ascii is false, and these characters will be output as is.
  • check_circular: check the cycle. If False (the default is True), the circular reference check of the container type is skipped and the circular reference raises an overflow error (or worse).
  • allow_nan: conversion between data types follows. If False (the default is True), a ValueError exception is thrown when serializing float type values (Nan, inf, and - INF) outside the strict JSON specification. If True, their JavaScript equivalents (Nan, Infinity, and - Infinity) are used.
  • cls: the category of JSON encoder. The built-in JSON encoder is used by default. If you want to use a custom JSON encoder, such as the subclass of jsoneencoder, you can use this parameter.
  • indent: controls the indented content.
    a. The default value of None selects the most compact representation, that is, a row of data.
    b. If indent is a non negative integer or string, JSON array elements and object members will be beautified and output the indented content specified for the value. Using a positive integer will indent each layer with the same number of spaces.
       if indent is 4, indent each layer with 4 spaces. Indent is a string (such as "\ t"), which will be used to indent each layer.
    c. If the indent level is zero, negative, or '', only line breaks will be added, and there will be no characters before each indent.
  • separators: split characters. Because two kinds of split characters are used, it should be a (item_separator, key_separator) tuple (item split character, key split character).
       however, due to the existence of indent parameter indent, the default value of separators will change in two ways. When indent is None, it will be expressed in a compact line. At this time, the separator will take the value (',': ') and there will be a space after each separator. On the contrary, when it is not None, it will take the value (', ':').
       tip: in order to get the most compact JSON expression, when indent is None, you can manually value separators as (',': '), that is, leave no spaces.
  • default: when specified, it should be a function. It will be called whenever an object cannot be serialized. This function accepts a parameter, which is the object that cannot be serialized.
    It should return a version of the object that can be encoded by JSON or raise a TypeError. If not specified, TypeError will be thrown directly.
  • sort_keys: if True (the default is False), the output of the dictionary will be sorted in the order of keys.

Specific explanation:

  • Cancel non ASCII escape

    When storing JSON encoded data into a file or encoding and returning, it is easy to find that except for ASCII characters, they have been escaped and become Unicode escape characters. The most typical example is Chinese characters.

    # -*- coding: utf-8 -*-
    import json
    from pathlib import Path
    
    BASE_DIR = Path(__file__).parent
    
    happly_new_year = {
        "particular year": 2022, "the Chinese zodiac": "tiger", 
        "Blessing words": "On the occasion of this beautiful Spring Festival, I wish all friends a good mood every day in the new year!", 
        "Happy Spring Festival": None
    }
    # File path
    file_path = BASE_DIR / "test.json"
    
    # Printout to console
    print(json.dumps(happly_new_year))
    # Store output to file
    with file_path.open("w", encoding="utf-8") as f_w:
        json.dump(happly_new_year, fp=f_w)
    

    Console output and output to file

    {"\u5e74\u4efd": 2022, "\u751f\u8096": "\u864e", "\u795d\u798f\u8bed": "\u5728\u8fd9\u7f8e\u4e3d\u7684\u6625\u8282\u4e4b\u9645\uff0c\u795d\u798f\u5404\u4f4d\u670b\u53cb\u5728\u65b0\u7684\u4e00\u5e74\u91cc\uff0c\u5929\u5929\u90fd\u6709\u4efd\u597d\u5fc3\u60c5!", "\u6625\u8282\u5feb\u4e50": null}
    

    So how to cancel the escape of non ASCII characters during JSON coding so that these characters will be output as they are? The answer is very simple. It will ensure each encoding_ Set the ASCII parameter to false.

    The code in the upper part remains the same, just add guarantee_ ASCII parameter

    ......
    
    # Printout to console
    print(json.dumps(happly_new_year, ensure_ascii=False))
    # Store output to file
    
    with file_path.open("w", encoding="utf-8") as f_w:
        json.dump(happly_new_year, fp=f_w, ensure_ascii=False)
    

    Console output and output to file

    {"particular year": 2022, "the Chinese zodiac": "tiger", "Blessing words": "On the occasion of this beautiful Spring Festival, I wish all friends a good mood every day in the new year!", "Happy Spring Festival": null}
    
  • JSON data formatting

    The JSON data output each time is displayed in one line. It can be read when the amount of data is relatively small. Once the amount of data increases, the reading experience is not so good. JSON itself is a format that pays attention to data structure. Its advantages can be brought into full play during maintenance and debugging. Formatting it is a common means.
    The indent parameter needs to be used to format and return it. The function of the indent parameter is explained in the parameter explanation part. Here we show the function of this parameter.

    Indent with four spaces

    # -*- coding: utf-8 -*-
    import json
    from pathlib import Path
    
    BASE_DIR = Path(__file__).parent
    
    happly_new_year = {
        "particular year": 2022, "the Chinese zodiac": "tiger",
        "Blessing words": "On the occasion of this beautiful Spring Festival, I wish all friends a good mood every day in the new year!",
        "Happy Spring Festival": None
    }
    # File path
    file_path = BASE_DIR / "test.json"
    
    # Printout to console
    print(json.dumps(happly_new_year, indent=4, ensure_ascii=False))
    # Store output to file
    with file_path.open("w", encoding="utf-8") as f_w:
        json.dump(happly_new_year, fp=f_w, indent=4, ensure_ascii=False)
    
    {
        "particular year": 2022,
        "the Chinese zodiac": "tiger",
        "Blessing words": "On the occasion of this beautiful Spring Festival, I wish all friends a good mood every day in the new year!",
        "Happy Spring Festival": null
    }
    

    Indent with two spaces

    ......
    
    # Printout to console
    print(json.dumps(happly_new_year, indent=2, ensure_ascii=False))
    # Store output to file
    with file_path.open("w", encoding="utf-8") as f_w:
        json.dump(happly_new_year, fp=f_w, indent=2, ensure_ascii=False)
    
    {
      "particular year": 2022,
      "the Chinese zodiac": "tiger",
      "Blessing words": "On the occasion of this beautiful Spring Festival, I wish all friends a good mood every day in the new year!",
      "Happy Spring Festival": null
    }
    
  • Clever use of default parameter

    default is actually a very convenient parameter. By passing in a function to handle custom objects to this parameter, you can escape these custom objects into a version that can be encoded by JSON without manually escaping the data source that cannot be parsed by the JSON encoder before serializing the data.

    There is a fruit class, which creates a new object for each kind of fruit. In each object, there is a method that can escape it into a method that can be encoded by JSON. What needs to be done is to call this method correctly to escape it every time of serialization. At this time, the parameter default comes in handy. The function passed in default needs to do two things. If it is an object of fruit class, escape it. If not, throw an exception.

    # -*- coding: utf-8 -*-
    import json
    from pathlib import Path
    
    BASE_DIR = Path(__file__).parent
    
    
    class Fruits:
        """Fruits"""
        def __init__(self, name, price):
            self.name = name
            self.price = price
    
        def __str__(self):
            return f"Fruit type:{self.name},Selling price:{self.price}"
    
        def todict(self):
            """Convert the fruit information into dictionary format for output"""
            return {"Fruits": self.name, "price": self.price}
    
    
    def obj_to_json(obj):
        """Object to JSON Energy coding function
        
        Args:
            obj: custom object
        Returns:
            Can be JSON Encoded version data
        Raises:
            An object that is not a target class will be thrown TypeError abnormal
        """
        if isinstance(obj, Fruits):
            return obj.todict()
        raise TypeError(f"Object of type {obj.__name__} is not JSON serializable")
    
    
    pear = Fruits("Pear", 3.3)
    apple = Fruits("Apple", 5.6)
    banana = Fruits("Banana", 11.6)
    orange = Fruits("orange", 6.6)
    
    
    fruits = [pear, apple, banana, orange]
    # File path
    file_path = BASE_DIR / "test.json"
    
    with file_path.open("w", encoding="utf-8") as f_w:
        json.dump(fruits, fp=f_w, default=obj_to_json, ensure_ascii=False)
    

    JSON data

    [{"Fruits": "Pear", "price": 3.3}, {"Fruits": "Apple", "price": 5.6}, {"Fruits": "Banana", "price": 11.6}, {"Fruits": "orange", "price": 6.6}]
    

2),json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)

Serialize obj into a JSON str object and return it without file storage

The meaning of other parameters is the same as that of JSON. Except for the lack of fp parameters for storing in files Dump () is the same, so its usage is the same, except that one is to store to a file and the other is to directly return JSON encoded data. There is no more explanation here

Note: keys in key value pairs in JSON are always str type. When an object is converted to JSON, all keys in the dictionary are cast to strings. As a result, when the dictionary is converted to JSON and then converted back to the dictionary, it may not be equal to the original. In other words, if x has a non string key, there are loads (dumps (x))= x. For example, in Python data types, with integers as keys, when converted to JSON, these integers will become string types.

2. Deserialization operation

1),json.load(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

Deserialize fp (a text file or binary file that supports. read() and contains a JSON document) into a Python object.

The parameters are as follows:

  • fp: pass in an owner read() writes the file object of the method. The file object should be text file or binary file. For binary objects, the input code should be UTF-8, UTF-16 or UTF-32.
  • cls: the category of JSON decoder. The built-in JSON decoder is used by default. If you want to use a custom JSON decoder, such as the subclass of JSONDecoder, you can use this parameter. Additional keyword parameters are passed through the constructor of the class.
  • object_hook: when specified, it should be a function. It will be called on each decoded object literal (a dict, a basic type object). This function accepts a parameter, which is the literal of each decoded object.
    The return value of this function will replace the original dict. In short, this function is responsible for converting the deserialized basic type object into a user-defined type object. This feature can be used to implement custom decoders (such as JSON-RPC type hints).
  • parse_float: resolve floating point numbers. If specified, will be called with each string to decode JSON floating-point numbers. It should be a function and receive a parameter at the same time. This parameter is the string of each JSON floating-point number. By default, it is equivalent to float(num_str).
    It can be used to use other data types and parsers (such as decimal.Decimal) for JSON floating-point numbers.
  • parse_int: parses an integer. If specified, will be called with each string to decode JSON integers. It should be a function and receive a parameter at the same time. This parameter is the string of each JSON integer. By default, it is equivalent to int(num_str).
    It can be used to use other data types and parser (such as float) for JSON integers.
  • parse_constant: parse constant. If specified, it will be called with one of the following strings: '- Infinity', 'infinity', 'NaN'. It should be a function and receive a parameter at the same time. This parameter is the string of each JSON integer. If you encounter an invalid JSON number, you can use it to throw an exception.
  • object_pairs_hook: it will be called on the literal quantity of each object decoded by the sequence table pair. This parameter is the same as object_ The function of hook parameter is the same as that of object_ The hook parameter is different from the object parameter_ pairs_ The decoded key value pair list of hook is orderly. Some functions that depend on the order of key value pairs can use object_pairs_hook parameter instead of object_hook.
    If object_hook is also defined as object_pairs_hook first.

Specific explanation:

  • JSON floating point number processing

    parse_ The float parameter is used to control the operation of converting JSON floating-point numbers to Python data types. By default, it is converted to floating-point number type float(num_str). Floating-point numbers can also be processed, such as rounding, rounding up and down, and converting to decimal Decimal, etc. These can be directly relied on parse_float parameter implementation.

    # -*- coding: utf-8 -*-
    import math
    import json
    from pathlib import Path
    from decimal import Decimal
    
    BASE_DIR = Path(__file__).parent
    # File path
    file_path = BASE_DIR / "test.json"
    
    
    def json_ceil(dic):
        """take json Floating point up rounding auxiliary function"""
        return math.ceil(float(dic))
    
    
    def json_floor(dic):
        """take json Floating point rounding down helper"""
        return math.floor(float(dic))
    
    
    def json_round(dic):
        """take json Floating point number rounding auxiliary function"""
        return round(float(dic))
    
    
    def float_to_decimal(dic):
        """take json Floating point conversion Decimal auxiliary function """
        return Decimal.from_float(float(dic))
        
    .......
    

    Source JSON data

    [{"Fruits": "Pear", "price": 3.3}, {"Fruits": "Apple", "price": 5.6}, {"Fruits": "Banana", "price": 11.6}, {"Fruits": "orange", "price": 6.6}]
    

    Round up

    ......
    
    with file_path.open("r", encoding="utf-8") as f_r:
        print(json.load(f_r, parse_float=json_ceil))
    
    [{'Fruits': 'Pear', 'price': 4}, {'Fruits': 'Apple', 'price': 6}, {'Fruits': 'Banana', 'price': 12}, {'Fruits': 'orange', 'price': 7}]
    

    Round down

    ......
    
    with file_path.open("r", encoding="utf-8") as f_r:
        print(json.load(f_r, parse_float=json_floor))
    
    [{'Fruits': 'Pear', 'price': 3}, {'Fruits': 'Apple', 'price': 5}, {'Fruits': 'Banana', 'price': 11}, {'Fruits': 'orange', 'price': 6}]
    

    Rounding

    ......
    
    with file_path.open("r", encoding="utf-8") as f_r:
        print(json.load(f_r, parse_float=json_round))
    
    [{'Fruits': 'Pear', 'price': 3}, {'Fruits': 'Apple', 'price': 6}, {'Fruits': 'Banana', 'price': 12}, {'Fruits': 'orange', 'price': 7}]
    

    Convert to Decimal type

    ......
    
    with file_path.open("r", encoding="utf-8") as f_r:
    	print(json.load(f_r, parse_float=float_to_decimal))
    
    [{'Fruits': 'Pear', 'price': Decimal('3.29999999999999982236431605997495353221893310546875')}, {'Fruits': 'Apple', 'price': Decimal('5.5999999999999996447286321199499070644378662109375')}, {'Fruits': 'Banana', 'price': Decimal('11.5999999999999996447286321199499070644378662109375')}, {'Fruits': 'orange', 'price': Decimal('6.5999999999999996447286321199499070644378662109375')}]
    
  • JSON integer processing
    When parsing JSON integer strings, the default is to convert them to integer type int(num_str), which can be converted to floating-point type directly by parse_float parameter implementation.

    # -*- coding: utf-8 -*-
    import math
    import json
    from pathlib import Path
    from decimal import Decimal
    
    BASE_DIR = Path(__file__).parent
    file_path = BASE_DIR / "test.json"
    
    
    with file_path.open("r", encoding="utf-8") as f_r:
        print(json.load(f_r, parse_int=float))
    

    Source JSON data

    {"Blessing words": "Happy New Year!", "Good luck": 1314}
    

    Program printing

    {'Blessing words': 'Happy New Year!', 'Good luck': 1314.0}
    
  • object_hook and object_ pairs_ Application of hook
    object_hook parameter and object_pairs_hook parameter functions are consistent, but when writing a function, the parameters accepted by the function are different. For object_ The hook parameter is passed in the most basic decoded type object, that is, each time it is passed in the dict dictionary data type. For object_pairs_hook parameter, which passes in each decoded list nested tuple type [(key, value),], which is equivalent to calling the dictionary The items() method.

    The ranking of awards is a typical example. Take the numbers 1, 2, 3... As the basis of awards, and use object respectively_ Hook parameter and object_pairs_hook parameter for example.

    # -*- coding: utf-8 -*-
    import json
    
    s_data = '''{
        "3": ["third", "Wang Han"], "1": ["the first", "Xiao Ming Wang"], "2": ["proxime accessit", "Xiao Ming"],
        "4": ["Fourth place", "Li Hua"], "5": ["Fifth place", "Li Dachuan"], "6": ["Sixth place", "xianzhe_"]
    }'''
    
    # Using object_hook parameters
    json.loads(s_data, object_hook=lambda x: print(f"Type:{type(x)},Content:{x}"))
    # Using object_pairs_hook parameters
    json.loads(s_data, object_pairs_hook=lambda x: print(f"Type:{type(x)},Content:{x}"))
    
    "Type:<class 'dict'>,Content:{'3': ['third', 'Wang Han'], '1': ['the first', 'Xiao Ming Wang'], '2': ['proxime accessit', 'Xiao Ming'], '4': ['Fourth place', 'Li Hua'], '5': ['Fifth place', 'Li Dachuan'], '6': ['Sixth place', 'xianzhe_']}"
    "Type:<class 'list'>,Content:[('3', ['third', 'Wang Han']), ('1', ['the first', 'Xiao Ming Wang']), ('2', ['proxime accessit', 'Xiao Ming']), ('4', ['Fourth place', 'Li Hua']), ('5', ['Fifth place', 'Li Dachuan']), ('6', ['Sixth place', 'xianzhe_'])]"
    

    You can see the object_pairs_hook does not return a sequence table, but with object_ The same order as hook. My guess is that I don't know the basis of sorting inside the program, but in fact, object_ pairs_ The parameter passed in by hook is a list, so you can manually specify the sorting

    Return according to the sorting of Key values

    # -*- coding: utf-8 -*-
    import json
    
    s_data = '''{
        "3": ["third", "Wang Han"], "1": ["the first", "Xiao Ming Wang"], "2": ["proxime accessit", "Xiao Ming"],
        "4": ["Fourth place", "Li Hua"], "5": ["Fifth place", "Li Dachuan"], "6": ["Sixth place", "xianzhe_"]
    }'''
    
    
    def sort_hook(dic):
        """take Json Decoding result sorting auxiliary function
    
        If it is object_hook Parameter usage will be returned directly
        Args:
            dic: The most basic type of object
        """
        if isinstance(dic, list):
            dic.sort(key=lambda i: i[0])
            return dic
        else:
            return dic
    
    
    # Using object_hook parameters
    print(json.loads(s_data, object_hook=sort_hook))
    # Using object_pairs_hook parameters
    print(json.loads(s_data, object_pairs_hook=sort_hook))
    
    {'3': ['third', 'Wang Han'], '1': ['the first', 'Xiao Ming Wang'], '2': ['proxime accessit', 'Xiao Ming'], '4': ['Fourth place', 'Li Hua'], '5': ['Fifth place', 'Li Dachuan'], '6': ['Sixth place', 'xianzhe_']}
    [('1', ['the first', 'Xiao Ming Wang']), ('2', ['proxime accessit', 'Xiao Ming']), ('3', ['third', 'Wang Han']), ('4', ['Fourth place', 'Li Hua']), ('5', ['Fifth place', 'Li Dachuan']), ('6', ['Sixth place', 'xianzhe_'])]
    

    When two parameters exist at the same time, object_pairs_hook has higher priority

    # -*- coding: utf-8 -*-
    import json
    
    s_data = '''{
        "3": ["third", "Wang Han"], "1": ["the first", "Xiao Ming Wang"], "2": ["proxime accessit", "Xiao Ming"],
        "4": ["Fourth place", "Li Hua"], "5": ["Fifth place", "Li Dachuan"], "6": ["Sixth place", "xianzhe_"]
    }'''
    
    
    def hook(dic):
        if isinstance(dic, list):
            return "object_pairs_hook first"
        else:
            return "object_hook first"
    
    
    print(json.loads(s_data, object_hook=hook, object_pairs_hook=hook))
    
    object_pairs_hook first
    

2),json.loads(s, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

Deserialize the JSON encoded s String object into a Python object.

And JSON Loads () is different in that it does not need to get JSON encoded data from a file object, but directly pass in a JSON encoded string object. The meaning of other parameters is the same as JSON The same as in loads (), so the usage is the same. I won't explain it here.

The parameters are as follows:

  • s: In fact, it is the abbreviation of str, which passes in the string object that needs to be deserialized by JSON

Note: if the deserialized data is not a valid JSON document, a JSONDecodeError error error is raised.

5, Encoder and decoder 🥃

1),class json.JSONDecoder(*, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, strict=True, object_pairs_hook=None)

Python is an important part of JSON decoding, a simple JSON decoder.
Deserialize JSON when calling load(),json. When using the loads () method, most of the parameters passed in are actually passed to the decoder, which is also the reason for the consistency of parameter names.
At the same time, NaN, Infinity and - Infinity of JSON are understood as float values corresponding to Python.

Source code:

You can see from the source code that the deserialization method of JSON actually calls the JSONDecoder decoder.

Partial source code

	......
    if cls is None:
        cls = JSONDecoder
    if object_hook is not None:
        kw['object_hook'] = object_hook
    if object_pairs_hook is not None:
        kw['object_pairs_hook'] = object_pairs_hook
    if parse_float is not None:
        kw['parse_float'] = parse_float
    if parse_int is not None:
        kw['parse_int'] = parse_int
    if parse_constant is not None:
        kw['parse_constant'] = parse_constant
    return cls(**kw).decode(s)

The parameters are as follows:

Most of the parameter functions of the decoder are related to deserialization JSON load(),json. The loads () method is consistent.

  • object_hook: when specified, it should be a function. It will be called on each decoded object literal (a dict, a basic type object). This function accepts a parameter, which is the literal of each decoded object.
    The return value of this function will replace the original dict. In short, this function is responsible for converting the deserialized basic type object into a user-defined type object. This feature can be used to implement custom decoders (such as JSON-RPC type hints).
  • parse_float: resolve floating point numbers. If specified, will be called with each string to decode JSON floating-point numbers. It should be a function and receive a parameter at the same time. This parameter is the string of each JSON floating-point number. By default, it is equivalent to float(num_str).
  • parse_int: parses an integer. If specified, will be called with each string to decode JSON integers. It should be a function and receive a parameter at the same time. This parameter is the string of each JSON integer. By default, it is equivalent to int(num_str).
  • parse_constant: parse constant. If specified, it will be called with one of the following strings: '- Infinity', 'infinity', 'NaN'. It should be a function and receive a parameter at the same time. This parameter is the string of each JSON integer. If you encounter an invalid JSON number, you can use it to throw an exception.
  • Strict: strict mode. If it is False (the default is True), the control character will be allowed within the string. The control characters in this context encode characters in the range 0 – 31, including '\ t' (TAB), '\ n', '\ r' and '\ 0'.
  • object_pairs_hook: it will be called on the literal quantity of each object decoded by the sequence table pair. This parameter is the same as object_ The function of hook parameter is the same as that of object_ The hook parameter is different from the object parameter_ pairs_ The decoded key value pair list of hook is orderly. Some functions that depend on the order of key value pairs can use object_pairs_hook parameter instead of object_hook.
    If object_hook is also defined as object_pairs_hook first.

Method function

  • decode(s)
    Returns the Python representation of the parameter s, which is a string str instance containing a JSON document.

  • raw_decode(s)
    Original decoded data. Decode the JSON document (a str object starting with the JSON document) from the parameter s and return a Python representation of 2 tuples and a sequence number indicating the end position of the document in S.
    This can be used to decode JSON documents from a string that may have irrelevant data at the end of the string.

2),class json.JSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)

Python is an important part of JSON decoding. It is an extensible JSON encoder for Python data structure.
Serialize JSON. After calling dump(),json. When using dumps () method, most of the parameters passed in are actually passed to the decoder, which is also the reason for the consistency of parameter names.

In order to extend it to identify other objects, you need to subclass and implement the default() method to another method of serializable objects that returns o. if possible, otherwise it should call the superclass implementation (to raise TypeError).

Source code:

You can see from the source code that the deserialization method of JSON actually calls the JSONDecoder decoder.

Partial source code

	......
    # cached encoder
    if (not skipkeys and ensure_ascii and
        check_circular and allow_nan and
        cls is None and indent is None and separators is None and
        default is None and not sort_keys and not kw):
        return _default_encoder.encode(obj)
    if cls is None:
        cls = JSONEncoder
    return cls(
        skipkeys=skipkeys, ensure_ascii=ensure_ascii,
        check_circular=check_circular, allow_nan=allow_nan, indent=indent,
        separators=separators, default=default, sort_keys=sort_keys,
        **kw).encode(obj)

The parameters are as follows:

  • skipkeys: when True (False by default), skip keys that are not the dictionary of basic objects (including str, int, float, bool, None). Otherwise, a TypeError exception will be thrown.
  • ensure_ascii: when True (the default is True), the output guarantees that all input non ASCII characters will be escaped. If ensure_ascii is false, and these characters will be output as is.
  • check_circular: check the cycle. If False (the default is True), the circular reference check of the container type is skipped and the circular reference raises an overflow error (or worse).
  • allow_nan: conversion between data types follows. If False (the default is True), a ValueError exception is thrown when serializing float type values (Nan, inf, and - INF) outside the strict JSON specification. If True, their JavaScript equivalents (Nan, Infinity, and - Infinity) are used.
  • sort_keys: when True (False by default), skip keys that are not the dictionary of basic objects (including str, int, float, bool, None). Otherwise, a TypeError exception will be thrown.
  • indent: controls the indented content.
    a. The default value of None selects the most compact representation, that is, a row of data.
    b. If indent is a non negative integer or string, JSON array elements and object members will be beautified and output the indented content specified for the value. Using a positive integer will indent each layer with the same number of spaces.
       if indent is 4, indent each layer with 4 spaces. Indent is a string (such as "\ t"), which will be used to indent each layer.
    c. If the indent level is zero, negative, or '', only line breaks will be added, and there will be no characters before each indent.
  • separators: split characters. Because two kinds of split characters are used, it should be a (item_separator, key_separator) tuple (item split character, key split character).
       however, due to the existence of indent parameter indent, the default value of separators will change in two ways. When indent is None, it will be expressed in a compact line. At this time, the separator will take the value (',': ') and there will be a space after each separator. On the contrary, when it is not None, it will take the value (', ':').
       tip: in order to get the most compact JSON expression, when indent is None, you can manually value separators as (',': '), that is, leave no spaces.
  • default: when specified, it should be a function. It will be called whenever an object cannot be serialized. This function accepts a parameter, which is the object that cannot be serialized.
    It should return a version of the object that can be encoded by JSON or raise a TypeError. If not specified, TypeError will be thrown directly.

Method function

  • default(o)
    Implement this method in a subclass to return o a serializable object, or call the underlying implementation (throw TypeError).

    For example, to support any iterator, you can implement the default settings like this:

    def default(self, o):
       try:
           iterable = iter(o)
       except TypeError:
           pass
       else:
           return list(iterable)
       # Let the base class default method raise the TypeError
       return json.JSONEncoder.default(self, o)
    encode(o)
    

    Returns the JSON string representation of a Python o data structure. For example:

    >>> json.JSONEncoder().encode({"foo": ["bar", "baz"]})
    '{"foo": ["bar", "baz"]}'
    
  • iterencode(o)
    Encode a given object o, and make every available string representation.

    for chunk in json.JSONEncoder().iterencode(bigobject):
        mysocket.write(chunk)
    

6, Abnormal 🧃

exception json.JSONDecodeError(msg, doc, pos)

The parameters are as follows:

  • msg
    Unformatted error message
  • doc
    JSON document being parsed.
  • pos
    Starting index of doc failed to resolve

Source code:

The source code is relatively simple. The source code is posted here

class JSONDecodeError(ValueError):
    """Subclass of ValueError with the following additional properties:

    msg: The unformatted error message
    doc: The JSON document being parsed
    pos: The start index of doc where parsing failed
    lineno: The line corresponding to pos
    colno: The column corresponding to pos

    """
    # Note that this exception is used from _json
    def __init__(self, msg, doc, pos):
        lineno = doc.count('\n', 0, pos) + 1
        colno = pos - doc.rfind('\n', 0, pos)
        errmsg = '%s: line %d column %d (char %d)' % (msg, lineno, colno, pos)
        ValueError.__init__(self, errmsg)
        self.msg = msg
        self.doc = doc
        self.pos = pos
        self.lineno = lineno
        self.colno = colno

    def __reduce__(self):
        return self.__class__, (self.msg, self.doc, self.pos)

Demo code:

The demonstration of manually triggering exceptions is generally not used. It is more used to understand how the source code works

# -*- coding: utf-8 -*-
import json

err_json = "{key: {\n'key2':value1}}}key3:value"
raise json.JSONDecodeError("json data in wrong format", err_json, len(err_json))
json.decoder.JSONDecodeError: json data in wrong format: line 2 column 27 (char 34)

reference material 💖

Related blog 😋

Keywords: Python JSON

Added by schilly on Thu, 03 Feb 2022 15:23:38 +0200