Introduction: This is Models and layers in Paddle Learning notes. For the construction and operation of the layer in the pad, the preliminary test and related learning are carried out.
Key words: Layer, cushion
§ 01 model and layer
model is one of the important concepts in deep learning. The core function of the model is to map a group of input variables to another group of output variables after a series of calculations. The mapping function represents a deep learning algorithm. In the paddy framework, the model includes the following two aspects:
- A combination of a series of layers for mapping (forward execution)
- Some parameter variables are updated in real time during training
in this document, you will learn how to define and use the paste model, and understand the relationship between the model and the layer.
1, Define models and layers in the pad
in paste, most models are composed of a series of layers, which are the basic logical execution unit of the model. There are two aspects in the layer:
- On the one hand, the required variables are calculated and held as members of the layer in the form of temporary variables or parameters
- On the other hand, one or more specific operators are held to complete the corresponding calculation.
1. Model and layer
building variables and operators from scratch to build layers and models is a very complex process, and it is difficult to avoid a lot of redundant code. Therefore, Paddle provides the basic data type Paddle nn. Layer to facilitate you to quickly implement your own layers and models. Both models and layers can be based on padding nn. Layer extends the implementation, so it can also be said that the model is only a special layer. The following will demonstrate how to use padding nn. Layer builds its own model:
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() self.flatten = paddle.nn.Flatten() def forward(self, inputs): y = self.flatten(inputs) return y
in the current example, by inheriting the pad nn. Layer builds a Model type Model, which contains only one pad nn. Flatten layer. When the Model is executed, the input variable will be padded nn. Flatten the flatten layer.
2. Test case
x = paddle.to_tensor([[1,2,3],[4,5,6]]) print(x) model = Model() y = model(x) print(y)
Tensor(shape=[2, 3], dtype=int64, place=CPUPlace, stop_gradient=True, [[1, 2, 3], [4, 5, 6]]) Tensor(shape=[2, 3], dtype=int64, place=CPUPlace, stop_gradient=True, [[1, 2, 3], [4, 5, 6]])
what the hell is this? Why didn't the result be flattened by the layer?
modify the previous code:
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() def forward(self, inputs): y = self.Flatten(inputs) return y def Flatten(self, x): xx = x.numpy().flatten() return paddle.to_tensor(xx) x = paddle.to_tensor([[1,2,3],[4,5,6]]) print(x) model = Model() y = model(x) print(y)
you can get the desired result:
Tensor(shape=[2, 3], dtype=int64, place=CPUPlace, stop_gradient=True, [[1, 2, 3], [4, 5, 6]]) Tensor(shape=[6], dtype=int64, place=CPUPlace, stop_gradient=True, [1, 2, 3, 4, 5, 6])
2, Sublayer interface
if you want to access or modify a layer defined in a model, you can call the SubLayer related interface.
1. Inheritance sublayer
take the simple model created above as an example. If you want to view all sub layers defined in the model:
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() self.flatten = paddle.nn.Flatten() def forward(self, inputs): y = self.Flatten(inputs) return y
[Flatten()] ---------------------- ('flatten', Flatten())
to see, by calling model Sublayers () interface, which prints out all the sublayers held in the previous model (at this time, there is only one paddle. NN. Flat sublayer in the model).
and traverse the model named_ Sublayers (), a set of tuples (sublayer name ('flat ') and sublayer object (padding. NN. Flat)) will be obtained in each cycle.
add a layer in self below to see more layers.
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() self.flatten = paddle.nn.Flatten() self.f1 = paddle.nn.Flatten() def forward(self, inputs): y = self.Flatten(inputs) return y model = Model() print(model.sublayers()) print("----------------------") for item in model.named_sublayers(): print(item)
[Flatten(), Flatten()] ---------------------- ('flatten', Flatten()) ('f1', Flatten())
2. Add sublayer
next, if you want to add a sub layer further, you can call add_sublayer() interface:
fc = paddle.nn.Linear(10, 3) model.add_sublayer("fc", fc) print(model.sublayers())
[Flatten(), Linear(in_features=10, out_features=3, dtype=float32)]
you can see the model add_ Sublayer () adds a paddle to the model nn. Linear sublayer, so that there are a total of paddles in the model nn. Flatten and paddle nn. Linear has two sublayers.
3. Modify sublayer
thousands of sub layers can be added to the model through the above methods. When there are a large number of sub layers in the model, how to modify all sub layers efficiently?
paddy provides the apply() interface. Through this interface, you can customize a function, and then apply the function to all sublayers in batch:
def function(layer): print(layer) model.apply(function)
Flatten() Linear(in_features=10, out_features=3, dtype=float32) Model( (flatten): Flatten() (fc): Linear(in_features=10, out_features=3, dtype=float32) )
in the current example, a function function with layer as parameter is defined to print the incoming layer information. By calling model The apply () interface applies function to all sublayers of the model, so the information of all sublayers in the model is printed in the output information.
another interface for batch access to the sublayer is children () or named_children() . These two interfaces access each sublayer through the Iterator:
sublayer_iter = model.children() for sublayer in sublayer_iter: print(sublayer)
Flatten() Linear(in_features=10, out_features=3, dtype=float32)
as you can see, traverse the model When children (), each round of the cycle can get the corresponding pad in the order of sub layer registration nn. Layer object.
3, Variable members in layers
1. Parameter variable addition and modification
sometimes you want to add a parameter to the network as input. For example, when using the image style conversion model, the parameters will be used as the input image, and the image parameters will be updated continuously during the training process, so as to finally get the image after style conversion.
you can use create at this time_ Parameter () and add_parameter() combination to create and record parameters:
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() img = self.create_parameter([1,3,256,256]) self.add_parameter("img", img) self.flatten = paddle.nn.Flatten() def forward(self): y = self.flatten(self.img) return y
the above example creates and adds a parameter named "img" to the model. You can then directly call model IMG to access this parameter.
for the added parameters, you can use parameters () or named_parameters()
model = Model() model.parameters() print('--------------------------------------------') for item in model.named_parameters(): print(item)
[Parameter containing: Tensor(shape=[1, 3, 256, 256], dtype=float32, place=CPUPlace, stop_gradient=False, [[[[-0.00323536, 0.00417978, 0.00387184, ..., -0.00263438, 0.00336105, -0.00079275], [-0.00398997, 0.00305213, 0.00338405, ..., 0.00321609, 0.00385862, 0.00383085], [ 0.00456822, 0.00335924, -0.00396630, ..., -0.00260351, 0.00388722, 0.00292703], ..., ..., [-0.00302772, -0.00052290, -0.00259735, ..., 0.00325148, 0.00051726, 0.00464376], [ 0.00238924, -0.00105374, 0.00219904, ..., -0.00279356, -0.00214116, -0.00319181], [ 0.00180969, 0.00476100, 0.00380237, ..., 0.00249749, 0.00374650, 0.00050141]]]])] -------------------------------------------- ('img', Parameter containing: Tensor(shape=[1, 3, 256, 256], dtype=float32, place=CPUPlace, stop_gradient=False, [[[[-0.00323536, 0.00417978, 0.00387184, ..., -0.00263438, 0.00336105, -0.00079275], [-0.00398997, 0.00305213, 0.00338405, ..., 0.00321609, 0.00385862, 0.00383085], [ 0.00456822, 0.00335924, -0.00396630, ..., -0.00260351, 0.00388722, 0.00292703], ..., ..., [-0.00302772, -0.00052290, -0.00259735, ..., 0.00325148, 0.00051726, 0.00464376], [ 0.00238924, -0.00105374, 0.00219904, ..., -0.00279356, -0.00214116, -0.00319181], [ 0.00180969, 0.00476100, 0.00380237, ..., 0.00249749, 0.00374650, 0.00050141]]]]))
as you can see, model Parameters () returns all parameters in the model as an array.
in the actual model training process, when the reverse graph execution method is called, the Paddle accountant calculates the gradient of each parameter in the model and saves it in the corresponding parameter object. Clear can be called if the parameter has been updated with a gradient, or if for some reason you do not want the gradient to be accumulated to the next round of training_ Gradients() to clear these gradient values.
model = Model() out = model() out.backward() model.clear_gradients()
2. Addition of nonparametric variables
parameter variables often need to participate in gradient updating, but in many cases, they only need a temporary variable or even a constant. For example, if you want to save an intermediate variable during model execution, you need to call create_ Tensor interface:
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() self.saved_tensor = self.create_tensor(name="saved_tensor0") self.flatten = paddle.nn.Flatten() self.fc = paddle.nn.Linear(10, 100) def forward(self, input): y = self.flatten(input) # Save intermediate tensor paddle.assign(y, self.saved_tensor) y = self.fc(y) return y
model = Model() print(model.sublayers())
[Flatten()]
call self here create_ Tensor () creates a temporary variable and records it in the model's self saved_ In the tensor. Paddle. Is called when the model executes Assign uses this temporary variable to record the value of variable y.
3. Addition of Buffer variable
the concept of Buffer only affects the conversion process from dynamic graph to static graph. In the previous section, you created a temporary variable to temporarily store the value of the intermediate variable. However, this temporary variable will not be recorded in the static calculation diagram during the conversion from dynamic diagram to static diagram. If you want this variable to be part of a static graph, you need to call register further_ Buffers () interface:
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() saved_tensor = self.create_tensor(name="saved_tensor0") self.register_buffer("saved_tensor", saved_tensor, persistable=True) self.flatten = paddle.nn.Flatten() self.fc = paddle.nn.Linear(10, 100) def forward(self, input): y = self.flatten(input) # Save intermediate tensor paddle.assign(y, self.saved_tensor) y = self.fc(y) return y
in this way, it is saved when the dynamic graph is converted to the static graph_ The tensor will be recorded in the static diagram.
for the registered buffers in the model, you can use buffers() or named_ Buffer s() to access:
model = Model() print(model.buffers()) for item in model.named_buffers(): print(item)
[Tensor(Not initialized)] ('saved_tensor', Tensor(Not initialized))
you can see the model Buffer s () returns all buffers registered in the model in the form of array
4, Functions of execution layer
after a series of model configurations, if a Paddle model is ready, it is as follows:
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() self.flatten = paddle.nn.Flatten() def forward(self, inputs): y = self.flatten(inputs) return y
to execute the model, you first need to set the execution mode
1. Execution mode setting
there are two execution modes of the model. If training is required, call train(). If only forward execution is performed, call eval():
x = paddle.randn([10, 1], 'float32') model = Model() model.eval() # set model to eval mode out = model(x) model.train() # set model to train mode out = model(x)
here, set the execution mode of the model to eval and train successively. The two execution modes are mutually exclusive. The new execution mode setting will overwrite the original setting.
2. Execution function
after the mode setting is completed, the execution function can be called directly. You can directly call the forward() method for forward execution, or you can call__ call__ () to execute the forward calculation logic defined in forward().
class Model(paddle.nn.Layer): def __init__(self): super(Model, self).__init__() self.flatten = paddle.nn.Flatten() def forward(self, inputs): y = self.Flatten(inputs) return y def Flatten(self, inputs): return paddle.to_tensor(inputs.numpy().flatten())
model = Model() x = paddle.randn([10, 1], 'float32') out = model(x) print(out)
Tensor(shape=[10], dtype=float32, place=CPUPlace, stop_gradient=True, [-0.26968753, -2.34697795, 0.87075204, 1.20670414, 2.26653862, 0.25821996, 0.70133287, 1.44512081, 0.96671742, 0.96629554])
call directly here__ call__ () forward execution logic of method call model.
3. Add hook function
sometimes you want some variables to be preprocessed before entering the layer. This function can be realized by registering hook. Hook is a custom function that acts on variables and is called when the model executes. The hook function registered on the layer can be divided into pre_hook and post_ There are two kinds of hooks. pre_hook can process the input variables of the layer, and use the return value of the function as a new variable to participate in the calculation of the layer. post_hook can process the output variables of the layer. After further processing the output of the layer, the return value of the function is used as the output of the layer calculation.
through register_forward_post_hook() interface, we can register a post_hook:
def forward_post_hook(layer, input, output): return 2*output x = paddle.ones([10, 1], 'float32') model = Model() forward_post_hook_handle = model.flatten.register_forward_post_hook(forward_post_hook) out = model(x) print(out)
Tensor(shape=[10, 1], dtype=float32, place=CPUPlace, stop_gradient=True, [[2.], [2.], [2.], [2.], [2.], [2.], [2.], [2.], [2.], [2.]])
5, Save model parameters
if you want to save the parameters in the model without storing the model itself, you can call state first_ The dict () interface stores the parameters and permanent variables in the model in a Python dictionary, and then saves the dictionary.
model = Model() state_dict = model.state_dict() paddle.save( state_dict, "./paddle_dy.pdparams")
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/framework/io.py:729: UserWarning: The input state dict is empty, no need to save. warnings.warn("The input state dict is empty, no need to save.")
when running the second time, there is no such prompt.
※ general ※ conclusion ※
this is Models and layers in Paddle Learning notes. For the construction and operation of the layer in the pad, the preliminary test and related learning are carried out.
■ links to relevant literature: