Network model construction of deep learning
1, Inherit Module class construction model
Module class is a model construction class provided in nn module. It is the base class of all neural network modules. We can inherit it to define the model we want. The following inherits the module class to construct the multi-layer perceptron mentioned at the beginning of this section. The MLP class defined here overloads the module class__ init__ Function and forward function. They are used to create model parameters and define forward calculations, respectively. Forward computation is also called forward propagation.
import torch from torch import nn class MLP(nn.Module): # Declare a layer with model parameters, where two fully connected layers are declared def __init__(self, **kwargs): # Call the constructor of the MLP parent class Module to perform the necessary initialization. In this way, other functions can be specified when constructing the instance # Parameters, such as the model parameter params described in the "access, initialization and sharing of model parameters" section super(MLP, self).__init__(**kwargs) self.hidden = nn.Linear(784, 256) # Hidden layer self.act = nn.ReLU() self.output = nn.Linear(256, 10) # Output layer # Define the forward calculation of the model, that is, how to calculate and return the required model output according to the input x def forward(self, x): a = self.act(self.hidden(x)) return self.output(a)
You can instantiate the MLP class to get the model variable net. The following code initializes net and passes in the input data X for a forward calculation. Among them, net(X) will call the MLP inherited from the Module class__ call__ Function, which will call the forward function defined by the MLP class to complete the forward calculation.
X = torch.rand(2, 784) net = MLP() print(net) net(X)
2, Module subclass
1. Sequential class
When the forward calculation of the model is the calculation of simply concatenating each layer, the Sequential class can define the model in a simpler way. This is the purpose of the Sequential class: it can receive the ordered Dictionary of a sub Module or a series of sub modules as parameters to add Module instances one by one, and the forward calculation of the model is to calculate these instances one by one in the order of addition. For example, the following Alex net construction example:
class AlexNet(nn.Module): # Training ALexNet ''' 5 Layer convolution, 3-layer full connection ''' def __init__(self): super(AlexNet, self).__init__() # Five convolution inputs 32 * 32 * 3 self.conv1 = nn.Sequential( nn.Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=1), # (32-3+2)/1+1 = 32 nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2, padding=0) # (32-2)/2+1 = 16 ) self.conv2 = nn.Sequential( # Enter 16 * 16 * 6 nn.Conv2d(in_channels=6, out_channels=16, kernel_size=3, stride=1, padding=1), # (16-3+2)/1+1 = 16 nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2, padding=0) # (16-2)/2+1 = 8 ) self.conv3 = nn.Sequential( # Enter 8 * 8 * 16 nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, stride=1, padding=1), # (8-3+2)/1+1 = 8 nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2, padding=0) # (8-2)/2+1 = 4 ) self.conv4 = nn.Sequential( # Enter 4 * 4 * 64 nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1), # (4-3+2)/1+1 = 4 nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2, padding=0) # (4-2)/2+1 = 2 ) self.conv5 = nn.Sequential( # Input 2 * 2 * 128 nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1), # (2-3+2)/1+1 = 2 nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2, padding=0), # (2-2)/2+1 = 1 # nn.Flatten() ) # Last convolution layer, output 1 * 1 * 128 # Full connection layer self.dense = nn.Sequential( nn.Linear(128, 120), nn.ReLU(), # nn.Dropout(), nn.Linear(120, 84), nn.ReLU(), # nn.Dropout(), nn.Linear(84, 10), # nn.ReLU(), # nn.Softmax() ) def forward(self, x): x = self.conv1(x) x = self.conv2(x) x = self.conv3(x) x = self.conv4(x) x = self.conv5(x) x = x.view(x.size()[0], -1) x = self.dense(x) return x def _initialize_weights(self): for m in self.modules(): if isinstance(m, nn.Conv2d): nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') if m.bias is not None: nn.init.constant_(m.bias, 0) elif isinstance(m, nn.Linear): nn.init.normal_(m.weight, 0, 0.01) nn.init.constant_(m.bias, 0)
2. ModuleList class
ModuleList receives a List of sub modules as input, and then can perform append and extend operations like List:
net = nn.ModuleList([nn.Linear(784, 256), nn.ReLU()]) net.append(nn.Linear(256, 10)) # # append operation similar to List print(net[-1]) # Index access similar to List print(net)
Output is:
Linear(in_features=256, out_features=10, bias=True) ModuleList( (0): Linear(in_features=784, out_features=256, bias=True) (1): ReLU() (2): Linear(in_features=256, out_features=10, bias=True) )
The difference between Sequential and ModuleList:
① ModuleList is just a list for storing various modules. There is no connection or order between these modules (so there is no need to ensure that the input and output dimensions of adjacent layers match, and the forward function is not implemented. You need to implement it yourself, so the above implementation of net(torch.zeros(1, 784)) will report 'NotImplementedError'.
② The modules in Sequential need to be arranged in order. To ensure that the input and output sizes of adjacent layers match, the internal forward function has been realized.
a. Example 1 of flexible application of ModuleList
The emergence of ModuleList only makes the forward propagation of network definition more flexible
class MyModule(nn.Module): def __init__(self): super(MyModule, self).__init__() self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(10)]) def forward(self, x): # ModuleList can act as an iterable, or be indexed using ints for i, l in enumerate(self.linears): x = self.linears[i // 2](x) + l(x) return x
b. Example 2 of flexible application of ModuleList
ModuleList is different from the general Python list. The parameters of all modules added to ModuleList will be automatically added to the whole network
class Module_ModuleList(nn.Module): def __init__(self): super(Module_ModuleList, self).__init__() self.linears = nn.ModuleList([nn.Linear(10, 10)]) class Module_List(nn.Module): def __init__(self): super(Module_List, self).__init__() self.linears = [nn.Linear(10, 10)] net1 = Module_ModuleList() net2 = Module_List() print("net1:") for p in net1.parameters(): print(p.size()) print("net2:") for p in net2.parameters(): print(p)
Output is:
net1: torch.Size([10, 10]) torch.Size([10]) net2:
3. ModuleDict class
ModuleDict receives the dictionary of a sub module as input, and then can add and access like a dictionary:
net = nn.ModuleDict({ 'linear': nn.Linear(784, 256), 'act': nn.ReLU(), }) net['output'] = nn.Linear(256, 10) # add to print(net['linear']) # visit print(net.output) print(net)
Output is:
Linear(in_features=784, out_features=256, bias=True) Linear(in_features=256, out_features=10, bias=True) ModuleDict( (act): ReLU() (linear): Linear(in_features=784, out_features=256, bias=True) (output): Linear(in_features=256, out_features=10, bias=True) )
Similarities between ModuleDict and ModuleList
① : like ModuleList, the ModuleDict instance only stores the dictionaries of some modules and does not define the forward function, which needs to be defined by itself.
② : ModuleDict is also different from Python's Dict. The parameters of all modules in ModuleDict will be automatically added to the whole network.
3, Build your own complex model
1. Construct your own model FancyMLP
A slightly more complex network FancyMLP is constructed below. In this network, we get_constant function creates parameters that are not iterated in training, i.e. constant parameters. In the forward calculation, in addition to using the constant parameters created, we also use Tensor's function and Python's control flow, and call the same layer many times.
class FancyMLP(nn.Module): def __init__(self, **kwargs): super(FancyMLP, self).__init__(**kwargs) self.rand_weight = torch.rand((20, 20), requires_grad=False) # Untrainable parameter (constant parameter) self.linear = nn.Linear(20, 20) def forward(self, x): x = self.linear(x) # Use the constant parameter created, and NN relu function and mm function in functional x = nn.functional.relu(torch.mm(x, self.rand_weight.data) + 1) # Reuse full connection layer. Equivalent to two full connection layers sharing parameters x = self.linear(x) # Control flow. Here, we need to call the item function to return scalar for comparison while x.norm().item() > 1: #norm() is used to find the second norm; item() function: take out the element value of the single element tensor and return the value, keeping the original element type unchanged. x /= 2 if x.norm().item() < 0.8: x *= 10 return x.sum() X = torch.rand(2, 20) net = FancyMLP() print(net) net(X)
Output is:
FancyMLP( (linear): Linear(in_features=20, out_features=20, bias=True) ) tensor(0.8432, grad_fn=<SumBackward0>)
2. Using the fancyMLP nesting constructed above, a new network NestMLP is constructed
Because FancyMLP and Sequential classes are subclasses of Module class, they can be nested and called.
class NestMLP(nn.Module): def __init__(self, **kwargs): super(NestMLP, self).__init__(**kwargs) self.net = nn.Sequential(nn.Linear(40, 30), nn.ReLU()) def forward(self, x): return self.net(x) net = nn.Sequential(NestMLP(), nn.Linear(30, 20), FancyMLP()) X = torch.rand(2, 40) print(net) net(X)
Output:
Sequential( (0): NestMLP( (net): Sequential( (0): Linear(in_features=40, out_features=30, bias=True) (1): ReLU() ) ) (1): Linear(in_features=30, out_features=20, bias=True) (2): FancyMLP( (linear): Linear(in_features=20, out_features=20, bias=True) ) ) tensor(14.4908, grad_fn=<SumBackward0>)
Summary
- You can construct the model by inheriting the Module class.
- Sequential, ModuleList and ModuleDict classes all inherit from the Module class.
- Unlike Sequential, ModuleList and ModuleDict do not define a complete network. They just store different modules together and need to define their own forward function.
- Although Sequential and other classes can make the model construction simpler, directly inheriting the Module class can greatly expand the flexibility of model construction.