The most complete Vision Transformer(ViT) paper interpretation and code reproduction (based on the paddle framework)

preface The pioneering work of the VIT model is to use a pure transformer structure, as shown in the title of the paper: AN IMAGE IS WORTH 16X16 WORDS, which embeds the pictures into a series of sequence s, and realizes the effect comparable to the SOTA model in CNN through multiple encoder structures and head s. Image classification t ...

Added by T2theC on Sat, 18 Dec 2021 09:54:15 +0200

Layers and models in Paddle network architecture

Introduction: This is Models and layers in Paddle Learning notes. For the construction and operation of the layer in the pad, the preliminary test and related learning are carried out. Key words: Layer, cushion   § 01 model and layer   model is one of the important concepts in deep learning. The core function o ...

Added by MasterHernan on Sun, 12 Dec 2021 14:37:16 +0200

AI Studio: use the minimalist framework in the paddy framework to identify MNIST

Introduction: ※ identify MNIST program by testing this minimalist Paddle on the network, that is, use a very simple linear regression network, and preliminarily get familiar with the network architecture under Paddle. An example is also given in the tensor conversion program from numpy to Paddle. Keywords: AI Studio, paddy, MNIST   ...

Added by jguy on Sun, 12 Dec 2021 05:30:22 +0200