Summary of the most complete Python machine learning and deep learning library in Station C (containing a large number of examples, recommended Collection)

preface

At present, with the popularity of artificial intelligence, it has attracted the attention of many industries to artificial intelligence. At the same time, it has also ushered in waves of artificial intelligence learning upsurge. Although the principles behind artificial intelligence can not be introduced in detail in a short article, like all disciplines, We don't need to "build wheels" from scratch. We can quickly build artificial intelligence models by using rich artificial intelligence frameworks, so as to get started with the trend of artificial intelligence.
Artificial intelligence refers to a series of technologies that enable machines to process information like humans; Machine learning is the process of using computer programming to learn from historical data and predict new data; Neural network is a computer model of machine learning based on the structure and characteristics of biological brain; Deep learning is a subset of machine learning, which processes a large number of unstructured data, such as human voice, text and image. Therefore, these concepts are interdependent at the level. Artificial intelligence is the most extensive term, and deep learning is the most specific:

In order to have a preliminary understanding of Python libraries commonly used in artificial intelligence and select libraries that can meet your own needs for learning, this paper briefly and comprehensively introduces the more common artificial intelligence libraries at present.

Introduction to common machine learning and deep learning libraries in python

1, Numpy

NumPy(Numerical Python) is an extension library of Python. It supports a large number of dimensional array and matrix operations. In addition, it also provides a large number of mathematical function libraries for array operations. Numpy is written in C language at the bottom. Objects are directly stored in the array rather than object pointers, so its operation efficiency is much higher than that of pure Python code.
In the example, we can compare the speed of pure Python and Numpy Library in calculating the sin value of the list:

import numpy as np
import math
import random
import time

start = time.time()
for i in range(10):
    list_1 = list(range(1,10000))
    for j in range(len(list_1)):
        list_1[j] = math.sin(list_1[j])
print("Use pure Python Time use{}s".format(time.time()-start))

start = time.time()
for i in range(10):
    list_1 = np.array(np.arange(1,10000))
    list_1 = np.sin(list_1)
print("use Numpy Time use{}s".format(time.time()-start))

From the following running results, we can see that the speed of using Numpy library is faster than that of code written in pure Python:

Use pure Python Time 0.017444372177124023s
 Use pure Python Time 0.001619577407836914s

2, OpenCV

OpenCV is a cross platform computer vision library that can run on Linux, Windows, and Mac OS operating systems. It is lightweight and efficient - it is composed of a series of C functions and a small number of C + + classes. At the same time, it also provides Python interface and implements many general algorithms in image processing and computer vision.
The following code tries to use some simple filters, including image smoothing and Gaussian blur:

import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('h89817032p0.png')
kernel = np.ones((5,5),np.float32)/25
dst = cv.filter2D(img,-1,kernel)
blur_1 = cv.GaussianBlur(img,(5,5),0)
blur_2 = cv.bilateralFilter(img,9,75,75)
plt.figure(figsize=(10,10))
plt.subplot(221),plt.imshow(img[:,:,::-1]),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(222),plt.imshow(dst[:,:,::-1]),plt.title('Averaging')
plt.xticks([]), plt.yticks([])
plt.subplot(223),plt.imshow(blur_1[:,:,::-1]),plt.title('Gaussian')
plt.xticks([]), plt.yticks([])
plt.subplot(224),plt.imshow(blur_1[:,:,::-1]),plt.title('Bilateral')
plt.xticks([]), plt.yticks([])
plt.show()

Can refer to Fundamentals of OpenCV image processing (transformation and denoising) , learn more about OpenCV image processing operations.

3, Scikit-image

Scikit image is an image processing library based on scipy, which processes pictures as numpy arrays.
For example, you can use scikit image to change the picture scale. Scikit image provides rescale, resize, and downscale_local_mean and other functions.

from skimage import data, color, io
from skimage.transform import rescale, resize, downscale_local_mean

image = color.rgb2gray(io.imread('h89817032p0.png'))

image_rescaled = rescale(image, 0.25, anti_aliasing=False)
image_resized = resize(image, (image.shape[0] // 4, image.shape[1] // 4),
                       anti_aliasing=True)
image_downscaled = downscale_local_mean(image, (4, 3))
plt.figure(figsize=(20,20))
plt.subplot(221),plt.imshow(image, cmap='gray'),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(222),plt.imshow(image_rescaled, cmap='gray'),plt.title('Rescaled')
plt.xticks([]), plt.yticks([])
plt.subplot(223),plt.imshow(image_resized, cmap='gray'),plt.title('Resized')
plt.xticks([]), plt.yticks([])
plt.subplot(224),plt.imshow(image_downscaled, cmap='gray'),plt.title('Downscaled')
plt.xticks([]), plt.yticks([])
plt.show()

4, Python Imaging Library(PIL)

Python Imaging Library(PIL) has become the de facto image processing standard library of Python. This is because PIL is very powerful, but the API is very simple and easy to use.
However, since PIL only supports Python 2.7 and is in disrepair for a long time, a group of volunteers created a compatible version based on PIL, named pilot, which supports the latest Python 3.0 x. Many new features have been added, so we can skip PIL and directly install and use pilot.

5, Pillow

Generate an alphanumeric verification code image using pilot:

from PIL import Image, ImageDraw, ImageFont, ImageFilter

import random

# Random letters:
def rndChar():
    return chr(random.randint(65, 90))

# Random color 1:
def rndColor():
    return (random.randint(64, 255), random.randint(64, 255), random.randint(64, 255))

# Random color 2:
def rndColor2():
    return (random.randint(32, 127), random.randint(32, 127), random.randint(32, 127))

# 240 x 60:
width = 60 * 6
height = 60 * 6
image = Image.new('RGB', (width, height), (255, 255, 255))
# To create a Font object:
font = ImageFont.truetype('/usr/share/fonts/wps-office/simhei.ttf', 60)
# To create a Draw object:
draw = ImageDraw.Draw(image)
# Fill each pixel:
for x in range(width):
    for y in range(height):
        draw.point((x, y), fill=rndColor())
# Output text:
for t in range(6):
    draw.text((60 * t + 10, 150), rndChar(), font=font, fill=rndColor2())
# Blur:
image = image.filter(ImageFilter.BLUR)
image.save('code.jpg', 'jpeg')

6, SimpleCV

SimpleCV is an open source framework for building computer vision applications. With it, you can access high-performance computer vision libraries, such as OpenCV, without having to first understand terms such as bit depth, file format, color space, buffer management, eigenvalue or matrix. But its support for Python 3 is very poor The following codes are used in 7:

from SimpleCV import Image, Color, Display
# load an image from imgur
img = Image('http://i.imgur.com/lfAeZ4n.png')
# use a keypoint detector to find areas of interest
feats = img.findKeypoints()
# draw the list of keypoints
feats.draw(color=Color.RED)
# show the  resulting image. 
img.show()
# apply the stuff we found to the image.
output = img.applyLayers()
# save the results.
output.save('juniperfeats.png')

The following errors will be reported, so it is not recommended to use in Python 3:

SyntaxError: Missing parentheses in call to 'print'. Did you mean print('unit test')?

7, Mahotas

Mahotas is a fast computer vision algorithm library, which is built on Numpy. At present, it has more than 100 image processing and computer vision functions, and is growing.
Use Mahotas to load images and operate on pixels:

import numpy as np
import mahotas
import mahotas.demos

from mahotas.thresholding import soft_threshold
from matplotlib import pyplot as plt
from os import path
f = mahotas.demos.load('lena', as_grey=True)
f = f[128:,128:]
plt.gray()
# Show the data:
print("Fraction of zeros in original image: {0}".format(np.mean(f==0)))
plt.imshow(f)
plt.show()

8, Ilastik

Ilastik can provide users with good biological information image analysis services based on machine learning. It can easily segment, classify, track and count cells or other experimental data by using machine learning algorithms. Most operations are interactive and do not require machine learning expertise. Can refer to https://www.ilastik.org/documentation/basics/installation.html For installation and use.

9, Scikit-learn

Scikit learn is a free software machine learning library for Python programming language. It has a variety of classification, regression and clustering algorithms, including support vector machine, random forest, gradient lifting, k-means and DBSCAN.
Implement KMeans algorithm using scikit learn:

import time

import numpy as np
import matplotlib.pyplot as plt

from sklearn.cluster import MiniBatchKMeans, KMeans
from sklearn.metrics.pairwise import pairwise_distances_argmin
from sklearn.datasets import make_blobs

# Generate sample data
np.random.seed(0)

batch_size = 45
centers = [[1, 1], [-1, -1], [1, -1]]
n_clusters = len(centers)
X, labels_true = make_blobs(n_samples=3000, centers=centers, cluster_std=0.7)

# Compute clustering with Means

k_means = KMeans(init='k-means++', n_clusters=3, n_init=10)
t0 = time.time()
k_means.fit(X)
t_batch = time.time() - t0

# Compute clustering with MiniBatchKMeans

mbk = MiniBatchKMeans(init='k-means++', n_clusters=3, batch_size=batch_size,
                      n_init=10, max_no_improvement=10, verbose=0)
t0 = time.time()
mbk.fit(X)
t_mini_batch = time.time() - t0

# Plot result
fig = plt.figure(figsize=(8, 3))
fig.subplots_adjust(left=0.02, right=0.98, bottom=0.05, top=0.9)
colors = ['#4EACC5', '#FF9C34', '#4E9A06']

# We want to have the same colors for the same cluster from the
# MiniBatchKMeans and the KMeans algorithm. Let's pair the cluster centers per
# closest one.
k_means_cluster_centers = k_means.cluster_centers_
order = pairwise_distances_argmin(k_means.cluster_centers_,
                                  mbk.cluster_centers_)
mbk_means_cluster_centers = mbk.cluster_centers_[order]

k_means_labels = pairwise_distances_argmin(X, k_means_cluster_centers)
mbk_means_labels = pairwise_distances_argmin(X, mbk_means_cluster_centers)

# KMeans
for k, col in zip(range(n_clusters), colors):
    my_members = k_means_labels == k
    cluster_center = k_means_cluster_centers[k]
    plt.plot(X[my_members, 0], X[my_members, 1], 'w',
            markerfacecolor=col, marker='.')
    plt.plot(cluster_center[0], cluster_center[1], 'o', markerfacecolor=col,
            markeredgecolor='k', markersize=6)
plt.title('KMeans')
plt.xticks(())
plt.yticks(())

plt.show()

10, SciPy

SciPy library provides many user-friendly and efficient numerical calculations, such as numerical integration, interpolation, optimization, linear algebra and so on.
SciPy library defines many special functions of mathematical physics, including elliptic function, Bessel function, gamma function, beta function, hypergeometric function, parabolic cylindrical function and so on.

from scipy import special
import matplotlib.pyplot as plt
import numpy as np

def drumhead_height(n, k, distance, angle, t):
    kth_zero = special.jn_zeros(n, k)[-1]
    return np.cos(t) * np.cos(n*angle) * special.jn(n, distance*kth_zero)

theta = np.r_[0:2*np.pi:50j]
radius = np.r_[0:1:50j]
x = np.array([r * np.cos(theta) for r in radius])
y = np.array([r * np.sin(theta) for r in radius])
z = np.array([drumhead_height(1, 1, r, theta, 0.5) for r in radius])


fig = plt.figure()
ax = fig.add_axes(rect=(0, 0.05, 0.95, 0.95), projection='3d')
ax.plot_surface(x, y, z, rstride=1, cstride=1, cmap='RdBu_r', vmin=-0.5, vmax=0.5)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_xticks(np.arange(-1, 1.1, 0.5))
ax.set_yticks(np.arange(-1, 1.1, 0.5))
ax.set_zlabel('Z')
plt.show()

11, NLTK

NLTK is a library for building Python programs to handle natural languages. It provides an easy-to-use interface for more than 50 corpora and vocabulary resources (such as WordNet), as well as a set of wrapper for text processing library and industrial natural language processing (NLP) library for classification, word segmentation, stem, marking, parsing and semantic reasoning.
NLTK is called "a wonderful tool for teaching, and working in, computational linguistics using Python".

import nltk
from nltk.corpus import treebank

# Download required for first use
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('maxent_ne_chunker')
nltk.download('words')
nltk.download('treebank')

sentence = """At eight o'clock on Thursday morning Arthur didn't feel very good."""
# Tokenize
tokens = nltk.word_tokenize(sentence)
tagged = nltk.pos_tag(tokens)

# Identify named entities
entities = nltk.chunk.ne_chunk(tagged)

# Display a parse tree
t = treebank.parsed_sents('wsj_0001.mrg')[0]
t.draw()

12, spaCy

spaCy is a free open source library for advanced NLP in Python. It can be used to build applications that handle large amounts of text; It can also be used to build information extraction or natural language understanding systems, or preprocess text for in-depth learning.

  import spacy

  texts = [
      "Net income was $9.4 million compared to the prior year of $2.7 million.",
      "Revenue exceeded twelve billion dollars, with a loss of $1b.",
  ]

  nlp = spacy.load("en_core_web_sm")
  for doc in nlp.pipe(texts, disable=["tok2vec", "tagger", "parser", "attribute_ruler", "lemmatizer"]):
      # Do something with the doc here
      print([(ent.text, ent.label_) for ent in doc.ents])

nlp.pipe generates Doc objects, so we can iterate over them and access the named entity prediction:

[('$9.4 million', 'MONEY'), ('the prior year', 'DATE'), ('$2.7 million', 'MONEY')]
[('twelve billion dollars', 'MONEY'), ('1b', 'MONEY')]

13, LibROSA

librosa is a Python library for music and audio analysis. It provides the functions and functions necessary to create a music information retrieval system.

# Beat tracking example
import librosa

# 1. Get the file path to an included audio example
filename = librosa.example('nutcracker')

# 2. Load the audio as a waveform `y`
#    Store the sampling rate as `sr`
y, sr = librosa.load(filename)

# 3. Run the default beat tracker
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
print('Estimated tempo: {:.2f} beats per minute'.format(tempo))

# 4. Convert the frame indices of beat events into timestamps
beat_times = librosa.frames_to_time(beat_frames, sr=sr)

14, Pandas

Pandas is a fast, powerful, flexible and easy-to-use open source data analysis and operation tool. Pandas can import data from various file formats, such as CSV, JSON, SQL and Microsoft Excel. It can perform operations on various data, such as merging, reshaping and selection, as well as data cleaning and data processing. Pandas is widely used in academic, financial, statistical and other data analysis fields.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

ts = pd.Series(np.random.randn(1000), index=pd.date_range("1/1/2000", periods=1000))
ts = ts.cumsum()

df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index, columns=list("ABCD"))
df = df.cumsum()
df.plot()
plt.show()

15, Matplotlib

Matplotlib is a drawing library of Python. It provides a complete set of command API s similar to matlab, which can generate exquisite graphics of publishing quality level. Matplotlib makes drawing very simple and achieves an excellent balance between ease of use and performance.
Use Matplotlib to plot multiple curves:

# plot_multi_curve.py
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0.1, 2 * np.pi, 100)
y_1 = x
y_2 = np.square(x)
y_3 = np.log(x)
y_4 = np.sin(x)
plt.plot(x,y_1)
plt.plot(x,y_2)
plt.plot(x,y_3)
plt.plot(x,y_4)
plt.show()

For more information about Matplotlib drawing, please refer to the previous blog post -—— Python Matplotlib visualization.

16, Seaborn

Seaborn is a Python data visualization library with more advanced API encapsulation based on Matplotlib, which makes drawing easier. Seaborn should be regarded as a supplement to Matplotlib rather than a substitute.

import seaborn as sns
import matplotlib.pyplot as plt
sns.set_theme(style="ticks")

df = sns.load_dataset("penguins")
sns.pairplot(df, hue="species")
plt.show()

17, Orange

Orange is an open source data mining and machine learning software, which provides a series of data exploration, visualization, preprocessing and modeling components. Orange has a beautiful and intuitive interactive user interface, which is very suitable for novices to conduct exploratory data analysis and visual display; At the same time, advanced users can also use it as a programming module of Python for data operation and component development.
Use pip to install Orange, praised ～

$ pip install orange3

After installation, enter the Orange canvas command on the command line to start the Orange graphical interface:

$ orange-canvas

After startup, you can see the Orange graphical interface for various operations.

18, PyBrain

PyBrain is Python's modular machine learning library. Its goal is to provide flexible, easy-to-use and powerful algorithms for machine learning tasks and various predefined environments to test and compare algorithms. PyBrain is the abbreviation of Python based reinforcement learning, artistic intelligence and neural network library.
We will use a simple example to show the usage of PyBrain and build a multi-layer perceptron (MLP).
First, we create a new feedforward network object:

from pybrain.structure import FeedForwardNetwork

n = FeedForwardNetwork()

Next, build the input, hide, and output layers:

from pybrain.structure import LinearLayer, SigmoidLayer

inLayer = LinearLayer(2)
hiddenLayer = SigmoidLayer(3)
outLayer = LinearLayer(1)

In order to use the layers you build, you must add them to the network:

n.addInputModule(inLayer)
n.addModule(hiddenLayer)
n.addOutputModule(outLayer)

Multiple input and output modules can be added. In order to calculate forward and propagate back error, the network must know which layers are input and which layers are output.
This requires a clear determination of how they should be connected. For this purpose, we use the most common connection type, the full connection layer, which is implemented by the FullConnection class:

from pybrain.structure import FullConnection
in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)

Like layers, we must explicitly add them to the network:

n.addConnection(in_to_hidden)
n.addConnection(hidden_to_out)

All elements are now in place, and finally, we need to call The sortModules() method makes MLP available:

n.sortModules()

This call performs some internal initialization, which is necessary before using the network.

19, Milk

MILK(MACHINE LEARNING TOOLKIT) is a machine learning toolkit for Python language. It mainly includes many classifiers, such as SVMS, K-NN, random forest and supervised classification in decision tree. It can also perform feature selection and form different classification systems, such as unsupervised learning, affinity propagation and K-means clustering supported by MILK.
Train a classifier with MILK:

import numpy as np
import milk
features = np.random.rand(100,10)
labels = np.zeros(100)
features[50:] += .5
labels[50:] = 1
learner = milk.defaultclassifier()
model = learner.train(features, labels)

# Now you can use the model on new examples:
example = np.random.rand(10)
print(model.apply(example))
example2 = np.random.rand(10)
example2 += .5
print(model.apply(example2))

20, TensorFlow

Tensorflow is an end-to-end open source machine learning platform. It has a comprehensive and flexible ecosystem, which can generally be divided into tensorflow 1 X and tensorflow2 x，TensorFlow1.x and tensorflow2 The main difference of X is TF1 X uses static diagrams instead of TF2 X use Eager Mode dynamic graph.
Tensorflow2 is mainly used here X as an example is shown in tensorflow2 A convolutional neural network (CNN) is constructed in X.

import tensorflow as tf

from tensorflow.keras import datasets, layers, models

# Data loading
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Data preprocessing
train_images, test_images = train_images / 255.0, test_images / 255.0

# model building
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

# Model compilation and training
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))

Want to learn more about tensorflow2 For an example of X, please refer to the column Tensorflow.

21, PyTorch

The predecessor of PyTorch is Torch. Its bottom layer is the same as the Torch framework, but it uses Python to rewrite a lot of content. It is not only more flexible, supports dynamic diagrams, but also provides a python interface.

# Import library
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda, Compose
import matplotlib.pyplot as plt

# model building
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using {} device".format(device))

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
            nn.ReLU()
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)

# Loss function and optimizer
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

# model training
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

22, Theano

Theano is a Python library that allows you to define, optimize, and effectively evaluate mathematical expressions involving multidimensional arrays, built on NumPy.
Calculate the Jacobian matrix in Theano:

import theano
import theano.tensor as T
x = T.dvector('x')
y = x ** 2
J, updates = theano.scan(lambda i, y,x : T.grad(y[i], x), sequences=T.arange(y.shape[0]), non_sequences=[y,x])
f = theano.function([x], J, updates=updates)
f([4, 4])

23, Keras

Keras is an advanced neural network API written in Python, which can run with TensorFlow, CNTK, or Theano as the back end. Keras's development focus is to support fast experiments and be able to convert ideas into experimental results with minimal delay.

from keras.models import Sequential
from keras.layers import Dense

# model building
model = Sequential()
model.add(Dense(units=64, activation='relu', input_dim=100))
model.add(Dense(units=10, activation='softmax'))

# Model compilation and training
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=32)

24, Caffe

stay Caffe2 On the official website, it says: Caffe2 is now part of PyTorch. Although these APIs will continue to work, the PyTorch api is encouraged.

25, MXNet

MXNet is a deep learning framework designed for efficiency and flexibility. It allows mixed symbolic programming and imperative programming to maximize efficiency and productivity.
Build handwritten numeral recognition model using MXNet:

import mxnet as mx
from mxnet import gluon
from mxnet.gluon import nn
from mxnet import autograd as ag
import mxnet.ndarray as F

# Data loading
mnist = mx.test_utils.get_mnist()
batch_size = 100
train_data = mx.io.NDArrayIter(mnist['train_data'], mnist['train_label'], batch_size, shuffle=True)
val_data = mx.io.NDArrayIter(mnist['test_data'], mnist['test_label'], batch_size)

# CNN model
class Net(gluon.Block):
    def __init__(self, **kwargs):
        super(Net, self).__init__(**kwargs)
        self.conv1 = nn.Conv2D(20, kernel_size=(5,5))
        self.pool1 = nn.MaxPool2D(pool_size=(2,2), strides = (2,2))
        self.conv2 = nn.Conv2D(50, kernel_size=(5,5))
        self.pool2 = nn.MaxPool2D(pool_size=(2,2), strides = (2,2))
        self.fc1 = nn.Dense(500)
        self.fc2 = nn.Dense(10)

    def forward(self, x):
        x = self.pool1(F.tanh(self.conv1(x)))
        x = self.pool2(F.tanh(self.conv2(x)))
        # 0 means copy over size from corresponding dimension.
        # -1 means infer size from the rest of dimensions.
        x = x.reshape((0, -1))
        x = F.tanh(self.fc1(x))
        x = F.tanh(self.fc2(x))
        return x
net = Net()
# Initialization and optimizer definition
# set the context on GPU is available otherwise CPU
ctx = [mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()]
net.initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx)
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.03})

# model training
# Use Accuracy as the evaluation metric.
metric = mx.metric.Accuracy()
softmax_cross_entropy_loss = gluon.loss.SoftmaxCrossEntropyLoss()

for i in range(epoch):
    # Reset the train data iterator.
    train_data.reset()
    for batch in train_data:
        data = gluon.utils.split_and_load(batch.data[0], ctx_list=ctx, batch_axis=0)
        label = gluon.utils.split_and_load(batch.label[0], ctx_list=ctx, batch_axis=0)
        outputs = []
        # Inside training scope
        with ag.record():
            for x, y in zip(data, label):
                z = net(x)
                # Computes softmax cross entropy loss.
                loss = softmax_cross_entropy_loss(z, y)
                # Backpropogate the error for one iteration.
                loss.backward()
                outputs.append(z)
        metric.update(label, outputs)
        trainer.step(batch.data[0].shape[0])
    # Gets the evaluation result.
    name, acc = metric.get()
    # Reset evaluation result to initial state.
    metric.reset()
    print('training acc at epoch %d: %s=%f'%(i, name, acc))

26, PaddlePaddle

Based on Baidu's years of deep learning technology research and business application, paddepaddele integrates deep learning core training and reasoning framework, basic model library, end-to-end development kit and rich tool components. It is China's first industrial level in-depth learning platform with independent research and development, complete functions, open source and openness.
Using PaddlePaddle to implement LeNtet5:

# Import required packages
import paddle
import numpy as np
from paddle.nn import Conv2D, MaxPool2D, Linear

## networking 
import paddle.nn.functional as F

# Define the LeNet network structure
class LeNet(paddle.nn.Layer):
    def __init__(self, num_classes=1):
        super(LeNet, self).__init__()
        # Create convolution and pooling layers
        # Create 1st volume layer
        self.conv1 = Conv2D(in_channels=1, out_channels=6, kernel_size=5)
        self.max_pool1 = MaxPool2D(kernel_size=2, stride=2)
        # Logic of size: the number of channels is not changed in the pool layer; The current number of channels is 6
        # Create 2nd volume layer
        self.conv2 = Conv2D(in_channels=6, out_channels=16, kernel_size=5)
        self.max_pool2 = MaxPool2D(kernel_size=2, stride=2)
        # Create the 3rd volume layer
        self.conv3 = Conv2D(in_channels=16, out_channels=120, kernel_size=4)
        # Logic of size: the input layer flattens the data [b, C, h, w] - [b, c * h * w]
        # The input size is [28,28]. After three convolutions and two pooling, C*H*W is equal to 120
        self.fc1 = Linear(in_features=120, out_features=64)
        # Create a full connection layer. The number of output neurons of the first full connection layer is 64, and the number of output neurons of the second full connection layer is the number of categories of classification labels
        self.fc2 = Linear(in_features=64, out_features=num_classes)
    # Forward computing process of network
    def forward(self, x):
        x = self.conv1(x)
        # Each convolution layer uses a Sigmoid activation function followed by a 2x2 pool
        x = F.sigmoid(x)
        x = self.max_pool1(x)
        x = F.sigmoid(x)
        x = self.conv2(x)
        x = self.max_pool2(x)
        x = self.conv3(x)
        # Logic of size: the input layer flattens the data [b, C, h, w] - [b, c * h * w]
        x = paddle.reshape(x, [x.shape[0], -1])
        x = self.fc1(x)
        x = F.sigmoid(x)
        x = self.fc2(x)
        return x

27, CNTK

CNTK(Cognitive Toolkit) is a deep learning toolkit, which describes neural network as a series of calculation steps through directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix operations on their inputs. CNTK can easily implement and combine popular model types, such as CNN.
CNTK uses network description language (NDL) to describe a neural network. In short, it is necessary to describe the input feature, the input label, some parameters, the calculation relationship between parameters and input, and what the target node is.

NDLNetworkBuilder=[
    
    run=ndlLR
    
    ndlLR=[
      # sample and label dimensions
      SDim=$dimension$
      LDim=1
    
      features=Input(SDim, 1)
      labels=Input(LDim, 1)
    
      # parameters to learn
      B0 = Parameter(4) 
      W0 = Parameter(4, SDim)
      
      
      B = Parameter(LDim)
      W = Parameter(LDim, 4)
    
      # operations
      t0 = Times(W0, features)
      z0 = Plus(t0, B0)
      s0 = Sigmoid(z0)   
      
      t = Times(W, s0)
      z = Plus(t, B)
      s = Sigmoid(z)    
    
      LR = Logistic(labels, s)
      EP = SquareError(labels, s)
    
      # root nodes
      FeatureNodes=(features)
      LabelNodes=(labels)
      CriteriaNodes=(LR)
      EvalNodes=(EP)
      OutputNodes=(s,t,z,s0,W0)
    ]   
  ]

Summary and classification

Summary of common machine learning and deep learning libraries in python

Library name	Official website	brief introduction
NumPy	http://www.numpy.org/	Providing support for large multidimensional arrays, NumPy is a key library in computer vision, because images can be represented as multidimensional arrays. There are many advantages in representing images as NumPy arrays
OpenCV	https://opencv.org/	Open source computer vision library
Scikit-image	https:// scikit-image.org/	A collection of image processing algorithms. Images operated by scikit image can only be NumPy arrays
Python Imaging Library(PIL)	http://www.pythonware.com/products/pil/	Image processing library provides powerful image processing and graphics functions
Pillow	https://pillow.readthedocs.io/	A branch of PIL
SimpleCV	http://simplecv.org/	Computer vision framework provides the key functions of image processing
Mahotas	https://mahotas.readthedocs.io/	Provides a set of functions for image processing and computer vision, which were originally designed for biological image informatics; However, now it also plays an important role in other fields. It is completely based on numpy array as its data type
Ilastik	http://ilastik.org/	User friendly and simple interactive image segmentation, classification and analysis tools
Scikit-learn	http://scikit-learn.org/	Machine learning library, with various classification, regression and clustering algorithms
SciPy	https://www.scipy.org/	Science and technology computing library
NLTK	https://www.nltk.org/	Libraries and programs for processing natural language data
spaCy	https://spacy.io/	Open source software library for advanced natural language processing in Python
LibROSA	https://librosa.github.io/librosa/	Library for music and audio processing
Pandas	https://pandas.pydata.org/	The library built on NumPy provides advanced data computing tools and easy-to-use data structures
Matplotlib	https://matplotlib.org	Drawing library, which provides a set of command API s similar to matlab, which can generate graphics of the required publishing quality level
Seaborn	https://seaborn.pydata.org/	It is a drawing library based on Matplotlib
Orange	https://orange.biolab.si/	Open source machine learning and Data Visualization Toolkit for novices and experts
PyBrain	http://pybrain.org/	Machine learning library, which provides the latest easy-to-use algorithm for machine learning
Milk	http://luispedro.org/software/milk/	Machine learning toolbox is mainly used for multi classification problems in supervised learning
TensorFlow	https://www.tensorflow.org/	Open source machine learning and deep learning library
PyTorch	https://pytorch.org/	Open source machine learning and deep learning library
Theano	http://deeplearning.net/software/theano/	Libraries for fast mathematical expressions, evaluations, and calculations have been compiled to run on CPU and GPU architectures
Keras	https://keras.io/	Advanced deep learning library, which can be run on TensorFlow, CNTK, Theano or Microsoft Cognitive Toolkit
Caffe2	https://caffe2.ai/	Caffe2 is a deep learning framework with expressiveness, speed and modularity. It is an experimental reconstruction of Caffe and can organize computing in a more flexible way
MXNet	https://mxnet.apache.org/	Designed as an efficient and flexible deep learning framework, it allows mixed symbolic programming and imperative programming
PaddlePaddle	https://www.paddlepaddle.org.cn	Based on Baidu's years of deep learning technology research and business application, it integrates deep learning core training and reasoning framework, basic model library, end-to-end development kit and rich tool components
CNTK	https://cntk.ai/	The deep learning toolkit describes the neural network as a series of calculation steps through a directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix operations on their inputs

classification

These libraries can be classified according to their main purposes:

category	library
image processing	NumPy,OpenCV,scikit image,PIL,Pillow,SimpleCV,Mahotas,ilastik
text processing	NLTK,spaCy,NumPy,scikit learn,PyTorch
Audio processing	LibROSA
machine learning	pandas, scikit-learn, Orange, PyBrain, Milk
Data view	Matplotlib,Seaborn,scikit-learn,Orange
Deep learning	TensorFlow,Pytorch,Theano,Keras,Caffe2,MXNet,PaddlePaddle,CNTK
Scientific computing	SciPy

Other Python libraries and packages for AI and machine learning can be accessed https://python.libhunt.com/packages/artificial-intelligence.

Keywords: Python Machine Learning AI Deep Learning

Added by nologin666 on Mon, 17 Jan 2022 20:41:00 +0200

Programming VIP