If you don't like kittens and puppies, you probably don't know what kind of animal they are. But generally, you can tell whether they are cats or dogs. Dogs and cats have different characteristics. How can we train a network to distinguish cats and dogs using machine learning?
We chose a data set from Kaggle ( https://www.kaggle.com/c/dogs-vs-cats/data ) The model is trained using the method of neural network.The downloaded dataset is a bit large for our test. There are 1,2500 training pictures of cats and dogs. Let's first reduce the training set, then build and train the model.Our approach is to select 1000 training pictures, 500 validation sets and 500 test sets for cats and dogs, and we can do this manually. All we need to do is:
// The following non-executable code has a very clear meaning and is accompanied by executable code mkdir dog-vs-cats-small cp dog-vs-cats/train/cat/pic-{0-999}.jpg dog-vs-cats-small/train/cat/ cp dog-vs-cats/train/dog/pic-{0-999}.jpg dog-vs-cats-small/train/dog/ cp dog-vs-cats/validation/cat/pic-{1000-1499}.jpg dog-vs-cats-small/validation/cat/ cp dog-vs-cats/validation/dog/pic-{1000-1499}.jpg dog-vs-cats-small/validation/dog/ cp dog-vs-cats/test/cat/pic-{1500-1999}.jpg dog-vs-cats-small/test/cat/ cp dog-vs-cats/test/dog/pic-{1500-1999}.jpg dog-vs-cats-small/test/dog/
From the experience of our previous articles, we know that this convolution network can be stacked with the Conv2D layer activated by relu and the MaxPooling2D layer, slightly requiring modification is the size of the network, and much larger network processing is more data.
The depth of convolution neural network network is often negatively related to the size of the feature map. The deeper the network, the smaller the size of each feature map. The data I see are usually: depth 32-> 128, size 150x150-> 7x7.Here's the network we've built:
The optimizer still uses RMSprop, and the learning rate is set to 0.0001 from the default of 0.001. We will also introduce different optimizers later.Since the result we need to output is "cat or dog", our last activation parameter is sigmoid and the natural loss function is binary_crossentropy, so the network is built and it should be fed to the network data.
Since we have pictures one after another, in jpg format, which is not the preferred format of our network, we need to process, read out the pictures, decode them into RGB pixels, and convert the pixel values in RGB into floating point numbers for calculation. Because our network is better at handling numbers between 0 and 1, we need to convert the pixel values into intervals, that is, from0-255 to 0-1, is it a little troublesome, really!Keras is the easiest in-depth learning framework because it also builds these cumbersome but useful tools in place, and ImageDataGenerator under the Image package can help us get a lot of RGB images and binary tags.
Next, we'll fit the data, fit_generator, and the generator on top will pass it on, so that we can build this network, train it, and still draw the loss and accuracy curves as before.
Training accuracy is approaching 100% gradually, alerting us to the danger of fitting; training accuracy is maintained at around 70% after the fifth (or sixth) time and no longer increases.
After the fifth or tenth time, the verification loss reaches the minimum, well...Obviously, over-fitting, we need to reduce over-fitting.
Fitting happened because there were too few learning samples and we used **Data Enhancement** to solve this problem.Our practice is to generate more training data in the existing training data, that is, to add some random transformations, which will produce pictures that are still valid.This allows the model to see more different images during training, which makes the training model more generalized.ImageDataGenerator provides the ability to rotate, zoom, pan, and flip pictures randomly.Adding a Dropout layer before the dense layer will reduce the fit better, so see the results:
As you can see, it works a lot better.Training accuracy can reach at least 80%. If you want to improve the accuracy significantly, you need some other methods. Let's talk about the next article.
Old rule, complete code attached:
#!/usr/bin/env python3 import os import shutil import time import matplotlib.pyplot as plt from keras import layers from keras import models from keras import optimizers from keras.preprocessing.image import ImageDataGenerator def make_small(): original_dataset_dir = '/Users/renyuzhuo/Desktop/cat/dogs-vs-cats/train' base_dir = '/Users/renyuzhuo/Desktop/cat/dogs-vs-cats-small' os.mkdir(base_dir) train_dir = os.path.join(base_dir, 'train') os.mkdir(train_dir) validation_dir = os.path.join(base_dir, 'validation') os.mkdir(validation_dir) test_dir = os.path.join(base_dir, 'test') os.mkdir(test_dir) train_cats_dir = os.path.join(train_dir, 'cats') os.mkdir(train_cats_dir) train_dogs_dir = os.path.join(train_dir, 'dogs') os.mkdir(train_dogs_dir) validation_cats_dir = os.path.join(validation_dir, 'cats') os.mkdir(validation_cats_dir) validation_dogs_dir = os.path.join(validation_dir, 'dogs') os.mkdir(validation_dogs_dir) test_cats_dir = os.path.join(test_dir, 'cats') os.mkdir(test_cats_dir) test_dogs_dir = os.path.join(test_dir, 'dogs') os.mkdir(test_dogs_dir) fnames = ['cat.{}.jpg'.format(i) for i in range(1000)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(train_cats_dir, fname) shutil.copyfile(src, dst) fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(validation_cats_dir, fname) shutil.copyfile(src, dst) fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(test_cats_dir, fname) shutil.copyfile(src, dst) fnames = ['dog.{}.jpg'.format(i) for i in range(1000)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(train_dogs_dir, fname) shutil.copyfile(src, dst) fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(validation_dogs_dir, fname) shutil.copyfile(src, dst) fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)] for fname in fnames: src = os.path.join(original_dataset_dir, fname) dst = os.path.join(test_dogs_dir, fname) shutil.copyfile(src, dst) def cat(): base_dir = '/Users/renyuzhuo/Desktop/cat/dogs-vs-cats-small' train_dir = os.path.join(base_dir, 'train') validation_dir = os.path.join(base_dir, 'validation') model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Flatten()) model.add(layers.Dropout(0.5)) model.add(layers.Dense(512, activation='relu')) model.add(layers.Dense(1, activation='sigmoid')) model.summary() model.compile(loss='binary_crossentropy', optimizer=optimizers.RMSprop(lr=1e-4), metrics=['acc']) # train_datagen = ImageDataGenerator(rescale=1. / 255) train_datagen = ImageDataGenerator( rescale=1. / 255, rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, ) test_datagen = ImageDataGenerator(rescale=1. / 255) train_generator = train_datagen.flow_from_directory( train_dir, target_size=(150, 150), batch_size=32, class_mode='binary') validation_generator = test_datagen.flow_from_directory( validation_dir, target_size=(150, 150), batch_size=32, class_mode='binary') history = model.fit_generator( train_generator, steps_per_epoch=100, epochs=100, validation_data=validation_generator, validation_steps=50) model.save('cats_and_dogs_small_2.h5') acc = history.history['acc'] val_acc = history.history['val_acc'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs = range(len(acc)) plt.plot(epochs, acc, 'bo', label='Training acc') plt.plot(epochs, val_acc, 'b', label='Validation acc') plt.title('Training and validation accuracy') plt.legend() plt.show() plt.figure() plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'b', label='Validation loss') plt.title('Training and validation loss') plt.legend() plt.show() if __name__ == "__main__": time_start = time.time() # make_small() cat() time_end = time.time() print('Time Used: ', time_end - time_start)
>Initiated from Public Number: RAIS