Deep learning project 2 of Youda school city: dog breed classifier

In this notebook, you will take the first step to develop algorithms that can be part of mobile or Web applications. At the end of this project, your program will be able to take any image provided by the user as input. If a dog can be detected from the image, it will output the prediction of dog breed. If the image shows a human face, it predicts the kind of dog most similar to it. The following figure shows the possible output results after completing the project.

Step 0: import dataset
Step 1: face detection
Step 2: detect dogs
Step 3: create a CNN from scratch to classify dog breeds
Step 4: use a CNN to distinguish dog breeds (using transfer learning)
Step 5: complete your algorithm
Step 6: test your algorithm

Step 0: import dataset

Download dog dataset: dog dataset , place it in the data folder and unzip it.

Downloaders dataset: human dataset (a human face recognition data set), place it in the data folder and unzip it.

Step 1: detect face

We will use the in OpenCV Haar feature-based cascade classifiers To detect the face in the image. OpenCV provides many pre trained face detection models, which are saved in XML files github . We have downloaded one of the detection models and stored it in the haarcascades directory.

In the following code unit, we will demonstrate how to use this detection model to find faces in sample images.

Write a face detector

We can use this program to write a function. If a face is detected in the image, it returns True, otherwise it returns False. This function is called face_detector, which takes the file path of the image as input and appears in the following code block.

Evaluate face detector

Question 1: use the following code cell to test face_ Performance of the detector function.

In human_ What is the probability of detecting a face in the first 100 images of files?
In dog_ What is the probability of detecting a face in the first 100 images of files?

Step 2: detect dogs

In this part, we use the pre trained vgg16 model to detect dogs in the image.

Given an image, the pre trained vgg16 model will return a prediction (from 1000 possible categories in ImageNet) for the objects contained in the image.

Use the pre trained model for prediction

In the next code cell, you will write a function that takes the path of the image as input and returns the index corresponding to the ImageNet class predicted by the pre trained vgg -16 model. The output should always be an integer between 0 and 999 (including 999).

from PIL import Image
import torchvision.transforms as transforms

# Set PIL to be tolerant of image files that are truncated.
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

def VGG16_predict(img_path):
    '''
    Use pre-trained VGG-16 model to obtain index corresponding to 
    predicted ImageNet class for image at specified path
    
    Args:
        img_path: path to an image
        
    Returns:
        Index corresponding to VGG-16 model's prediction
    '''
    
    ## TODO: Complete the function.
    ## Load and pre-process an image from the given img_path
    ## Return the *index* of the predicted class for that image
    normalize = transforms.Normalize(
       mean=[0.485, 0.456, 0.406],
       std=[0.229, 0.224, 0.225]
    )
    transform = transforms.Compose([transforms.Resize((224,224)),
                                    transforms.CenterCrop(224),
                                    transforms.ToTensor(),
                                    normalize
                                   ])
    image = Image.open(img_path)
    #print(image.size)
    image = transform(image)
    image.unsqueeze_(0)
    #print(image.size)
    if use_cuda:
        image = image.cuda()
        
    output = VGG16(image)
    
    if use_cuda:
        output = output.cuda()
        
    class_index = output.data.cpu().numpy().argmax()    
    
    
    return class_index # predicted class index

#print(dog_files[0])
VGG16_predict(dog_files[0])

result:

Write a dog detector

Studying the detailed list You will notice that the serial number corresponding to the dog category is 151-268. Therefore, when checking the pre training model to determine whether the image contains dogs, we only need to check vgg16 above_ Whether the predict function returns a value between 151 and 268 (including the end of the interval).

We use these ideas to complete the "dog" below_ The detector function returns {True if a dog is detected from the image, otherwise} False.

Question 2: use the following code cell to test the dog_ Performance of the detector function.

In human_ files_ What percentage of dogs are detected in short?
In dog_ files_ What percentage of dogs are detected in short?

You are free to explore other pre training networks (such as Inception-v3, ResNet-50, etc.). Please use the following code unit to test other pre trained PyTorch models. If you decide to perform this optional task, please report to human_files_short and dog_files_short performance.

Step 3: create a CNN from scratch to classify dog breeds

Now we have implemented a function that can recognize humans and dogs in images. But we need further methods to identify the category of dogs. In this step, you need to implement a convolutional neural network to classify dog breeds. You need to implement your convolutional neural network from scratch (at this stage, you can't use transfer learning), and you need to achieve an accuracy of more than 1% of the test set. In the five steps of this project, you also have the opportunity to use transfer learning to achieve a model with greatly improved accuracy.

It is worth noting that classifying dog images is a very challenging task. Because even a normal person, it is difficult to distinguish between the Litani and the Welsh Springer.

It is not difficult to find that other dog breeds have small class differences (such as golden retriever and American water hound).

Similarly, labradors are yellow, brown and black. Then your vision based algorithm will have to overcome this high class difference in order to divide these dogs of different colors into the same breed.

We also mentioned that random classification will get a very low result: regardless of the influence of slight imbalance of varieties, the probability of randomly guessing the correct varieties is 1 / 133, and the corresponding accuracy is less than 1%.

Remember, in the field of deep learning, practice is much higher than theory. Try a lot of different frameworks and trust your intuition! Of course, have fun!

Specifies the data loader for the Dog dataset

Question 3: describe the data preprocessing process you selected.

How does your code resize the image (crop, stretch, etc.)? What is the size of the input tensor and why?
Have you decided to expand the data set? If so, how (by translation, flipping, rotation, etc.)? If not, why?

Specify loss function and optimizer

Training and validation model

# the following import is required for training to be robust to truncated images
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
    """returns trained model"""
    # initialize tracker for minimum validation loss
    valid_loss_min = np.Inf 
    
    for epoch in range(1, n_epochs+1):
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0
        
        ###################
        # train the model #
        ###################
        model.train()
        for batch_idx, (data, target) in enumerate(loaders['train']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            ## find the loss and update the model parameters accordingly
            ## record the average training loss, using something like
            ## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
            optimizer.zero_grad()    
            ## find the loss and update the model parameters accordingly
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()
            ## record the average training loss, using something like
            ## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
            train_loss += ((1 / (batch_idx + 1)) * (loss.data - train_loss))
            
             # print training/validation statistics 
            if (batch_idx+1) % 40 == 0:
                print('Epoch: {} \tBatch: {} \tTraining Loss: {:.6f}'.format(epoch, batch_idx + 1, train_loss))

            
        ######################    
        # validate the model #
        ######################
        model.eval()
        for batch_idx, (data, target) in enumerate(loaders['valid']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            ## update the average validation loss
            output = model(data)
            loss = criterion(output, target)
            valid_loss += ((1 / (batch_idx + 1)) * (loss.data - valid_loss))
            # print training/validation statistics 
            if (batch_idx+1) % 40 == 0:
                print('Epoch: {} \tBatch: {} \tValidation Loss: {:.6f}'.format(epoch, batch_idx + 1, valid_loss))
            
        # print training/validation statistics 
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
            epoch, 
            train_loss,
            valid_loss
            ))
        
        ## TODO: save the model if validation loss has decreased
        if valid_loss <= valid_loss_min:
            print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(
                    valid_loss_min,
                    valid_loss))
            torch.save(model.state_dict(), save_path)
            valid_loss_min = valid_loss    
            
    # return trained model
    return model

epoch = 10

# train the model
model_scratch = train(epoch, loaders_scratch, model_scratch, optimizer_scratch, 
                      criterion_scratch, use_cuda, 'model_scratch.pt')

# load the model that got the best validation accuracy
model_scratch.load_state_dict(torch.load('model_scratch.pt'))

result:

Epoch: 1 	Batch: 40 	Training Loss: 4.889728
Epoch: 1 	Batch: 80 	Training Loss: 4.887787
Epoch: 1 	Batch: 120 	Training Loss: 4.887685
Epoch: 1 	Batch: 160 	Training Loss: 4.885648
Epoch: 1 	Batch: 200 	Training Loss: 4.881847
Epoch: 1 	Batch: 240 	Training Loss: 4.873278
Epoch: 1 	Batch: 280 	Training Loss: 4.862270
Epoch: 1 	Batch: 320 	Training Loss: 4.854340
Epoch: 1 	Batch: 40 	Validation Loss: 4.712207
Epoch: 1 	Training Loss: 4.851301 	Validation Loss: 4.704982
Validation loss decreased (inf --> 4.704982).  Saving model ...
Epoch: 2 	Batch: 40 	Training Loss: 4.730308
Epoch: 2 	Batch: 80 	Training Loss: 4.719476
Epoch: 2 	Batch: 120 	Training Loss: 4.701708
Epoch: 2 	Batch: 160 	Training Loss: 4.695746
Epoch: 2 	Batch: 200 	Training Loss: 4.692133
Epoch: 2 	Batch: 240 	Training Loss: 4.675904
Epoch: 2 	Batch: 280 	Training Loss: 4.663143
Epoch: 2 	Batch: 320 	Training Loss: 4.650386
Epoch: 2 	Batch: 40 	Validation Loss: 4.488307
Epoch: 2 	Training Loss: 4.643542 	Validation Loss: 4.494160
Validation loss decreased (4.704982 --> 4.494160).  Saving model ...
Epoch: 3 	Batch: 40 	Training Loss: 4.474283
Epoch: 3 	Batch: 80 	Training Loss: 4.501595
Epoch: 3 	Batch: 120 	Training Loss: 4.477735
Epoch: 3 	Batch: 160 	Training Loss: 4.488136
Epoch: 3 	Batch: 200 	Training Loss: 4.490754
Epoch: 3 	Batch: 240 	Training Loss: 4.487989
Epoch: 3 	Batch: 280 	Training Loss: 4.490090
Epoch: 3 	Batch: 320 	Training Loss: 4.481546
Epoch: 3 	Batch: 40 	Validation Loss: 4.262285
Epoch: 3 	Training Loss: 4.479444 	Validation Loss: 4.275317
Validation loss decreased (4.494160 --> 4.275317).  Saving model ...
Epoch: 4 	Batch: 40 	Training Loss: 4.402790
Epoch: 4 	Batch: 80 	Training Loss: 4.372338
Epoch: 4 	Batch: 120 	Training Loss: 4.365306
Epoch: 4 	Batch: 160 	Training Loss: 4.367325
Epoch: 4 	Batch: 200 	Training Loss: 4.374326
Epoch: 4 	Batch: 240 	Training Loss: 4.369847
Epoch: 4 	Batch: 280 	Training Loss: 4.365964
Epoch: 4 	Batch: 320 	Training Loss: 4.363493
Epoch: 4 	Batch: 40 	Validation Loss: 4.249445
Epoch: 4 	Training Loss: 4.364571 	Validation Loss: 4.248449
Validation loss decreased (4.275317 --> 4.248449).  Saving model ...
Epoch: 5 	Batch: 40 	Training Loss: 4.229365
Epoch: 5 	Batch: 80 	Training Loss: 4.267400
Epoch: 5 	Batch: 120 	Training Loss: 4.269664
Epoch: 5 	Batch: 160 	Training Loss: 4.257591
Epoch: 5 	Batch: 200 	Training Loss: 4.261866
Epoch: 5 	Batch: 240 	Training Loss: 4.247512
Epoch: 5 	Batch: 280 	Training Loss: 4.239336
Epoch: 5 	Batch: 320 	Training Loss: 4.230827
Epoch: 5 	Batch: 40 	Validation Loss: 4.043582
Epoch: 5 	Training Loss: 4.231559 	Validation Loss: 4.039588
Validation loss decreased (4.248449 --> 4.039588).  Saving model ...
Epoch: 6 	Batch: 40 	Training Loss: 4.180193
Epoch: 6 	Batch: 80 	Training Loss: 4.140314
Epoch: 6 	Batch: 120 	Training Loss: 4.153989
Epoch: 6 	Batch: 160 	Training Loss: 4.140887
Epoch: 6 	Batch: 200 	Training Loss: 4.151268
Epoch: 6 	Batch: 240 	Training Loss: 4.153749
Epoch: 6 	Batch: 280 	Training Loss: 4.153314
Epoch: 6 	Batch: 320 	Training Loss: 4.156451
Epoch: 6 	Batch: 40 	Validation Loss: 3.940857
Epoch: 6 	Training Loss: 4.149529 	Validation Loss: 3.945810
Validation loss decreased (4.039588 --> 3.945810).  Saving model ...
Epoch: 7 	Batch: 40 	Training Loss: 4.060485
Epoch: 7 	Batch: 80 	Training Loss: 4.065772
Epoch: 7 	Batch: 120 	Training Loss: 4.056967
Epoch: 7 	Batch: 160 	Training Loss: 4.068470
Epoch: 7 	Batch: 200 	Training Loss: 4.076772
Epoch: 7 	Batch: 240 	Training Loss: 4.087616
Epoch: 7 	Batch: 280 	Training Loss: 4.074337
Epoch: 7 	Batch: 320 	Training Loss: 4.080192
Epoch: 7 	Batch: 40 	Validation Loss: 3.860693
Epoch: 7 	Training Loss: 4.078263 	Validation Loss: 3.884382
Validation loss decreased (3.945810 --> 3.884382).  Saving model ...
Epoch: 8 	Batch: 40 	Training Loss: 3.960585
Epoch: 8 	Batch: 80 	Training Loss: 3.983979
Epoch: 8 	Batch: 120 	Training Loss: 3.965129
Epoch: 8 	Batch: 160 	Training Loss: 3.965021
Epoch: 8 	Batch: 200 	Training Loss: 3.965830
Epoch: 8 	Batch: 240 	Training Loss: 3.976013
Epoch: 8 	Batch: 280 	Training Loss: 3.975547
Epoch: 8 	Batch: 320 	Training Loss: 3.978744
Epoch: 8 	Batch: 40 	Validation Loss: 3.784086
Epoch: 8 	Training Loss: 3.980776 	Validation Loss: 3.779312
Validation loss decreased (3.884382 --> 3.779312).  Saving model ...
Epoch: 9 	Batch: 40 	Training Loss: 3.917738
Epoch: 9 	Batch: 80 	Training Loss: 3.967938
Epoch: 9 	Batch: 120 	Training Loss: 3.934165
Epoch: 9 	Batch: 160 	Training Loss: 3.917138
Epoch: 9 	Batch: 200 	Training Loss: 3.910391
Epoch: 9 	Batch: 240 	Training Loss: 3.909857
Epoch: 9 	Batch: 280 	Training Loss: 3.907439
Epoch: 9 	Batch: 320 	Training Loss: 3.901893
Epoch: 9 	Batch: 40 	Validation Loss: 3.826410
Epoch: 9 	Training Loss: 3.903304 	Validation Loss: 3.824970
Epoch: 10 	Batch: 40 	Training Loss: 3.910845
Epoch: 10 	Batch: 80 	Training Loss: 3.910709
Epoch: 10 	Batch: 120 	Training Loss: 3.915924
Epoch: 10 	Batch: 160 	Training Loss: 3.900949
Epoch: 10 	Batch: 200 	Training Loss: 3.883792
Epoch: 10 	Batch: 240 	Training Loss: 3.882242
Epoch: 10 	Batch: 280 	Training Loss: 3.875963
Epoch: 10 	Batch: 320 	Training Loss: 3.858594
Epoch: 10 	Batch: 40 	Validation Loss: 3.638186
Epoch: 10 	Training Loss: 3.861175 	Validation Loss: 3.647465
Validation loss decreased (3.779312 --> 3.647465).  Saving model ...
<All keys matched successfully>

test model

Try your model in the test data set of dog pictures. Use the following code unit to calculate and print test loss and accuracy. Make sure your test accuracy is greater than 10%.

Step 4: use CNN migration learning to distinguish dog breeds

Using the method of Transfer Learning can help us greatly reduce the training time without losing accuracy. In the following steps, you can try to use Transfer Learning to train your own CNN.

Model architecture

Specify loss function and optimizer

Training and validation model

# train the model
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
    """returns trained model"""
    # initialize tracker for minimum validation loss
    valid_loss_min = np.Inf 
    
    for epoch in range(1, n_epochs+1):
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0
        
        ###################
        # train the model #
        ###################
        model.train()
        for batch_idx, (data, target) in enumerate(loaders['train']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
                
            optimizer.zero_grad()    
            ## find the loss and update the model parameters accordingly
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()
            ## record the average training loss, using something like
            ## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
            train_loss += ((1 / (batch_idx + 1)) * (loss.data - train_loss))
            
             # print training/validation statistics 
            if (batch_idx+1) % 40 == 0:
                print('Epoch: {} \tBatch: {} \tTraining Loss: {:.6f}'.format(epoch, batch_idx + 1, train_loss))
        ######################    
        # validate the model #
        ######################
        model.eval()
        for batch_idx, (data, target) in enumerate(loaders['valid']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            ## update the average validation loss
            output = model(data)
            loss = criterion(output, target)
            valid_loss += ((1 / (batch_idx + 1)) * (loss.data - valid_loss))
            # print training/validation statistics 
            if (batch_idx+1) % 40 == 0:
                print('Epoch: {} \tBatch: {} \tValidation Loss: {:.6f}'.format(epoch, batch_idx + 1, valid_loss))
           
        # print training/validation statistics 
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
            epoch, 
            train_loss,
            valid_loss
            ))
        
        ## TODO: save the model if validation loss has decreased
        if valid_loss <= valid_loss_min:
            print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(
                    valid_loss_min,
                    valid_loss))
            torch.save(model.state_dict(), save_path)
            valid_loss_min = valid_loss    
    # return trained model
    return model


n_epochs = 2
model_transfer = train(n_epochs, loaders_transfer, model_transfer, optimizer_transfer, criterion_transfer, use_cuda, 'model_transfer.pt')

# load the model that got the best validation accuracy (uncomment the line below)
model_transfer.load_state_dict(torch.load('model_transfer.pt'))

result:

test model

Prediction of dog breeds by model

Write a function that takes the image path as input and returns the breed of dog predicted by your model

Step 5: complete your algorithm

Implement an algorithm whose input is the path of the image. It can distinguish whether the image contains a person, a dog or neither, and then:

If a dog is detected from the image, the predicted breed is returned.
If a person is detected from the image, the most similar dog breed is returned.
If neither can be detected in the image, an error message is output.

We welcome you to write your own function to detect humans and dogs in the image. You can use the function completed above at will_ Detector and dog_detector function. You need to use your CNN to predict dog breeds in step 5.

The example output of the algorithm is provided below, but you are free to design your own model!

Step 6: test your algorithm

In this section, you will try your new algorithm! What kind of dog does the algorithm think you look like? If you have a dog, can it accurately predict your dog's breed? If you have a cat, will it misjudge your cat as a dog?