CIFAR-10 classification using Keras Tutorial

Posted 27/08/2018

In nlp

The CIFAR-10 dataset consists of 60000 32×32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

Recognizing photos from the cifar-10 collection is one of the most common problems in the today’s world of machine learning. I’m going to show you – step by step – how to build multi-layer artificial neural networks that will recognize images from a cifar-10 set with an accuracy of about 80% and visualize it.

Building 4 and 6-layer Convolutional Neural Networks

To build our CNN (Convolutional Neural Networks) we will use Keras and introduce a few newer techniques for Deep Learning model like activation functions: ReLU, dropout.

Keras is an open source neural network Python library which can run on top of other machine learning libraries like TensorFlow, CNTK or Theano. It allows for an easy and fast prototyping, supports convolutional, recurrent neural networks and a combination of the two.

In the beginning, we will learn what Keras is, deep learning, what we will learn, and briefly about the cifar-10 collection. Then step by step, we will build a 4 and 6 layer neural network along with its visualization, resulting in % accuracy of classification with graphical interpretation.

Finally, we will see the results and compare the two networks in terms of the accuracy and speed of training for each epoch.

The CIFAR-10 DATASET

The dataset is divided into five training batches and one test batch, each with 10000 images.

The test batch contains exactly 1000 randomly-selected images from each class.

The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another.

Between them, the training batches contain exactly 5000 images from each class.

You can download it from here.

Convolutional Neural Networks – The Code

First of all, we will be defining all of the classes and functions we will need:

# Import all modules
import time
import matplotlib.pyplot as plt
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.constraints import maxnorm
from keras.optimizers import SGD
from keras.layers import Activation
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.normalization import BatchNormalization
from keras.utils import np_utils
from keras_sequential_ascii import sequential_model_to_ascii_printout
from keras import backend as K
if K.backend()=='tensorflow':
    K.set_image_dim_ordering("th")

# Import Tensorflow with multiprocessing
import tensorflow as tf
import multiprocessing as mp

# Loading the CIFAR-10 datasets
from keras.datasets import cifar10

As a good practice suggests, we need to declare our variables:

batch_size – the number of training examples in one forward/ backwards pass. The higher the batch size, the more memory space you’ll need
num_classes – number of cifar-10 dataset classes
one epoch – one forward pass and one backward pass of all the training examples

# Declare variables

batch_size = 32 
# 32 examples in a mini-batch, smaller batch size means more updates in one epoch

num_classes = 10 #
epochs = 100 # repeat 100 times

Next, we can load the CIFAR-10 data set.

(x_train, y_train), (x_test, y_test) = cifar10.load_data() 
# x_train - training data(images), y_train - labels(digits)

Print figure with 10 random images from the CIFAR-10 dataset.

# Print figure with 10 random images from each

fig = plt.figure(figsize=(8,3))
for i in range(num_classes):
    ax = fig.add_subplot(2, 5, 1 + i, xticks=[], yticks=[])
    idx = np.where(y_train[:]==i)[0]
    features_idx = x_train[idx,::]
    img_num = np.random.randint(features_idx.shape[0])
    im = np.transpose(features_idx[img_num,::],(1,2,0))
    ax.set_title(class_names[i])
    plt.imshow(im)
plt.show()

Running the code create a 5×2 plot of images and show examples from each class.

The pixel values are in the range of 0 to 255 for each of the red, green and blue channels.

It’s good practice to work with normalized data.

Because the input values are well understood, we can easily normalize to the range 0 to 1 by dividing each value by the maximum observation which is 255.

Note, the data is loaded as integers, so we must cast it to float point values in order to perform the division.

# Convert and pre-processing

y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train  /= 255
x_test /= 255

The output variables are defined as a vector of integers from 0 to 1 for each class.

Let’s start by defining a simple CNN model.

We will use a model with four convolutional layers followed by max pooling and a flattening out of the network to fully connected layers to make predictions:

Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
Max Pool layer with size 2×2
Dropout set to 25%
Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
Max Pool layer with size 2×2
Dropout set to 25%
Flatten layer
Fully connected layer with 512 units and a rectifier activation function
Dropout set to 50%
Fully connected output layer with 10 units and a softmax activation function

A logarithmic loss function is used with the stochastic gradient descent (SGD) optimization algorithm configured with a large momentum and weight decay start with a learning rate of 0.1.

Then we can fit this model with 100 epochs and a batch size of 32.

def base_model():

    model = Sequential()
    model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]))
    model.add(Activation('relu'))
    model.add(Conv2D(32,(3, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    model.add(Conv2D(64, (3, 3), padding='same'))
    model.add(Activation('relu'))
    model.add(Conv2D(64, (3,3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    model.add(Flatten())
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(num_classes))
    model.add(Activation('softmax'))

    sgd = SGD(lr = 0.1, decay=1e-6, momentum=0.9 nesterov=True)

# Train model

    model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
    return model
cnn_n = base_model()
cnn_n.summary()

# Fit model

cnn = cnn_n.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test,y_test),shuffle=True)

The second variant for 6-layer model:

Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
Dropout set to 20%
Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
Max Pool layer with size 2×2
Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
Dropout set to 20%
Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
Max Pool layer with size 2×2
Convolutional input layer, 128 feature maps with a size of 3×3, a rectifier activation function
Dropout set to 20%
Convolutional input layer, 128 feature maps with a size of 3×3, a rectifier activation function
Max Pool layer with size 2×2
Flatten layer
Dropout set to 20%
Fully connected layer with 1024 units and a rectifier activation function and a weight constraint of max norm set to 3
Dropout set to 20%
Fully connected output layer with 10 units and a softmax activation function

def base_model():
    model = Sequential()

    model.add(Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=x_train.shape[1:]))
    model.add(Dropout(0.2))

    model.add(Conv2D(32,(3,3),padding='same', activation='relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(64,(3,3),padding='same',activation='relu'))
    model.add(Dropout(0.2))

    model.add(Conv2D(64,(3,3),padding='same',activation='relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Conv2D(128,(3,3),padding='same',activation='relu'))
    model.add(Dropout(0.2))

    model.add(Conv2D(128,(3,3),padding='same',activation='relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))

    model.add(Flatten())
    model.add(Dropout(0.2))
    model.add(Dense(1024,activation='relu',kernel_constraint=maxnorm(3)))
    model.add(Dropout(0.2))
    model.add(Dense(num_classes, activation='softmax'))

In this section, we can visualize the model structure. For this problem, we can use a library for Keras for investigating architectures and parameters of sequential models by Piotr Migdał.

# Vizualizing model structure

sequential_model_to_ascii_printout(cnn_n)

First variant for 4-layer:

Second variant for 6-layer:

After the training process, we can see loss and accuracy on plots using the code below:

# Plots for training and testing process: loss and accuracy

plt.figure(0)
plt.plot(cnn.history['acc'],'r')
plt.plot(cnn.history['val_acc'],'g')
plt.xticks(np.arange(0, 101, 2.0))
plt.rcParams['figure.figsize'] = (8, 6)
plt.xlabel("Num of Epochs")
plt.ylabel("Accuracy")
plt.title("Training Accuracy vs Validation Accuracy")
plt.legend(['train','validation'])


plt.figure(1)
plt.plot(cnn.history['loss'],'r')
plt.plot(cnn.history['val_loss'],'g')
plt.xticks(np.arange(0, 101, 2.0))
plt.rcParams['figure.figsize'] = (8, 6)
plt.xlabel("Num of Epochs")
plt.ylabel("Loss")
plt.title("Training Loss vs Validation Loss")
plt.legend(['train','validation'])


plt.show()

4-layer:

6-layer:

scores = cnn_n.evaluate(x_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

Running this example prints the classification accuracy and loss on the training and test datasets for each epoch.

After that, we can print a confusion matrix for our example with graphical interpretation.

Confusion matrix – also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each row of the matrix represents the instances in a predicted class while each column represents the instances in an actual class (or vice versa).

# Confusion matrix result

from sklearn.metrics import classification_report, confusion_matrix
Y_pred = cnn_n.predict(x_test, verbose=2)
y_pred = np.argmax(Y_pred, axis=1)

for ix in range(10):
    print(ix, confusion_matrix(np.argmax(y_test,axis=1),y_pred)[ix].sum())
cm = confusion_matrix(np.argmax(y_test,axis=1),y_pred)
print(cm)

# Visualizing of confusion matrix
import seaborn as sn
import pandas  as pd


df_cm = pd.DataFrame(cm, range(10),
                  range(10))
plt.figure(figsize = (10,7))
sn.set(font_scale=1.4)#for label size
sn.heatmap(df_cm, annot=True,annot_kws={"size": 12})# font size
plt.show()

4-layer confusion matrix and visualizing:

[[599   5  74  98  55   14  12   9 117  17]
 [ 16 738  12  65   9   26   7   6  40  81]
 [ 31   0 523 168 136   86  33  14   9   0]
 [ 10   1  31 652  90  175  19  15   5   2]
 [  6   0  34 132 717   55  16  31   9   0]
 [  5   1  17 233  53  661  10  15   4   1]
 [  2   1  39 157 105   48 637   3   7   1]
 [  6   0  14  97 103   96   5 637   5   1]
 [ 41   7  28  84  19   18   6   4 783  10]
 [ 25  28   8  77  29   27   5  19  59 723]]

6-layer confusion matrix and visualizing:

[[736  11  54  45  30  14  15   9  61  25]
 [ 10 839   6  38   3  13   7   5  22  57]
 [ 47   2 566  96 145  65  51  17   7   4]
 [ 23   6  56 570  97 140  57  29  12  10]
 [ 16   2  52  80 700  55  25  64   3   3]
 [ 10   1  64 211  59 582  24  39   6   4]
 [  4   3  42 114 121  40 650  13   5   8]
 [ 14   1  40  57  69  68  11 723   3  14]
 [ 93  32  26  37  16  15   6   2 752  21]
 [ 34  83   8  42  12  21   6  21  25 748]]