CIFAR-10 classification using Keras Tutorial
The CIFAR-10 dataset consists of 60000 32×32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
Recognizing photos from the cifar-10 collection is one of the most common problems in the today’s world of machine learning. I’m going to show you – step by step – how to build multi-layer artificial neural networks that will recognize images from a cifar-10 set with an accuracy of about 80% and visualize it.
Building 4 and 6-layer Convolutional Neural Networks
To build our CNN (Convolutional Neural Networks) we will use Keras and introduce a few newer techniques for Deep Learning model like activation functions: ReLU, dropout.
Keras is an open source neural network Python library which can run on top of other machine learning libraries like TensorFlow, CNTK or Theano. It allows for an easy and fast prototyping, supports convolutional, recurrent neural networks and a combination of the two.
In the beginning, we will learn what Keras is, deep learning, what we will learn, and briefly about the cifar-10 collection. Then step by step, we will build a 4 and 6 layer neural network along with its visualization, resulting in % accuracy of classification with graphical interpretation.
Finally, we will see the results and compare the two networks in terms of the accuracy and speed of training for each epoch.
The CIFAR-10 DATASET
The dataset is divided into five training batches and one test batch, each with 10000 images.
The test batch contains exactly 1000 randomly-selected images from each class.
The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another.
Between them, the training batches contain exactly 5000 images from each class.
You can download it from here.
Convolutional Neural Networks – The Code
First of all, we will be defining all of the classes and functions we will need:
# Import all modules import time import matplotlib.pyplot as plt import numpy as np from keras.models import Sequential from keras.layers import Dense from keras.layers import Dropout from keras.layers import Flatten from keras.constraints import maxnorm from keras.optimizers import SGD from keras.layers import Activation from keras.layers.convolutional import Conv2D from keras.layers.convolutional import MaxPooling2D from keras.layers.normalization import BatchNormalization from keras.utils import np_utils from keras_sequential_ascii import sequential_model_to_ascii_printout from keras import backend as K if K.backend()=='tensorflow': K.set_image_dim_ordering("th") # Import Tensorflow with multiprocessing import tensorflow as tf import multiprocessing as mp # Loading the CIFAR-10 datasets from keras.datasets import cifar10
As a good practice suggests, we need to declare our variables:
- batch_size – the number of training examples in one forward/ backwards pass. The higher the batch size, the more memory space you’ll need
- num_classes – number of cifar-10 dataset classes
- one epoch – one forward pass and one backward pass of all the training examples
# Declare variables batch_size = 32 # 32 examples in a mini-batch, smaller batch size means more updates in one epoch num_classes = 10 # epochs = 100 # repeat 100 times
Next, we can load the CIFAR-10 data set.
(x_train, y_train), (x_test, y_test) = cifar10.load_data() # x_train - training data(images), y_train - labels(digits)
Print figure with 10 random images from the CIFAR-10 dataset.
# Print figure with 10 random images from each fig = plt.figure(figsize=(8,3)) for i in range(num_classes): ax = fig.add_subplot(2, 5, 1 + i, xticks=[], yticks=[]) idx = np.where(y_train[:]==i)[0] features_idx = x_train[idx,::] img_num = np.random.randint(features_idx.shape[0]) im = np.transpose(features_idx[img_num,::],(1,2,0)) ax.set_title(class_names[i]) plt.imshow(im) plt.show()
Running the code create a 5×2 plot of images and show examples from each class.
The pixel values are in the range of 0 to 255 for each of the red, green and blue channels.
It’s good practice to work with normalized data.
Because the input values are well understood, we can easily normalize to the range 0 to 1 by dividing each value by the maximum observation which is 255.
Note, the data is loaded as integers, so we must cast it to float point values in order to perform the division.
# Convert and pre-processing y_train = np_utils.to_categorical(y_train, num_classes) y_test = np_utils.to_categorical(y_test, num_classes) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255
The output variables are defined as a vector of integers from 0 to 1 for each class.
Let’s start by defining a simple CNN model.
We will use a model with four convolutional layers followed by max pooling and a flattening out of the network to fully connected layers to make predictions:
- Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
- Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
- Max Pool layer with size 2×2
- Dropout set to 25%
- Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
- Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
- Max Pool layer with size 2×2
- Dropout set to 25%
- Flatten layer
- Fully connected layer with 512 units and a rectifier activation function
- Dropout set to 50%
- Fully connected output layer with 10 units and a softmax activation function
A logarithmic loss function is used with the stochastic gradient descent (SGD) optimization algorithm configured with a large momentum and weight decay start with a learning rate of 0.1.
Then we can fit this model with 100 epochs and a batch size of 32.
def base_model(): model = Sequential() model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:])) model.add(Activation('relu')) model.add(Conv2D(32,(3, 3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Conv2D(64, (3, 3), padding='same')) model.add(Activation('relu')) model.add(Conv2D(64, (3,3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(512)) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(num_classes)) model.add(Activation('softmax')) sgd = SGD(lr = 0.1, decay=1e-6, momentum=0.9 nesterov=True) # Train model model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy']) return model cnn_n = base_model() cnn_n.summary() # Fit model cnn = cnn_n.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test,y_test),shuffle=True)
The second variant for 6-layer model:
- Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
- Dropout set to 20%
- Convolutional input layer, 32 feature maps with a size of 3×3, a rectifier activation function
- Max Pool layer with size 2×2
- Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
- Dropout set to 20%
- Convolutional input layer, 64 feature maps with a size of 3×3, a rectifier activation function
- Max Pool layer with size 2×2
- Convolutional input layer, 128 feature maps with a size of 3×3, a rectifier activation function
- Dropout set to 20%
- Convolutional input layer, 128 feature maps with a size of 3×3, a rectifier activation function
- Max Pool layer with size 2×2
- Flatten layer
- Dropout set to 20%
- Fully connected layer with 1024 units and a rectifier activation function and a weight constraint of max norm set to 3
- Dropout set to 20%
- Fully connected output layer with 10 units and a softmax activation function
def base_model(): model = Sequential() model.add(Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=x_train.shape[1:])) model.add(Dropout(0.2)) model.add(Conv2D(32,(3,3),padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(64,(3,3),padding='same',activation='relu')) model.add(Dropout(0.2)) model.add(Conv2D(64,(3,3),padding='same',activation='relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(128,(3,3),padding='same',activation='relu')) model.add(Dropout(0.2)) model.add(Conv2D(128,(3,3),padding='same',activation='relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten()) model.add(Dropout(0.2)) model.add(Dense(1024,activation='relu',kernel_constraint=maxnorm(3))) model.add(Dropout(0.2)) model.add(Dense(num_classes, activation='softmax'))
In this section, we can visualize the model structure. For this problem, we can use a library for Keras for investigating architectures and parameters of sequential models by Piotr Migdał.
# Vizualizing model structure sequential_model_to_ascii_printout(cnn_n)
First variant for 4-layer:
Second variant for 6-layer:
After the training process, we can see loss and accuracy on plots using the code below:
# Plots for training and testing process: loss and accuracy plt.figure(0) plt.plot(cnn.history['acc'],'r') plt.plot(cnn.history['val_acc'],'g') plt.xticks(np.arange(0, 101, 2.0)) plt.rcParams['figure.figsize'] = (8, 6) plt.xlabel("Num of Epochs") plt.ylabel("Accuracy") plt.title("Training Accuracy vs Validation Accuracy") plt.legend(['train','validation']) plt.figure(1) plt.plot(cnn.history['loss'],'r') plt.plot(cnn.history['val_loss'],'g') plt.xticks(np.arange(0, 101, 2.0)) plt.rcParams['figure.figsize'] = (8, 6) plt.xlabel("Num of Epochs") plt.ylabel("Loss") plt.title("Training Loss vs Validation Loss") plt.legend(['train','validation']) plt.show()
4-layer:
6-layer:
scores = cnn_n.evaluate(x_test, y_test, verbose=0) print("Accuracy: %.2f%%" % (scores[1]*100))
Running this example prints the classification accuracy and loss on the training and test datasets for each epoch.
After that, we can print a confusion matrix for our example with graphical interpretation.
Confusion matrix – also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each row of the matrix represents the instances in a predicted class while each column represents the instances in an actual class (or vice versa).
# Confusion matrix result from sklearn.metrics import classification_report, confusion_matrix Y_pred = cnn_n.predict(x_test, verbose=2) y_pred = np.argmax(Y_pred, axis=1) for ix in range(10): print(ix, confusion_matrix(np.argmax(y_test,axis=1),y_pred)[ix].sum()) cm = confusion_matrix(np.argmax(y_test,axis=1),y_pred) print(cm) # Visualizing of confusion matrix import seaborn as sn import pandas as pd df_cm = pd.DataFrame(cm, range(10), range(10)) plt.figure(figsize = (10,7)) sn.set(font_scale=1.4)#for label size sn.heatmap(df_cm, annot=True,annot_kws={"size": 12})# font size plt.show()
4-layer confusion matrix and visualizing:
[[599 5 74 98 55 14 12 9 117 17]
[ 16 738 12 65 9 26 7 6 40 81]
[ 31 0 523 168 136 86 33 14 9 0]
[ 10 1 31 652 90 175 19 15 5 2]
[ 6 0 34 132 717 55 16 31 9 0]
[ 5 1 17 233 53 661 10 15 4 1]
[ 2 1 39 157 105 48 637 3 7 1]
[ 6 0 14 97 103 96 5 637 5 1]
[ 41 7 28 84 19 18 6 4 783 10]
[ 25 28 8 77 29 27 5 19 59 723]]
6-layer confusion matrix and visualizing:
[[736 11 54 45 30 14 15 9 61 25]
[ 10 839 6 38 3 13 7 5 22 57]
[ 47 2 566 96 145 65 51 17 7 4]
[ 23 6 56 570 97 140 57 29 12 10]
[ 16 2 52 80 700 55 25 64 3 3]
[ 10 1 64 211 59 582 24 39 6 4]
[ 4 3 42 114 121 40 650 13 5 8]
[ 14 1 40 57 69 68 11 723 3 14]
[ 93 32 26 37 16 15 6 2 752 21]
[ 34 83 8 42 12 21 6 21 25 748]]
Comparison Accuracy [%] between 4-layer and 6-layer CNN
As we can see in the chart below, the best accuracy for 4-layer CNN is for epochs between 20-50. For 6-layer CNN is for epochs between 10-20 epochs.
Comparison time of learning process between 4-layer and 6-layer CNN
As we can see in the chart below, the neural network training time is considerably longer for a 6-layer network.
Summary
After working through this tutorial you learned:
- What is Keras library and how to use it
- What is Deep Learning
- How to use ready datasets
- What is Convolutional Neural Networks(CNN)
- How to build step by step Convolutional Neural Networks(CNN)
- What are differences in model results
- Basics of Machine Learning
- Introduction to Artificial Intelligence(AI)
- What is the confusion matrix and how to visualize it
If you have any questions about the project or this post, please ask your question in the comments.
You can download it from GitHub.