Visualizing the Output Images of the Convolutional Layers of a CNN

TLDR; Have you ever wondered ‘How do the output images of the convolutional layers of a Convolutional Neural Network (CNN) look like?’. If yes, then this article is for you.

Posted by Fathima Fazla on 2021-Sep-09

Have you ever wondered ‘How do the output images of the convolutional layers of a Convolutional Neural Network (CNN) look like?’. If yes, then this article is for you. Even if not, read the article to find the answer to the question you went through just before.

CNN is widely used for image classification. As its name suggests, ‘Convolution’ is the main technique used in a CNN. The convolutional layers of CNNs perform the convolution operation on images using the filters. The size of the output images of convolutional layers depends on the size of input images of convolutional layers, stride of filters, size of padding and number of filters that are used in the convolution operation.

If you want to learn deeper about the theory of CNNs and implementation of a CNN, I can suggest you following articles to go through.

  1. Convolutional Neural Network : An Overview
  2. Layers of a Convolutional Neural Network (Part 1)
  3. Layers of a Convolutional Neural Network (Part 2)
  4. Building a CNN model for the Classification of Fashion MNIST (Step by Step)

Now let's see how to visualize the output images of the convolutional layers. Here first we need to train a CNN using an image dataset. For that I am going to use MNIST digits dataset. Using that dataset, let’s train a CNN of 2 convolutional layers (having 32 filters size of 2*2) with ReLU activation function followed by a Pooling layer and finally 2 Fully connected layers.

#Importing required libraries
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

#Loading the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

#Normalizing the images of the MNIST dataset
train_images, test_images = train_images/255.0, test_images/255.0

#Reshaping the dimension of input images to (28, 28, 1)
train_images_new = train_images.reshape(-1, 28, 28, 1)
test_images_new = test_images.reshape(-1, 28, 28, 1)

#Creating the CNN model
model_cnn = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (2,2), input_shape = (28, 28, 1), activation = 'relu'),
    tf.keras.layers.MaxPooling2D((2,2)),
    tf.keras.layers.Conv2D(32, (2,2), activation = 'relu'),
    tf.keras.layers.MaxPooling2D((2,2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(500, activation = 'sigmoid'),
    tf.keras.layers.Dense(10, activation = 'softmax')
])

#Compiling the CNN model
model_cnn.compile(optimizer = ‘adam’, loss = ‘sparse_categorical_crossentropy’, metrics = [‘accuracy’])

#Running the CNN model
results_cnn = model_cnn.fit(train_images_new, train_labels, epochs = 20, batch_size = 1000, validation_data = (test_images_new, test_labels))

We have trained a CNN using the MNIST dataset. Now we are going to visualize the output images of first convolutional layer.

#Defining the outputs of all layers
layer_outputs = [layer.output for layer in model_cnn.layers]
activation_model = tf.keras.models.Model(inputs = model_cnn.input, outputs = layer_outputs)

#Taking the output images of all layers for first image of training set
activations = activation_model.predict(train_images_new[0].reshape(-1, 28, 28, 1))

activations’ list (obtained from above code) consists of outputs of all layers. From ‘activations’ list, let’s pick the output images of first convolutional layer.

first_layer = activations[0]

From the output images of first convolutional layer, let’s pick the output image for first filter (as each convolutional layer has 32 filters) and visualize it.

output = first_layer[0,:,:,0]
plt.figure(figsize=(5,5))
plt.imshow(output, cmap = 'viridis')
plt.xticks([])
plt.yticks([])
plt.show()

Output image will look as follow,

Also we can visualize all 32 output images for 32 filters of first convolutional layer.

seq = []
for k in (0, 8, 16, 24):
    seq.append(np.concatenate([first_layer[0,:,:,i] for i in range (k,k+8)], axis = 1))
stack = np.concatenate([seq[i] for i in range(0,4)], axis = 0)
plt.figure(figsize=(6,6))
plt.imshow(stack, cmap = 'viridis')
plt.xticks([])
plt.yticks([])
plt.show()

Output images will look like as follow,

Likewise from ‘activations’ list, you can access the output images of each layer (like first_layer = activations[0], second_layer = activations[1], third_layer = activations[2] and so on). Hope you now know how to visualize the output images of not only convolutional layers, but also pooling layers.

If you have any doubt, feel free to ask below.