Convolutional Neural Networks (CNN) is a form of Artificial Neural Network used largely for image identification and processing. It is a powerful tool that can recognize patterns in images but requires millions of labeled data points for training. Here in this article we will deep dive and explore the CNNs in depth.
The following are the topics we are about to discuss further in this blog.
What is CNN?
It was first introduced by Yann LeCun. It was also called ConvNets, in the 1980s. A Convolutional Neural Network (CNN) is a form of Artificial Neural Network used largely for image identification and processing. It is a powerful tool that can recognize patterns in images but requires millions of labeled data points for training.
Even though CNNs were created to handle issues with visual imaging, they may also be used for image categorization, natural language processing, drug development, and health risk assessments. It can also assist self-driving automobiles with depth estimates.
How Do Convolutional Neural Networks Work?
The higher performance of Convolutional Neural Networks with pictures, voice, or audio signal inputs sets them apart from conventional neural networks. As we mentioned earlier, it is divided into three sorts of layers:
- Convolution Layer
- Pooling Layer
- Fully-Connected Layer
We will further discuss these layers in detail in this blog.
If we give an input image, it goes to convolution+Relu, each area has a 3D, RGB, then it goes to the next pooling layer where it shrinks the max value and this cycle keeps repeating. This is the learning process. We try to classify the values and then we have to apply neural nets and try to figure out what the actual image is. Given that it is a car, softmax gives a value of 0 to 1, the probability of the maximum is identified as the car.
Design the Future with your AI Skills
Start your AI Journey with the Best Certification
Convolutional Neural Network Architecture
The CNN architecture is a made up of two important components:
- In a process known as Feature Extraction, a convolution tool isolates and identifies the distinct characteristics of a picture for analysis. This feature extraction consists of an input, convolution layer, and pooling layer.
- Another component present in CNN architecture is classification in which we have fully connected the layer and output. The classification component is a fully connected layer that uses the output of the convolution process to forecast the image’s class using the information acquired in earlier stages.
CNN becomes more complex with each increasing layer. This is done for detecting larger areas of the picture. The first few layers majorly concentrate on basic elements like colors and borders. As the images travel through the CNN layers, it starts to differentiate the bigger components or features of the images, and eventually, identifies the target object. We will talk about these layers in detail in the upcoming section.
Convolutional Neural Network Layers
Convolutional layers, pooling layers, and fully-connected (FC) layers are the three types of layers that make up the CNN. A CNN architecture will be constructed when these layers are layered. Here is a detailed explanation of these three layers.
Convolution layer: The convolutional layer is the most essential component of the CNN as this is where most processing takes place. It needs input data, a filter, and a feature map, among other things.
Let’s pretend the input is a color picture, which is made up of a 3D matrix of pixels. This implies the input will have three dimensions: height, width, and depth, which match the RGB color space of a picture. Here we try to decompose RGB to a multidimensional layer and apply a filter to each layer.
A feature detector, commonly known as kernel or filter, traverses over the image’s receptive fields, checking for the presence of each feature
Pooling Layer: Pooling layers is a dimension reduction technique that reduces the number of input parameters. The pooling process sweeps a filter across the input just like the convolutional layer. However, this filter does not contain any weights, unlike the convolution layer.
Instead, the kernel uses an aggregation function to populate the output array from the values in the receptive field. The pooling layer is also known as the Downsampling process. And, maximum pooling and average pooling are the two basic forms of pooling.
Fully-Connected Layer: The fully-connected layer’s name is a perfect description of what it is. As previously stated, with partly connected layers, the input image’s pixel values are not directly connected to the output layer.
However, each node in the output layer links directly to a node in the preceding layer in the fully-connected layer. This layer conducts categorization based on the characteristics retrieved by the preceding layers and the filters applied to them
While convolutional and pooling layers generally utilize ReLu functions to distinguish inputs, FC layers typically use a softmax activation function to provide a probability sequence ranging between 0 to 1.
Transform Data into Intelligence
Learn AI with Our In-Depth Certification
Important aspects of CNN
The important aspects of CNN are filters, receptive field, stride, and padding.
Filters
Filters in Convolutional Neural Networks recognize spatial patterns such as edges in an image by detecting changes in the picture’s intensity values.
Receptive Field
Receptive fields are specified areas of space or spatial constructs that include units that offer input to a layer’s collection of units. The filter size of a layer within a Convolution Neural Network determines the receptive field.
Stride
The kernel’s stride is the number of pixels it traverses across the input matrix. Although stride values of two or more are uncommon, a bigger stride results in a lesser output.
Padding
Padding essentially increases the number of images that a convolutional neural network can handle. Each pixel is scanned by the kernel/filter as it goes over the picture, converting the image into a smaller image.
Implementation of Convolutional Neural Networks using TensorFlow and Keras
Importing the necessary libraries
import numpy as np
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
Loading the dataset
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
Normalizing the data
x_train, x_train = x_train / 255.0, x_test / 255.0
Converting into labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
Building the CNN Model
model = models.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu')
])
Compiling the model
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
Training the model
history = model.fit(x_train,
y_train,
epochs=10,
validation_data=(x_test, y_test),
batch_size=64)
Evaluating the model
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print(f'\nTest accuracy: {test_acc}')
Limitations of CNN
- Because of operations like max pool, a Convolutional Neural Network is substantially slower.
- If the CNN contains several layers, the training process will take a long time if the machine does not have a powerful GPU.
- To analyze and train the neural network, a ConvNet requires a huge dataset.
- It fails when it comes to comprehending the contents of a picture.
Get 100% Hike!
Master Most in Demand Skills Now!
Conclusion
Regardless of the limitations of CNNs, there’s no doubt that they’ve ushered in a new era in Artificial Intelligence. Face recognition, picture search, and editing, augmented reality, and other computer vision applications all employ CNNs today. Our results are spectacular and valuable, as improvements in CNN’s demonstrate, but we are still a long way from reproducing the core components of human intellect. We hope this blog helps you comprehend everything you need to know about convolutional neural networks. If you want to understand more about CNNs, check out our Artificial Intelligence Course, right away.
Our Artificial Intelligence Courses Duration and Fees
Cohort starts on 25th Jan 2025
₹79,002
Cohort starts on 11th Jan 2025
₹79,002
Cohort starts on 1st Feb 2025
₹79,002