Before discussing convolutional neural networks, we need to first understand deep learning. Deep learning is a part of machine learning that’s inspired by how the human brain works. Deep learning algorithms try to enact the human brain by repeatedly analyzing data using a logical structure. For this, a multilayered structure is used called a neural network.
Neurons make up the artificial neural networks, which are the main processing units of the network.
Let’s look at the diagram above. The first layer is the input layer, where the training observations are fed to the neurons. All the computations are performed by the hidden layer. In the end, the output is extracted from these two layers and is predicted by the output layer.
Let’s use an example in which an image is passed with M X M pixels, where every pixel is fed as input to the first layer neurons.
The neurons of each layer are connected to the other layer through channels and each channel has a specific weight associated with it.
Then these input neurons are multiplied with those specific weights and the computed sum is sent to the hidden layer neurons.
Moreover, these neurons are linked to one value called bias that is added to the input sum.
This computed value is then passed through an activation function and this function transfers data to the next layer neurons.
The output is determined by the highest value of the neuron. This is how the data is transmitted through the network.
Convolutional Neural Networks
Multi-Layered Perceptron
Artificial Neural Network
Recurrent Neural Network
This neural network extracts characteristics from images that are given as input to implement certain tasks such as face recognition, image classification, etc. This network mostly has more than one layer for characteristic extraction which performs the multiplication of specific weights with input, while keeping the sensitive features without human supervision.
CNN is applied for classification of images specifically. With Artificial Neural Network (ANN), the two-dimensional image will be modified into a one-dimensional vector before beginning to train the model.
Moreover, as the image size increases, the parameters that are used to train the ANN model will also increase, which can lead to loss of storage. ANN is unable to capture the sequential information.
Therefore, CNN is preferred.
First/Convolution Layer:
The first layer of this network performs the extraction of characteristics by putting a filter over the input image. The output layer correlates with the original image’s textures, curves, sharp edges, etc. In the network that has more convolutional layers the generic characteristics are extracted in the initial layer and the complex characteristics are removed when the network gets deeper.
Second/Pooling Layer:
The function of this layer is to decrease the number of trainable parameters by reducing the size of the image, this is how the computational cost is reduced.
As pooling is done on each dimension so the depth of the image does not have any effect. The most common method is max pooling, where the important element is taken as input. Then max pooling is performed to decrease the dimension to a huge extent to give the output image while keeping the essential information.
Fully Connected Layer:
The layers that determine the output are the last layers, or the fully connected layers. The pooling layer output is compressed into a 1 D vector and then given as input to the connected layer.
The output or the last layer has the identical number of neurons as the number of groups in our classification problem, thereby relating characteristics to specific labels.
The above process is called forward propagation. The output generated is compared to the actual for the generation of error.
The error is then propagated back to update the filters (weights) and bias values.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
Import cv2
Import numpy as np
Import os
Import pandas as pd
Import keras
Import sklearn
Import tensorflow as tf
from keras.models
import Sequential
from keras.preprocessing.image
import ImageData Generator
# Preprocessing the data
Datagen_train = ImageDataGenerator(rescale = 1. / 255, zoom_range = 0.2, shear_range = 0.2, horizontal_flip = True)
Train_gen = train_datagen.flow_from_directory(r” C: catstrain vs dogs”, target_size = (64, 64), batch_size = 32, class_mode = ‘binary’)
Datagen_test = ImageDataGenerator(rescale = 1. / 255)
validation_gen = Datagen_test.flow_from_directory(r” C: catstrain vs dogs”, target_size = (64, 64), batch_size = 32, class_mode = ‘binary’)
#CNN model
Cnn_model = tf.keras.models.sequential()
#Convolution
Cnn_model.add(tf.keras.layers.MaxPool2D(pool_size = 2, strides = 2))
#Adding more convolutional layer
Cnn_model.add(tf.keras.layers.Convo2D(filters = 32, kernel_size = 2, activation = ‘relu’))
#Adding one pooling layer
Cnn_model.add(tf.keras.layers.MaxPool2D(pool_size = 2, strides = 2))
Cnn_model.add(tf.keras.layers.flatten())
Cnn_model.add(tf.keras.layers.Dense(units = 256, activation = ‘relu’))
#compiling the model
Cnn_model.compile(optimizer = “adam”, metrics = [‘accuracy’])
Cnn_model.summary()