Artificial Neural networks replicate the way human learns and they are inspired by the structure of the brain. Neural Networks (NNs) consist of layers of neurons where each neuron in a layer is connected to the neurons in the next layer. In general NNs have 3 layers,
- Input Layer : It is the first layer of the network and Inputs are fed to each neuron of this layer
- Hidden Layer : This is the important layer of the network where the feature extraction takes place
- Output Layer : This is the last layer of the network and It provides the final output
Based on the architecture of NNs, we can categorize them into 3 types,
- Simple Neural Network : It has an input layer and an output layer only (no hidden layers)
- Shallow Neural Network : It has a hidden layer with an input layer and an output layer
- Deep Neural Network : It has more than one hidden layer with an input layer and an output layer
Let's see how an input data passes through the network. We feed each input feature of input data to a neuron in the input layer and then data is processed while passing through the hidden layers and finally we receive the output at the output layer. Likewise by passing all the training data through the network for several iteration, we train a neural network and after the training process, we check the accuracy of trained neural network using the test data. Training of a neural network can be roughly divided into two parts,
- Forward Propagation
- Backward Propagation
First let's see what is forward propagation and how it works using an Example.
Forward propagation is making steps in forward direction. Here we pass the input data through the neural network in forward direction (from input layer to output layer). In forward propagation, we calculate the output of a neuron (node) of a layer, by getting the sum of products of inputs (from neurons of the previous layer) and weights (associated with that neuron and neurons of the previous layer) and adding a bias to that sum and then passing it through an activation function.
Let's have a look on the terminologies used above.
- Weight : Every connection between two neurons is represented using a real value called as weight. Simply it tells the importance of an input feature on the output.
- Bias : It is a constant value. It is used to shift the output of an activation function to right or left.
- Activation function : It is very important in neural networks as it introduces the nonlinearity into the network. Because of that, neural network is able to learn the nonlinear relationships of data.
Let's work out on the following data to understand the forward propagation. We will create a simple neural network and train it using these data.
We can represent our input data as follow,
Here each row of an input data represents each feature.
First let's create a simple neural network. As each input data has two feature values, we should use two neurons in the input layer and also as output is either 0 or 1 (binary classification), we should use one neuron in the output layer (Since we are creating a simple neural network, there is no hidden layers).
Note : In regression since output is a continuous dependent variable, you can use a single neuron in the output layer. In classification, if you are going to use the neural network for binary classification, you then can use a single neuron in the output layer and if you are going to use it for multi class classification, you can select the number of neurons in the output layer based on the number of classes.
Now let's see how our weight matrix will look like. Size of a weight matrix between (n-1) and n layers is given by (Here n = 1 denotes the input layer),
As in our case n = 2, then the size of our weight matrix will be given by,
Let's assign the bias as 1 and weights as follow,
We can visualize the structure of the simple neural network we are going to use as in the figure below (Also we can refer it as a single layer perceptron).
If we forward propagate the input data through the network, we will get following as weighted sum after adding bias,
After that, we have to pass it through an activation function. Here we are going to use 'sigmoid' activation function as it limits the outputs to a range between 0 and 1. Equation of a 'sigmoid' function is given by,
If we pass the z value of each input through the 'sigmoid' activation function, outputs of the activation function will be,
After passing through the activation function, we have to define a threshold value to classify the output as 0 or 1. we are going to use a threshold value of 0.5. Then,
Predicted outputs will be,
These are the predicted outputs of the simple neural network from forward propagation.
This is how forward propagation works and predicts the outputs. But you can notice that there are errors between the predicted outputs and the actual outputs.
Therefore for reducing the errors, we use back propagation and during that we update our weights and bias. In our next article, we will discuss how back propagation works using these predicted outputs.