From the course: Artificial Intelligence Foundations: Neural Networks
How neural networks learn
- [Instructor] In this video, let's learn how neural networks actually learn, the data characteristics, and patterns. This image is what we'll use to understand how neural networks learn. Our use case is how to predict the price of a house. Is this a classification problem or a regression problem? If you answer it regression problem then you are correct. You are not looking to classify house prices in buckets of low, medium, or high but to predict the house price based on a set of features. Now, back to our image. First, let's refer to the transfer function as weighted sum for easier understanding. And let's add Y hat as the output. This is commonly referred to as the predicted value. In learning, the predicted value is compared to the actual value. For your dataset, this would be the predicted price of the house compared to the actual price of the house in the dataset. And we've added the summation and activation function formula to remind you of the mathematical calculations. To simplify the formula let's use notation for the summation and activation function. So, how does a neural network learn? One method is by using the back propagation algorithm. This image shows the regression problem as a supervised learning use case. The actual values of the size and price of the house are shown in the table and plotted on a graph. The line shown on a graph is the line of best fit and are the predicted values. First, your housing dataset training samples are passed through the network, and the number of training examples in one forward pass is called the batch size. This image shows the output or Y hat obtained from the network's forward pass compared to the actual output. The goal is to minimize the error or the difference between the actual and the predicted value. This image shows the error calculations for each individual data point. The error is used to adjust or change weights of the notes such that the error, meaning the distance between the true or actual value and predicted value decreases gradually. Why is this important? We need a way to evaluate our predictive model. In our regression housing example, we used a root mean squared error as our cost or loss function shown here. So what we mean by learning is which weights and bias minimize a certain cost function. In our case, root mean squared error. The process for calculating root mean squared error, our loss function evaluation metric is to, one, get the errors for the training examples. Two, compute the squares of the error values. Three, compute the mean of the squared error values. And four, take a square root of the mean. The notation here is to use a little hat symbol on top of the Y that represents your model's prediction and to use a plain Y to represent the label. So now, back to backpropagation, no pun intended. Think of our goal as decreasing the distance to the line between our true and predicted values. The back propagation step is also known as updating the weights, which is a bit easier to visualize. So how is this weight adjustment done? Well, the back propagation algorithm iteratively passes batches of data through the network, updating the weights so that the error is decreased in the forward pass. The signal flow moves from the input layer, through the hidden layers, to the output layer, and the decision of the output layer is measured against the ground truth or actual label. In the backward pass, using back propagation and calculus, the various weights and biases are back propagated through the network making those adjustments. The network keeps playing this back and forth like a game of tennis until the error can go no lower. This state is known as convergence.