From the course: Artificial Intelligence Foundations: Neural Networks
Hyperparameters and neural networks
From the course: Artificial Intelligence Foundations: Neural Networks
Hyperparameters and neural networks
- [Instructor] In this video, we look at hyperparameters which provides techniques to let you improve your model performance and avoid underfitting and overfitting models. If you are not using automated machine learning where the training process is handled automatically by an AutoML framework, then hyperparameters need to be set by you. Lastly, model parameters are not hyperparameters. Let's look at some examples. This image shows a six layer deep neural network using the dataset features from our hands-on lab in the previous section, for example, digital, television, newspaper, and radio. Below the neural network, there is an expanded drawing of the summation and activation functions which was covered in a previous video. Model parameters are something that a model learns on its own. They are estimated and learned by data during training. So when you think of model parameters, think of the model weights or nodes or bias in a neural network. These model weights cannot be manually set by you. They must be learned from the forward and backward pass workflow you saw in a previous video. But when you think of hyperparameters, think of what you can set. You can determine the number of epochs, the learning rate, and what regularization techniques to use. These are the components shown in the blue bubbles. There are other hyperparameters we'll cover later in the video. So let's look at this hands-on neural network from the CARES Lab you previously built. Model parameters are learned during training when we optimize a loss function. Model hyperparameters cannot be learned during training but are set beforehand. Model hyperparameters define the model's architecture but are external to the model. Thus, they do not change with your model during training, and there is no learning as part of model training. Since they are not a part of the trained model, no values are saved. Hyperparameters are used to optimize the model performance. Typical hyperparameters for a neural network include the number and size of the hidden layers, nodes, weight initialization scheme, learning rate, activation and it's decay, dropout, and gradient clipping thresholds, for example. Model parameters are estimated during the training with historical data. It is a part of the model. The estimated value is saved with the trained model. It does not define the model's architecture or optimize performance.
Contents
-
-
-
-
-
-
-
Overfitting and underfitting: Two common ANN problems4m 54s
-
Hyperparameters and neural networks3m 24s
-
How do you improve model performance?3m 56s
-
Regularization techniques to improve overfitting models7m 40s
-
Challenge: Manually tune hyperparameters45s
-
Solution: Manually tune hyperparameters2m 4s
-
-