From the course: Deep Learning with Python: Optimizing Deep Learning Models
Defining a tunable deep learning model in Keras - Python Tutorial
From the course: Deep Learning with Python: Optimizing Deep Learning Models
Defining a tunable deep learning model in Keras
- [Instructor] In this video, you'll learn how to define a tuneable deep learning model in preparation for hyperparameter tuning. I will run the code in the 04_04e file. You can follow along by completing the empty code cells in the 04_04b file. Note that this video is the first in a two video sequence that teaches you how to tune the hyperparameters of a deep learning model. Make sure to run the previously written code to import and pre-process the data, as well as to build and train the baseline model. I've already done so. So the previously written code has been run above to create a baseline model. Now we're going to define a tuneable model architecture. Before we search for the optimal hyperparameters for a model, we need to define a function that specifies the architectural blueprint of the model. The blueprint will incorporate hyperparameters for the number of units per layer, dropout rates, and the optimizer learning rate. Keras Tuner will invoke this function later on multiple times with different hyperparameter values in order to find an optimal combination that maximizes validation accuracy. So before we continue, we first need to import a couple of things. First is the dropout layer. So we import Dropout from tensorflow.keras.layers. Next, we import the Adam optimizer from tensorflow.keras.optimizers. Now we define our tuneable function, the function that actually defines our model. So we're going to call this function build_model, and we're going to specify an argument hp that represents each of the hyperparameters we're trying to tune. So we begin by initializing our model, so keras.Sequential, and then we specify the input layer. The shape is 784, so we saw this earlier. If you read above, you get a better sense of what's going on here. The next thing we do is to add the first dense layer, the first hidden layer. So in this dense layer, we are going to try for hyperparameter values, or number of neurons here for this dense layer, between 32 and 512. So we specify hp.Int, which means that the hyperparameter values are going to be integer values. And we're going to call this hidden1. This is just a label to describe this layer. And we're going to go from 32 to 512, and with the step of 32, which means we're going to try values for 32 neurons, then plus 32, which is 64 neurons and so forth, all the way to 512 neurons. The next layer is a dropout layer. We are going to try, we're going to use a dropout rate of between 0.1 and 0.5 with a step of 0.1. So we're going to try dropout rates of 0.1, 0.2, 0.3, 0.4, and 0.5. So the idea here is to be able to figure out exactly which of those dropout rates for this layer is best for this model. So we specify hp.Float, indicating that the hyperparameter values are going to be float or decimal numbers. The second hidden layer we call hidden2. We're also going to now try multiple neuron values, number of neuron values, integer values between 16 and 128 with a step of 16. So that means the first value will be 16, then we try 32, and then we go forward from there. So we're going to try 16 all the way to 128. We add another dropout layer. This time around we also evaluate dropout rates between 0.1 and 0.5 with a step of 0.1. Finally, we specify our output layer, right? So our output layer is not going to be a hyperparameter that needs to be tuned, because this is limited to the number of possible outcomes in our model. So this is going to be a typical output layer with units that are 10, which means that there're going to be 10 neurons in this output layer, or 10 nodes. Finally, we're going to go ahead and specify the learning rate values, potential learning rate values for our optimizer, right? So here we're going to use, as we've done in the past, we use hp.Float, hp.Int for integer values. Here we're going to say hp.Choice, which means that we're going to have discrete values that we want to evaluate. This time around we're going to evaluate the values of 0.0001, 0.001, or 0.01. So that's what we specified here as the possible values to evaluate during hyperparameter tuning. And then finally, we want our model to now compile, right? So we specify model.compile. We specify the Adam optimizer as the optimizer we want to use, and we specify the learning rate values to evaluate, which are the values that we specified up here. So every time the function is called during the hyperparameter tuning process, it's going to evaluate these different learning rates along with the other hyperparameters that we are trying to evaluate. And then the loss is categorical cross-entropy. And the metrics that we want to use to evaluate performance is accuracy, okay? So this is how we actually define a tuneable model for hyperparameter tuning. So I'm going to go ahead and run this. All this does at this point in time is just get the model ready, the function ready, so that when we're actually doing the hyperparameter search, this function is called over and over again. And this allows us to be able to search through the space. So next, we'll walk through the process of running a hyperparameter search to identify the optimal set of hyperparameters for our problem. See you on the other side.