From the course: Deep Learning with Python: Optimizing Deep Learning Models

Using KerasTuner for hyperparameter tuning - Python Tutorial

From the course: Deep Learning with Python: Optimizing Deep Learning Models

Using KerasTuner for hyperparameter tuning

- [Person] In this video, you will learn how to use Keras Tuner to perform a search for the optimal hyperparameters of a deep-learning model. I'll be writing the code in the "04_05e" file. You can follow along by completing the empty code cells in the "04_05b" file. Note that this video is the second in a two-video sequence that teaches you how to tune the hyperparameters of a deep-learning model. If you have not done so, watch the previous course video on how to define a tune-able deep-learning model for a detailed explanation of the prior code. Also, make sure to run the previously routine code to get your environment up to speed. I've already done so. So having defined our hyperparameter tune-able model, we now need to set up a tuner. Here, we choose Hyperband, a resource-efficient approach to hyperparameter tuning that builds upon random search and combines it with the principles of early stopping. Its primary goal is to reduce the computational cost of hyperparameter tuning by dynamically allocating more resources to promising hyperparameter configurations and fewer resources to less promising ones. So we start by importing Karas Tuner, and we call it "kt," and we're going to call the Hyperband function from Keras Tuner. So here, we specify model as a first argument, and model is the function, the tune-able model that we defined in the previous video, so every time that the process cycles through, it's going to call that function to define a new model, so we're going to specify "max_epochs" as five, which means that this is the number of epochs it's going to try to see if it can improve on the performance, and for each epoch, it's going to do one iteration, and then, we're going to set a seed so that you and I can get the same results if we... When we run these... This process at different times on different computers. We're going to set "overwrite" to go to "True," which means that it should overwrite the log of the previous attempt of hyperparameter tuning, our "objective" is to maximize validation accuracy, we're going to specify directly for the "tuning_logs," which it keeps track of the process, and we're going to give a name to the project, so let's go ahead and run this. That initializes our tuner. Now, we can start the search process with... Using the "tuner.search" method. This command will build and train multiple models using different combinations of hyperparameters, all right? And so as we specified in our tune-able model, we're going to try different layer sizes, different dropout rates, different learning rates, and we're also going to try different batch sizes, which we specify in this next code chunk. So here, we specify "tuner.search," and within "tuner.search," we specify the training data, the training labels, and we say for each of these search processes, we want to do five epochs, we want to do validation splits of 0.1, and then, we want to specify different batch sizes that we want to evaluate, so here, we specify the batch size as a label for this type of parameter values that we're going to be looking at, but what we're saying here is I want to try batch sizes from 32 to 128, with a step of 32, so this means that we want to try for every 32, so 32, then 64, and all the way to 128, so that's kind of what we want to try here to see which batch size actually gives us the best performance, so we're going to go ahead and initiate our tuner, and so, that way, the process begins. So we're going to let this go through, and so, we see that it's going through the process of hyperparameter tuning, trying to find the best combination of hyperparameters for our problem, so we're going to keep observing this to see where we are in the process. All right, so this is going to keep going. (no audio) Okay. So the process is complete, and so, we went through 10 different trials, so once the search is complete, we can output the best configuration of hyperparameters, right? So that took us about three minutes and 13 seconds to search through the hyperparameter space to find the optimal set of hyperparameters for our problem. So now that the search is complete, we can output the best set of hyperparameters, so we can run this here, and this is now what we get, "The optimal number of units in the first and second densely-connected layers are 416," for the first hidden layer, "With a dropout rate of 0.40," and 64 for the second hidden layer, "With a dropout rate of 0.20" okay? "The optimal learning rate for the optimizer is 0.001, and the optimal training batch size is 96," okay? So we use a Hyperband approach to be able to identify these optimal hyperparameters. Next, using these optimal hyperparameters, we are going to create a tuned model, so this is a model now that's going to use those hyperparameters, we're going to define our model based on those values, so we go ahead and create our model, and finally, we train the tuned model. (no audio) So now we're going to use... What, so 10 epochs? So we're going to wait for it to finish, so we're going to use the optimal batch size and the model, obviously, was defined with the optimal values for the other hyperparameters. All right, and so, now, we can also evaluate how well the tuned model generalizes to new data, so let's look at how it performs, and so, here, we see that the test loss is 0.0649, and the test accuracy is 0.9183, right? So how is this better than what we had before? So let's scroll up a little bit to see what the baseline model... How it performed, so all the way up here, we see the performance on the... Of the baseline model, so we can see that the test accuracy of the baseline model is 96%, and our model obviously improved upon that slightly to 98%, right? So what we did here was a very basic hyperparameter search. We can do something much more extensive than what we've done here, but just the little part that we did here was able to improve the performance of our model, so as... If you've been following along between this video and the one before this, it means that you now know how to tune the hyperparameters of a deep-learning model in Python using Keras Tuner. Good job.

Contents