From the course: AI Text Summarization with Hugging Face

Pushing the model to the Hugging Face Hub - Hugging Face Tutorial

From the course: AI Text Summarization with Hugging Face

Pushing the model to the Hugging Face Hub

At this point, we've successfully fine-tuned our model on the CNN Daily Mail dataset that we had. We fine-tuned our model on only a small subset of the original dataset because with a single GPU on Colab, training on a larger dataset would have been very onerous, we would not have had sufficient resources to do so. Now because we specified push_to_hub = True, as a training argument, you can see that a repository with our model and tokenizer parameters has been created here on the Hugging Face hub. You can see the name of the output directory that we have specified, cnn_new s_summary_model_ trained_on_reduced_data. The model card is empty. I'll show you how you can populate that in just a bit. And you can see in files and versions, a bunch of configuration files, as well as the serialized tokenizer and the serialized model. You can see the PyTorch model that we trained, that is, the .bin file in the pickled format. training_args.bin holds the training arguments again in the pickled format. These arguments, config file, and model have been pushed during the training process. Now, heading back to our Colab notebook, I'm going to call trainer.push_to_hub(). This will actually push our trained model to the hub. The hub will now contain the model parameters that we used, the evaluation metrics computed on the results, and even the model card. Back in the hub, let's just refresh this page and you can see the updated results that have been pushed here. Observe that a model card with a standard template has been automatically generated for you. The model card has automatically included the rouge scores computed on your validation data. Then there are a bunch of sections here for model descriptions and other details. Information for that is not automatically generated, but here at the bottom, we have the training results. You can see the performance of the model over three epochs of training. For example, look at the rouge1 score. After the first epoch, it was 0.2175, then it moved up to 0.2183, and then to 0.219. Three epochs of training is not very much, so we definitely can't have expected this model to improve a whole lot. You can see that the framework versions on which the model was trained is also logged here. If you go to the files and versions tab, you will find the final version of our serialized model after training is now complete. You can see that the model file, pytorch_model.bin was updated just about a minute ago when we pushed it to hub.

Contents