From the course: AI Text Summarization with Hugging Face
Using Colab to work with Hugging Face Transformers - Hugging Face Tutorial
From the course: AI Text Summarization with Hugging Face
Using Colab to work with Hugging Face Transformers
In this demo, we'll see how we can write Python code in a Colab notebook to access and use a pre-trained transformer model from the Hugging Face Transformers library. We'll pick a pre-trained model that works well for text summarization, we'll fine-tune that model on a dataset that we load, that is, the CNN Daily Mail dataset and we'll generate summaries and compute the ROUGE score for that model on our data. The pre-trained NLP model that we'll use from the Hugging Face library is the T5 Small Model. Let's look at the model card for this and see what the model is about. You can see here that this model is a text-to-text transformer model, which essentially performs a number of different NLP tasks. This model has been trained to work on a variety of tasks. Out of the box, you can prepend a different prefix to the input based on what task you're planning to use it for. For example, translate, summarize, and so on. We'll write Python code to access and work with this pre-trained model using Colab notebooks. Colab is a free Jupyter Notebook environment that runs entirely on the cloud. You don't have to do any setup, you just run the notebook within your browser. Colab was developed by Google to provide free access to GPUs and TPUs to anyone who needs them to build machine learning models. And we'll actually use Colab GPUs to generate summaries and to fine-tune our summarization model. Let's sign into Colab. All this needs is a Gmail account. So as long as you have a Gmail account, you can simply use that to sign in. I have the notebook where I perform abstractive text summarization using the T5 Model on my local machine. So I'm simply going to upload that notebook to Colab and I'll execute the notebook and we'll discuss the code that it contains. Select the notebook abstractive text summarization using T5, and this notebook has been uploaded to Colab and we can run it from here. First, let's take a look at the runtime environment. Go to runtime, change runtime type, and notice that this uses the T4 GPU. So I'll be using a GPU to perform summarization and to fine-tune this model. Next, we pip install the libraries that we need to work with, Hugging Face Transformers. The Transformers library will give us access to the pre-trained model, the datasets library will give us access to Hugging Face datasets, the evaluate library will allow us to compute evaluation metrics on our model, the rouge_score will allow us to compute ROUGE scores for our summaries, and the accelerate library will allow us to perform distributed training on GPUs. We have just the one GPU, but we do need the accelerate library as well. Go ahead and pip install all of these packages and once that's done, make sure you restart your runtime so that all of the newly installed Python packages are available for you to use.