From the course: AI Text Summarization with Hugging Face
Exploring Hugging Face - Hugging Face Tutorial
From the course: AI Text Summarization with Hugging Face
Exploring Hugging Face
In this course, we'll be performing text summarization using Hugging Face. So before we do anything, we need to be introduced to the Hugging Face community and platform and see what it has to offer us. Head over to huggingface.co, and here you can see what the Hugging Face community has to offer. Hugging Face refers to a company and an open-source community that focuses on natural language processing and artificial intelligence. This platform offers a range of tools and services related to NLP and machine learning. Now, you might have a particular kind of task that you wish to solve using artificial intelligence. You can head over to the tasks option here on Hugging Face, and here you'll find the collection of predefined tasks that users can use to fine-tune and evaluate pre-trained models. And you can see that there are models for all kinds of tasks. There are models for computer vision tasks. These include image classification, image segmentation, object detection, and many others. If you scroll down, you'll find the models that are available on Hugging Face for natural language processing tasks, for question answering, sentence similarity, text classification, text generation, as well as summarization. Summarization is the task that we are interested in at this point in time. The Hugging Face community also offers models for audio-related tasks, tabular classification, tabular regression tasks, multimodal tasks, and even reinforcement learning. Anything that you're looking for in the field of artificial intelligence as pre-trained models should be available on Hugging Face. Today we are focused on text summarization, so let's scroll back up and click through to "Summarization" under natural language processing. This will give you an overview of what exactly this task is about and also all of the resources that Hugging Face has to offer for text summarization. These resources could be models, datasets or apps that the Hugging Face community has built and made available here. Now, if you head over to models, you'll be able to find all of the pre-trained models available here on Hugging Face, and every model is categorized based on the kind of task that it performs. If you click through to the summarization category, you'll be able to drill down into all of the models available here on Hugging Face for summarization. Most of these are transformer models, but there are other simpler models performing extractive summarization as well. Let's click through to the very first one. This is the bart-large-cnn from Facebook. This is a pre-trained sequence to sequence transformer model based on the paper that's linked off the middle of the screen, BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. This is just one model available for text summarization. As you can see, Hugging Face provides many more. Now, let's say you're interested in datasets for a particular kind of task, head over to the datasets tab. Now, since we're focused on summarization, we can click on the summarization task off to the left of your screen, and this will give you access to all of the summarization datasets that you can access from the Hugging Face community. A common dataset used for fine-tuning summarization models is the cnn_dailymail dataset, and if you click through, you'll be able to get an overview of what this dataset is all about. Observe a sample here. The dataset comprises of the original article and the corresponding summary, and this can be used to train or fine-tune summarization models. Well, Hugging Face also offers something known as spaces, and spaces is essentially where members of the Hugging Face community make available applications that they've built around ML models. There are a whole host of applications here for different kinds of machine learning tasks and in fact we'll be using one of these apps for extractive text summarization. We'll get to that in a bit. Hugging Face offers a variety of different machine learning models and other applications. If you head over to Docs, you'll get an overview of what Hugging Face has to offer and get documentation for how you can use different aspects of Hugging Face. And the great thing about the Hugging Face community is that most of this is available to you absolutely free. If you look at pricing, you can see that there are paid plans for Hugging Face, but the HF Hub, which is what we'll be using, is an absolutely free plan. For prototyping and playing around with ML models, HF Hub is more than sufficient. That's what we'll be using.