adporn.net Challenges with building Transformers - Applied AI: Getting Started with Hugging Face Transformers Video Tutorial | LinkedIn Learning, formerly Lynda.com

LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: Applied AI: Getting Started with Hugging Face Transformers

Challenges with building Transformers

From the course: Applied AI: Getting Started with Hugging Face Transformers

Start my 1-month free trial Buy for my team

Challenges with building Transformers

“

- [Instructor] Machine learning technologies for NLP have grown leaps and bounds in the last few years. Transformers are the state of the art, but do they make building and serving transformers easy? Building transformer models from scratch is not the same as building classical machine learning models. Transformer models pose some unique challenges, and overcoming them is critical for building successful NLP applications. Let's begin with language modeling challenges. Irrespective of the specific application task, all NLP models need to represent human languages in some form. Human languages are complex in terms of how they are spoken and interpreted. While all languages have a syntax or grammar, general usage by humans may not follow them. There are semantic relationships between the words and the language, like synonyms, antonyms, and others. For example, if we have a word, king, what is its relationship with other words like queen, boy, and emperor? Unless they are modeled explicitly, it would be hard to interpret them correctly. Also, the same word may have different contexts based on where they're used. For example, the word file may mean a physical cardboard file, or a computer file. Also, file is a verb. Capturing all these relationships in a single model would usually result in a huge model with extensive training requirements. The next set of challenges is related to training the model itself. First, the training data sizes would be huge, partly because it is text data, as opposed to numeric data, and also because a large corpus of data is needed to capture all contexts and relationships. Labeling text data is also difficult and resource intensive. There are heavy pre-processing and cleansing requirements to prepare text data for machine learning. The models that come out for transformers are usually huge, sometimes a few gigabits in size. Compute requirements, like CPU, memory, and disk are significantly high for both training and inference. Transformer models typically need GPUs to train and predict. In general, NLP use cases with transformers are more expensive to develop and maintain than classical machine learning. Because of these challenges, building every transformer model from scratch is not cost-effective. But then, as all transformer models are trained on general language characteristics, it's possible to develop pre-trained models for them, Using pre-trained models and then customizing them for the specific task through transferred learning is more effective and less time-consuming. It is becoming popular. Hacking phase and its transformers library provides us with these pre-trained models, and we will explore how to use and customize them in the rest of the course.

Contents

- (Locked)
  
  Continuing with Transformers
  
  45s