From the course: Prompt Engineering with LangChain
What are language models?
From the course: Prompt Engineering with LangChain
What are language models?
- [Instructor] Our world is increasingly shaped by technology, from asking a digital assistant about the weather, to translating documents into your native language. Our interactions with technologies is becoming deeply rooted in natural language. Language models are increasingly becoming the heart of these interactions, the backbone of our daily digital engagements. They help machines understand and generate human language, making our interactions with technology smoother and more intuitive. A language model is a machine learning model trained to understand, generate, and interact with human languages. These models learn patterns, structures, and nuances of a language. When given a word or a sequence, a language model predicts what might come next based on what it's learned. This is similar to how after hearing the phrase "peanut butter", and many of us might instinctively complete with "jelly". In practice, a language model gives the probability of a certain word sequence being valid. It's important to note that validity here doesn't strictly refer to the grammatical correctness of the generated text, it's about understanding human language's nuances, its context and its intricacies. Instead, it's about how closely the sequence mirrors human-like writing patterns. This is achieved by training these models on vast amounts of textual data, enabling them to understand context, nuances, and intricacies of human language. There are primarily two types of language models, probabilistic language models, and neural network-based language models. Probabilistic language models are typically based on n-gram probabilities. They predict the next word from the preceding n words. An n-gram is a sequence of n items from a text. For instance, "I love dogs" has a bigram inside of it. Actually, it has two bigrams, "I love" and "love dogs". The biggest limitation for these probabilistic language models lies in their inability to capture deep context. Neural network-based language models excel at predicting the next word in a sequence using mechanisms like attention for contextual understanding. When referring to large language models, or LLMs, we're often talking about the transformer architecture, which is a type of deep neural network architecture. Transformers understand the contextual relationship of words in a sequence using a mechanism called self-attention. Transformers can be categorized into the following varieties, encoder only, like the BERT model, which stands for Bidirectional Encoder Representation Transformer. Decoder-only, like the GPT, the generative pre-trained transformer. And the encoder-decoder, like the T5 model. LLMs have some key features that distinguish them from previous language models. First is that these large language models display surprising emergent abilities that were not observed in smaller models. For example, GPT-3 can handle few-shot tasks through in-context learning, something its predecessor GPT-2 struggled with. Accessing LLMs is primarily done through a prompting interface like the ChatGPT UI or the GPT-4 API from OpenAI. This means users need to understand how LLMs function and format your prompts appropriately. Developing LLMs blurs the lines between peer research and engineering. Training these models demands practical experience in large-scale data processing and distributed parallel training. As we become increasingly reliant on AI assistance, a foundational grasp of language models equips us to better appreciate this technology's marvel. It empowers us to utilize these tools to their fullest potential and critically evaluate them when necessary. Remember, these models will start to underpin more and more advanced technologies, from search engines to digital assistants built into tools that you're already using every day, to even operating systems. LLMs represent a significant leap in AI capabilities, and they'll continue to shape and redefine our digital experiences.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.