Introduction to Large Language Models
Google Cloud - Introduction to Generative AI path
This is collection of notes from the Introduction to Large Language Models course on Google Cloud taught by John Ewald. Images are taken from the course itself.
It is a detailed compilation and annotated excerpts will be available on my LinkedIn profile.
Course Overview
The course is divided into 4 parts:
Define Large Language Models (LLMs)
LLM Use Cases
Prompt Tuning and Tuning LLMs
Generative AI development tools.
Defining Large Language Models (LLMs)
Large Language Models, as the name suggests, are large general purpose language models that can be pretrained and fine-tuned.
Pretraining is the process to teach the LLMs to perform basic tasks such as Text Classification, Question Answering (QA), Document Summarization and Text Generation.
These LLMs can be tailored to add upon domain specific tasks. These domains can be Retail, Finance, Entertainment, etc.
Major Features of LLMs
Large
It refers to the large training dataset required to train the model. It also corresponds to large number of parameters which define the skill of solving. These parameters are the memories and the knowledge learnt by the model.
General Purpose
It is sufficient to solve common problems. This is because of the commonality of human language. Since these models require lots of resources to train, only certain organizations have the capacity to create foundational models.
Pretrained and Fine-tuned
The models are pretrained on large databases and then can be fine-tuned for a domain specific task using a small database.
Types of LLMs
There are three types of LLMs and each of them require different type of prompting. Also, the first two types confuse easily so we need to use them carefully.
Generic (Raw) Language Model
It predicts next token based on training data. It is just like autocomplete in search.
Instruction Tuned Language Model
It predicts response to instructions given. Example: Summarize text, generate poem in given style, give synonyms, sentiment classification.
Dialog Tuned Language Model
It is trained to have dialog to predict next response. It is a special case of instruction tuned where requests are questions. It is further specialization which is expected to have longer context and work better with natural question like phrasing.
Chain of Thought Reasoning
Language models output better answers when they first output reason for the answer, rather than directly arriving to it. This is more prominent in numerical calculations.
Benefits
Single model for different tasks
Built using petabytes of data and billions of parameters. Can perform operations such as language translation, sentence completion, QA, etc.
Fine-tuning requires minimal field data
These models have decent performance with little domain data. They can be used in few-shot or zero-shot scenarios.
Continuous performance growth
These models can be continuously improved by providing more data and increasing the number of parameters.
Example - PaLM
Pathways Language Model (PaLM) was released by Google in April 2022. It has 540 billion parameters and achieved state of the art performance on variety of tasks. It is a dense Decoder only transformer model.
It leverages Google's new Pathways system that efficiently trains a single model across multiple TPU V4 pods. It is a new AI architecture that can handle multiple tasks at once, learn new tasks quickly and reflect a better understanding of the world. It enables PaLM to orchestrate distributed computation for accelerators.
LLM Development vs Traditional Development
LLM Use Cases
The course discusses Text Generation - Question Answering as an example application of LLMs.
Question Answering (QA)
It is a subfield of Natural Language Processing. It answers questions posed in Natural Language.
QA systems are trained on large amount of text and code.
It is used for wide range of questions such as factual, definitional and opinion based.
Generative QA
Generates free text using context. It leverages text generation models and do not need domain knowledge.
Bard QA
Performs operation as directed by the prompt given and also provides definition. Getting desired results require prompt design.
Prompt Tuning
Both prompt design and prompt engineering involve creating a prompt that is clear, concise and informative. But their differences are as follow:
Prompt Design
Involves creating a prompt tailored for a specific task.
Requires instructions and context.
It is essential.
Prompt Engineering
Create prompt to improve performance.
Requires domain knowledge as well as provided examples.
Involves using effective keywords.
Necessary for systems that require high accuracy or performance.
Tuning LLMs
Adapting a Large Language Model to new domain is called Tuning.
Example: Tuning for legal or medical domains.
Fine-tuning
Fine-tuning an LLM requires bringing own dataset and then retraining every weight in LLM. This involves a big training job and hosting one's own model.
This process is expensive and usually not realistic as every organization does not have sufficient resources to perform these tasks. Instead, Parameter Efficient Tuning Method (PETM) is employed.
Parameter Efficient Tuning Method (PETM)
This involves tuning LLM without duplicating the model. The base model remains unaltered. Instead, small add-on layers are added and tuned, which can be swapped at inference time.
Another form of simple PETM is prompt tuning which can also alter the model output without retraining it.
Generative AI Development Tools
Discussion about Vertex AI Search and Conversation, Gen AI Studio and MakerSuite were covered in the last article: Introduction to Generative AI (Part 2)
Ending Note
More details about prompts and prompt engineering will be discussed in subsequent posts. Stay tuned for lots of Generative AI content.