to build and deploy customized LLMs. Learn to fine-tune open-source LLMs on proprietary data and deploy your customized LLM models using AWS SageMaker and Streamlit.Fine-tune open-source LLMs for custom business purposesDeploy and scale models for enterprise purposes using AWS SageMaker and StreamlitUnderstand and implement QLoRA from theory to codeLearn to preprocess proprietary datasets with chunking, tokenization, and attention maskingMonitor training and performance to ensure optimal business resultsManage cloud resources and optimize for costApply advanced AI engineering techniques including quantization and more
- Introduction
Course Introduction (What We're Building)
Exercise: Meet Your Classmates and Instructor
Course Resources
ZTM Plugin + Understanding Your Video Player
Set Your Learning Streak Goal
- Setting up our AWS Account
Signing in to AWS
Creating an IAM User
Using our new IAM User
What To Do In Case You Get Hacked!
- Setting Up AWS Sagemaker Environment
Creating a SageMaker Domain
Logging in to our SageMaker Environment
Introduction to JupyterLab
Let's Have Some Fun (+ More Resources)
- Gathering, Chunking, Tokenizing and Uploading our Dataset
Sagemaker Sessions, Regions, and IAM Roles
Examining Our Dataset from HuggingFace
Tokenization and Word Embeddings
HuggingFace Authentication with Sagemaker
Applying the Templating Function to our Dataset
Attention Masks and Padding
Star Unpacking with Python
Chain Iterator, List Constructor and Attention Mask example with Python
Understanding Batching
Slicing and Chunking our Dataset
Creating our Custom Chunking Function
Tokenizing our Dataset
Running our Chunking Function
Understanding the Entire Chunking Process
Uploading the Training Data to AWS S3
Course Check-In
- Understanding LoRA and Setting up HuggingFace Estimator
Setting Up Hyperparameters for the Training Job
Creating our HuggingFace Estimator in Sagemaker
Introduction to Low-rank adaptation (LoRA)
LoRA Numerical Example
LoRA Summarization and Cost Saving Calculation
(Optional) Matrix Multiplication Refresher
Understanding LoRA Programatically Part 1
Understanding LoRA Programatically Part 2
Unlimited Updates
- Improving Training Speed with Bfloat 16
Bfloat16 vs Float32
Comparing Bfloat16 Vs Float32 Programatically
Implement a New Life System - at end of 3rd section
- Setting up the QLoRA Training Script with Mixed Precision & Double Quantization
Setting up Imports and Libraries for the Train Script
Argument Parsing Function Part 1
Argument Parsing Function Part 2
Understanding Trainable Parameters Caveats
Introduction to Quantization
Identifying Trainable Layers for LoRA
Setting up Parameter Efficient Fine Tuning
Implement LoRA Configuration and Mixed Precision Training
Understanding Double Quantization
Creating the Training Function Part 1
Creating the Training Function Part 2
Exercise: Imposter Syndrome
Finishing our Sagemaker Script
Gaining Access to Powerful GPUs with AWS Quotas
Final Fixes Before Training
- Running our Fine Tuning Script for our LLM
Starting our Training Job
Inspecting the Results of our Training Job and Monitoring with Cloudwatch
- Deploying our Fine Tuned LLM
Deploying our LLM to a Sagemaker Endpoint
Testing our LLM in Sagemaker Locally
Creating the Lambda Function to Invoke our Endpoint
Creating API Gateway to Deploy the Model Through the Internet
Implementing our Streamlit App
Streamlit App Correction
- Cleaning up Resources
Congratulations and Cleaning up AWS Resources
- Where To Go From Here?
Thank You!
Review This Course!
Become An Alumni
Learning Guideline
ZTM Events Every Month
LinkedIn Endorsements