What You Need to Know Before
You Start
Starts 27 June 2025 14:23
Ends 27 June 2025
Scale to 0 LLM Inference: Cost Efficient Open Model Deployment on Serverless GPUs
Devoxx
2765 Courses
17 minutes
Optional upgrade avallable
Not Specified
Progress at your own speed
Free Video
Optional upgrade avallable
Overview
Discover the innovative approach to deploying LLM models on serverless GPUs that scale efficiently to zero during inactivity. This session will guide you through the process of running Ollama on these advanced infrastructures, allowing for cost-effective open LLM deployment.
Gain complete control over both models and private data, optimizing performance and expenditure.
Syllabus
- **Introduction to Serverless GPU Computing**
- **Overview of Ollama and LLM Deployment**
- **Setting Up a Serverless Environment**
- **Deploying LLMs on Serverless GPUs**
- **Cost Optimization Strategies**
- **Maintaining Model and Data Privacy**
- **Performance Optimization**
- **Troubleshooting and Support**
- **Capstone Project**
- **Course Conclusion and Future Directions**
Subjects
Computer Science