What You Need to Know Before
You Start

Starts 7 June 2025 18:21

Ends 7 June 2025

00 days
00 hours
00 minutes
00 seconds
course image

Unlocking New Pose in HPC - Containerization, Cloud, and GPU-based Workloads

Explore containerization, cloud, and GPU-based workloads in HPC. Learn about Kubernetes for resource management, GPU virtualization, custom scheduling, and monitoring for efficient AI development and deployment.
CNCF [Cloud Native Computing Foundation] via YouTube

CNCF [Cloud Native Computing Foundation]

2544 Courses


41 minutes

Optional upgrade avallable

Not Specified

Progress at your own speed

Conference Talk

Optional upgrade avallable

Overview

Explore containerization, cloud, and GPU-based workloads in HPC. Learn about Kubernetes for resource management, GPU virtualization, custom scheduling, and monitoring for efficient AI development and deployment.

Syllabus

  • Introduction to HPC and AI Workloads
  • Overview of HPC and its relevance to AI
    Introduction to GPU-based workloads
    The role of cloud computing in HPC
  • Fundamentals of Containerization
  • Basics of containers and their benefits in HPC
    Container orchestration tools overview
  • Deep Dive into Kubernetes
  • Kubernetes architecture and components
    Setting up a Kubernetes cluster for HPC
    Deploying and managing applications with Kubernetes
    Resource management techniques in Kubernetes
  • GPU Virtualization and Utilization
  • Understanding GPU architecture and capabilities
    Tools and techniques for GPU virtualization
    Best practices for maximizing GPU utilization in HPC
  • Custom Scheduling in HPC
  • Introduction to custom schedulers in Kubernetes
    Designing custom scheduling algorithms for optimized performance
    Hands-on exercises with Kubernetes scheduling
  • Monitoring and Performance Tuning
  • Monitoring tools and techniques for Kubernetes
    Analyzing performance bottlenecks in GPU-based workloads
    Techniques for effective performance tuning
  • Cloud Integration with HPC
  • Exploring cloud service providers and their offerings for HPC
    Design patterns for hybrid cloud HPC environments
    Case studies of successful cloud-enabled HPC deployments
  • Case Studies and Real-World Applications
  • Analyzing real-world AI applications using HPC
    Lessons learned from deploying AI workloads in HPC clusters
  • Future Trends and Innovations in HPC
  • Emerging technologies and their potential impact
    Research directions in AI and HPC integration
  • Course Conclusion
  • Recap of key concepts learned
    Final project: designing a scalable AI workload deployment strategy using Kubernetes and cloud resources.

Subjects

Conference Talks