מה צריך לדעת לפני
שתתחיל

מתחיל 5 June 2026 14:59

נגמר 5 June 2026

00 ימים
00 שעות
00 דקות
00 שניות
course image

AI and ML: The Critical Operational Side of Running Applications in Kubernetes

Discover how to effectively manage AI and ML operations using service mesh, focusing on GPU workloads, multitenancy, and scaling in Kubernetes environments for reliable and observable ML applications.
CNCF [Cloud Native Computing Foundation] via YouTube

CNCF [Cloud Native Computing Foundation]

6076 קורסים


28 minutes

שדרוג אופציונלי זמין

Not Specified

התקדמות בקצב שלך

Free Video

שדרוג אופציונלי זמין

סקירה כללית

Discover how to effectively manage AI and ML operations using service mesh, focusing on GPU workloads, multitenancy, and scaling in Kubernetes environments for reliable and observable ML applications.

סילבוס

  • Introduction to Kubernetes for AI/ML
  • Overview of Kubernetes architecture
    Key concepts: pods, nodes, and clusters
    Kubernetes networking basics
  • Understanding AI and ML Workloads in Kubernetes
  • Characteristics of AI/ML workloads
    Common challenges in deploying AI/ML on Kubernetes
    Introduction to GPU utilization in Kubernetes
  • Service Mesh Fundamentals
  • Definition and benefits of a service mesh
    Overview of popular service mesh technologies (Istio, Linkerd, etc.)
    Implementing a service mesh in Kubernetes for AI/ML applications
  • Managing GPU Workloads with Kubernetes
  • Configuring Kubernetes for GPU scheduling
    Best practices for GPU resource management
    Tools and frameworks for optimizing GPU workload performance
  • Multitenancy in Kubernetes
  • Approaches to achieving multitenancy
    Managing namespaces and resource quotas
    Security considerations in multitenant environments
  • Scaling AI/ML Applications in Kubernetes
  • Horizontal and vertical pod autoscaling
    Load balancing and resilience strategies
    Handling stateful vs stateless workloads
  • Observability and Monitoring in AI/ML Operations
  • Setting up monitoring for Kubernetes-based applications
    Using tools like Prometheus and Grafana
    Implementing logging and tracing
  • Case Studies and Best Practices
  • Real-world examples of AI/ML deployments in Kubernetes
    Lessons learned and best practices from industry
  • Summary and Q&A
  • Key takeaways from the course
    Open floor for questions and clarifications

נושאים

Programming