מה צריך לדעת לפני
שתתחיל

מתחיל 6 July 2026 00:11

נגמר 6 July 2026

00 ימים

00 שעות

00 דקות

00 שניות

הרשמה

AI Orchestration: From local models to cloud

Master AI orchestration across local and cloud environments—build prompt engineering pipelines, deploy models with Ollama and llamafile, optimize GPU inference in Rust, and design cost-effective workflows using AWS Spot instances.

Pragmatic AI Labs via Coursera

5 hours

שדרוג אופציונלי זמין

מתחיל

התקדמות בקצב שלך

Paid Course

שדרוג אופציונלי זמין

סקירה כללית

Learn to orchestrate AI systems across local and cloud environments through hands-on infrastructure setup, model deployment, and workflow integration. You will build a prompt engineering pyramid from basic prompts to chain-of-thought reasoning implemented in Rust, then evaluate six decision factors for choosing between local and cloud models including latency, throughput, cost, and privacy.

The course covers local AI infrastructure in depth:

running Ollama with custom Modelfiles for task-specific assistants, deploying llamafile for zero-dependency portable inference, compiling Rust Candle with CUDA for GPU-accelerated local inference, and optimizing local RAG with caching strategies. You will configure a complete AI workstation with tmux for session management, nvidia-smi and Zenith for GPU monitoring, and NVIDIA GPU optimization.

The final module covers cloud workflows including AWS Spot instances for cost-effective GPU compute, Hugging Face model discovery and download, and GitHub AI models integration. By completing this course, you will be able to set up local AI infrastructure, deploy models across local and cloud environments, and design orchestration workflows that balance cost, privacy, and performance.

סילבוס

Orchestration Fundamentals

A comprehensive course covering prompt engineering with chain-of-thought reasoning, local inference runtimes (Ollama, llamafile, Candle), GPU workstation configuration, and cost-optimized cloud deployment with AWS Spot instances.

Local AI Infrastructure

Covers local vs cloud model tradeoffs, caching strategies, local RAG optimization, Ollama with custom Modelfiles, llamafile portable deployment, and Candle GPU-accelerated Rust inference.

Workstation and Cloud Workflows

Covers tmux session management, nvidia-smi and Zenith GPU monitoring, local workstation orchestration, AWS Spot instance deployment, Hugging Face and GitHub AI model workflows, and Rust project structure.

Capstone

Head-to-head comparison of Ollama vs `apr` ([paiml/aprender](https://github.com/paiml/aprender)) running Qwen2.5-Coder-1.5B on the same prompt suite, same hardware. Build a chain-of-thought routing engine that selects runtimes based on task complexity and validation requirements, with cost analysis spanning local workstations, Spot instances, and Bedrock.

נלמד על ידי

Alfredo Deza and Noah Gift

נושאים

Artificial Intelligence

מה צריך לדעת לפני שתתחיל

AI Orchestration: From local models to cloud

5 hours

מתחיל

Paid Course

סקירה כללית

סילבוס

נלמד על ידי

נושאים

Advancing Your Career in Production AI

Industrial Biomanufacturing: From Cells to Products

Automate Routine Tax Processes

Building Multimodal AI Agents

人工智能中的数学算法

Mathematical Algorithm in AI

מה צריך לדעת לפני
שתתחיל