Was Sie vorher wissen sollten
bevor Sie beginnen

Beginnt 23 July 2026 10:19

Endet 23 July 2026

00 Tage

00 Stunden

00 Minuten

00 Sekunden

Registrieren

Prompt Engineering for Vision Models

Master prompt engineering techniques for vision models including SAM, OWL-ViT, and Stable Diffusion through hands-on image generation, segmentation, and object detection tasks.

DeepLearning.AI via Coursera

1 hour 30 minutes

Optionales Upgrade verfügbar

Not Specified

Lernen Sie in Ihrem eigenen Tempo

Paid Course

Optionales Upgrade verfügbar

Übersicht

In this course, you’ll learn to prompt different vision models like Meta’s Segment Anything Model (SAM), a universal image segmentation model, OWL-ViT, a zero-shot object detection model, and Stable Diffusion 2.0, a widely used diffusion model. You’ll also use a fine-tuning technique called DreamBooth to tune a diffusion model to associate a text label with an object of your preference.

In detail, you’ll explore:

1. Image Generation:

Prompt with text and by adjusting hyperparameters like strength, guidance scale, and number of inference steps. 2.

Image Segmentation:

Prompt with positive or negative coordinates, and with bounding box coordinates. 3. Object detection:

Prompt with natural language to produce a bounding box to isolate specific objects within images. 4.

In-painting:

Combine the above techniques to replace objects within an image with generated content. 5. Personalization with Fine-tuning:

Generate custom images based on pictures of people or places that you provide, using a fine-tuning technique called DreamBooth. 6.

Iterating and Experiment Tracking:

Prompting and hyperparameter tuning are iterative processes, and therefore experiment tracking can help to identify the most effective combinations. This course will use Comet, a library to track experiments and optimize visual prompt engineering workflows.

Lehrplan

Prompt Engineering for Vision Models

Prompt engineering is used not only in text models but also in vision models. Depending on the vision model, they may use text prompts, but can also work with pixel coordinates, bounding boxes, or segmentation masks.In this course, you’ll learn to prompt different vision models like Meta’s Segment Anything Model (SAM), a universal image segmentation model, OWL-ViT, a zero-shot object detection model, and Stable Diffusion 2.0, a widely used diffusion model. You’ll also use a fine-tuning technique called DreamBooth to tune a diffusion model to associate a text label with an object of your preference.In detail, you’ll explore: 1. Image Generation: Prompt with text and by adjusting hyperparameters like strength, guidance scale, and number of inference steps. 2. Image Segmentation: Prompt with positive or negative coordinates, and with bounding box coordinates. 3. Object detection: Prompt with natural language to produce a bounding box to isolate specific objects within images. 4. In-painting: Combine the above techniques to replace objects within an image with generated content. 5. Personalization with Fine-tuning: Generate custom images based on pictures of people or places that you provide, using a fine-tuning technique called DreamBooth. 6. Iterating and Experiment Tracking: Prompting and hyperparameter tuning are iterative processes, and therefore experiment tracking can help to identify the most effective combinations. This course will use Comet, a library to track experiments and optimize visual prompt engineering workflows.

Unterrichtet von

Abigail Morgan, Jacques Verre, and Caleb Kaiser

Fachgebiete

Computer Science

Was Sie vorher wissen sollten bevor Sie beginnen

Prompt Engineering for Vision Models

1 hour 30 minutes

Not Specified

Paid Course

Übersicht

Lehrplan

Unterrichtet von

Fachgebiete

AI for FP&A Automation & Modeling

FP&A with AI: Capstone Project

Interpretability of LLMs - Generating SAE Feature Descriptions - Spring 2026

CodeCloak: A DRL-Based Method for Mitigating Code Leakage by LLM Code Assistants

Generative AI for NLP with PyTorch

Machine Learning Engineer: ML and Deep Learning Models

Was Sie vorher wissen sollten
bevor Sie beginnen