מה צריך לדעת לפני
שתתחיל

מתחיל 14 July 2026 12:09

נגמר 14 July 2026

00 ימים

00 שעות

00 דקות

00 שניות

הרשמה

The Dark Side of AI: Jailbreaking, Injections, Hallucinations & more

Explore AI vulnerabilities through hands-on jailbreaking, prompt injections, and bias testing with real models like ChatGPT to understand security risks and prevention methods.

via Zero To Mastery

3 hours

שדרוג אופציונלי זמין

בינוני

התקדמות בקצב שלך

Paid Course

שדרוג אופציונלי זמין

סקירה כללית

Step over to the dark side and learn about the vulnerabilities, exploits, and unintended consequences that AI models like LLMs suffer from, with hands-on prompting and exercises.What jailbreaking models involves and how to do it yourselfUnderstanding vulnerabilities inherent to models, including prompt and data leakageThe risks of exposing LLMs to proprietary or sensitive dataExploring the toxicity and bias inherently built into different modelsReal-world tests using ChatGPT, DeepSeek and other modelsExperiment with steering an LLM's neurons to prevent hallucinations

סילבוס

Introduction

Welcome to The Dark Side (Intro to Guardrails and Jailbreaking)

Exercise: Meet Your Classmates and Instructor

Course Resources

The Dark Side of AI

Jailbreak! (The DAN Prompt)

Exercise: Create Your Own Jailbreak

Many Shot Jailbreaking

Prompt Injections - Part 1

Prompt Injections - Part 2

Thinking Like LLMs - Multi-Modal Injection

Leaking - Part 1 (Prompt Leaking)

Leaking - Part 2 (Data Leaking)

Exposure

Poisoning

Toxicity

Hallucinations

Thinking Like LLMs - Big vs Small

Challenge: Conduct Your Own Mechanistic Interpretability Research on Hallucinations

Challenge Instructions

Leaderboard: Mechanistic Interpretability

The Model Card

Model Cards Deep Dive

Exercise: Explore the Model Card for GPT-o3-mini and Learn Something New!

Where To Go From Here?

Let's Keep Learning Together!

Review This Byte!

נלמד על ידי

Scott Kerr

נושאים

Computer Science

מה צריך לדעת לפני שתתחיל

The Dark Side of AI: Jailbreaking, Injections, Hallucinations & more

3 hours

בינוני

Paid Course

סקירה כללית

סילבוס

נלמד על ידי

נושאים

CodeCloak: A DRL-Based Method for Mitigating Code Leakage by LLM Code Assistants

Generative AI for NLP with PyTorch

Machine Learning Engineer: ML and Deep Learning Models

Data Preparation & Applied Machine Learning

Building an AI Cooking Helper with Django

Feature Engineering and Feature Stores for AI and ML

מה צריך לדעת לפני
שתתחיל