What You Need to Know Before
You Start
Starts 5 June 2026 05:48
Ends 5 June 2026
Probabilistic Safety Guarantees Using Model Internals
Simons Institute
6076 Courses
46 minutes
Optional upgrade avallable
Not Specified
Progress at your own speed
Free Video
Optional upgrade avallable
Overview
Join us for an insightful exploration of probabilistic safety guarantees for language models. Led by Jacob Hilton from the Alignment Research Center, this session focuses on the critical analysis of model internals.
Ideal for enthusiasts and professionals in artificial intelligence and computer science, this YouTube event offers cutting-edge insights into enhancing model safety and reliability.
Syllabus
- Introduction to Probabilistic Safety
- Fundamentals of Model Internals
- Analyzing Model Internals
- Probabilistic Methods in AI Safety
- Developing Safety Guarantees
- Case Studies and Practical Examples
- Implementing Safety Frameworks
- Evaluating Safety in Language Models
- Tools and Resources
- Guest Lecture by Jacob Hilton
- Conclusion and Future Directions
- Final Project
Subjects
Computer Science