What You Need to Know Before
You Start
Starts 3 July 2025 10:40
Ends 3 July 2025
Sesame AI and RVQs - The Network Architecture Behind Viral Speech Models
Neural Breakdown with AVB
2765 Courses
19 minutes
Optional upgrade avallable
Not Specified
Progress at your own speed
Free Video
Optional upgrade avallable
Overview
Join us on a fascinating journey into the inner workings of the Sesame Conversational Speech Model. Discover how the Mimi Encoder utilizes split RVQ tokenization to process semantic and acoustic codes efficiently.
Uncover the role of the Autoregressive Transformer Backbone in enabling seamless and natural speech interactions. This insightful session is brought to you by YouTube, tailored for enthusiasts in Artificial Intelligence and Computer Science.
Syllabus
- Introduction to Conversational Speech Models
- Sesame Conversational Speech Model Architecture
- Mimi Encoder and Tokenization
- Split Residual Vector Quantization (RVQ)
- Semantic and Acoustic Codes
- Autoregressive Transformer Backbone
- Applications of Sesame AI
- Practical Implementation and Case Studies
Subjects
Computer Science