What You Need to Know Before
You Start
Starts 8 June 2025 00:55
Ends 8 June 2025
Giving Sight to Speech Models
Massachusetts Institute of Technology
5 Courses
The Massachusetts Institute of Technology (MIT) is a globally recognized research university known for its interdisciplinary curriculum, pioneering research, and groundbreaking discoveries.
24 minutes
Optional upgrade avallable
Not Specified
Progress at your own speed
Free Video
Optional upgrade avallable
Overview
Discover how Whisper-Flamingo integrates visual lip features into speech recognition models, improving performance in noisy conditions for both English recognition and multilingual translation.
Syllabus
- **Introduction to Whisper-Flamingo**
- **Fundamentals of Speech Recognition**
- **Introduction to Visual Lip Features**
- **Integration of Visual and Audio Data**
- **Improving Performance in Noisy Conditions**
- **English Language Speech Recognition**
- **Multilingual Translation with Whisper-Flamingo**
- **Model Evaluation and Performance Metrics**
- **Advanced Topics and Future Directions**
- **Project and Practical Implementation**
- **Course Wrap-Up and Next Steps**
Subjects
Computer Science