What You Need to Know Before
You Start
Starts 13 June 2026 08:46
Ends 13 June 2026
Not Specified
Optional upgrade avallable
Advanced
Progress at your own speed
Free Online Course
Optional upgrade avallable
Overview
ABOUT THE COURSE:
This course explores how Generative AI is applied to modern computer vision tasks. Unlike existing NPTEL courses, it specifically emphasized on vision-based generative AI models.
It begins with mathematical foundations and classical vision techniques, followed by deep learning architectures. The course then introduces generative learning paradigms including GANs, VAEs, diffusion models, and transformers with a discussion regarding evaluation metrics and training challenges like mode collapse, diffusion noise scheduling, etc.
Moreover, it includes LLM models for vision applications like GPT-4V, LLaMA, PaLM-E, Flamingo, etc. This course is primarily focusing on deep generative learning for computer vision tasks like Image Captioning, VQA, Scene Understanding etc.
It further discusses multimodal generative models and agentic AI systems for automatic image synthesis and reasoning.INTENDED AUDIENCE:
Final/Pre-final year B.Tech/BE, M.Tech/ME, MS, PhD students, Industry professionals, and Faculty members.PREREQUISITES:
Basics of Machine Learning and Computer Vision. Neural Networks for Vision and NLP.INDUSTRY SUPPORT:
Relevant for AI/ML roles in IT companies, startups, research labs, and product-based companies working in generative AI and computer vision domains.
Syllabus
- Introduction
- Mathematical Foundations
- Classical Vision Techniques
- Deep Learning Architectures
- Generative Learning Paradigms
- Multimodal Generative Models
- Transformers and Vision Applications
- Training Challenges and Evaluation Metrics
- Application Domains and Case Studies
- Industry Use Cases and Open Research Areas
- Course Conclusion
- Assignments and Project Work
- Additional Resources
Taught by
Prof. Arijit Sur
Subjects
Artificial Intelligence