Was Sie vorher wissen sollten
bevor Sie beginnen

Beginnt 6 June 2026 05:02

Endet 6 June 2026

00 Tage
00 Stunden
00 Minuten
00 Sekunden
course image

Multimodal and cross-modal AI integrations

Discover how to build AI applications that seamlessly integrate text, images, and speech using Azure AI Services for sophisticated multimodal solutions.
Microsoft via Coursera

Microsoft

2874 Kurse


19 hours 55 minutes

Optionales Upgrade verfügbar

Not Specified

Lernen Sie in Ihrem eigenen Tempo

Paid Course

Optionales Upgrade verfügbar

Übersicht

Learn to build AI that sees, hears, and understands the world in an integrated way. This course takes you beyond single-modality models, teaching you to architect applications that connect different data types like text, images, and speech.

Starting with text-to-image generation, you will progress to integrating various AI components and orchestrating the full power of Azure AI Services to build sophisticated, cross-modal solutions. By the end, you'll be equipped to design the next generation of intelligent, multi-faceted AI applications.

Lehrplan

  • Multimodal AI component integration
  • This module introduces the foundational concepts of multimodal AI. You will learn the architectural patterns for combining different AI components, such as text and image models, and progress from basic integration to building complex systems that can reason across multiple data types.
  • Text-to-image generation
  • This module provides a deep dive into the popular and creative task of generating images from text descriptions. You will explore the models that power this technology, like DALL·E, and learn both basic and advanced prompting techniques to craft and refine specific, high-quality visual outputs.
  • Cross-modal applications with Azure AI vision
  • This module focuses on practical implementation using a powerful, specialized tool. You will leverage the features of Azure AI Vision to build and optimize cross-modal applications like image captioning and visual search. You'll learn how this single service can analyze visual content to generate rich textual descriptions and extract embedded text (OCR), providing the core components for sophisticated multimodal solutions.
  • Advanced AI integration with Azure services
  • This capstone module builds upon your deep expertise in Azure AI Vision. You will learn to integrate your vision applications with other powerful Azure AI Services, such as Language and Speech, to create comprehensive, end-to-end solutions. The focus will be on orchestrating these distinct services to develop a sophisticated application that solves a real-world business problem, demonstrating your ability to design and build a complete multimodal system from the ground up.

Unterrichtet von

Microsoft


Fachgebiete

Artificial Intelligence