Was Sie vorher wissen sollten
bevor Sie beginnen

Beginnt 9 July 2026 04:52

Endet 9 July 2026

00 Tage

00 Stunden

00 Minuten

00 Sekunden

Registrieren

Multimodal and cross-modal AI integrations

Discover how to build AI applications that seamlessly integrate text, images, and speech using Azure AI Services for sophisticated multimodal solutions.

Microsoft via Coursera

19 hours 55 minutes

Optionales Upgrade verfügbar

Not Specified

Lernen Sie in Ihrem eigenen Tempo

Paid Course

Optionales Upgrade verfügbar

Übersicht

Learn to build AI that sees, hears, and understands the world in an integrated way. This course takes you beyond single-modality models, teaching you to architect applications that connect different data types like text, images, and speech.

Starting with text-to-image generation, you will progress to integrating various AI components and orchestrating the full power of Azure AI Services to build sophisticated, cross-modal solutions. By the end, you'll be equipped to design the next generation of intelligent, multi-faceted AI applications.

Lehrplan

Multimodal AI component integration

This module introduces the foundational concepts of multimodal AI. You will learn the architectural patterns for combining different AI components, such as text and image models, and progress from basic integration to building complex systems that can reason across multiple data types.

Text-to-image generation

This module provides a deep dive into the popular and creative task of generating images from text descriptions. You will explore the models that power this technology, like DALL·E, and learn both basic and advanced prompting techniques to craft and refine specific, high-quality visual outputs.

Cross-modal applications with Azure AI vision

This module focuses on practical implementation using a powerful, specialized tool. You will leverage the features of Azure AI Vision to build and optimize cross-modal applications like image captioning and visual search. You'll learn how this single service can analyze visual content to generate rich textual descriptions and extract embedded text (OCR), providing the core components for sophisticated multimodal solutions.

Advanced AI integration with Azure services

This capstone module builds upon your deep expertise in Azure AI Vision. You will learn to integrate your vision applications with other powerful Azure AI Services, such as Language and Speech, to create comprehensive, end-to-end solutions. The focus will be on orchestrating these distinct services to develop a sophisticated application that solves a real-world business problem, demonstrating your ability to design and build a complete multimodal system from the ground up.

Unterrichtet von

Microsoft

Fachgebiete

Artificial Intelligence

Was Sie vorher wissen sollten bevor Sie beginnen

Multimodal and cross-modal AI integrations

19 hours 55 minutes

Not Specified

Paid Course

Übersicht

Lehrplan

Unterrichtet von

Fachgebiete

Neural Networks for AI Study Group Sessions - Wolfram U

AI Applications for Mental Health and Clinical Practice

Build Apps with AI — From Idea to Working Product

Understanding and Evaluating AI for Mental Health

Mastering SAP Data Architecture

Introduction to Performative Prediction - Tutorial 1

Was Sie vorher wissen sollten
bevor Sie beginnen