返回到 Build Multimodal Generative AI Applications
IBM

Build Multimodal Generative AI Applications

Ready to level up your GenAI skills? Step into the exciting world of multimodal AI, where language, images, and speech come together to build smarter, more interactive applications. In this hands-on course, you’ll learn how to build systems that work across multiple modalities, from creating AI-powered storytellers and meeting assistants to developing image captioning tools and video generation apps. You’ll gain experience with real-world tools like IBM’s Granite, OpenAI’s Whisper, Sora and DALL·E, Meta’s Llama, Mistral’s Mixtral, and Gradio. Plus, you'll explore multimodal search, question answering, and retrieval systems that combine text, speech, and visual data. By the end of the course, you’ll be able to design and build full-stack multimodal AI solutions using Python and frameworks like Flask and Gradio. If you’re looking to gain in-demand skills for building the next generation of AI applications, enroll today and power up your AI career!

状态:Software Development
状态:LLM Application
中级课程小时

精选评论

MH

5.0评论日期:Oct 26, 2025

Wow, It was next Level Experience to learn the Multimodal Gen AI Development. Truly Amazing.

所有审阅

显示:4/4

Muhammad Ali Hasnain
5.0
评论日期:Oct 27, 2025
Mansib Miraj
5.0
评论日期:Oct 15, 2025
Filip Pisowicz
4.0
评论日期:Mar 30, 2026
Sajjan Malik
1.0
评论日期:Sep 22, 2025