Unlock the full potential of generative AI with our advanced course module focused on state-of-the-art multimodal models. This course is designed for learners eager to bridge the gap between images and text, and to master the latest techniques in AI-driven content generation. You’ll begin by exploring the foundational concepts behind multimodal models, learning how contrastive language-image pre-training enables seamless integration of visual and textual data. Discover how these models power innovative applications like semantic image search, allowing you to query image content without manual labeling. Dive deeper into the mechanics of latent diffusion models and unravel the inner workings of stable diffusion, gaining the skills to transform text prompts into entirely new, never-before-seen images. The course also covers essential strategies for evaluating generative models and introduces efficient methods for fine-tuning and adapting pre-trained models to new styles and subjects. By the end, you’ll be equipped to build, adapt, and optimize cutting-edge text-to-image systems—ready to innovate in creative, research, or commercial settings.
This module delves into multimodal generative AI, focusing on models that connect images and text. Learners explore contrastive language-image pre-training for semantic image search and uncover the workings of latent diffusion and stable diffusion for text-to-image generation. The module then covers evaluation of generative models, parameter-efficient fine-tuning, and techniques to teach pre-trained models new styles and subjects. It concludes with methods to optimize diffusion models for faster, near real-time image generation, equipping students with both conceptual understanding and practical skills in advanced multimodal AI systems.
涵盖的内容
44个视频3个作业
显示有关单元内容的信息
44个视频•总计408分钟
Topics•1分钟
Components of a Multimodal Model•5分钟
Vision-Language Understanding•10分钟
Contrastive Language-Image Pretraining•6分钟
Embedding Text and Images with CLIP•14分钟
Zero-Shot Image Classification with CLIP•4分钟
Semantic Image Search with CLIP•11分钟
Conditional Generative Models•5分钟
Introduction to Latent Diffusion Models•9分钟
The Latent Diffusion Model Architecture•6分钟
Failure Modes and Additional Tools•7分钟
Stable Diffusion Deconstructed•12分钟
Writing Our Own Stable Diffusion Pipeline•11分钟
Decoding Images from the Stable Diffusion Latent Space•5分钟
Improving Generation with Guidance•9分钟
Playing with Prompts•30分钟
Topics•1分钟
Methods and Metrics for Evaluating Generative AI•7分钟
Manual Evaluation of Stable Diffusion with DrawBench•14分钟
Quantitative Evaluation of Diffusion Models with Human Preference Predictors•20分钟
Overview of Methods for Fine-Tuning Diffusion Models•10分钟
Sourcing and Preparing Image Datasets for Fine-Tuning•8分钟
Generating Automatic Captions with BLIP-2•8分钟
Parameter Efficient Fine-Tuning with LoRA•12分钟
Inspecting the Results of Fine-Tuning•5分钟
Inference with LoRAs for Style-Specific Generation•12分钟
Conceptual Overview of Textual Inversion•8分钟
Subject-Specific Personalization with Dreambooth•8分钟
Dreambooth versus LoRA Fine-Tuning•6分钟
Dreambooth Fine-Tuning with Hugging Face•14分钟
Inference with Dreambooth to Create Personalized AI Avatars•14分钟
Adding Conditional Control to Text-to-Image Diffusion Models•4分钟
Creating Edge and Depth Maps for Conditioning•16分钟
Depth and Edge-Guided Stable Diffusion with ControlNet•17分钟
Understanding and Experimenting with ControlNet Parameters•9分钟
Generative Text Effects with Font Depth Maps•3分钟
Few Step Generation with Adversarial Diffusion Distillation (ADD)•7分钟
Reasons to Distill•6分钟
Comparing SDXL and SDXL Turbo•12分钟
Text-Guided Image-to-Image Translation•17分钟
Video-Driven Frame-by-Frame Generation with SDXL Turbo•13分钟
Near Real-Time Inference with PyTorch Performance Optimizations•11分钟
Programming Generative AI: Summary•1分钟
Course Summary•1分钟
3个作业•总计90分钟
Connecting Text and Images Quiz•30分钟
Post-Training Procedures for Diffusion Models Quiz•30分钟
The World’s Leading Learning Company
Pearson provides in-demand training and expert resources across business, technology, and professional development.
Designed to help learners at all levels gain new skills, advance their careers, and stay competitive in a rapidly changing world, Pearson's expert-led courses offer practical, real-world knowledge from industry leaders. Whether you're preparing for a certification, enhancing workplace skills, or driving impact in your organization, Pearson is your trusted partner in lifelong learning.
Explore Pearson's courses and take the next step in your professional journey.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.