This intermediate course provides a practical, hands-on exploration of Databricks Governance, focusing on the essential tools and workflows for managing and securing your data lakehouse. You will learn to navigate and control access to your data assets using Unity Catalog, the foundation of Databricks governance. The course covers the core hierarchy of metastores, catalogs, schemas, and tables, and teaches you how to manage them programmatically using the Databricks Python SDK, CLI, and VS Code extension.
Beyond foundational access control, you will master the skills to implement modern CI/CD and MLOps practices directly within the Databricks environment. You'll learn to integrate Databricks Repos with GitHub, automate notebook testing and deployment with GitHub Actions, and understand the architectural considerations for managing machine learning models in production. Finally, you will explore how to ensure ongoing data reliability by setting up and understanding Lakehouse Monitoring for data quality and freshness.
This course is unique because it moves beyond theory, demonstrating how to apply these governance concepts with the actual tools and code used by data professionals. By the end, you'll be equipped to build, deploy, and monitor secure and reliable data pipelines and AI applications on the Databricks platform
This module establishes the foundation of Databricks governance
through Unity Catalog. You'll navigate the metastore-catalog-schema-
table hierarchy, set up role-based access control using service
principals and GRANT/REVOKE statements, and learn to manage your
governance setup programmatically with the Databricks Python SDK,
CLI, and VS Code extension.
涵盖的内容
16个视频9篇阅读材料1个作业
显示有关单元内容的信息
16个视频•总计48分钟
Course Introduction•1分钟
Introduction•0分钟
Unity Catalog overview•5分钟
Navigating the catalog hierarchy•5分钟
Setting up your first Unity Catalog•6分钟
Summary•0分钟
Introduction•0分钟
Introducing the Databricks Python SDK•7分钟
Setting up the Databricks VS Code extension•3分钟
Overview of the Databricks CLI•5分钟
Summary•0分钟
Introduction•0分钟
Principals and configurations•3分钟
Using the SDK to create a Service Principal•5分钟
Writing REVOKE and GRANT statements•4分钟
Summary•1分钟
9篇阅读材料•总计17分钟
About this course and your instructors•1分钟
Key terms•1分钟
Lab•5分钟
Reflection•1分钟
Key terms•1分钟
Lab•5分钟
Reflection•1分钟
Key terms•1分钟
Reflection•1分钟
1个作业•总计30分钟
Quiz: Governance •30分钟
CI/CD and MLOps
第 2 单元•小时 后完成
单元详情
This module covers the workflows that take Databricks code from a
developer's laptop to production. You'll integrate Databricks Repos
with GitHub using branching strategies and code review, automate
notebook testing and deployment with GitHub Actions, and build a
complete MLOps pipeline that serves a GenAI application through a
model serving endpoint.
涵盖的内容
16个视频9篇阅读材料1个作业
显示有关单元内容的信息
16个视频•总计52分钟
Introduction•0分钟
Connecting Databricks to GitHub•5分钟
Authenticating to GitHub•4分钟
Branching strategies and code review•4分钟
Summary•1分钟
Introduction•0分钟
Running notebooks as jobs•6分钟
Challenges with notebooks•6分钟
Automating tests and runs with GitHub Actions•6分钟
Summary•1分钟
Introduction•0分钟
Overview of ML and AI capabilities•4分钟
Creating a GenAI application•5分钟
Creating a serving endpoint•6分钟
MLOps architectural overview•4分钟
Summary•0分钟
9篇阅读材料•总计17分钟
Key terms•1分钟
Lab•5分钟
Reflection•1分钟
Databricks Free Edition•1分钟
Key terms•1分钟
Lab•5分钟
Reflection•1分钟
Key terms•1分钟
Reflection•1分钟
1个作业•总计30分钟
Quiz: CI/CD and MLOps•30分钟
Monitoring and quality
第 3 单元•小时 后完成
单元详情
This module closes the production loop with Lakehouse Monitoring.
You'll enable quality and freshness monitoring on Unity Catalog
tables, interpret monitoring results to detect data anomalies and
drift, and review the recommendations that turn a working pipeline
into a production-ready governance setup.
I'm already using the Databricks UI for governance. Why do I need the SDK or CLI?
While the UI is great for one-off tasks, managing governance at scale requires automation. This course teaches you how to use the SDK and CLI to programmatically manage users, permissions, and data assets, which is essential for integrating governance into your CI/CD pipelines and Infrastructure as Code practices.
I'm a data engineer, not a machine learning expert. Will the ML module be too advanced?
The ML module is designed to give data engineers the necessary context for working with ML teams. It focuses on the operational aspects—like setting up a serving endpoint and the overall MLOps architecture—that are relevant for integrating and supporting ML models within governed data pipelines.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.