Chevron Left
返回到 ETL and Data Pipelines with Shell, Airflow and Kafka

学生对 IBM 提供的 ETL and Data Pipelines with Shell, Airflow and Kafka 的评价和反馈

4.5
433 个评分

课程概述

Delve into the two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application. In this course, you will learn about the different tools and techniques that are used with ETL and Data pipelines. Both ETL and ELT extract data from source systems, move the data through the data pipeline, and store the data in destination systems. During this course, you will experience how ELT and ETL processing differ and identify use cases for both. You will identify methods and tools used for extracting the data, merging extracted data either logically or physically, and for loading data into data repositories. You will also define transformations to apply to source data to make the data credible, contextual, and accessible to data users. You will be able to outline some of the multiple methods for loading data into the destination system, verifying data quality, monitoring load failures, and the use of recovery mechanisms in case of failure. By the end of this course, you will also know how to use Apache Airflow to build data pipelines as well be knowledgeable about the advantages of using this approach. You will also learn how to use Apache Kafka to build streaming pipelines as well as the core components of Kafka which include: brokers, topics, partitions, replications, producers, and consumers. Finally, you will complete a shareable final project that enables you to demonstrate the skills you acquired in each module....

热门审阅

BN

Mar 30, 2023

Overall it's a good course. I wish I could use dos2unix, tr, or sed for removing ^M from the toll_data.tsv. The Final Assignment Instructions could have been clearer.

DS

Jun 13, 2022

Excellent introduction to this topics. Labs contain all you need to know how to start using this type of technologies. Highly recommended.

筛选依据:

26 - ETL and Data Pipelines with Shell, Airflow and Kafka 的 50 个评论(共 94 个)

创建者 Steven W

Jul 19, 2023

I feel though the final project suffered from issues with permissions, and there was a lack of a standard setup. Where should DAG scripts go? Why should they be in a folder with admin only permissions? Submitting screenshots is tedious and (frankly) shows a lack of willingness on the part of the course designers to use tools like nbgrader/Jupyter notebooks or other automated grading solutions.

Warning, if you can write a "Hello World" program in any language, you probably want to skip this course/certification.

创建者 Trevor F K

Feb 16, 2024

As with all these IBM courses this one is super boring. Robot voice talking over powerpoints, as usual. This one stuck out as especially bad because the online lab environment is very unreliable. So much time was wasted waiting for airflow to fail to start. Extremely frustrating!

创建者 Matthew M

Apr 21, 2023

Great course! I found the challenge intensity for the final peer-graded assignment to be at a perfect level for this course. It brought together many skills from this course and several previous courses in the IBM Data Engineering Professional Certificate curriculum.

创建者 Sureerat P

May 2, 2024

The course is excellent and well-prepared. The instructor is very helpful and responsive on the discussion board. I really appreciate having the opportunity to learn from this course. Thank you to all the instructors and peers for reviewing assignments.

创建者 Brusk A

Feb 25, 2023

Amazing for beginners to this subject! The labs are super useful and everything is explained in a really nice way. Can definitely get you started doing a simple project using all that you've learned. Something nice for your portfolio and github :)

创建者 Sreepad P

Jul 6, 2022

The course is simply amazing which provides good amount of hands-on sessions to learn about building data pipelines with Shell scripting, Airflow and Kafka. I highly recommend this course to anyone who wants to be a Data Engineer.

创建者 Bach N

Mar 31, 2023

Overall it's a good course. I wish I could use dos2unix, tr, or sed for removing ^M from the toll_data.tsv. The Final Assignment Instructions could have been clearer.

创建者 Mohamed A

Jun 10, 2022

Thanks to all the instructor's efforts, one of the best DATA engineering courses, contains hands-on Experience with essential data tools.

创建者 Sergei K

Jan 21, 2025

Relevant information in recordings, good recap of every video and hand-on lesson in the end to concrete the knowledge.

创建者 Jibran

Jul 23, 2023

Labs in this course are very helpful and to the point. It took me a while to complete this course but i learned a lot.

创建者 Darrick L

Sep 7, 2022

Very useful high-level overview with practical examples of the major technologies that drive modern data pipelines.

创建者 Theodosios T

Jan 5, 2023

The explanation was very thorough and easy to understand. The exercises were very helpful. Great course overall!

创建者 Uchechi N

Dec 31, 2022

This was my first introduction to Apache airflow and i found the course detailed and practical.

创建者 k b

Apr 24, 2022

Nice intro to ETL and Data Pipelines. Beginner level easy to follow hands on Airflow and Kafka.

创建者 Reinaldo R

Aug 28, 2024

Muy satisfecho con el contenido del curso, y los laboratorios. Thank you very much!

创建者 Younes G

Oct 26, 2022

Perfect Intro to Airflow and Kafka !

The Final Project is very fun and instructive

创建者 Олег К

Jan 26, 2024

It was not easy to pass this course. Lots of laboratory work. But I liked it

创建者 Tea K

Jul 23, 2023

Great hands-on (alike the others in the pack)! Practical and interactive.

创建者 Takahiko A

Aug 20, 2024

個人的にはData Warehouse Engineerの目玉となる講義でした。Pythonを使うことで、画期的に楽しくなりますね。

创建者 Rorisang S

Mar 14, 2022

Succinctly presented. Labs really hammered the point home :)

创建者 Maxime T

Jun 23, 2024

Great content but the labs can be challenging to work with.

创建者 Asanka W

Sep 8, 2022

Great Course, Assignments prepare really well and flexible

创建者 Monica D

Mar 11, 2023

Great learning course for Kafka/ Airflow, well presented

创建者 SARADA N G

Jul 13, 2023

Learn a lot about Apache Airflow, Kafka from sketch.

创建者 Joseph O

Jun 11, 2024

That Kafka unit was soo mindboggling but worth it