The course "Data Analysis Using Hadoop Tools" provides a thorough and hands-on introduction to key tools within the Hadoop ecosystem, such as Hive, Pig, HBase, and Apache Spark, for data processing, management, and analysis. Learners will gain practical experience with Hive's SQL-like interface for complex data querying, Pig Latin scripting for data transformation, and HBase's NoSQL capabilities for efficient big data management. The course also covers Apache Spark's powerful in-memory computation capabilities for high-performance data processing tasks. By the end, participants will be equipped with the skills to leverage these technologies within the Hadoop platform to address real-world big data challenges.
What makes this course unique is its comprehensive approach to integrating various Hadoop tools into a cohesive workflow. You'll not only learn how to use each tool individually but also understand how to effectively combine them to optimize data processing and analysis. Through hands-on exercises and examples, you'll gain the confidence and skills to tackle complex data challenges and extract valuable insights from big data. Whether you're looking to enhance your data analysis capabilities for work or want to deepen your knowledge of Hadoop and big data tools, this course offers valuable skills that will help you succeed.
This course provides a comprehensive overview of key tools within the Hadoop ecosystem, including Hive, Pig, HBase, and Apache Spark. You will learn how to set up and configure these technologies for data processing, management, and analysis. The course covers Hive's query execution, Pig's scripting language, and HBase's NoSQL capabilities. You'll also gain hands-on experience with Spark's core programming model for efficient big data processing. By the end, you'll be equipped to leverage these tools for optimized data analysis and management.
涵盖的内容
2篇阅读材料
显示有关单元内容的信息
2篇阅读材料•总计15分钟
Course Overview•5分钟
Instructor Biography: Prof. Karthik Shyamsunder•10分钟
Data Analysis Using Hive
第 2 单元•小时 后完成
单元详情
In this module, we will cover MapReduce programming using a higher-level language called Hive which translates Hive SQL-like queries to MapReduce.
涵盖的内容
9个视频7篇阅读材料4个作业
显示有关单元内容的信息
9个视频•总计107分钟
Introduction - Hive•2分钟
Hive Overview and Architecture•23分钟
Setting up Hive•26分钟
Simple Hive Example•20分钟
Loading Data•9分钟
Hive Statements•11分钟
Partitions•6分钟
Joins•8分钟
Summary- Hive•2分钟
7篇阅读材料•总计105分钟
Hive Overview and Architecture•10分钟
Setting up Hive•10分钟
Simple Hive Example•10分钟
Loading Data•10分钟
Hive Statement•10分钟
Partitions and Joins in Hive•15分钟
Self-Reflective Reading: Balancing Hive, Java MapReduce, and Pig in Hadoop Architectures•40分钟
4个作业•总计105分钟
Data Analysis using Hive •60分钟
Introduction to Hive: Overview, Architecture, and Setup•15分钟
Working with Hive: Basic Examples, Data Loading, and Hive Statements•15分钟
Advanced Hive: Partitions, Joins, and Summary•15分钟
Data Analysis Using Pig
第 3 单元•小时 后完成
单元详情
In this module, we will cover MapReduce programming using a higher-level language called Pig which translates Pig Latin queries to MapReduce.
涵盖的内容
9个视频7篇阅读材料4个作业
显示有关单元内容的信息
9个视频•总计132分钟
Introduction - Pig•2分钟
Pig: Overview and Architecture•22分钟
Setting up Pig•8分钟
Grunt Interactive Shell•18分钟
Pig Latin Language Basics•10分钟
Pig Data Types and Schema•15分钟
Core Relational Operators•14分钟
Join Operators•26分钟
Debug Operators•17分钟
7篇阅读材料•总计102分钟
Pig: Overview and Architecture•15分钟
Grunt Interactive Shell•7分钟
Exploring Pig Latin Basics: Data Structures, Syntax, and Commands•10分钟
Understanding Schemas, Data Types, and Functions in Apache Pig•10分钟
Core Relational Operators in Pig Latin: An Overview•10分钟
Exploring Relational Join Operators in Apache Pig•10分钟
Self-Reflective Reading: Hive vs. Pig for Your Big Data Strategy•40分钟
4个作业•总计105分钟
Data Analysis using Pig•60分钟
Introduction to Pig: Overview, Architecture, and Setup•15分钟
Pig Fundamentals: Grunt Shell, Pig Latin Basics, and Data Types•15分钟
Pig Advanced: Core Operators, Joins, Debugging, and Summary•15分钟
Hadoop NoSQL Database HBase
第 4 单元•小时 后完成
单元详情
In this module, we will start with a primer of NoSQL databases and then dive into HBase, a NoSQL database built on top of Hadoop that allows for random, real-time read/write access to your Big Data.
涵盖的内容
8个视频3篇阅读材料3个作业
显示有关单元内容的信息
8个视频•总计175分钟
Introduction - HBase•2分钟
NoSQL Primer•35分钟
HBase Overview and Architecture•31分钟
Setting up HBase•25分钟
HBase Data Model•16分钟
HBase Shell•33分钟
CRUD operations using Java API•31分钟
Summary - HBase•3分钟
3篇阅读材料•总计60分钟
HBase Overview and Architecture•10分钟
HBase Data Model•10分钟
Self-Reflective Reading: Coexisting Databases: Balancing NoSQL and RDBMS in Modern Applications•40分钟
3个作业•总计90分钟
Hadoop NOSQL Database HBase•60分钟
Introduction to HBase: NoSQL Basics, Architecture, and Setup•15分钟
HBase Fundamentals: Data Model, Shell, CRUD Operations, and Summary•15分钟
Spark
第 5 单元•小时 后完成
单元详情
In this module, we will cover the Spark engine and framework and show how it integrates on the Hadoop platform.
涵盖的内容
8个视频5篇阅读材料4个作业
显示有关单元内容的信息
8个视频•总计233分钟
Introduction - Spark•3分钟
Spark Overview•38分钟
Spark Architecture•29分钟
Setting up and Running Spark•55分钟
Spark Core Programming Model •41分钟
Hands-on Spark •39分钟
Miscellaneous Spark Components•20分钟
Summary - Spark•7分钟
5篇阅读材料•总计85分钟
Spark Architecture•15分钟
Reading References•10分钟
Setting up and Running Spark•10分钟
Reading References•10分钟
Self-Reflective Reading: Hadoop vs Spark: Competing or Complementary?•40分钟
4个作业•总计105分钟
Spark•60分钟
Introduction to Spark: Overview and Architecture•15分钟
The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.