Johns Hopkins University
Big Data Processing Using Hadoop 专项课程
Johns Hopkins University

Big Data Processing Using Hadoop 专项课程

Master Big Data Processing with Hadoop. Gain hands-on experience with Hadoop tools and techniques to efficiently process, analyze, and manage big data in real-world applications.

包含在 Coursera Plus

深入学习学科知识
中级 等级

推荐体验

12 周 完成
在 5 小时 一周
灵活的计划
自行安排学习进度
深入学习学科知识
中级 等级

推荐体验

12 周 完成
在 5 小时 一周
灵活的计划
自行安排学习进度

您将学到什么

  • Gain expertise in Hadoop ecosystem components like HDFS, YARN, and MapReduce for big data processing and management across various tasks.

  • Learn to set up, configure, and utilize tools like Hive, Pig, HBase, and Spark for efficient data analysis, processing, and real-time management.

  • Develop advanced programming techniques for MapReduce, optimization methods, and parallelism strategies to handle large-scale data sets effectively.

  • Understand the architecture and functionality of Hadoop and its components, applying them to solve complex data challenges in real-world scenarios.

要了解的详细信息

可分享的证书

添加到您的领英档案

授课语言:英语(English)

了解顶级公司的员工如何掌握热门技能

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

精进特定领域的专业知识

  • 向大学和行业专家学习热门技能
  • 借助实践项目精通一门科目或一个工具
  • 培养对关键概念的深入理解
  • 通过 Johns Hopkins University 获得职业证书

专业化 - 4门课程系列

您将学到什么

  • Define Big Data, explore its relevance in analytics and data science, and understand trends shaping modern data processing technologies.

  • Examine Hadoop architecture, its ecosystem, and subprojects, distinguishing distributions and their roles in Big Data solutions.

  • Acquire practical skills to install, configure, and run Hadoop on a Linux virtual machine, enabling effective Big Data processing.

您将获得的技能

类别:Apache Hadoop
类别:Big Data
类别:Distributed Computing
类别:Linux
类别:Scalability
类别:Data Infrastructure
类别:Software Installation
类别:System Configuration
类别:Data Processing
类别:Analytics
类别:Data Science
HDFS Architecture and Programming

HDFS Architecture and Programming

第 2 门课程14小时

您将学到什么

  • Understand HDFS architecture, components, and how it ensures scalability and availability for big data processing.

  • Learn to configure Hadoop for Java programming and perform file CRUD operations using HDFS APIs.

  • Master advanced HDFS programming concepts like compression, serialization, and working with specialized file structures like Sequence and Map files.

您将获得的技能

类别:File Systems
类别:Data Storage
类别:Apache Hadoop
类别:Scalability
类别:Distributed Computing
类别:Java
类别:Infrastructure Architecture
类别:Data Processing
类别:Big Data
类别:Data Structures
类别:File Management
类别:Systems Architecture
类别:Development Environment

您将学到什么

  • Learn the fundamentals of YARN and MapReduce architectures, including how they work together to process large-scale data efficiently.

  • Understand and implement Mapper and Reducer parallelism in MapReduce jobs to improve data processing efficiency and scalability.

  • Apply optimization techniques such as combiners, partitioners, and compression to enhance the performance and I/O operations of MapReduce jobs.

  • Explore advanced concepts like multithreading, speculative execution, input/output formats, and how to avoid common MapReduce anti-patterns.

您将获得的技能

类别:Distributed Computing
类别:Apache Hadoop
类别:Data Processing
类别:System Configuration
类别:Big Data
类别:Performance Tuning
类别:Software Architecture
类别:Java
类别:Scalability
Data Analysis Using  Hadoop Tools

Data Analysis Using Hadoop Tools

第 4 门课程23小时

您将学到什么

  • Learn to set up and configure Hive, Pig, HBase, and Spark for efficient big data analysis and processing within the Hadoop ecosystem.

  • Master Hive’s SQL-like queries for data retrieval, management, and optimization using partitions and joins to enhance query performance.

  • Understand Pig Latin for scripting data transformations, including the use of operators like join and debug to process large datasets effectively.

  • Gain expertise in NoSQL databases with HBase for real-time read/write operations, and use Spark’s core programming model for fast data processing.

您将获得的技能

类别:Apache Hadoop
类别:NoSQL
类别:Query Languages
类别:Big Data
类别:Apache Hive
类别:Data Transformation
类别:Apache Spark
类别:Data Processing
类别:Scripting Languages
类别:SQL
类别:Data Manipulation
类别:Data Management

获得职业证书

将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。

位教师

Karthik Shyamsunder
Johns Hopkins University
4 门课程992 名学生

提供方

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.
自 2018开始学习的学生
''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情,我就可以学习。'
Jennifer J.
自 2020开始学习的学生
''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'
Larry W.
自 2021开始学习的学生
''如果我的大学不提供我需要的主题课程,Coursera 便是最好的去处之一。'
Chaitanya A.
''学习不仅仅是在工作中做的更好:它远不止于此。Coursera 让我无限制地学习。'
Coursera Plus

通过 Coursera Plus 开启新生涯

无限制访问 10,000+ 世界一流的课程、实践项目和就业就绪证书课程 - 所有这些都包含在您的订阅中

通过在线学位推动您的职业生涯

获取世界一流大学的学位 - 100% 在线

加入超过 3400 家选择 Coursera for Business 的全球公司

提升员工的技能,使其在数字经济中脱颖而出

常见问题