Dartmouth College
Predictive Analytics
Dartmouth College

Predictive Analytics

Reed H. Harder
Vikrant S. Vaze

位教师:Reed H. Harder

包含在 Coursera Plus

深入了解一个主题并学习基础知识。
中级 等级

推荐体验

5 周 完成
在 10 小时 一周
灵活的计划
自行安排学习进度
深入了解一个主题并学习基础知识。
中级 等级

推荐体验

5 周 完成
在 10 小时 一周
灵活的计划
自行安排学习进度

了解顶级公司的员工如何掌握热门技能

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

积累特定领域的专业知识

本课程是 Data Analytics for Digital Transformation 专项课程 专项课程的一部分
在注册此课程时,您还会同时注册此专项课程。
  • 向行业专家学习新概念
  • 获得对主题或工具的基础理解
  • 通过实践项目培养工作相关技能
  • 获得可共享的职业证书

该课程共有9个模块

Welcome to Predictive Analytics for Digital Transformation. This course is part of the Digital Transformation for Data Analytics Certificate. It is designed to equip you with the tools and knowledge to transform raw data into actionable insights. Whether you want to enhance organizational efficiency, improve customer experiences, or innovate within your field, this course provides the foundational skills to leverage predictive analytics effectively. Throughout this course, you will explore the theoretical underpinnings and practical applications of predictive analytics, starting with linear and logistic regression and advancing to more complex models and techniques. Using Python and cloud-based tools, you'll gain hands-on experience in building, training, and evaluating models that solve real-world business challenges. Topics include diagnosing model performance issues like overfitting and underfitting, selecting appropriate features, and working with skewed datasets. You’ll also explore advanced modeling techniques and cross-validation methods to ensure your models are generalizable and robust. Guided by Drs. Vikrant Vaze and Reed Harder, you’ll complete practical activities, reflection exercises, and case-based projects designed to simulate real-world scenarios. Along the way, you’ll learn to integrate analytics into digital transformation initiatives, empowering you to lead data-driven innovations in your industry. Whether you're a seasoned professional or new to the field, this course will challenge you to think critically, code effectively, and apply your skills to meaningful, data-centric problems.

涵盖的内容

2个视频10篇阅读材料2个作业2个非评分实验室

You may have heard the analogies “Data is the new oil,” and “Analytics is the combustion engine.” What is meant by these comparisons? In the digital transformation era, traditional companies seek to gather, refine, and mathematically study all kinds of available information, from customer demographics to operational metrics, to reimagine business models and processes for the 21st century. Indeed, quality data is the fuel that drives organizational decision-making! In this module, you will get hands-on practice with two key predictive analytics tools, regression, and classification, and learn how to create mathematical models representative of business situations. We’ll conclude with instructions on implementing the models in code using Scikit-learn, a common Python library for machine learning.

涵盖的内容

2个视频5篇阅读材料2个作业2个非评分实验室

In this unit, we build our first predictive model using linear regression, a fundamental and powerful method in supervised learning. To illustrate its application, we return to our airfare prediction example: an airline collects historical data to predict the average airfare for a new route based on its distance. We aim to determine the line that best fits the data, minimizing the difference between predicted and actual fares. This process introduces the model training objective, where we optimize parameters (e.g., slope and intercept) by minimizing an error function. We'll explore how the gradient descent algorithm, a versatile and iterative optimization method, achieves this. Linear regression is a cornerstone of digital transformation, enabling organizations to derive actionable insights from data. For example, the healthcare industry can predict patient outcomes based on variables like age, medical history, and treatment options, driving more personalized care. Similarly, businesses can forecast sales in retail based on historical purchasing trends, inventory levels, and seasonal factors, enabling smarter supply chain management. Industries are transforming operations, decision-making, and customer experiences by integrating models like this.

涵盖的内容

3个视频6篇阅读材料2个作业4个非评分实验室

In this unit, we deepen our understanding of predictive analytics by exploring more complex models and concepts that enhance decision-making through digital transformation. Building on our foundation in linear regression, we will expand into multivariate linear regression, where multiple features contribute to predictions, reflecting the multifaceted nature of real-world data. We will also introduce classification models, which predict discrete outcomes rather than continuous ones. Using practical examples like hospital readmission prediction, we’ll see how these models can address critical questions such as whether a patient is likely to be readmitted. Additional scenarios, such as predicting flight delays, customer behavior, or tumor diagnoses, will further demonstrate the power of classification. To effectively build and refine these models, we will introduce three essential concepts: feature transformation, feature selection, and overfitting. These techniques help answer pivotal questions about which features to include, how to transform data for optimal results, and how to avoid overly complex models that fail to generalize. We will understand the trade-offs and risks in creating robust supervised learning models by applying these ideas to previously explored examples, such as airfare prediction and hospital readmissions.

涵盖的内容

4个视频7篇阅读材料2个作业4个非评分实验室

In this unit, we bridge the gap between foundational predictive analytics and practical implementation in modern digital transformation contexts. We begin by exploring Predictive Analytics in Python, where we leverage Python’s powerful libraries to build, train, and evaluate regression and classification models. Through hands-on exercises, you’ll learn how to process data, apply linear and logistic regression, and visualize results effectively. Next, we extend our focus to Linear Regression on the Cloud, demonstrating how cloud platforms enable scalable, efficient training of regression models on large datasets. You’ll gain practical experience in using cloud-based tools and services to handle real-world data challenges, such as forecasting trends and optimizing resource allocation. We also delve into Logistic Regression on the Cloud, emphasizing its applications in predicting discrete outcomes. By hosting and training logistic regression models in a cloud environment, we unlock the ability to process high-volume, real-time data, essential for tasks like customer behavior prediction, fraud detection, and healthcare analytics. Throughout the unit, we’ll highlight the role of predictive analytics in digital transformation, showing how cloud computing and Python empower organizations to make data-driven decisions.

涵盖的内容

2个视频3篇阅读材料2个作业3个非评分实验室

Now that you are able to translate various business situations into predictive analytics models, the next challenge is to choose which model will best perform for the task at hand. Model choices may vary depending on the nature of your project, such as the requirements and constraints of the stakeholders, the time and resources available, and the availability of data. In this module, we will introduce more advanced modeling techniques which will aid the effective use of different kinds of datasets, and allow you to evaluate and improve your models in a way that incorporates the risk and uncertainty that is inherent in any real-world situation. Hands-on practice in Python to implement these advanced models will enhance your coding skills.

涵盖的内容

3个视频6篇阅读材料2个作业3个非评分实验室

In this unit, we bring together the key concepts and techniques of predictive analytics to build robust, generalizable models that address real-world challenges. We start by diagnosing two critical issues—overfitting and underfitting—which can significantly impact a model’s performance. Using diagnostic tools, we will explore how to systematically identify and mitigate these problems to enhance model accuracy and reliability. Next, we introduce cross-validation, a powerful method to ensure models perform well on unseen data. By dividing data into training, validation, and test sets, we’ll learn how to make informed decisions about features, model complexity, and regularization parameters. This approach ensures that the predictive analytics models we develop are optimized for generalizability, a key requirement for leveraging digital transformation technologies effectively. We also tackle the challenges posed by skewed datasets, especially in classification problems with binary labels. Through practical examples, we’ll understand why standard metrics like misclassification error may fall short in such scenarios. To address this, we’ll introduce more nuanced evaluation metrics—precision, recall, and F-score—and demonstrate how to balance these measures by adjusting decision thresholds. By the end of this unit, you’ll have a comprehensive understanding of how to diagnose and refine predictive analytics models.

涵盖的内容

3个视频6篇阅读材料2个作业4个非评分实验室

In this unit, we take a significant step forward in predictive analytics by exploring the application of neural networks for both regression and classification tasks. Neural networks offer powerful capabilities for capturing complex patterns in data, making them an essential tool for digital transformation across industries. By leveraging the scalability and efficiency of cloud platforms, we will learn to build, train, and evaluate neural network models capable of addressing real-world challenges. We begin by revisiting familiar datasets, such as expanded versions of the readmission dataset for classification and the market size dataset for regression. Through these examples, we’ll explore how neural networks handle continuous and discrete predictions, enabling us to address diverse business problems. To optimize model performance, we’ll incorporate techniques like cross-validation to fine-tune hyperparameters, such as regularization, and understand how these choices impact the trade-off between underfitting and overfitting. The unit also introduces advanced diagnostics to evaluate model performance, using metrics such as mean absolute error and mean squared error for regression, and confusion matrices for classification tasks with skewed datasets. These metrics help us refine our models and ensure their robustness. Additionally, we will tackle common data preparation challenges, such as handling missing or erroneous data, merging datasets, and transforming categorical variables into usable formats. Finally, we delve into practical examples, such as predicting flight delays, to illustrate the end-to-end workflow of cleaning, processing, and modeling data with neural networks. By iterating through models of varying complexity—linear, quadratic, and cubic—we’ll identify how to balance complexity and generalizability to avoid overfitting. By the end of this unit, you will have a comprehensive understanding of how to implement and evaluate neural networks in cloud environments, preparing you to harness their full potential for data-driven decision-making.

涵盖的内容

2个视频2篇阅读材料2个作业3个非评分实验室

The final unit of this course is a practicum that serves as a mini-capstone project, allowing you to consolidate your learning and demonstrate mastery of the tools and techniques introduced throughout the course. This project is your opportunity to apply predictive analytics, cloud-based tools, and data science methodologies to a practical business problem, providing actionable insights that align with digital transformation initiatives. You will select a dataset and problem of interest—either from your own professional or academic context or from one of the structured scenarios provided. Using your analytics toolbox, you will explore, analyze, and develop a data-driven solution to inform strategic and operational decisions. This hands-on project will challenge you to: -Frame a business problem in terms of predictive analytics. -Develop and evaluate models, leveraging tools like Scikit-learn, neural networks, or optimization techniques. -Diagnose model performance, validate results, and provide implementable recommendations. -Translate your findings into a technical report with a comprehensive executive summary tailored to stakeholders.

涵盖的内容

4篇阅读材料2个作业2个非评分实验室

获得职业证书

将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。

位教师

Reed H. Harder
Dartmouth College
6 门课程1,463 名学生
Vikrant S. Vaze
Dartmouth College
5 门课程2,103 名学生

提供方

Dartmouth College

从 Data Analysis 浏览更多内容

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.
自 2018开始学习的学生
''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情,我就可以学习。'
Jennifer J.
自 2020开始学习的学生
''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'
Larry W.
自 2021开始学习的学生
''如果我的大学不提供我需要的主题课程,Coursera 便是最好的去处之一。'
Chaitanya A.
''学习不仅仅是在工作中做的更好:它远不止于此。Coursera 让我无限制地学习。'
Coursera Plus

通过 Coursera Plus 开启新生涯

无限制访问 10,000+ 世界一流的课程、实践项目和就业就绪证书课程 - 所有这些都包含在您的订阅中

通过在线学位推动您的职业生涯

获取世界一流大学的学位 - 100% 在线

加入超过 3400 家选择 Coursera for Business 的全球公司

提升员工的技能,使其在数字经济中脱颖而出

常见问题