This course delves into both the theoretical aspects and practical applications of data mining within the field of engineering. It provides a comprehensive review of the essential fundamentals and central concepts underpinning data mining. Additionally, it introduces pivotal data mining methodologies and offers a guide to executing these techniques through various algorithms. Students will be introduced to a range of data mining techniques, such as data preprocessing, the extraction of association rules, classification, prediction, clustering, and the exploration of complex data, and will implement a capstone project exploring the same. Additionally, we will use case studies to explore the application of data mining across diverse sectors, including but not limited to manufacturing, healthcare, medicine, business, and various service industries.
In this module, participants will explore essential data concepts across domains, understanding diverse data types, attributes, and features. They will grasp the fundamental principles, methodologies, and scope of data mining.
涵盖的内容
4个视频9篇阅读材料1个作业
显示有关单元内容的信息
4个视频•总计22分钟
Course Overview•1分钟
Intro to Data Mining-Data•6分钟
Intro to Data Mining-Mining•6分钟
Data Mining Techniques •9分钟
9篇阅读材料•总计41分钟
Course Introduction•2分钟
Meet Your Faculty: Chinthaka Pathum "Dinesh" Herath Gedara•10分钟
Machine Learning and Data Analytics Part 1 Syllabus•10分钟
Academic Integrity•3分钟
Intro to Data Mining-Data•2分钟
Intro to Data Mining-Mining•5分钟
Data Mining Life Cycle•5分钟
Data Mining Techniques •1分钟
Types of Machine Learning•3分钟
1个作业•总计20分钟
Module 1: Assess Your Learning: Introduction to Data Mining in Engineering•20分钟
Exploratory Data Analysis and Visualization
第 2 单元•小时 后完成
单元详情
This module aims to impart a comprehensive understanding of data concepts, spanning various domains. Participants will learn to differentiate between different data types, attributes, and features. They will explore fundamental principles and methodologies of data mining
涵盖的内容
3个视频13篇阅读材料1个作业
显示有关单元内容的信息
3个视频•总计13分钟
Exploratory Data Analysis (EDA)•4分钟
Data Cleaning and Preprocessing•6分钟
Data Transformation •3分钟
13篇阅读材料•总计92分钟
Exploratory Data Analysis (EDA)•30分钟
Data Cleaning and Preprocessing•1分钟
Data Cleaning and Preprocessing Methods•5分钟
Data Visualization Techniques: Charts and Plots•3分钟
Bar and Pie Charts•5分钟
Line Graphs and Scatter Plots•5分钟
Pearson Correlation, Pair Plots, and Radar Charts•8分钟
Parallel Coordinates Plot and Sankey Plot•5分钟
Histograms, Box Plots and Violin Chart•8分钟
Area Plots and Bubble Charts•5分钟
Heat, Tree, and Choropleth Maps•8分钟
Word Clouds and Network Graphs•8分钟
Data Transformation •1分钟
1个作业•总计30分钟
Module 2: Assess Your Learning: Exploratory Data Analysis and Visualization•30分钟
Dimensionality Reduction
第 3 单元•小时 后完成
单元详情
Throughout this module, we will jump into the realm of dimensionality reduction, a technique for simplifying complex datasets to facilitate efficient analysis and visualization. By implementing dimensionality reduction methods such as Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE), we gain insight into how to effectively reduce the number of features while preserving essential information. We'll also learn to select and apply the most suitable dimensionality reduction techniques based on data types and analytical goals.
涵盖的内容
5个视频11篇阅读材料1个作业
显示有关单元内容的信息
5个视频•总计52分钟
Feature Selection–Linear Methods PCA•8分钟
PCA•11分钟
t-SNE: What is it?•5分钟
t-SNE: How it Works•15分钟
Linear Discriminant Analysis (LDA) •13分钟
11篇阅读材料•总计69分钟
Dimensionality Reduction •3分钟
Why Dimensionality Reduction?•5分钟
Feature Selection and Extraction•3分钟
Feature Extraction–Linear Methods PCA•1分钟
Principal Component Analysis (PCA)•1分钟
Covariance Matrix•5分钟
Correlation Matrix•5分钟
PCA Example•15分钟
t-SNE•25分钟
Linear Discriminant Analysis (LDA) •1分钟
LDA Example•5分钟
1个作业•总计10分钟
Module 3: Assess Your Learning: Dimensionality Reduction•10分钟
Performance Evaluation Matrices
第 4 单元•小时 后完成
单元详情
In this module, we learn the concept of the Bias-Variance Trade-Off in machine learning. Striving for models that generalize well requires navigating the delicate balance between bias and variance to avoid underfitting and overfitting. Bias prevents the error from oversimplifying a complex problem, while variance quantifies the model's sensitivity to different training data subsets. We will explore strategies to combat bias and variance in developing models that strike the right balance between accuracy and generalization. Transitioning to regression metrics, we will look at practical tools used to measure and evaluate model performance in regression tasks, focusing on metrics such as Root Mean Squared Error (RMSE). Finally, we will navigate the landscape of assessing model performance in binary classification tasks, exploring advanced measures like the F1 score, Matthews Correlation Coefficient (MCC), propensity scores, and the AUC-ROC curve.
Practical Application of Lift and Gains Chart•10分钟
1个作业•总计10分钟
Module 4: Assess Your Learning: Performance Evaluation Metrics•10分钟
Foundational Classification Algorithms - Part 1
第 5 单元•小时 后完成
单元详情
In this module, we will continue to explore key learning objectives to empower your understanding and application of essential techniques in machine learning. By mastering foundational classification algorithms such as KNN, LDA, and logistic regression, you'll gain the tools to tackle practical data mining tasks effectively. Through real-world dataset analysis, you'll learn to implement these algorithms with precision and insight, enabling you to extract valuable insights and make informed decisions in various domains. Join us this week to unlock the potential of classification algorithms and elevate your machine learning skills.
涵盖的内容
6个视频9篇阅读材料1个作业
显示有关单元内容的信息
6个视频•总计42分钟
Classification•6分钟
K-Nearest Neighbors (KNN) Model Distances •10分钟
Performing KNN, Picking Best K, Propensity Score, and Regression Prediction •9分钟
Logistic Regression, Intuitions, Odds/Logits, and Interpretation•9分钟
Parameter Estimation•3分钟
Multiclass Classification•4分钟
9篇阅读材料•总计72分钟
Classification •35分钟
K-Nearest Neighbors (KNN) Model Distances •10分钟
Performing KNN, Picking Best K, Propensity Score, and Regression Prediction •1分钟
KNN Example•10分钟
KNN–Advantages and Limitations•3分钟
Logistic Regression, Intuitions, Odds/Logits, and Interpretation•1分钟
Parameter Estimation•1分钟
Logistic Regression Example•10分钟
Multiclass Classification•1分钟
1个作业•总计10分钟
Module 5: Assess Your Learning: Foundational Classification Algorithms - Part 1•10分钟
Foundational Classification Algorithms - Part 2
第 6 单元•小时 后完成
单元详情
Embark on a captivating journey through the world of classification algorithms in this module. We’ll dive into the intricacies of foundational techniques like decision trees, Bayes classifier, ensemble learning, and more as you learn to navigate real-world dataset analysis with confidence. After we uncover the power of the Bayes classifier, we will transition seamlessly into tackling regression tasks with decision trees. Finally, we will dive into the realm of ensemble learning. Over the course of the module, you’ll become equipped with the knowledge and skills to implement these algorithms effectively, propelling your data mining endeavors to new heights.
涵盖的内容
4个视频12篇阅读材料1个作业
显示有关单元内容的信息
4个视频•总计44分钟
Bayes Classifier •14分钟
Decision Trees •11分钟
Decision Trees: Regression Analysis •9分钟
Ensemble Learning•9分钟
12篇阅读材料•总计170分钟
Bayes Classifier •1分钟
Naive Bayes•10分钟
Bayesian Decision Theory•5分钟
How to Apply Bayes Classifier in a Dataset•8分钟
How to Apply Naive Bayes Classifier in a Dataset•10分钟
Module 6: Assess Your Learning: Foundational Classification Algorithms - Part 2•10分钟
Key Regression Techniques - Part 1
第 7 单元•小时 后完成
单元详情
In this module, we get into essential regression techniques, equipping you with the skills to analyze and model real-world data. Through hands-on lessons, learners will grasp the fundamentals of linear, multiple, and logistic regression, gaining proficiency in implementing these methods on diverse datasets for predictive modeling. Lessons cover topics ranging from understanding linear regression and calculating coefficients to exploring polynomial regression and feature selection. By the end of this module, students will possess a comprehensive understanding of regression techniques, enabling them to make informed decisions and generate valuable insights from data.
涵盖的内容
3个视频6篇阅读材料1个作业
显示有关单元内容的信息
3个视频•总计22分钟
Linear Regression•11分钟
Linear Regression: Calculating Coefficients and Minimizing Cost Function•6分钟
Polynomial Regression•5分钟
6篇阅读材料•总计26分钟
Linear Regression•1分钟
Linear Regression: Calculating Coefficients and Minimizing Cost Function•1分钟
Applying Linear Regression in a Dataset•10分钟
Advantages and Disadvantages of Linear Regression Models•3分钟
Polynomial Regression•1分钟
Congratulations!•10分钟
1个作业•总计10分钟
Module 7: Assess Your Learning: Linear, Multiple and Logistic Regression Techniques•10分钟
Founded in 1898, Northeastern is a global research university with a distinctive, experience-driven approach to education and discovery. The university is a leader in experiential learning, powered by the world’s most far-reaching cooperative education program. The spirit of collaboration guides a use-inspired research enterprise focused on solving global challenges in health, security, and sustainability.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I purchase the Certificate?
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.