This course delves into both the theoretical aspects and practical applications of data mining within the field of engineering. It provides a comprehensive review of the essential fundamentals and central concepts underpinning data mining. Additionally, it introduces pivotal data mining methodologies and offers a guide to executing these techniques through various algorithms. Students will be introduced to a range of data mining techniques, such as data preprocessing, the extraction of association rules, classification, prediction, clustering, and the exploration of complex data, and will implement a capstone project exploring the same. Additionally, we will use case studies to explore the application of data mining across diverse sectors, including but not limited to manufacturing, healthcare, medicine, business, and various service industries.
In this module, participants will explore essential data concepts across domains, understanding diverse data types, attributes, and features. They will grasp the fundamental principles, methodologies, and scope of data mining, enabling them to effectively analyze data and extract valuable insights. Through this comprehensive approach, learners will gain proficiency in utilizing key data concepts, facilitating informed decision-making and innovation across various domains.
涵盖的内容
5个视频8篇阅读材料2个作业2个讨论话题
显示有关单元内容的信息
5个视频•总计23分钟
Meet Your Faculty: Kiran Trivedi•1分钟
Course Introduction•2分钟
Intro to Data Mining-Data•6分钟
Intro to Data Mining-Mining•6分钟
Data Mining Techniques •9分钟
8篇阅读材料•总计24分钟
Welcome to Data Mining•2分钟
Syllabus - Practical Engineering Data Mining: Techniques and Uses•5分钟
Academic Integrity•1分钟
Intro to Data Mining-Data•2分钟
Intro to Data Mining-Mining•5分钟
Data Mining Life Cycle•5分钟
Data Mining Techniques •1分钟
Types of Machine Learning•3分钟
2个作业•总计20分钟
Check Your Knowledge: Data Mining•10分钟
Check Your Knowledge: CRISP-DM Transformation•10分钟
2个讨论话题•总计55分钟
Meet Your Fellow Learners•10分钟
Intro to Data Mining•45分钟
Exploratory Data Analysis and Visualization
第 2 单元•小时 后完成
单元详情
This module aims to impart a comprehensive understanding of data concepts, spanning various domains. Participants will learn to differentiate between different data types, attributes, and features. They will explore fundamental principles and methodologies of data mining, enabling them to extract meaningful insights from datasets. By mastering these objectives, learners will be equipped with the knowledge and skills necessary to analyze data effectively and make informed decisions in diverse professional settings.
涵盖的内容
3个视频13篇阅读材料1个作业1个讨论话题
显示有关单元内容的信息
3个视频•总计13分钟
Exploratory Data Analysis (EDA)•4分钟
Data Cleaning and Preprocessing•6分钟
Data Transformation •3分钟
13篇阅读材料•总计94分钟
Exploratory Data Analysis (EDA)•30分钟
Data Cleaning and Preprocessing•1分钟
Data Cleaning and Preprocessing Methods•7分钟
Data Visualization Techniques: Charts and Plots•3分钟
Bar and Pie Charts•5分钟
Line Graphs and Scatter Plots•5分钟
Pearson Correlation, Pair Plots, and Radar Charts•8分钟
Parallel Coordinates Plot and Sankey Plot•5分钟
Histograms, Box Plots and Violin Chart•8分钟
Area Plots and Bubble Charts•5分钟
Heat, Tree, and Choropleth Maps•8分钟
Word Clouds and Network Graphs•8分钟
Data Transformation •1分钟
1个作业•总计30分钟
Check Your Knowledge: Data Visualization Techniques•30分钟
1个讨论话题•总计45分钟
Exploratory Data Analysis•45分钟
Dimensionality Reduction
第 3 单元•小时 后完成
单元详情
Throughout this module, we will jump into the realm of dimensionality reduction, a technique for simplifying complex datasets to facilitate efficient analysis and visualization. By implementing dimensionality reduction methods such as Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE), we will gain insight into how to effectively reduce the number of features while preserving essential information. We'll learn to select and apply the most suitable dimensionality reduction techniques based on data types and analytical goals, thereby enhancing model performance and interpretability. This module shares the tools to navigate and extract meaningful insights from high-dimensional datasets, paving the way for more effective data analysis and decision-making.
涵盖的内容
4个视频9篇阅读材料1个作业1个讨论话题
显示有关单元内容的信息
4个视频•总计39分钟
Feature Selection–Linear Methods PCA•8分钟
PCA•11分钟
t-SNE: What is it?•5分钟
t-SNE: How it Works•15分钟
9篇阅读材料•总计63分钟
Dimensionality Reduction •3分钟
Why Dimensionality Reduction?•5分钟
Feature Selection and Extraction•3分钟
Feature Selection–Linear Methods PCA•1分钟
Principal Component Analysis (PCA)•1分钟
Covariance Matrix•5分钟
Correlation Matrix•5分钟
PCA Example•15分钟
t-SNE•25分钟
1个作业•总计5分钟
Check Your Knowledge: Dimensionality Reduction•5分钟
1个讨论话题•总计45分钟
Dimensionality Reduction•45分钟
Performance Evaluation Matrices
第 4 单元•小时 后完成
单元详情
In this module, we learn the concept of the Bias-Variance Trade-off in machine learning. Striving for models that generalize well requires navigating the delicate balance between bias and variance to avoid underfitting and overfitting. Bias represents the error from oversimplifying a complex problem, while variance quantifies the model's sensitivity to different training data subsets. We explore strategies to combat bias and variance in developing models that strike the right balance between accuracy and generalization. Transitioning to regression metrics, we look at practical tools used to measure and evaluate model performance in regression tasks, focusing on metrics like Root Mean Squared Error (RMSE). Finally, we navigate the landscape of assessing model performance in binary classification tasks, exploring advanced measures like the F1-Score, Matthews Correlation Coefficient (MCC), propensity scores, and the AUC-ROC curve.
Founded in 1898, Northeastern is a global research university with a distinctive, experience-driven approach to education and discovery. The university is a leader in experiential learning, powered by the world’s most far-reaching cooperative education program. The spirit of collaboration guides a use-inspired research enterprise focused on solving global challenges in health, security, and sustainability.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I purchase the Certificate?
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.