Put the keystone in your Python Data Science skills by becoming proficient with Data Visualization and Modeling. This course is suited for intermediate programmers, who have some experience with NumPy and Pandas, that want to expand their skills for any career in data science. Whether you come to data science through social sciences and Statistics, or from a programming background, this course will integrate the two perspectives and offer unique insights from each.
You’ll begin by becoming adept with matplotlib, an essential plotting library in Python that will enable you to discover and communicate insights about data effectively. You’ll progress to classification algorithms by creating a K-Nearest Neighbors (KNN) classifier, a foundational algorithm used in data science and machine learning. Finally, you will write Python programs that leverage your newfound data science skills based on inferential statistics, and be able to describe relationships between variables in your data.
By the end of the course, you’ll be able to quickly visualize a dataset, explore it for insights, determine relationships between data, and communicate it all with effective plots. In the last module of this course, you’ll produce a publication-quality figure based on data that you’ve prepared and cleaned yourself; the first artifact in your data science portfolio.
Throughout this course you’ll get plenty of hands-on experience through interactive programming assignments, live coding demos from data scientists, and analyzing the data behind important real-world problems (like carbon emissions, real estate prices, and infant mortality). Guided activities throughout each module will reinforce your proficiency with data science techniques and analytical approach as a data scientist.
Solidify your understanding of these critical data science concepts and begin your data science portfolio by mastering visualization and modeling. Start this integrative and transformative learning journey today!
In this module, you will learn about plotting in Python—an important technique for exploring a dataset, and an indispensable tool for communicating insights. We’ll learn to make all the most common types of plots used in data science including the basics like line, bar, and scatter plots, as well as more advanced plot types including histograms and heatmaps. We’ll learn both how to make these plots and how they can be customized for your needs using a core plotting library for python, matplotlib, which serves as the backbone for many python plotting tools. You’ll learn how to create professional, accessible, and information-rich plots, which you will leverage to quickly identify trends in data that would be difficult to otherwise recognize. We've also included some optional additional readings if you want to further enhance your learning!
涵盖的内容
1个视频30篇阅读材料1个作业5个非评分实验室
显示有关单元内容的信息
1个视频•总计4分钟
Why Do Data Scientists Code?•4分钟
30篇阅读材料•总计388分钟
Is This Course For Me? (Given My Knowledge of Machine Learning and/or Statistics)?•5分钟
Report a problem with the course•5分钟
Plotting Introduction•5分钟
Effective Plotting Practices•8分钟
Basic Plotting with MatPlotLib•10分钟
A Figure in Ten Pieces: Matplotlib Customization•15分钟
Plotting text (and a side note on axis scaling)•20分钟
Deep Dive: Bar Plots•20分钟
Stack Plots (Optional)•15分钟
Pie Charts•15分钟
Subplots•15分钟
Deep Dive: Subplots•15分钟
Deep Dive: Scatter Plots•15分钟
Error Bars•10分钟
Heat Maps•10分钟
Histograms•10分钟
Two Dimensional Histograms (Optional)•10分钟
Tying together Histograms (Optional)•15分钟
Legends•15分钟
Saving to File•10分钟
Explicit and Implicit Syntax•10分钟
Plotting Zoo: Multiple Ways of Visualization•10分钟
Plotting with Pandas•10分钟
Customizing Plot Styles•20分钟
Plotting Zoo Restyled•15分钟
The Matplotlib Model•15分钟
Making Plots Pretty: Laying the Foundation•15分钟
Making Plots Pretty: The Process•10分钟
Plotting with Seaborn•25分钟
Seaborn Object Recipes•15分钟
1个作业•总计30分钟
Module 1 Wrap Up Quiz•30分钟
5个非评分实验室•总计300分钟
Exercise: Anscombe's Quartet•60分钟
Exercise: Creating a Series of Timeseries•60分钟
Practice Exercise: Histograms of Poker Outcomes•60分钟
Practice Exercise: Creating a Custom Bar Chart•60分钟
Practice Exercise: Creating a Custom Stack Plot•60分钟
Prediction
第 2 单元•小时 后完成
单元详情
This module, you will learn the basics of how to use code to make predictions based on data. After discussing what prediction is, you’ll learn to describe the concepts that underlie predictive algorithms within the context of the K-Nearest Neighbors (KNN) algorithm for both classification and regression. Additionally, you’ll learn to evaluate the accuracy of a predictive algorithm to assess its ability to generalize to new data. You will build your own KNN classification and regression algorithms from scratch and make predictions with each of them. At the end of this module, we’ll have a quiz to give you the opportunity to evaluate your understanding of predictive algorithms and reflect on your experience implementing your own.
涵盖的内容
1个视频7篇阅读材料1个作业5个非评分实验室
显示有关单元内容的信息
1个视频•总计52分钟
Live Coding: Creating and Evaluating a KNN Classifier•52分钟
Interactive Reading: KNN for Regression: Application•60分钟
Regression
第 3 单元•小时 后完成
单元详情
This module, you will learn how to describe the differences between prediction and inference, two key Data Science concepts. You’ll learn how to implement linear regressions — one of the most useful tools that data scientists have for inference and prediction — and other statistical models in Python. You’ll apply this knowledge by examining a dataset and regressing multiple variables on each other, and describing the insights on their relationships.
涵盖的内容
1个视频7篇阅读材料1个作业2个非评分实验室
显示有关单元内容的信息
1个视频•总计26分钟
Live Coding: Exploring Data with Linear Regression•26分钟
7篇阅读材料•总计135分钟
Linear Regression in Python•10分钟
Linear Regression: A Brief Introduction•20分钟
A Brief Intro to Categorical Variables•20分钟
Inference vs. Prediction in Data Science•15分钟
Linear Regression in Python•25分钟
From Pandas to Numpy with patsy•20分钟
Optional Reading: Linear Regression Extensions•25分钟
1个作业•总计30分钟
Module 3 Wrap Up Quiz•30分钟
2个非评分实验室•总计120分钟
Lab Used in Module 3 Live Coding Video•60分钟
Module 3 Wrap Up Lab (For Graded Quiz)•60分钟
Final Project
第 4 单元•小时 后完成
单元详情
This module, you’ll bring together the concepts and skills you’ve developed throughout the course to create a final project for your data science portfolio. You’ll recreate a now-famous data visualization that illustrates the relationship between the income of countries and their greenhouse gas emissions on a global scale. To do this, you’ll explore and prepare 4 datasets and merge them into a composite dataset that you’ll plot. Creating this merged dataset is an important step, and you’ll validate your merged dataset with a short quiz on the insights within. The end result of this effort will be a publication-quality plot that makes a compelling point about the relationship between emissions and income—an impactful visualization that showcases your growing programming skills for data science applications.
Duke University has about 13,000 undergraduate and graduate students and a world-class faculty helping to expand the frontiers of knowledge. The university has a strong commitment to applying knowledge in service to society, both near its North Carolina campus and around the world.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.