The field of natural language processing (NLP) aims at getting computers to perform useful and interesting tasks with human language. This course introduces students to the 3 pillars underlying modern NLP: probabilistic language models, simple neural networks with a focus on gradient based learning, and vector-based meaning representations in the form of word embeddings. At the end of the course, students will be able to implement and analyze probabilistic language models based on N-grams, text classifiers using logistic regression and gradient-based learning, and vector-based approaches to word meaning and text classification.
This course can be taken for academic credit as part of CU Boulder’s MS in Data Science or MS in Computer Science degrees offered on the Coursera platform. These fully accredited graduate degrees offer targeted courses, short 8-week sessions, and pay-as-you-go tuition. Admission is based on performance in three preliminary courses, not academic history. CU degrees on Coursera are ideal for recent graduates or working professionals. Learn more:
MS in Data Science: https://hua.dididi.sbs/degrees/master-of-science-data-science-boulder
MS in Computer Science: https://coursera.org/degrees/ms-computer-science-boulder
This first week of Fundamentals of Natural Language Processing introduces the fundamental concepts of natural language processing (NLP), focusing on how computers process and analyze human language. You will explore key linguistic structures, including words and morphology, and learn essential techniques for text normalization and tokenization.
涵盖的内容
5个视频8篇阅读材料2个作业
显示有关单元内容的信息
5个视频•总计56分钟
Meet Your Instructor•1分钟
Course Introduction•7分钟
Morphology•16分钟
Text Normalization•17分钟
Subword Tokenization•15分钟
8篇阅读材料•总计141分钟
Course Updates and Accessibility Support•1分钟
Earn Academic Credit for Your Work! •10分钟
Course Support•10分钟
Assessment Expectations•5分钟
AI Citation and Acknowledgement•10分钟
Morphology•30分钟
Text Normalization•60分钟
Byte-Pair Encoding•15分钟
2个作业•总计35分钟
AI Policy Quiz•5分钟
Quiz 1: Morphology and Tokenization•30分钟
Probabilistic Language Models
第 2 单元•小时 后完成
单元详情
This week explores foundational language modeling techniques, focusing on n-gram models and their role in statistical Natural Language Processing. You will learn how n-gram language models are constructed, smoothed, and evaluated for effectiveness.
涵盖的内容
4个视频4篇阅读材料1个作业1个编程作业
显示有关单元内容的信息
4个视频•总计61分钟
Introducing Language Models•14分钟
N-Gram Based Language Models•16分钟
Smoothing N-Gram Language Models•22分钟
Evaluating Language Models•9分钟
4篇阅读材料•总计80分钟
N-Gram Language Models: Introduction•10分钟
N-Gram Language Models: N-Grams•20分钟
N-Gram Language Models: Smoothing, Interpolation, and Backoff•20分钟
Evaluating Language Models•30分钟
1个作业•总计30分钟
Quiz 2: Language Models•30分钟
1个编程作业•总计180分钟
Constructing a Language Model•180分钟
Text Classification and Logistic Regression
第 3 单元•小时 后完成
单元详情
This week introduces text classification and explores logistic regression as a powerful classification technique. You will learn how logistic regression models work, including key mathematical concepts such as the logit function, gradients, and stochastic gradient descent. The week also covers evaluation metrics for assessing classifier performance.
涵盖的内容
6个视频3篇阅读材料1个作业1个编程作业
显示有关单元内容的信息
6个视频•总计91分钟
Introduction to Text Classification•10分钟
Logistic Regression•16分钟
Introducing the Logit•7分钟
Learning in Logistic Regression•19分钟
Learning Algorithms for Logistic Regression•17分钟
Evaluating Classifiers•21分钟
3篇阅读材料•总计125分钟
Introduction to Text Classification•35分钟
Logistic Regression•60分钟
Evaluating Classifiers•30分钟
1个作业•总计30分钟
Quiz 3: Logistic Regression •30分钟
1个编程作业•总计180分钟
Sentiment Classification with Logistic Regression•180分钟
Vector Space Semantics and Word Embeddings
第 4 单元•小时 后完成
单元详情
This final week explores how words can be represented as vectors in a high-dimensional space, allowing computational models to capture semantic relationships between words. You will learn about both sparse and dense vector representations, including TF-IDF, Pointwise Mutual Information (PMI), Latent Semantic Analysis (LSA), and Word2Vec. The module also covers techniques for evaluating and applying word embeddings.
CU Boulder is a dynamic community of scholars and learners on one of the most spectacular college campuses in the country. As one of 34 U.S. public institutions in the prestigious Association of American Universities (AAU), we have a proud tradition of academic excellence, with five Nobel laureates and more than 50 members of prestigious academic academies.
What is the recommended background for this course?
Learners should be proficient in Python programming including the use of packages such as numpy, scikit-learn and pandas. Students should be proficient in data structures and basic topics in algorithm design, such as sorting and searching, dynamic programming, and algorithm analysis. Students should also have basic familiarity with introductory concepts from calculus, discrete probability, and linear algebra.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.