What is the recommended background for this course?

Learners should be proficient in Python programming including the use of packages such as numpy, scikit-learn and pandas. Students should be proficient in data structures and basic topics in algorithm design, such as sorting and searching, dynamic programming, and algorithm analysis. Students should also have basic familiarity with introductory concepts from calculus, discrete probability, and linear algebra.

When will I have access to the lectures and assignments?

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

What will I get if I subscribe to this Specialization?

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Fundamentals of Natural Language Processing

Fundamentals of Natural Language Processing

位教师：James Martin

2,849 人已注册

包含在中

了解更多

4个模块

深入了解一个主题并学习基础知识。

17 条评论

中级等级

推荐体验

灵活的计划

2 周在 10 小时一周

自行安排学习进度

攻读学位

了解更多

4个模块

深入了解一个主题并学习基础知识。

17 条评论

中级等级

推荐体验

灵活的计划

2 周在 10 小时一周

自行安排学习进度

攻读学位

了解更多

您将学到什么

Analyze corpora to develop effective lexicons using subword tokenization.
Develop language models that can assign probabilities to texts.
Design, implement, and evaluate the effectiveness of text classifiers using gradient-based learning techniques.
Design, implement and evaluate unsupervised methods for learning word embeddings.

您将获得的技能

您将学习的工具

Classification Algorithms

要了解的详细信息

可分享的证书

添加到您的领英档案

作业

5 项作业

授课语言：英语（English）

了解顶级公司的员工如何掌握热门技能

了解关于 Coursera for Business 的更多信息

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

该课程共有4个模块

The field of natural language processing (NLP) aims at getting computers to perform useful and interesting tasks with human language. This course introduces students to the 3 pillars underlying modern NLP: probabilistic language models, simple neural networks with a focus on gradient based learning, and vector-based meaning representations in the form of word embeddings. At the end of the course, students will be able to implement and analyze probabilistic language models based on N-grams, text classifiers using logistic regression and gradient-based learning, and vector-based approaches to word meaning and text classification.

This course can be taken for academic credit as part of CU Boulder’s MS in Data Science or MS in Computer Science degrees offered on the Coursera platform. These fully accredited graduate degrees offer targeted courses, short 8-week sessions, and pay-as-you-go tuition. Admission is based on performance in three preliminary courses, not academic history. CU degrees on Coursera are ideal for recent graduates or working professionals. Learn more: MS in Data Science: https://hua.dididi.sbs/degrees/master-of-science-data-science-boulder MS in Computer Science: https://coursera.org/degrees/ms-computer-science-boulder

单元详情

This first week of Fundamentals of Natural Language Processing introduces the fundamental concepts of natural language processing (NLP), focusing on how computers process and analyze human language. You will explore key linguistic structures, including words and morphology, and learn essential techniques for text normalization and tokenization.

涵盖的内容

5个视频8篇阅读材料2个作业

5个视频总计56分钟

Meet Your Instructor1分钟
Course Introduction7分钟
Morphology16分钟
Text Normalization17分钟
Subword Tokenization15分钟

8篇阅读材料总计141分钟

Course Updates and Accessibility Support1分钟
Earn Academic Credit for Your Work! 10分钟
Course Support10分钟
Assessment Expectations5分钟
AI Citation and Acknowledgement10分钟
Morphology30分钟
Text Normalization60分钟
Byte-Pair Encoding15分钟

2个作业总计35分钟

AI Policy Quiz5分钟
Quiz 1: Morphology and Tokenization30分钟

This week explores foundational language modeling techniques, focusing on n-gram models and their role in statistical Natural Language Processing. You will learn how n-gram language models are constructed, smoothed, and evaluated for effectiveness.

涵盖的内容

4个视频4篇阅读材料1个作业1个编程作业

4个视频总计61分钟

Introducing Language Models14分钟
N-Gram Based Language Models16分钟
Smoothing N-Gram Language Models22分钟
Evaluating Language Models9分钟

4篇阅读材料总计80分钟

N-Gram Language Models: Introduction10分钟
N-Gram Language Models: N-Grams20分钟
N-Gram Language Models: Smoothing, Interpolation, and Backoff20分钟
Evaluating Language Models30分钟

1个作业总计30分钟

Quiz 2: Language Models30分钟

1个编程作业总计180分钟

Constructing a Language Model180分钟

This week introduces text classification and explores logistic regression as a powerful classification technique. You will learn how logistic regression models work, including key mathematical concepts such as the logit function, gradients, and stochastic gradient descent. The week also covers evaluation metrics for assessing classifier performance.

涵盖的内容

6个视频3篇阅读材料1个作业1个编程作业

6个视频总计91分钟

Introduction to Text Classification10分钟
Logistic Regression16分钟
Introducing the Logit7分钟
Learning in Logistic Regression19分钟
Learning Algorithms for Logistic Regression17分钟
Evaluating Classifiers21分钟

3篇阅读材料总计125分钟

Introduction to Text Classification35分钟
Logistic Regression60分钟
Evaluating Classifiers30分钟

1个作业总计30分钟

Quiz 3: Logistic Regression 30分钟

1个编程作业总计180分钟

Sentiment Classification with Logistic Regression180分钟

This final week explores how words can be represented as vectors in a high-dimensional space, allowing computational models to capture semantic relationships between words. You will learn about both sparse and dense vector representations, including TF-IDF, Pointwise Mutual Information (PMI), Latent Semantic Analysis (LSA), and Word2Vec. The module also covers techniques for evaluating and applying word embeddings.