CSE 5xx - Natural Language Processing
Teaching AssistantPrateek Rawat
This course will cover a broad range of topics related to NLP, including basic text processing (such as tokenization, stemming), language modeling, morphology, syntax, dependency parsing, distributional and lexical Semantics, sense disambiguation, information extraction etc. We will also introduce underlying theory from probability, statistics, machine learning that are essential to understand fundamental algorithms in NLP such as language modeling, HMM etc.
This course will end with more advanced topics in NLP such as stylometry analysis, sentiment analysis, named-entity disambiguation, machine translation etc. The term projects will provide opportunity to the students to get hands-on experience on designing different real-world NLP models.
1. Introduction, Regular Expressions, Text Normalization, and Edit Distance
2. Regular Language & FSA
3. Morphology & Finite-state Transducers
4. Probabilistic models & Spelling correction
5. N-grams, smoothing and entropy
6. HMM, viterbi and A* decoding
7. Word classes and POS tagging
8. CFG for English and Parsing
9. Semantics: Introduction & Distributional semantics
10. Lexical semantics & Word Sense disambiguation
11. Advance topics: Text classification, Text Summarization
12. Advance topics: Sentiment analysis, Stylometry analysis
13 Advance topics: Web mining, Named-entity disambiguation, Concluding remarks
Class Attendance/ Performance 5%
Scientific Blog 5%
Please refer to the policy: http://iiitd.ac.in/education/resources/academic-dishonesty
1. CSE322 Theory of Computation
2. CSE222 Algorithm Design & Analysis
3. CSE101 Intro to Programming
1. Python/Java programming - desirable
Monday 11:30 am to 1:00 p.m.
Thursday 11:30 am to 1:00 p.m.
Open Hour: Mon & Thu, 2.30 pm – 3.00 pm
(Students can discuss with the Instructor during Open Hour.)
T.A. Hours: Monday 5.00 P.M. - 6.00 P.M. (Library Ground Floor)
Speech and Language Processing : An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition by Daniel Jurafsky and James H. Martin
Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schütze
Journal of Computational Linguistics, Transactions of the Association for Computational Linguistics, Journal of Information Retrieval, Journal of Machine Learning
ACL, EACL, EMNLP, NAACL, COLING, IJCNLP, SIGIR, WWW, ICON