F E E D B A C K

CSE 5xx - Natural Language Processing

IIIT-Delhi

Monsoon 2017

4 credits

Instructor

Tanmoy Chakraborty

Teaching Assistant

Prateek Rawat

Info - Natural Language Processing, IIIT-Delhi

Overview

This course will cover a broad range of topics related to NLP, including basic text processing (such as tokenization, stemming), language modeling, morphology, syntax, dependency parsing, distributional and lexical Semantics, sense disambiguation, information extraction etc. We will also introduce underlying theory from probability, statistics, machine learning that are essential to understand fundamental algorithms in NLP such as language modeling, HMM etc.
This course will end with more advanced topics in NLP such as stylometry analysis, sentiment analysis, named-entity disambiguation, machine translation etc. The term projects will provide opportunity to the students to get hands-on experience on designing different real-world NLP models.

Important Links

Description

Content:
   1. Introduction,  Regular Expressions, Text Normalization, and Edit Distance
   2. Regular Language & FSA
   3. Morphology & Finite-state Transducers
   4. Probabilistic models & Spelling correction
   5. N-grams, smoothing and entropy
   6. HMM, viterbi and A* decoding
   7. Word classes and POS tagging
   8. CFG for English and Parsing 
   9. Semantics: Introduction & Distributional semantics
   10. Lexical semantics & Word Sense disambiguation
   11. Advance topics: Text classification, Text Summarization
   12. Advance topics: Sentiment analysis,  Stylometry analysis 
   13 Advance topics: Web mining, Named-entity disambiguation, Concluding remarks

Evaluation

Mid-sem  15%
End-sem  25%
Class Attendance/ Performance  5%
Scientific Blog 5%
Quiz 
5%
Assignment  15%
Project  30%

Please refer to the policy: http://iiitd.ac.in/education/resources/academic-dishonesty

Pre-requisites

Mandatory:
   1. CSE322 Theory of Computation
   2. CSE222 Algorithm Design & Analysis
   3. CSE101 Intro to Programming

Desirable:
   1.  Python/Java programming - desirable

Class Timings

Room C-12

Monday 11:30 am to 1:00 p.m.
Thursday 11:30 am to 1:00 p.m.

Open Hour: Mon & Thu, 2.30 pm – 3.00 pm
(Students can discuss with the Instructor during Open Hour.)

T.A. Hours: Monday 5.00 P.M. - 6.00 P.M. (Library Ground Floor)

Textbooks

Textbook:
Speech and Language Processing : An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition by  Daniel Jurafsky and James H. Martin

Reference Book:
Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schütze

Journals:
Journal of Computational Linguistics, Transactions of the Association for Computational Linguistics, Journal of Information Retrieval, Journal of Machine Learning

Conferences:
ACL, EACL, EMNLP, NAACL, COLING, IJCNLP, SIGIR, WWW, ICON