F E E D B A C K

CSE556 - Introduction to Natural Language Processing

IIIT-Delhi

Monsoon 2018

4 credits

Instructor

Tanmoy Chakraborty

Teaching Assistants


Info - Introduction to Natural Language Processing, IIIT-Delhi

Overview

This course will cover a broad range of topics related to NLP, including basic text processing (such as tokenization, stemming), language modeling, morphology, syntax, dependency parsing, distributional and lexical Semantics, sense disambiguation, information extraction etc. We will also introduce underlying theory from probability, statistics, machine learning that are essential to understand fundamental algorithms in NLP such as language modeling, HMM etc. 
This course will end with more advanced topics in NLP such as stylometry analysis, sentiment analysis, named-entity disambiguation, machine translation etc. The term projects will provide opportunity to the students to get hands-on experience on designing different real-world NLP models.

Description

Introduction
Regular Expressions, Text Normalization, and Edit Distance
Morphology & Finite-state Transducers
Probabilistic models & Spelling correction
N-grams, smoothing and entropy
HMM, Viterbi and A* decoding
Word classes and POS tagging
CFG for English and Parsing
Semantics: Introduction & Distributional semantics
Lexical semantics & Word Sense disambiguation
Advance topics: Text classification, Information retrieval
Advance topics: Sentiment analysis, Stylometry analysis
Advance topics: Web mining, Named-entity disambiguation

Research paper presentations schedule : https://docs.google.com/spreadsheets/d/1NgEyt4sKkz7tNiQzKr5znBZE3X2mLbazrSkryoyHLzw/edit?usp=sharing

Evaluation

Mid-sem  15%
End-sem  25%
Class Attendance  2%
Quiz (sudden + declared) 
 10%
Assignment  15%
Research Paper Discussion 3%
Project / Scribe  30%
Bonus Credits based on class performance

Please refer to the policy: http://iiitd.ac.in/education/resources/academic-dishonesty

Pre-requisites

Mandatory:
   1. CSE322 Theory of Computation 
   2. CSE222 Algorithm Design & Analysis 
   3. CSE101 Intro to Programming

Desirable:
   1.  Python/Java programming

Class Timings

Monday 11:30 am to 1:00 p.m.
Thursday 11:30 am to 1:00 p.m. 

Office Hours

Open Hour: Mon & Thu, 2.30 pm – 3.00 pm
(Students can discuss with the Instructor during Open Hour.)

Teaching Assistants:
1. Hridoy Sankar Dutta: Friday 3.30pm-4.30pm (
Drop an email before coming)
2. Vishal Raj Dutta: Thursday 3:30-4:30 pm (Kindly drop an email before coming)
3. Mohit Chawla: 
Wednesday 10:30 am to 11:30 am  (Kindly Drop an email before coming)
4. Shubhrata Khandelwal: Tuesday 4:00 to 5:00 (
Drop an email before coming)
5. Subhankar Adak: Wednesday 10 am to 11 am 
 (Kindly Drop an email before coming)
6. Raunak Sinha: Wednesday 2:30 pm to 3:30 pm (Please drop an email before coming)

Textbooks

Textbook:
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition by  Daniel Jurafsky and James H. Martin

Reference Book:
Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schütze
Natural Language Understanding by James Allan

Journals: 
Journal of Computational Linguistics, Transactions of the Association for Computational Linguistics, Journal of Information Retrieval, Journal of Machine Learning 

Conferences: 
ACL, EACL, EMNLP, NAACL, COLING, IJCNLP, SIGIR, WWW, ICON