List of resources for studying Natural Language Processing. It is split into Textbooks & Papers, and Tools. Many of the links come from Slav Petrov’s excellent course Statistical Natural Language Processing at NYU. I update the list on an ongoing basis. See Neural Networks for general resources on deep learning.
Textbooks & Papers
- Natural Language Processing, Stanford, Dan Jurafsky & Chris Manning: The whole course is available on YouTube. This is the link to the first lecture.
- Neural Network Methods for Natural Language Processing: Excellent, concise and up to date book by Yoav Goldberg. It gives a comprehensive overview of the different neural network based techniques for NLP.
- Language Models: Tutorials on n-gram models for language modeling, followed by a comprehensive overview of the different smoothing techniques
- TnT – A Statistical Part of Speech Tagger: Nice paper by Thorsten Brants explaining his hidden markov model tagger. The paper is sufficiently detailed to follow along and implement your own version.
- Word2Vec: The now classic 2013 paper by Mikolov, Chen, Corrado, and Dean, introducing an efficient, completely unsupervised, technique for computing continuous vector representations of words from very large datasets.
- Word Alignments: In depth comparison of different statistical word alignment models.
- Neural networks and NLP: really nice review of recurrent neural networks for sequence learning
- The Unreasonable Effectiveness of Recurrent Neural Networks: Classic post by Andrej Karpathy on the magic of recurrent nets.
- Understanding LSTM networks: Clear and thoughtful introduction to LSTMs.
- Written Memories: Understanding, Deriving and Extending the LSTM: Superb post on Long Short-Term Memory networks
- Neural machine translation: Clear analysis of Google’s neural network architecture for machine translation based on this paper. This article walks you through each of the model components explaining the ideas behind them both from a theoretical and practical standpoint.
- DeVISE: Interesting application of word embeddings to the label space of a visual classifier. By forcing the classifier to output label embeddings, the semantic relationship between labels was incorporated into the model, resulting in much better generalization to unknown classes and more semantically reasonable errors.
- FastText: proof that word embeddings + linear models can give excellent results.
Tools
- NLTK: Python library for natural language processing. Comes with the NLTK book too!
- And the gensim package which makes is really easy to train your own word embeddings. Only one line of code.
I am always looking to learn more. Please send suggestions or comments to contact [at] learningmachinelearning [dot] org