General resources for machine learning.
- Andrew Ng’s Machine Learning course on Coursera: This is a fantastic introductory Machine Learning course and is a great place to start learning about machine learning. It assumes a basic knowledge of programming. Primary course language: MATLAB / Octave
- Python for Data Analysis: Introduction to data processing and manipulation in Python. Focuses on two libraries, numpy and Pandas. Written by the author of the Pandas library, Wes McKinney.
- Machine Learning with SciKit Learn: Fast paced overview of SciKit Learn by Andreas Mueller, a core developer and co-maintainer of the library. Andreas packs a lot of information into one hour but makes it easy to take in. The introduction to Grid Search and Pipelines was a highlight for me. A really useful and flexible way to manage your data pre-processing and model building.
- Advanced SciKit Learn: Where to go to learn about SciKit Learn’s more advanced features. I recommend watching the Machine Learning with SckiKit Learn (link above) first.
- Kullback-Leibler divergence explained: Nice, simple explanation.
- Yann LeCun Director of AI Research at Facebook discussing deep learning and the future of artificial intelligence
- AI, Deep Learning, and Machine Learning: A Primer: Great video by Frank Chen. The best part is the history of AI and the timeline of AI winters!
- The Master Algorithm: Overview of the five different families of AI algorithms. Particularly useful for understanding the motivations for and philosophies underpinning the different AI schools of thought.
- Model selection, model evaluation, and algorithm selection in machine learning: Excellent series of in depths tutorials by Sebastian Raschka
- How to use t-SNE effectively: Clear explanation of the effect of different parameter settings on t-SNE plots with lots of visual examples.
- How to set up an AWS GPU for machine learning: Great tutorial by Jason Brownlee, walks you through step by step. I recommend starting with the AWS Setting up documentation to get an IAM and security group set up before switching to Jason’s tutorial.
- Jeff Knupp’s blog and videos are fantastic resources for improving your Python programming.
- This Matplotlib tutorial. 95% of my plotting is basic but I’m constantly forgetting the commands to produce a clean, simple, clear plot. This always saves me.
Essential Python Libraries
-
- Numpy: Scientific computing package
- Pandas: Essential for data pre-processing and manipulation
- SciKit Learn: Machine learning in Python. Comprehensive, very well documented and easy to use
- NLKT: Natural Language Processing Toolkit
- Matplotlib: Data visualisation
- Keras: minimalist, highly modular neural networks library which runs on top of Theano or Tensor Flow
- TensorFlow: Google’s machine learning software library
- pyTorch: Fast, flexible library for implementing and training neural networks. Dynamic graph computation and seamless transfers from CPU to GPU.
- OpenCV: Open source computer vision library to process and transform images, detect features, analyze videos, and calibrate cameras.
- Gensim: Python library for topic modeling. Particularly useful for the Word2Vec implementation which makes it very easy to train word embeddings with just one line of code.
- Python Object Serialization: Very useful for saving data structures (e.g. lists)
- Python Imaging Library (forked) for 3.5
- HD5F for Python: Essential for storing large amount of numerical data
- Python style guide
I’m always looking to learn more. Please send suggestions or comments to contact [at] learningmachinelearning [dot] org