General resources for machine learning.

  • Andrew Ng’s Machine Learning course on Coursera: This is a fantastic introductory Machine Learning course and is a great place to start learning about machine learning. It assumes a basic knowledge of programming. Primary course language: MATLAB / Octave
  • Python for Data Analysis: Introduction to data processing and manipulation in Python. Focuses on two libraries, numpy and Pandas. Written by the author of the Pandas library, Wes McKinney.
  • Machine Learning with SciKit Learn: Fast paced overview of SciKit Learn by Andreas Mueller, a core developer and co-maintainer of the library. Andreas packs a lot of information into one hour but makes it easy to take in. The introduction to Grid Search and Pipelines was a highlight for me. A really useful and flexible way to manage your data pre-processing and model building.
  • Advanced SciKit Learn: Where to go to learn about SciKit Learn’s more advanced features. I recommend watching the Machine Learning with SckiKit Learn (link above) first.
  • Kullback-Leibler divergence explained: Nice, simple explanation.
  • Yann LeCun Director of AI Research at Facebook discussing deep learning and the future of artificial intelligence
  • AI, Deep Learning, and Machine Learning: A Primer: Great video by Frank Chen. The best part is the history of AI and the timeline of AI winters!
  • The Master Algorithm: Overview of the five different families of AI algorithms. Particularly useful for understanding the motivations for and philosophies underpinning the different AI schools of thought.
  • Model selection, model evaluation, and algorithm selection in machine learning: Excellent series of in depths tutorials by Sebastian Raschka
  • How to use t-SNE effectively: Clear explanation of the effect of different parameter settings on t-SNE plots with lots of visual examples.
  • How to set up an AWS GPU for machine learning: Great tutorial by Jason Brownlee, walks you through step by step. I recommend starting with the AWS Setting up documentation to get an IAM and security group set up before switching to Jason’s tutorial.
  • Jeff Knupp’s blog and videos are fantastic resources for improving your Python programming.
  • This Matplotlib tutorial. 95% of my plotting is basic but I’m constantly forgetting the commands to produce a clean, simple, clear plot. This always saves me.

Essential Python Libraries

    • Numpy: Scientific computing package
    • Pandas: Essential for data pre-processing and manipulation
    • SciKit Learn: Machine learning in Python. Comprehensive, very well documented and easy to use
    • NLKT: Natural Language Processing Toolkit
    • Matplotlib: Data visualisation
    • Keras: minimalist, highly modular neural networks library which runs on top of Theano or Tensor Flow
    • TensorFlow: Google’s machine learning software library
    • pyTorch: Fast, flexible library for implementing and training neural networks. Dynamic graph computation and seamless transfers from CPU to GPU.
    • OpenCV: Open source computer vision library to process and transform images, detect features, analyze videos, and calibrate cameras.
    • Gensim: Python library for topic modeling. Particularly useful for the Word2Vec implementation which makes it very easy to train word embeddings with just one line of code.
    • Python Object Serialization: Very useful for saving data structures (e.g. lists)
    • Python Imaging Library (forked) for 3.5
    • HD5F for Python: Essential for storing large amount of numerical data
    • Python style guide

I’m always looking to learn more. Please send suggestions or comments to contact [at] learningmachinelearning [dot] org