Computer Vision

Resources for learning about computer vision. They focus mostly on deep learning for computer vision but include the odd traditional computer vision paper. I update the list on an ongoing basis. See Neural Networks for general resources on deep learning.

Convolutional Networks

Deep Learning for Computer Vision: Andrej Karpathy’s tutorial on Convolutional Networks from the Bay Area Deep Learning School. 90 minutes well spent, it’s an excellent tutorial. His message “don’t be a hero” has stuck.
ImageNet classification with Convolutional Neural Networks: THE paper which convinced the machine learning community that neural networks could achieve excellent results. Kicked off of the deep learning renaissance.
Visualizing convolutional neural networks (part 1): Keras blog post and python code to visualize individual neurons in the VGG16 model
Visualizing convolutional neural networks (part 2): Deep visualization toolbox
Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images: Important paper highlighting the ease of generating adversarial images which fool convolutional neural networks and the difficulty of training networks to robustly avoid these types of errors.
Interactive tool for visualizing convolutional network features for Keras.
Visualizing weights and intermediate outputs of a CNN in Keras: Great 12min video tutorial by Anuj Shah.
Faster R-CNN: The successor to R-CNN and Fast R-CNN, this uses convolutional object detection and region proposal networks for almost real-time multi-object detection. The main competitor with YOLO.
YOLO (You Only Look Once): Very fast multi-object detection. Capable of processing images at 40-90 fps! See here for an implementation in C.
Generating image descriptions: Seminal paper be Fei-Fei Li and Andrej Karpathy, involving multi-modal training data. Google has open-sourced an implementation of their related work, Show and Tell. Finally, Microsoft Research came out with some really interesting work on open ended Visual Question Answering.

Traditional computer vision

Distinctive Image Features from Scale-Invariant Keypoints: Classic paper from 2004 introducing the SIFT algorithm by David Lowe. Still widely used today.

I am always looking to learn more. Please send suggestions or comments to contact [at] learningmachinelearning [dot] org

	A Neural Network Pro… on A Neural Network program in Py…
	A Neural Network pro… on Regularization for Neural…
	A Neural Network pro… on Introduction to Neural Ne…
	Regularization for N… on Introduction to Neural Ne…

Learning Machine Learning

Tutorials and resources for machine learning and data analysis enthusiasts

Computer Vision

Share this: