Research in General
Basics of machine learning
Basics of deep learning
-
Practical recommendations for gradient-based training of deep architectures
-
Quick’n’dirty introduction to deep learning: Advances in Deep Learning
-
Contractive auto-encoders: Explicit invariance during feature extraction
-
An Analysis of Single Layer Networks in Unsupervised Feature Learning
-
The importance of Encoding Versus Training With Sparse Coding and Vector Quantization
Feedforward nets
-
“Improving Neural Nets with Dropout” by Nitish Srivastava
-
“What is the best multi-stage architecture for object recognition?”
MCMC
-
Radford Neal’s Review Paper (old but still very comprehensive)
Restricted Boltzmann Machines
-
Unsupervised learning of distributions of binary vectors using 2-layer networks
-
Training restricted Boltzmann machines using approximations to the likelihood gradient
-
Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machine
-
Enhanced Gradient for Training Restricted Boltzmann Machines
-
Using fast weights to improve persistent contrastive divergence
-
Training Products of Experts by Minimizing Contrastive Divergence
Boltzmann Machines
-
Deep Boltzmann Machines (Salakhutdinov & Hinton)
-
A Two-stage Pretraining Algorithm for Deep Boltzmann Machines
Regularized Auto-Encoders
Regularization
Stochastic Nets & GSNs
Others
-
Slow, Decorrelated Features for Pretraining Complex Cell-like Networks
-
What Regularized Auto-Encoders Learn from the Data Generating Distribution
Recurrent Nets
-
Learning long-term dependencies with gradient descent is difficult
-
Learning recurrent neural networks with Hessian-free optimization
-
On the importance of momentum and initialization in deep learning,
-
Long short-term memory (Hochreiter & Schmidhuber)
-
Long Short-Term Memory in Echo State Networks: Details of a Simulation Study
-
The "echo state" approach to analysing and training recurrent neural networks
-
Backpropagation-Decorrelation: online recurrent learning with O(N) complexity
-
New results on recurrent network training:Unifying the algorithms and accelerating convergence
Convolutional Nets
-
ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, NIPS 2012.
Optimization issues with DL
-
Knowledge Matters: Importance of Prior Information for Optimization
-
Practical recommendations for gradient-based training of deep architectures
-
Hessian Free
-
Natural Gradient (TONGA)
NLP + DL
-
Distributed Representations of Words and Phrases and their Compositionality
-
Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection
CV+RBM
CV + DL
Scaling Up
DL + Reinforcement learning
Graphical Models Background
-
An Introduction to Graphical Models (Mike Jordan, brief course notes)
-
A View of the EM Algorithm that Justifies Incremental, Sparse and Other Variants (Neal & Hinton, important paper to the modern understanding of Expectation-Maximization)
-
A Unifying Review of Linear Gaussian Models (Roweis & Ghahramani, ties together PCA, factor analysis, hidden Markov models, Gaussian mixtures, k-means, linear dynamical systems)
-
An Introduction to Variational Methods for Graphical Models (Jordan et al, mean-field, etc.)
Writing
Software documentation
-
Python, Theano, Pylearn2, Linux (bash) (at least the 5 first sections), git (5 first sections), github/contributing to it (Theano doc), vim tutorial or emacs tutorial
Software lists of built-in commands/functions
Other Software stuff to know about:
-
screen
-
ssh
-
ipython
-
matplotlib