Cognitive Science 38 (6):1078-1101 (2014)

It is possible to learn multiple layers of non-linear features by backpropagating error derivatives through a feedforward neural network. This is a very effective learning procedure when there is a huge amount of labeled training data, but for many learning tasks very few labeled examples are available. In an effort to overcome the need for labeled data, several different generative models were developed that learned interesting features by modeling the higher order statistical structure of a set of input vectors. One of these generative models, the restricted Boltzmann machine (RBM), has no connections between its hidden units and this makes perceptual inference and learning much simpler. More significantly, after a layer of hidden features has been learned, the activities of these features can be used as training data for another RBM. By applying this idea recursively, it is possible to learn a deep hierarchy of progressively more complicated features without requiring any labeled data. This deep hierarchy can then be treated as a feedforward neural network which can be discriminatively fine-tuned using backpropagation. Using a stack of RBMs to initialize the weights of a feedforward neural network allows backpropagation to work effectively in much deeper networks and it leads to much better generalization. A stack of RBMs can also be used to initialize a deep Boltzmann machine that has many hidden layers. Combining this initialization method with a new method for fine-tuning the weights finally leads to the first efficient way of training Boltzmann machines with many hidden layers and millions of weights
Keywords Backpropagation  Learning graphical models  Variational learning  Boltzmann machines  Distributed representations  Deep learning  Contrastive divergence  Learning features
Categories (categorize this paper)
DOI 10.1111/cogs.12049
Edit this record
Mark as duplicate
Export citation
Find it on Scholar
Request removal from index
Revision history

Download options

PhilArchive copy

Upload a copy of this paper     Check publisher's policy     Papers currently archived: 61,064
Through your library

References found in this work BETA

Finding Structure in Time.Jeffrey L. Elman - 1990 - Cognitive Science 14 (2):179-211.
Connectionist Learning of Belief Networks.Radford M. Neal - 1992 - Artificial Intelligence 56 (1):71-113.

View all 7 references / Add more references

Citations of this work BETA

View all 7 citations / Add more citations

Similar books and articles

Human Semi-Supervised Learning.Bryan R. Gibson, Timothy T. Rogers & Xiaojin Zhu - 2013 - Topics in Cognitive Science 5 (1):132-172.
Learning Simple Things: A Connectionist Learning Problem From Various Perspectives.Edward P. Stabler - 1988 - PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association 1988:424 - 441.


Added to PP index

Total views
51 ( #204,113 of 2,439,687 )

Recent downloads (6 months)
1 ( #432,499 of 2,439,687 )

How can I increase my downloads?


My notes