Topics in Cognitive Science 3 (1):74-91 (2011)

We describe a deep generative model in which the lowest layer represents the word-count vector of a document and the top layer represents a learned binary code for that document. The top two layers of the generative model form an undirected associative memory and the remaining layers form a belief net with directed, top-down connections. We present efficient learning and inference procedures for this type of generative model and show that it allows more accurate and much faster retrieval than latent semantic analysis. By using our method as a filter for a much slower method called TF-IDF we achieve higher accuracy than TF-IDF alone and save several orders of magnitude in retrieval time. By using short binary codes as addresses, we can perform retrieval on very large document sets in a time that is independent of the size of the document set using only one word of memory to describe each document
Keywords Auto‐encoders  Restricted Boltzmann machines  Deep learning  Document retrieval  Semantic hashing  Binary codes
Categories No categories specified
(categorize this paper)
DOI 10.1111/j.1756-8765.2010.01109.x
Edit this record
Mark as duplicate
Export citation
Find it on Scholar
Request removal from index
Revision history

Download options

PhilArchive copy

Upload a copy of this paper     Check publisher's policy     Papers currently archived: 62,343
Through your library

References found in this work BETA

Learning Multiple Layers of Representation.Geoffrey E. Hinton - 2007 - Trends in Cognitive Sciences 11 (10):428-434.

Add more references

Citations of this work BETA

Add more citations

Similar books and articles


Added to PP index

Total views
185 ( #55,891 of 2,445,362 )

Recent downloads (6 months)
2 ( #310,870 of 2,445,362 )

How can I increase my downloads?


My notes