Discovering Binary Codes for Documents by Learning Deep Generative Models

Topics in Cognitive Science 3 (1):74-91 (2011)
  Copy   BIBTEX

Abstract

We describe a deep generative model in which the lowest layer represents the word-count vector of a document and the top layer represents a learned binary code for that document. The top two layers of the generative model form an undirected associative memory and the remaining layers form a belief net with directed, top-down connections. We present efficient learning and inference procedures for this type of generative model and show that it allows more accurate and much faster retrieval than latent semantic analysis. By using our method as a filter for a much slower method called TF-IDF we achieve higher accuracy than TF-IDF alone and save several orders of magnitude in retrieval time. By using short binary codes as addresses, we can perform retrieval on very large document sets in a time that is independent of the size of the document set using only one word of memory to describe each document

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 93,031

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Where Do Features Come From?Geoffrey Hinton - 2014 - Cognitive Science 38 (6):1078-1101.
単語の属性空間の表現方法.稲子 希望 笠原 要 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:539-547.

Analytics

Added to PP
2010-08-19

Downloads
201 (#102,749)

6 months
9 (#356,105)

Historical graph of downloads
How can I increase my downloads?