Abstract
The Information Bottleneck method aims to extract a compact representation which preserves the maximum relevant information. The sub-optimality in agglomerative Information Bottleneck (aIB) algorithm restricts the applications of Information Bottleneck method. In this paper, the concept of density-based chains is adopted to evaluate the information loss among the neighbors of an element, rather than the information loss between pairs of elements. The DaIB algorithm is then presented to alleviate the sub-optimality problem in aIB while simultaneously keeping the useful hierarchical clustering tree-structure. The experiment results on the benchmark data sets show that the DaIB algorithm can get more relevant information and higher precision than aIB algorithm, and the paired t-test indicates that these improvements are statistically significant.
This research was supported by the National Science Foundation of China under grant No. 60773048., and Deakin CRGS Grant 2008.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Naftali Tishby, F.C.P., Bialek, W.: The information bottleneck method. on Communication and Computation. In: Proc. 37th Allerton Conference, pp. 368–377 (1999)
Slonim, N.: The Information Bottleneck: Theory and Applications. Ph.D thesis, the Senate of the Hebrew University (2002)
Slonim, N., Tishby, N.: Agglomerative information bottleneck. In: Advances in Neural Information Processing Systems (NIPS), vol. 12, pp. 617–623 (1999)
Noam Slonim, N.F., Tishby, N.: Unsupervised document classification using sequential information maximization on Research and Development in Information Retrieval. In: Proc. of the 25th Ann. Int. ACM SIGIR Conf., 129–136 (2002)
Jacob Goldberger, S.G., Greenspan, H.: Unsupervised image set clustering using an information theoretic framework. IEEE Transactions on Image Processing 15(2), 449–458 (2006)
Ester, M., Hans-Peter Kriegel, J.S., Xu., X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD), pp. 226–231 (1996)
Slonim, N., Tishby, N.: Document clustering using word clusters via the information bottleneck method on Research and Development in Information Retrieval. In: Proc. of the 23rd Ann. Int. ACM SIGIR Conf., pp. 208–215 (2000)
Slonim, N., Tishby, N.: The power of word clusters for text classification. In: 23rd European Colloquium on Information Retrieval Research (ECIR) (2001)
Lang, K.: Learning to filter netnews. In: Proc. of the 12th International Conf. on Machine Learning, pp. 331–339 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ren, Y., Ye, Y., Li, G. (2008). The Density-Based Agglomerative Information Bottleneck. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-89197-0_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89196-3
Online ISBN: 978-3-540-89197-0
eBook Packages: Computer ScienceComputer Science (R0)