Abstract
Named entities have been considered and combined with keywords to enhance information retrieval performance. However, there is not yet a formal and complete model that takes into account entity names, classes, and identifiers together. Our work exploresvariousadaptations of the traditional Vector Space Model that combine different ontological features with keywords, and in different ways. It shows better performance of the proposed models as compared to the keyword-based Lucene, and their advantages for both text retrieval and representation of documents and queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Buckley, C.: Implementation of the SMART Information Retrieval System. Technical Report 85-686, Cornell University (1985)
Cao, T.H., Do, H.T., Hong, D.T., Quan, T.T.: Fuzzy Named Entity-Based Document Clustering. In: Proceedings of the 17th IEEE International Conference on Fuzzy Systems, pp. 2028–2034 (2008)
Castells, P., Vallet, D., Fernández, M.: An Adaptation of the Vector Space Model for Ontology-Based Information Retrieval. IEEE Transactions of Knowledge and Data Engineering, 261–272 (2006)
Dominich, S.: Paradox-Free Formal Foundation of Vector Space Model. In: Proceedings of the ACM SIGIR 2002 Workshop on Mathematical/Formal Methods in Information Retrieval, pp. 43–48 (2002)
Gonçalves, A., Zhu, J., Song, D., Uren, V., Pacheco, R.: LRD: Latent Relation Discovery for Vector Space Expansion and Information Retrieval. In: Proceedings of the 7th International Conference on Web-Age Information Management (2006)
Gospodnetic, O.: Parsing, Indexing, and Searching XML with Digester and Lucene. Journal of IBM DeveloperWorks (2003)
Guha, R., McCool, R., Miller, E.: Semantic Search. In: Proceedings of the 12th International Conference on World Wide Web, pp. 700–709 (2003)
Khalid, M.A., Jijkoun, V., de Rijke, M.: The Impact of Named Entity Normalization on Information Retrieval for Question Answering. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 705–710. Springer, Heidelberg (2008)
Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D.: Semantic Annotation, Indexing, and Retrieval. Journal of Web Semantics 2 (2005)
Lee, D.L., Chuang, H., Seamons, K.: Document Ranking and the Vector-Space Model. IEEE Software 14, 67–75 (1997)
Meij, E., Katrenko, S.: Bootstrapping Language Associated with Biomedical Entities. In: Proceedings of the 16th Text REtrieval Conference (2007)
Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Communications of the ACM 18, 613–620 (1975)
Sekine, S.: Named Entity: History and Future. Proteus Project Report (2004)
Sparck Jones, K., Walker, S., Robertson, S.E.: A Probabilistic Model of Information Retrieval: Development and Comparative Experiments – Part 1 and Part 2. Information Processing and Management 36, 779–808, 809–840 (2000)
van Rijbergen, C.J.: A Non-Classical Logic for Information Retrieval. The Computer Journal 29, 481–485 (1986)
Varelas, G., Voutsakis, E., Raftopoulou, P., Petrakis, E.G.M., Milios, E.E.: Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web. In: Proceedings of the 7th Annual ACM Intl Workshop on Web Information and Data Management, pp. 10–16 (2005)
Zhou, W., Yu, C.T., Torvik, V.I., Smalheiser, N.R.: A Concept-based Framework for Passage Retrieval in Genomics. In: Proceedings of the 15th Text REtrieval Conference (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cao, T.H., Le, K.C., Ngo, V.M. (2008). Exploring Combinations of Ontological Features and Keywords for Text Retrieval. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_55
Download citation
DOI: https://doi.org/10.1007/978-3-540-89197-0_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89196-3
Online ISBN: 978-3-540-89197-0
eBook Packages: Computer ScienceComputer Science (R0)