David Bourget (Western Ontario)
David Chalmers (ANU, NYU)
Rafael De Clercq
Jack Alan Reynolds
Learn more about PhilPapers
The ARXMLIV corpus is a remarkable collection of text containing scientiﬁc mathematical discourse. With more than half a million documents, it is an ambitious target for large scale linguistic and semantic analysis, requiring a generalized and distributed approach. In this paper we implement an architecture which solves and automates the issues of knowledge representation and knowledge management, providing an abstraction layer for distributed development of semantic analysis tools. Furthermore, we enable document interaction and visualization and present current implementations of semantic tools and follow-up applications using this architecture. We identify ﬁve different stages, or purposes, which such architecture needs to address, encapsulating each in an independent module. These stages are determined by the different properties of the document formats used, as well as the state of processing and linguistic enrichment introduced so far. We discuss the need of migration between XML representations and the challenges it would pose on our system, revealing the beneﬁts and trade-off of each format we employ. In the heart of the architecture lies the Semantic Blackboard module. The Semantic Blackboard comprises a system based on a centralized RDF database which can facilitate distributed corpus analysis of arbitrary applications, or analysis modules. This is achieved by providing a document abstraction layer and a mechanism for storing, reusing and communicating results via RDF stand-off annotations deposited in the central database. Achieving a properly encapsulated and automated pipeline from the input corpus document to a semantically enriched output in a state-of-the-art representation is the task of the Preprocessing, Semantic Result and Output Generation modules. Each of them addresses the task of format migration and enhances the document for further semantic enrichment or aggregation. The ﬁfth module, targeting Visualization and Feedback, enables user interaction and display of different stages of processing..
|Keywords||No keywords specified (fix it)|
No categories specified
(categorize this paper)
Setup an account with your affiliations in order to access resources via your University's proxy server
Configure custom proxy (use this if your affiliation does not provide a proxy)
|Through your library||
References found in this work BETA
No references found.
Citations of this work BETA
No citations found.
Similar books and articles
Michael G. Dyer (2006). Will the Neural Blackboard Architecture Scale Up to Semantics? Behavioral and Brain Sciences 29 (1):77-78.
Cynthia A. Thompson, Roger Levy & Christopher D. Manning, A Generative Model for Semantic Role Labeling.
Brian Riordan & Michael N. Jones (2011). Redundancy in Perceptual and Linguistic Experience: Comparing Feature-Based and Distributional Models of Semantic Representation. Topics in Cognitive Science 3 (2):303-345.
Jaroslav Peregrin (2010). The Myth of Semantic Structure. In Piotr Stalmaszczyk (ed.), Philosophy of Language and Linguistics. Ontos Verlag. 1.
M. Balconi & U. Pozzoli (2003). ERPs (Event-Related Potentials), Semantic Attribution, and Facial Expression of Emotions. Consciousness and Emotion 4 (1):63-80.
Added to index2010-12-22
Total downloads3 ( #283,452 of 1,096,661 )
Recent downloads (6 months)1 ( #271,187 of 1,096,661 )
How can I increase my downloads?