Abstract
Using web standards, such as uniform resource identifiers (URIs), XML and HTTP, for naming and describing resources which are not information objects is the key difference between the Web as we know it today and the Semantic Web. Naming and interlinking this type of resources by HTTP URIs (instead of individual constants in a formal language) is the key feature which distinguishes traditional knowledge representation from web-scale knowledge representation. However, this use of URIs brought back attention to the old philosophical problem of identity and reference in a new form. In this paper, we analyze the new version of the problem, provide a formal model for dealing with it when interlinking knowledge on the Web, and argue for the need of a distinction between the use of URIs for describing and accessing resources, and the use of URIs for fixing the reference. We show that in the current practice of linking data these roles are not clearly distinguished, and that this fact may cause unwanted effects and prevent some basic forms of data integration. We also discuss the role of an entity name system as a potential piece of infrastructure for fixing the reference in the Semantic Web.
Similar content being viewed by others
Notes
See http://www.w3.org/2001/09/06-ecdl/slide17-0.html for a classical representation of the cake.
See http://richard.cyganiak.de/2007/10/lod/ for an up-to-date representation of the so-called linked data cloud.
RDF is the acronym of Resource Description Framework, namely the W3C standard for describing resources. The building blocks of RDF are the so-called “triples”, namely simple statements of the form subject-predicate-object (like “Paris—is the capital of—France”). A set of RDF triples can be seen as a graph, as for example the object of a triple may be the subject of another triple (e.g., “France—is part of—Europe”. For more technical information, see http://www.w3.org/RDF/.
DBpedia is a partial representation of Wikipedia in RDF, where triples are mainly extracted from the so-called infobox of any Wikipedia page.
Technically, the second method is a special case of the first, as making a owl:sameAs statement presupposes the ability of naming resources defined in other graphs. However, the intended use can be quite different, so we maintain the distinction.
In the rest of the paper, we follow Gangemi and Presutti (2006, 2007), Halpin and Presutti (2009) in using the distinction between web resources and non-web resources instead of the distinction between information resources and non-information resources as proposed for example in http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial. The point is that for example even a non-digital book counts as an information resource, although it cannot be made directly accessible on the web. Only digital objects can be accessed on the Web, and the notion of web resource captures the idea of digital object accessible on the Web, by means of dereferencing a URI. It is also important to remark that the distinction between web and non-web resources is independent of the online/offline distinction: according to the original definition, a HTML page (or any kind of digital object) can be a web resource even if it happens not to be actually published on the web; it is more a matter of being suitable for becoming retrievable through a URI rather than being actually retrievable.
This term was introduced by Kendall Grant Clark in http://www.xml.com/pub/a/2002/09/11/deviant.html in 2002.
See his article on Curing the Web’s Identity Crisis at http://www.ontopia.net/topicmaps/materials/identitycrisis.html.
The examples below and the terminology are taken from a very influential document on Cool URIs for the Semantic Web at http://www.w3.org/TR/cooluris.
Technically oriented readers may want to check http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html for an illustration of the solution proposed in the Technical Architecture Group of W3C on this issue.
See for example, Tabulator http://dig.csail.mit.edu/2007/tab, Disco http://www4.wiwiss.fu-berlin.de/bizer/ng4j/disco and the OpenLink RDF browser http://demo.openlinksw.com/DAV/JS/rdfbrowser/index.html.
A well-known tutorial on how to do this can be found at http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial.
The OKKAM project has deployed an implementation of the ENS, which can be used through its APIs presented at http://api.okkam.org/. More documentation and a few showcases can be found at http://community.okkam.org/.
One would obtain the same result using the URI http://UNI2/resource/John in the query.
See the concept of “nearly same as” discussed by Pat Hayes in his ISWC2009 invited talk: http://www.slideshare.net/PatHayes/blogic-iswc-2009-invited-talk.
See Bouquet et al. (2009a) for a model of querying the Web of Data.
Remember that ENS URIs are HTTP URIs.
For the formal definition of this property see Bouquet et al. (2009b). Informally speaking, the property of ens:corefer holds between semantic URIs and ENS URIs and means that the semantic URI gives information about the non-web resource that is the bearer of the ENS URI. ens:corefer is irreflexive, asymmetric and intransitive. Clearly, if two semantic URIs stand in the ens:corefer property to the same ENS URI, that means that the two semantic URIs give information about one and the same non-web resource.
Note that the entity lifecycle management mechanisms inside the ENS operate to avoid that two ENS URIs for one and the same entity are generated.
This recalls Frege’s (Frege 1892) treatment of indirect contexts.
We stress that it is the pragmatic role of ENS URIs, i.e., the way and purpose for which they are used, that distinguishes them from semantic URIs. ENS are not used to make assertions about non-web resources but to gather all the information about such non-web resources that is published on the Web.
See http://sig.ma/.
References
Austin, J. L. (1962). How to do things with words: The William James lecture. Oxford: Oxford University Press.
Bouquet, P., Ghidini, C., & Serafini, L. (2009). Querying the web of data: A formal approach. In: Y. Yu, A. Gomez-Peréz, & Y. Ding (Eds.), Proceedings of the 4th Asian Semantic Web Conference (ASWC2009), Shanghai, China, 7–9 December 2009. LNCS 5926 (pp. 291–305). Berlin: Springer.
Bouquet, P., Palpanas, T., Stoermer, H., & Vignolo, M. (2009). A conceptual model for a web-scale entity name system. In Y. Yu, A. Gomez-Peréz, & Y. Ding (Eds.), Proceedings of the 4th Asian Semantic Web Conference (ASWC2009), Shanghai, China, 7–9 December 2009. LNCS 5926. Berlin: Springer. pp. 46–60
Donnellan, K. S. (1966). Reference and definite descriptions. The Philosophical Review, 77. pp. 281–304.
Frege, G. (1892). Über sinn und bedeutung. Zeitschrift für Philosophie und philosophische Kritik, 100. pp. 25–50.
Gangemi, A., & Presutti, V. (2006). Towards an OWL ontology for identity on the web. In: Semantic Web Applications and Perspectives (SWAP2006). http://CEUR-WS.org/Vol-201/43.pdf.
Gangemi, A., & Presutti, V. (2007). A grounded ontology for identity and reference of web resources. In: i3: Identity, identifiers, identification. Proceedings of the WWW2007 Workshop on Entity-Centric Approaches to Information and Knowledge Management on the Web, Banff, Canada, 8 May 2007. http://CEUR-WS.org/Vol-249/submission_71.pdf.
Halpin, H., & Presutti, V. (2009). An ontology of resources: solving the identity crisis. In: L. Aroyo, et al. (Eds.), Proceedings of ESWC2009. Studies in logic and computation. Chichester: Research Studies Press/Wiley. pp. 121–140
Kaplan, D. (1968). Quantifying in. Synthese, 19. pp. 178–214.
Kripke, S. (1980). Naming and necessity. Boston: Basil Blackwell.
Acknowledgements
This work is partially supported by the FP7 EU Large-Scale Integrating Project OKKAM—Enabling a Web of Entities (contract no. 215032). For more details, visit http://project.okkam.org/.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bouquet, P., Stoermer, H. & Vignolo, M. Web of Data and Web of Entities: Identity and Reference in Interlinked Data in the Semantic Web. Philos. Technol. 25, 5–26 (2012). https://doi.org/10.1007/s13347-010-0011-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13347-010-0011-6