The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using common controlled vocabularies or ‘ontologies’. Unfortunately, the very success of this approach has led to a proliferation of ontologies which itself creates obstacles to integration. The Open Biomedical Ontologies (OBO) consortium has set in train a strategy to overcome this problem. Existing (...) OBO ontologies, including the Gene Ontology, are undergoing a process of coordinated reform and new ontologies being created on the basis of an evolving set of shared principles governing ontology development. The result is an expanding family of ontologies designed to be interoperable, logically well-formed, and to incorporate accurate representations of biological reality. We describe the OBO Foundry initiative, and provide guidelines for those who might wish to become involved. (shrink)
The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to (...) existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource providing details on the people, policies, and issues being addressed in association with OBI. (shrink)
Basic Formal Ontology is a top-level ontology consisting of thirty-six classes, designed to support information integration, retrieval, and analysis across all domains of scientific investigation, presently employed in over 350 ontology projects around the world. BFO is a genuine top-level ontology, containing no terms particular to material domains, such as physics, medicine, or psychology. In this paper, we demonstrate how a series of cases illustrating common types of change may be represented by universals, defined classes, and relations employing the BFO (...) framework. We provide discussion of these cases to provide a template for other ontologists using BFO, as well as to facilitate comparison with the strategies proposed by ontologists using different top-level ontologies. (shrink)
Ontologies are being used increasingly to promote the reusability of scientific information by allowing heterogeneous data to be integrated under a common, normalized representation. Definitions play a central role in the use of ontologies both by humans and by computers. Textual definitions allow ontologists and data curators to understand the intended meaning of ontology terms and to use these terms in a consistent fashion across contexts. Logical definitions allow machines to check the integrity of ontologies and reason over data annotated (...) with ontology terms to make inferences that promote knowledge discovery. Therefore, it is important not only to include in ontologies multiple types of definitions in both formal and in natural languages, but also to ensure that these definitions meet good quality standards so they are useful. While tools such as Protégé can assist in creating well-formed logical definitions, producing good definitions in a natural language is still to a large extent a matter of human ingenuity supported at best by just a small number of general principles. For lack of more precise guidelines, definition authors are often left to their own personal devices. This paper aims to fill this gap by providing the ontology community with a set of principles and conventions to assist in definition writing, editing, and validation, by drawing on existing definition writing principles and guidelines in lexicography, terminology, and logic. (shrink)
The Protein Ontology (PRO) provides terms for and supports annotation of species-specific protein complexes in an ontology framework that relates them both to their components and to species-independent families of complexes. Comprehensive curation of experimentally known forms and annotations thereof is expected to expose discrepancies, differences, and gaps in our knowledge. We have annotated the early events of innate immune signaling mediated by Toll-Like Receptor 3 and 4 complexes in human, mouse, and chicken. The resulting ontology and annotation data set (...) has allowed us to identify species-specific gaps in experimental data and possible functional differences between species, and to employ inferred structural and functional relationships to suggest plausible resolutions of these discrepancies and gaps. (shrink)
Definitions vary according to context of use and target audience. They must be made relevant for each context to fulfill their cognitive and linguistic goals. This involves adapting their logical structure, type of content, and form to each context of use. We examine from these perspectives the case of definitions in ontologies.
We are developing the Neurological Disease Ontology (ND) to provide a framework to enable representation of aspects of neurological diseases that are relevant to their treatment and study. ND is a representational tool that addresses the need for unambiguous annotation, storage, and retrieval of data associated with the treatment and study of neurological diseases. ND is being developed in compliance with the Open Biomedical Ontology Foundry principles and builds upon the paradigm established by the Ontology for General Medical Science (OGMS) (...) for the representation of entities in the domain of disease and medical practice. Initial applications of ND will include the annotation and analysis of large data sets and patient records for Alzheimer’s disease, multiple sclerosis, and stroke. (shrink)
Representing species-specific proteins and protein complexes in ontologies that are both human and machine-readable facilitates the retrieval, analysis, and interpretation of genome-scale data sets. Although existing protin-centric informatics resources provide the biomedical research community with well-curated compendia of protein sequence and structure, these resources lack formal ontological representations of the relationships among the proteins themselves. The Protein Ontology (PRO) Consortium is filling this informatics resource gap by developing ontological representations and relationships among proteins and their variants and modified forms. Because (...) proteins are often functional only as members of stable protein complexes, the PRO Consortium, in collaboration with existing protein and pathway databases, has launched a new initiative to implement logical and consistent representation of protein complexes. We describe here how the PRO Consortium is meeting the challenge of representing species-specific protein complexes, how protein complex representation in PRO supports annotation of protein complexes and comparative biology, and how PRO is being integrated into existing community bioinformatics resources. The PRO resource is accessible at http://pir.georgetown.edu/pro/. (shrink)
Biological ontologies are used to organize, curate, and interpret the vast quantities of data arising from biological experiments. While this works well when using a single ontology, integrating multiple ontologies can be problematic, as they are developed independently, which can lead to incompatibilities. The Open Biological and Biomedical Ontologies Foundry was created to address this by facilitating the development, harmonization, application, and sharing of ontologies, guided by a set of overarching principles. One challenge in reaching these goals was that the (...) OBO principles were not originally encoded in a precise fashion, and interpretation was subjective. Here we show how we have addressed this by formally encoding the OBO principles as operational rules and implementing a suite of automated validation checks and a dashboard for objectively evaluating each ontology’s compliance with each principle. This entailed a substantial effort to curate metadata across all ontologies and to coordinate with individual stakeholders. We have applied these checks across the full OBO suite of ontologies, revealing areas where individual ontologies require changes to conform to our principles. Our work demonstrates how a sizable federated community can be organized and evaluated on objective criteria that help improve overall quality and interoperability, which is vital for the sustenance of the OBO project and towards the overall goals of making data FAIR. Competing Interest StatementThe authors have declared no competing interest. (shrink)
We have begun work on two separate but related ontologies for the study of neurological diseases. The first, the Neurological Disease Ontology (ND), is intended to provide a set of controlled, logically connected classes to describe the range of neurological diseases and their associated signs and symptoms, assessments, diagnoses, and interventions that are encountered in the course of clinical practice. ND is built as an extension of the Ontology for General Medical Sciences — a high-level candidate OBO Foundry ontology that (...) provides a set of general classes that can be used to describe general aspects of medical science. ND is being built with classes utilizing both textual and axiomatized definitions that describe and formalize the relations between instances of other classes within the ontology itself as well as to external ontologies such as the Gene Ontology, Cell Ontology, Protein Ontology, and Chemical Entities of Biological Interest. In addition, references to similar or associated terms in external ontologies, vocabularies and terminologies are included when possible. Initial work on ND is focused on the areas of Alzheimer’s and other diseases associated with dementia, multiple sclerosis, and stroke and cerebrovascular disease. Extensions to additional groups of neurological diseases are planned. The second ontology, the Neuro-Psychological Testing Ontology (NPT), is intended to provide a set of classes for the annotation of neuropsychological testing data. The intention of this ontology is to allow for the integration of results from a variety of neuropsychological tests that assay similar measures of cognitive functioning. Neuro-psychological testing is an important component in developing the clinical picture used in the diagnosis of patients with a range of neurological diseases, such as Alzheimer’s disease and multiple sclerosis, and following stroke or traumatic brain injury. NPT is being developed as an extension to the Ontology for Biomedical Investigations. (shrink)
How do we find what is clinically significant in the swarms of data being generated by today’s diagnostic technologies? As electronic records become ever more prevalent – and digital imaging and genomic, proteomic, salivaomics, metabalomics, pharmacogenomics, phenomics and transcriptomics techniques become commonplace – fdifferent clinical and biological disciplines are facing up to the need to put their data houses in order to avoid the consequences of an uncontrolled explosion of different ways of describing information. We describe a new strategy to (...) advance the consistency of data in the dental research community. The strategy is based on the idea that existing systems for data collection in dental research will continue to be used, but proposes a methodology in which past, present and future data will be described using a consensus-based controlled structured vocabulary called the Ontology for Dental Research (ODR). (shrink)
Monoclonal antibodies are essential biomedical research and clinical reagents that are produced by companies and research laboratories. The NIAID ImmPort (Immunology Database and Analysis Portal) resource provides a long-term, sustainable data warehouse for immunological data generated by NIAID, DAIT and DMID funded investigators for data archiving and re-use. A variety of immunological data is generated using techniques that rely upon monoclonal antibody reagents, including flow cytometry, immunofluorescence, and ELISA. In order to facilitate querying, integration, and reuse of data, standardized terminology (...) for describing monoclonal antibody reagents and their targets needs to be used for annotating data submitted to ImmPort. (shrink)
Vaccine research, as well as the development, testing, clinical trials, and commercial uses of vaccines involve complex processes with various biological data that include gene and protein expression, analysis of molecular and cellular interactions, study of tissue and whole body responses, and extensive epidemiological modeling. Although many data resources are available to meet different aspects of vaccine needs, it remains a challenge how we are to standardize vaccine annotation, integrate data about varied vaccine types and resources, and support advanced vaccine (...) data analysis and inference. To address these problems, the community-based Vaccine Ontology (VO) has been developed through collaboration with vaccine researchers and many national and international centers and programs, including the National Center for Biomedical Ontology (NCBO), the Infectious Disease Ontology (IDO) Initiative, and the Ontology for Biomedical Investigations (OBI). VO utilizes the Basic Formal Ontology (BFO) as the top ontology and the Relation Ontology (RO) for definition of term relationships. VO is represented in the Web Ontology Language (OWL) and edited using the Protégé-OWL. Currently VO contains more than 2000 terms and relationships. VO emphasizes on classification of vaccines and vaccine components, vaccine quality and phenotypes, and host immune response to vaccines. These reflect different aspects of vaccine composition and biology and can thus be used to model individual vaccines. More than 200 licensed vaccines and many vaccine candidates in research or clinical trials have been modeled in VO. VO is being used for vaccine literature mining through collaboration with the National Center for Integrative Biomedical Informatics (NCIBI). Multiple VO applications will be presented. (shrink)
This paper proposes a reformulation of the treatment of boundaries, at parts and aggregates of entities in Basic Formal Ontology. These are currently treated as mutually exclusive, which is inadequate for biological representation since some entities may simultaneously be at parts, boundaries and/or aggregates. We introduce functions which map entities to their boundaries, at parts or aggregations. We make use of time, space and spacetime projection functions which, along the way, allow us to develop a simple temporal theory.
Identification of non-coding RNAs (ncRNAs) has been significantly enhanced due to the rapid advancement in sequencing technologies. On the other hand, semantic annotation of ncRNA data lag behind their identification, and there is a great need to effectively integrate discovery from relevant communities. To this end, the Non-Coding RNA Ontology (NCRO) is being developed to provide a precisely defined ncRNA controlled vocabulary, which can fill a specific and highly needed niche in unification of ncRNA biology.
Identification of non-coding RNAs (ncRNAs) has been significantly improved over the past decade. On the other hand, semantic annotation of ncRNA data is facing critical challenges due to the lack of a comprehensive ontology to serve as common data elements and data exchange standards in the field. We developed the Non-Coding RNA Ontology (NCRO) to handle this situation. By providing a formally defined ncRNA controlled vocabulary, the NCRO aims to fill a specific and highly needed niche in semantic annotation of (...) large amounts of ncRNA biological and clinical data. (shrink)
Interoperability across data sets is a key challenge for quantitative histopathological imaging. There is a need for an ontology that can support effective merging of pathological image data with associated clinical and demographic data. To foster organized, cross-disciplinary, information-driven collaborations in the pathological imaging field, we propose to develop an ontology to represent imaging data and methods used in pathological imaging and analysis, and call it Quantitative Histopathological Imaging Ontology – QHIO. We apply QHIO to breast cancer hot-spot detection with (...) the goal of enhancing reliability of detection by promoting the sharing of data between image analysts. (shrink)
We describe the rationale for an application ontology covering the domain of human body fluids that is designed to facilitate representation, reuse, sharing and integration of diagnostic, physiological, and biochemical data, We briefly review the Blood Ontology (BLO), Saliva Ontology (SALO) and Kidney and Urinary Pathway Ontology (KUPO) initiatives. We discuss the methods employed in each, and address the project of using them as starting point for a unified body fluids ontology resource. We conclude with a description of how the (...) body fluids ontology initiative may provide support to basic and translational science. (shrink)
Letter commenting on the paper -/- Barry Smith, Louis J. Goldberg, Alan Ruttenberg & Michael Glick, "Ontology and the Future of Dental Research Informatics", Journal of the American Dental Association 141 2010;(10):1173-75 -/- with responses by the authors of the paper.