Ontologies are being used increasingly to promote the reusability of scientific information by allowing heterogeneous data to be integrated under a common, normalized representation. Definitions play a central role in the use of ontologies both by humans and by computers. Textual definitions allow ontologists and data curators to understand the intended meaning of ontology terms and to use these terms in a consistent fashion across contexts. Logical definitions allow machines to check the integrity of ontologies and reason over data annotated (...) with ontology terms to make inferences that promote knowledge discovery. Therefore, it is important not only to include in ontologies multiple types of definitions in both formal and in natural languages, but also to ensure that these definitions meet good quality standards so they are useful. While tools such as Protégé can assist in creating well-formed logical definitions, producing good definitions in a natural language is still to a large extent a matter of human ingenuity supported at best by just a small number of general principles. For lack of more precise guidelines, definition authors are often left to their own personal devices. This paper aims to fill this gap by providing the ontology community with a set of principles and conventions to assist in definition writing, editing, and validation, by drawing on existing definition writing principles and guidelines in lexicography, terminology, and logic. (shrink)
Identification of non-coding RNAs (ncRNAs) has been significantly improved over the past decade. On the other hand, semantic annotation of ncRNA data is facing critical challenges due to the lack of a comprehensive ontology to serve as common data elements and data exchange standards in the field. We developed the Non-Coding RNA Ontology (NCRO) to handle this situation. By providing a formally defined ncRNA controlled vocabulary, the NCRO aims to fill a specific and highly needed niche in semantic annotation of (...) large amounts of ncRNA biological and clinical data. (shrink)
Definitions vary according to context of use and target audience. They must be made relevant for each context to fulfill their cognitive and linguistic goals. This involves adapting their logical structure, type of content, and form to each context of use. We examine from these perspectives the case of definitions in ontologies.
The Protein Ontology (PRO) provides terms for and supports annotation of species-specific protein complexes in an ontology framework that relates them both to their components and to species-independent families of complexes. Comprehensive curation of experimentally known forms and annotations thereof is expected to expose discrepancies, differences, and gaps in our knowledge. We have annotated the early events of innate immune signaling mediated by Toll-Like Receptor 3 and 4 complexes in human, mouse, and chicken. The resulting ontology and annotation data set (...) has allowed us to identify species-specific gaps in experimental data and possible functional differences between species, and to employ inferred structural and functional relationships to suggest plausible resolutions of these discrepancies and gaps. (shrink)
The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to (...) existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed in association with OBI. The current release of OBI is available at http://purl.obolibrary.org/obo/obi.owl. (shrink)
Interoperability across data sets is a key challenge for quantitative histopathological imaging. There is a need for an ontology that can support effective merging of pathological image data with associated clinical and demographic data. To foster organized, cross-disciplinary, information-driven collaborations in the pathological imaging field, we propose to develop an ontology to represent imaging data and methods used in pathological imaging and analysis, and call it Quantitative Histopathological Imaging Ontology – QHIO. We apply QHIO to breast cancer hot-spot detection with (...) the goal of enhancing reliability of detection by promoting the sharing of data between image analysts. (shrink)
Identification of non-coding RNAs (ncRNAs) has been significantly enhanced due to the rapid advancement in sequencing technologies. On the other hand, semantic annotation of ncRNA data lag behind their identification, and there is a great need to effectively integrate discovery from relevant communities. To this end, the Non-Coding RNA Ontology (NCRO) is being developed to provide a precisely defined ncRNA controlled vocabulary, which can fill a specific and highly needed niche in unification of ncRNA biology.
Monoclonal antibodies are essential biomedical research and clinical reagents that are produced by companies and research laboratories. The NIAID ImmPort (Immunology Database and Analysis Portal) resource provides a long-term, sustainable data warehouse for immunological data generated by NIAID, DAIT and DMID funded investigators for data archiving and re-use. A variety of immunological data is generated using techniques that rely upon monoclonal antibody reagents, including flow cytometry, immunofluorescence, and ELISA. In order to facilitate querying, integration, and reuse of data, standardized terminology (...) for describing monoclonal antibody reagents and their targets needs to be used for annotating data submitted to ImmPort. (shrink)
We are developing the Neurological Disease Ontology (ND) to provide a framework to enable representation of aspects of neurological diseases that are relevant to their treatment and study. ND is a representational tool that addresses the need for unambiguous annotation, storage, and retrieval of data associated with the treatment and study of neurological diseases. ND is being developed in compliance with the Open Biomedical Ontology Foundry principles and builds upon the paradigm established by the Ontology for General Medical Science (OGMS) (...) for the representation of entities in the domain of disease and medical practice. Initial applications of ND will include the annotation and analysis of large data sets and patient records for Alzheimer’s disease, multiple sclerosis, and stroke. (shrink)
Vaccine research, as well as the development, testing, clinical trials, and commercial uses of vaccines involve complex processes with various biological data that include gene and protein expression, analysis of molecular and cellular interactions, study of tissue and whole body responses, and extensive epidemiological modeling. Although many data resources are available to meet different aspects of vaccine needs, it remains a challenge how we are to standardize vaccine annotation, integrate data about varied vaccine types and resources, and support advanced vaccine (...) data analysis and inference. To address these problems, the community-based Vaccine Ontology (VO) has been developed through collaboration with vaccine researchers and many national and international centers and programs, including the National Center for Biomedical Ontology (NCBO), the Infectious Disease Ontology (IDO) Initiative, and the Ontology for Biomedical Investigations (OBI). VO utilizes the Basic Formal Ontology (BFO) as the top ontology and the Relation Ontology (RO) for definition of term relationships. VO is represented in the Web Ontology Language (OWL) and edited using the Protégé-OWL. Currently VO contains more than 2000 terms and relationships. VO emphasizes on classification of vaccines and vaccine components, vaccine quality and phenotypes, and host immune response to vaccines. These reflect different aspects of vaccine composition and biology and can thus be used to model individual vaccines. More than 200 licensed vaccines and many vaccine candidates in research or clinical trials have been modeled in VO. VO is being used for vaccine literature mining through collaboration with the National Center for Integrative Biomedical Informatics (NCIBI). Multiple VO applications will be presented. (shrink)
How do we find what is clinically significant in the swarms of data being generated by today’s diagnostic technologies? As electronic records become ever more prevalent – and digital imaging and genomic, proteomic, salivaomics, metabalomics, pharmacogenomics, phenomics and transcriptomics techniques become commonplace – fdifferent clinical and biological disciplines are facing up to the need to put their data houses in order to avoid the consequences of an uncontrolled explosion of different ways of describing information. We describe a new strategy to (...) advance the consistency of data in the dental research community. The strategy is based on the idea that existing systems for data collection in dental research will continue to be used, but proposes a methodology in which past, present and future data will be described using a consensus-based controlled structured vocabulary called the Ontology for Dental Research (ODR). (shrink)
We describe the rationale for an application ontology covering the domain of human body fluids that is designed to facilitate representation, reuse, sharing and integration of diagnostic, physiological, and biochemical data, We briefly review the Blood Ontology (BLO), Saliva Ontology (SALO) and Kidney and Urinary Pathway Ontology (KUPO) initiatives. We discuss the methods employed in each, and address the project of using them as starting point for a unified body fluids ontology resource. We conclude with a description of how the (...) body fluids ontology initiative may provide support to basic and translational science. (shrink)
Representing species-specific proteins and protein complexes in ontologies that are both human and machine-readable facilitates the retrieval, analysis, and interpretation of genome-scale data sets. Although existing protin-centric informatics resources provide the biomedical research community with well-curated compendia of protein sequence and structure, these resources lack formal ontological representations of the relationships among the proteins themselves. The Protein Ontology (PRO) Consortium is filling this informatics resource gap by developing ontological representations and relationships among proteins and their variants and modified forms. Because (...) proteins are often functional only as members of stable protein complexes, the PRO Consortium, in collaboration with existing protein and pathway databases, has launched a new initiative to implement logical and consistent representation of protein complexes. We describe here how the PRO Consortium is meeting the challenge of representing species-specific protein complexes, how protein complex representation in PRO supports annotation of protein complexes and comparative biology, and how PRO is being integrated into existing community bioinformatics resources. The PRO resource is accessible at http://pir.georgetown.edu/pro/. (shrink)
We have begun work on two separate but related ontologies for the study of neurological diseases. The first, the Neurological Disease Ontology (ND), is intended to provide a set of controlled, logically connected classes to describe the range of neurological diseases and their associated signs and symptoms, assessments, diagnoses, and interventions that are encountered in the course of clinical practice. ND is built as an extension of the Ontology for General Medical Sciences — a high-level candidate OBO Foundry ontology that (...) provides a set of general classes that can be used to describe general aspects of medical science. ND is being built with classes utilizing both textual and axiomatized definitions that describe and formalize the relations between instances of other classes within the ontology itself as well as to external ontologies such as the Gene Ontology, Cell Ontology, Protein Ontology, and Chemical Entities of Biological Interest. In addition, references to similar or associated terms in external ontologies, vocabularies and terminologies are included when possible. Initial work on ND is focused on the areas of Alzheimer’s and other diseases associated with dementia, multiple sclerosis, and stroke and cerebrovascular disease. Extensions to additional groups of neurological diseases are planned. The second ontology, the Neuro-Psychological Testing Ontology (NPT), is intended to provide a set of classes for the annotation of neuropsychological testing data. The intention of this ontology is to allow for the integration of results from a variety of neuropsychological tests that assay similar measures of cognitive functioning. Neuro-psychological testing is an important component in developing the clinical picture used in the diagnosis of patients with a range of neurological diseases, such as Alzheimer’s disease and multiple sclerosis, and following stroke or traumatic brain injury. NPT is being developed as an extension to the Ontology for Biomedical Investigations. (shrink)
This paper proposes a reformulation of the treatment of boundaries, at parts and aggregates of entities in Basic Formal Ontology. These are currently treated as mutually exclusive, which is inadequate for biological representation since some entities may simultaneously be at parts, boundaries and/or aggregates. We introduce functions which map entities to their boundaries, at parts or aggregations. We make use of time, space and spacetime projection functions which, along the way, allow us to develop a simple temporal theory.