Abstract

Intelligent sensors should be seamlessly, securely, and trustworthy interconnected to enable automated high-level smart applications. Semantic metadata can provide contextual information to support the accessibility of these features, making it easier for machines and humans to process the sensory data and achieve interoperability. The unique overview of sensor ontologies according to the semantic needs of the layers of IoT solutions can serve a guideline of engineers and researchers interested in the development of intelligent sensor-based solutions. The explored trends show that ontologies will play an even more essential role in interlinked IoT systems as interoperability and the generation of controlled linkable data sources should be based on semantically enriched sensory data.

1. Introduction

Internet of Things (IoT)-based systems are spreading at a fast pace with the promise to improve the quality of our lives [1, 2] and the efficiency of production systems [3]. Applications of IoT frequently perform data analysis and real-time predictive analytics [4] that require informative automated measurements. IoT-based sensor solutions attempt to support ubiquitous computing [5] and interoperability [68] by transforming low-level sensor data into high-level knowledge that is comprehensible to humans and machines [9].

The enrichment of raw sensory data is becoming more and more critical as rigorous management and stewardship of digital resources are preconditions that support knowledge discovery and innovation. Therefore, data-driven systems should ensure the properties of the FAIR guiding principles (Findability, Accessibility, Interoperability, and Reusability) [10].

Semantic modelling produces an explicit description of the meaning of data in a structured way by merging domain knowledge and context-relevant information with raw measured data [11]. Semantics includes ontologies, contexts, and structured metadata. As ontologies can describe knowledge that is problem relevant [1214] by answering a 4W1H question (what, where, when, who, and how) [15, 16], ontological modelling provides a flexible framework of knowledge management (Ontohub https://ontohub.org and DAML http://www.daml.org/ list about 5536 and 282 ontologies, respectively.).

Ontologies can enrich sensory data [17] and ensure interoperability by providing an abstraction layer [18].

Interoperability is one of the most significant challenges in an IoT smart environment, where different products, processes, and organizations are connected. The ontology-based development of IoT solutions can enable developers to obtain universal solutions which ensure the success of IoT. The development of these semantic models should follow the trends of IoT solutions. Collaborative IoT (C-IoT) is gaining ground which also encourages interoperability [19]. By increasing the degree of interconnectedness, additional functionalities can be developed that outsmart individual applications; e.g., smart cars can react based on shared information [20] and contextual information sharing significantly enhances the performance of assisted/autonomous driving algorithms [21]. With contextual information fusion, new types of knowledge can be extracted [22]. Smart interconnection of sensors, actuators, and knowledge-elements enables the development of solutions that are required for smart city- and Cyber-Physical Systems (CPS)-type solutions [23] in addition to edge-computing-based real-time autonomous decisions [24, 25]. As reasoning and decisions are based on data, one of the key enablers of these technologies is semantical models that support the management of sensory measurements [26].

According to [27], three types of IoT ontologies can be distinguished:(i)Device ontology that describes actuators and sensors based on their detailed characteristics.(ii)Domain ontology that represents real-world physical concepts, based on observations, measurements, and their high-level relations to each other.(iii)Estimation ontology which describes the Quality of Service and provides the information needed for service composition.

Although IoT technologies [28] and the related implementation/installation methodologies [29], databases [30], requirements [31], and privacy as well as security aspects [32], manufacturing-specific architecture [33], and communication standards [34] have already been overviewed and the frameworks of sensory data access, service discovery, architecture, and heterogeneity have also been presented [35], the detailed discussion of semantic models of IoT solutions is yet to be had.

Although some aspects of semantic sensor technologies have been overviewed already, a systematic review that follows the structure of IoT solutions is yet to be conducted. A historical study of the evolution of ontologies up until 2014 is presented in [36]. The possible methods of semantic annotation are overviewed in [37] which also compare high-level application-oriented ontologies of context management. The contribution of the Open Geospatial Consortium (OGC) in semantic sensor networks is unquestionably significant as most of the ontologies developed after 2012 are based on their concept [38].

The review is structured similarly to the ITU-T Y.2060 standard [39] that describes the IoT reference model and identifies the high-level requirements of IoT solutions. The sensor, device, network (we refer as the gateway), service and application support, and application layers of IoT systems are shown in Figure 1. The enabling technologies of these layers like communication standards [1, 40], protocols [40, 41], high-level reasoning [42], and linked open data enrichment possibilities [43] are developing at a fast pace and crying out for semantic model-based standardization which will be overviewed in this paper.

This work provides a unique overview of IoT semantics both technologies, as well as models, to support sensor network and IoT solution design. The review is structured according to the transformed layers and presents their functionalities and the details of related ontologies. This structured breakdown provides an insight and starting point for research and development projects considering semantic-based applications at any point in the process.

This systematic review is based the on examination of the literature in Google Scholar, Scopus and Web of Science, following the PRISMA-P protocol [44]. The PRISMA-P (Preferred Reporting Items for Systematic reviews and Meta-Analyses for Protocols) workflow consists of a 17-item checklist intended to facilitate the preparation and reporting of a robust protocol for the systematic review. In the following, only the main details of the process are given. The information sources were last fully queried in October 2018. As the main research question was how ontologies and semantic models can be utilized in the layers of IoT solutions, (“semantic model” OR “sensor “ontology”) and (“Internet of things” OR IoT) were the keywords of the search that resulted in approximately 750 papers.

The inclusion and eligibility criteria were that how closely the publications are connected to semantic models and ontologies of sensors. As Figure 2 illustrates, the selection process was supported by the network analysis of the keywords. This meta-analysis was useful to combine data from different works and to retrieve the most important topics and their connections. Once a comprehensive list of abstracts has been grouped and reviewed the papers appearing to meet inclusion criteria were then be obtained and reviewed in full. The evolution of the technologies was also tracked, so the roots of semantic models and standards in addition to recent trends are also presented.

To focus on the semantic context, only the most closely related 162 publications are cited and discussed in this work. With the aim of minimizing bias we included all the relevant ontologies that were identified as general or widely applied in a specific field. As a synthesis of results, the extracted information was structured according to the layers of IoT solutions which ensures the uniqueness and the applicability of our work. The resultant overview can serve as a guideline for engineers interested in the development of easily linkable and compatible IoT solutions as well as for researchers interested in finding worthwhile areas of research. The limitations of this review are coming from its focused viewpoint. The categorization of the ontologies to the layers of IoT solutions is subjective. The new algorithms that were developed to support the extraction of information from semantic sensor data are not discussed, as the results of the fast developing fields of semantic data analysis, big, and linked data deserve another review.

2. Semantic Representations of Sensory Data

As semantics plays a significant role in knowledge organization [45], it can support the enrichment of measurements and gaining knowledge from IoT systems. Figure 3 shows how semantic metadata like context, description of the sensor, and its configuration (e.g., optimal range) improve the understanding of a single measurement. The following section provides an overview of how these ontologies have evolved and are followed by how this approach should be applied in terms of the design of the layers of IoT systems.

The evolution of sensor network ontologies is motivated by the problem of giving context to the measurements. The first pioneering applications of OWL encoded context ontology (CONON) already demonstrated that ontologies can support logic-based context reasoning [83] and can be used to develop context-aware applications, like the iMuseum [84]. Without enumerating these contextual ontologies, in the following, we cover the evolution of the sensor network ontologies developed to support sensor description, measurement description, sensor state description, and sensor discovery. These ontologies are summarized in Table 1 and Figure 4. The most interesting properties of these base-level ontologies are discussed as follows:(i)Avancha et al. described one of the first ontologies for sensors to define the conditions and expected behaviour of the sensor network [46].(ii)Pedigree Ontology handles service-level information of different sensors (e.g., magnetic, acoustic, electro-optical, etc.) [50].(iii)Sensor Web for Autonomous Mission Operations (SWAMO) enables the dynamic interoperability of sensor webs and describes autonomous agents for system-wide resource sharing, distributed decision-making, and autonomous operations. NASA uses this ontology during stellar missions. SWAMO is based on Sensor Model Language (SensorML) [51] and uses the Unified Code for Units of Measure (UCUM) [85] to describe measurements.(iv)WIreless Sensor Networks Ontology (WISNO) is a very simple proof of concept of how to use the Web Ontology Language (OWL) and the Semantic Web Rule Language (SWRL) built upon IEEE 1454.1 and SensorML [52].(v)Device-Agent Based Middleware Approach for Mixed Mode Environments (A3ME) describes environments with different dimensions of heterogeneity based on the Foundations for Intelligent Physical Agents (FIPA) device ontology, OntoSensor, and SOPRANO context ontology [53, 86].(vi)SEEK (http://seek.ecoinformatics.org/)-Extensible Observation Ontology (OBOE) enables automated data merging and discovery encoded using the Web Ontology Language Description Logic (OWL-DL) [54].(vii)ISTAR ontology describes tasks, sensors, and deployment platforms to support automated task allocation. An interface to the physical sensor environment allows instantaneous sensor configuration [55].(viii)CSIRO-Sensor Ontology by the Commonwealth Scientific and Industrial Research Organisation (CSIRO) published a number of ontologies that can be used in data integration, search, and workflow management [56, 57].(ix)OntoSensor is based on Suggested Upper Merged Ontology (SUMO) by the Institute of Electrical and Electronics Engineers (IEEE), ISO 19115 by the International Organization for Standardization (ISO) and SensorML. Although the ontology is not updated, it can serve as a good starting point of further developments as it supports data discovery, processing, and analysis of sensor measurements; geolocation of observed values and contains an explicit description of the process by which an observation was obtained [5860].(x)Ontonym-Sensor covers the core concepts of location, people, time, event, and sensing. Ontonym-Sensor contains eight classes to provide a high-level description of a sensor and its capabilities (frequency, coverage, accuracy, and precision pairs) in addition to a description of sensor observations (observation-specific information, metadata, sensor, timestamp, and the time period over which the value is valid, the rate of change) [61].(xi)SENSEI O&M is the metadata annotation assigned to a gateway which receives raw data and wraps the value with annotations taken from a template (i.e., a semantic model) as the annotated data can then be transmitted to information subscribers [62].(xii)W3C Semantic Sensor Network (SSN)-Ontology (https://w3c.github.io/sdw/ssn/) is the first W3C standard. Semantic Sensor Networks Incubator Group (SSN-XG) introduces the Stimulus-Sensor-Observation (SSO) pattern [56]. The three parts of SSO are the stimulus dealing with the observed property, sensors that are transforming the incoming stimulus into a digital representation, while the observation connects the stimulus to the sensor which gives a symbolic representation of the phenomena, yielding the beginning of contexts [65]. Ontology design patterns are useful resources and design methods for pattern-matching algorithms, visualizations, reasoning, and knowledge-based creation [87]. SSN does not provide facilities for abstraction, categorization or reasoning offered by semantic technologies [56].(xiii)WSSN ontology extends the SSN ontology by describing the context and communication policy of the nodes. The need for this ontology emerged from the low energy nodes and their unsolved data stream management. WSSN solves the data stream management by implementing communication policies directly into the ontology [70].(xiv)Coastal Environmental Sensing Networks (CESN) are built on Marine Metadata Interoperability (MMI), SensorML, and CSIRO and provide sensor types, a description logic (DL), and a rule-based reasoning engine to make inferences about anomalies of measurements [64].(xv)DOLCE Ultra Light (DUL) is a descriptive ontology for linguistic and cognitive engineering (DOLCE) and distinguishes between physical, temporal, and abstract qualities (http://www.loa.istc.cnr.it/ontologies/DUL.owl) [88]. DUL is built upon the W3C SSN-XG ontology, so the Stimulus-Sensor-Observation (SSO) ontology design pattern is also implemented and followed. DOLCE Ultra Light is a stimulus-centred extension of an ontology design pattern [65]. DUL can be either directly used, e.g., for Linked Sensor Data, or integrated into more complex ontologies as a common ground for alignment, matching, translation, or interoperability in general.(xvi)Semantic Sensor Grids for Rapid Application Development for Environmental Management (SemSorGrid4Env) was introduced for the prediction of flood emergencies [66]. The ontology is divided into four layers: ontology in specific fields, information ontology, upper ontologies, and external ontologies. The layers meet different requirements concerning knowledge representation.(xvii)Sensor Web Resources Ontology for Atmospheric Observation (SWROAO) ontology is with the addition of location taxonomies to sensor data for atmospheric observations [67].(xviii)Sensor Core Ontology (SCoreO) extends the SSN ontology by modules such as the component module, service module, and context module. In the context module, three important classes are added: space, time, and theme [68].(xix)Sensor Measurement Ontology (SenMESO) automatically converts heterogeneous sensor measurements into semantic data [89].(xx)Wireless Semantic Sensor Network (WSSN) ontology is an extension of the SSN with sensor node state descriptors. WSSN uses a Stimulus-WSNnode-Communication (SWSNC) ontology design pattern that treats the stimulus as the starting point of any process and the trigger of sensor or communication equipment [56].(xxi)Sensor Data Ontology was created to the support search of relevant sensor data in distributed and heterogeneous sensor networks. The ontology utilizes the Suggested Upper Merged Ontology (SUMO) and a sensor hierarchy subontology that describes sensors and sensor data as well as a sensor data subontology that describes the context of a sensory data with respect to spatial and/or temporal observations [90].(xxii)National Institute of Standards and Technology (NIST) ontology is also based on SSN. NIST ontology describes the detailed dimension, weights, and resolution of the sensors; the abilities of the system; and the sensor network in manufacturing environments [71].(xxiii)Agencia Estatal de Meteorología  (AEMET) ontology is used for meteorological forecasting of the Spanish Meteorological Office. As the ontology follows the Linked Data concept, the measurements are easily transformable to linked data [72].(xxiv)Sensor Cloud Ontology (SCloudO) is another extension of SSN, with the aim of drawing up a semantic description of the sensor data in the sensor cloud [73].(xxv)IoT-Lite  (https://www.w3.org/Submission/2015/SUBM-iot-lite-20151126/) allows the representation and use of IoT platforms without consuming an excessive amount of processing time when querying the ontology. IoT-Lite describes the IoT concepts in three classes: objects, system, or resources and services. IoT-Lite is focused on sensing and it is suitable for dynamic environments thanks to its real-time sensor discovery functionality [74, 75].(xxvi)MyOntoSens details the measurement process including inputs, outputs, description, calibration, drift, latency, the unit of measurement, and precision [76].(xxvii)Sensor description in context awareness system: the novelty of the ontology is that machines can identify different sensors according to their process capabilities marked in the ontology [78].(xxviii)Dynamic ontology-based sensor binding: interestingly all ontologies are extending and adding information growing the availability of data, this ontology thinks backwards; it subtracts from the ontologies; e.g., OntoSensor [79].(xxix)Smart Onto Sensor. It is an ontology for smartphone based sensors, based on SSN and SensorML [80]. Multimedia Semantic Sensor Network Ontology (MSSN-Onto) can effectively model Multimedia Sensor Networks (Stream of Audio; Video) and multimedia data, define complex events, and also provide an event querying engine for Multimedia Sensor Network [81](xxx)Sensor, Observation, Sample, and Actuator Ontology (SOSA). Lightweight event-centric ontology is built on top of SSN [82].

Ontologies are continually evolving, compiling ever more space for reasoning and simplification. Special, application-oriented ontologies emerge and are integrated into standards. The lightweight and extremely extendable ontologies also support the development of tailored applications and convertibility between formats, in addition to information transfer between applications.

The forerunner of this trend is the SOSA ontology thanks to its linked data-like structure.

3. Layer-Wise Overview of Semantic Models of Measurement Systems

3.1. The Sensor/Device Layer

The sensor layer is commonly referred to as the observation and measurement (O&M) level [56]. The World Wide Web Consortium Semantic Sensor Network (W3C SSN) ontology contains most of the annotations needed for describing sensors and observations. The SOSA extended SSN ontology, which is the current version of SSN, is visualized in Figure 5, which can be considered as one of the most general frameworks of semantic sensor models. The most crucial functionalities of this layer are the following [91, 92]:(i)Description of the sensor enables remote configuration and asset management as well as the discovery of the sensor. A description of the sensors may contain physical characteristics (e.g., interface, energy source, etc.), hierarchy, deployment, and manufacturer information.(ii)Description of the measurement annotates and clarifies the data of the measurement by defining its unit, value, and the measurement process. The Unified Code for Units of Measure (UCUM) [85] is widely used to describe measurements (http://ontolog.cim3.net/cgi-bin/wiki.pl?UoM).(iii)Sensor discovery is challenging in dynamic IoT systems [93]. Sensor discovery is standardized by linked layer-level protocols, like the Link Layer Discovery Protocol (LLDP) that corresponds to the standard IEEE 802.1AB, Web Service Deployment Descriptors (WSDD) (http://docs.oasis-open.org/ws-dd/discovery/1.1/wsdd-discovery-1.1-spec.html), the Bonjour configuration network (https://developer.apple.com/bonjour/), or Simple Service Discovery Protocol (SSDP) as part of the Universal Plug and Play (UPnP) set of protocols (http://quimby.gnus.org/internet-drafts/draft-cai-ssdp-v1-03.txt).(iv)Sensor state description plays a key role in the operation, configuration, and maintenance of the device.

Communication technologies and strategies are hard to place to a single layer. As the discussion would lead very much away from semantic technologies, we stay very brief on this topic. Semantic technologies are often tailored seamlessly in other technologies; we have to mention the three main communication strategies, the publish-subscribe MQTT, a specific web transfer protocol CoAP, and streaming. The tailoring of semantics to these technologies is in context-based access control.

3.2. Infrastructure Layers

The infrastructure layers are used to manage the complexity and heterogeneity of sensor networks. These layers are separated into a gateway layer which enables the heterogeneous duplex communication and a service and application support (middleware) layer that serves as an interface towards the applications and enables the following advanced functionalities:(i)Context management corresponds to characterizing and tagging situations, objects, places, or persons which plays a key role in terms of authorization and adapts the operation to suit the environment [94]. The context may vary in different application areas. However, the operation is the same in every application, namely, context acquisition, modelling, reasoning, and distribution [95]. Context acquisition is expressed in the form of five factors: (1) the process of acquisition, (2) frequency, (3) responsibility, (4) sensor type, and (5) source [96]. The five factors also correspond to the previously described ontologies. An interesting factor, however, namely, responsibility, was not taken into account on the lower layer. The most applicable question is: “What does a sensor measure?” Context modelling is defined as the context representation that provides assistance in the understanding of properties, relationships, and details of the measured objects. High-level reasoning will be discussed in the application layer. An emerging and important part of interoperability are security and privacy measures, which are often associated with huge bottlenecks in information flow. However common technologies like temporal-, spacial-, risk-based-, and event-based roles are already involved [97].(ii)Complex Event Processing (CEP) & Stream Reasoning merge observations into complex events often determined by multiple data sources. CEP systems aim to process data efficiently and immediately recognize interesting situations when they occur. Context management and context handling are crucial with regard to CEP [98]. CEP not only provides information in the form of events from providers to consumers but also supports the detection of the relationships between events and discovers temporal event correlation rules, referred to as event patterns [99]. As IoT supports the modular development of real-time solutions so does CEP and Stream reasoning [100]. Current high-level stream processors, with the capability of on arrival or nonarrival of data, deriving information for the system, are C-SPARQL, CQELS, and ACEIS [101]. With special architectures and techniques, machine learning based solutions can also be implemented [102].(iii)Service configuration is the key in the vision of autonomic computing [103, 104]. IBM stated that systems have to be self-managed, which means self-configuration, self-optimization, self-protection, and self-healing. The vision is far from, but remote configuration, and remote access are already in place as a Connected Device Platform (CDP) which ensures that the devices and sensors can be easily connected and remotely managed [105].(iv)Interoperability allows automated detection services and devices using novel tools like Semantic Markup for Web Services (OWL-Siot) [106] and Multistage Semantic Service Matching [107], hiding the details of diversity and channeling services by providing a simple, single application programming interface (API) layer, and ensuring security & privacy functionalities [108].(v)Device management functionalities are like device provisioning, device configuration, software module management, lifecycle management operations (e.g., install, update, and uninstall), and fault management [109].(vi)Software-defined networking (SDN) enables the management of networked assets through an integrated interface.(vii)Data storage mostly relies on graph databases, parallel databases, key-value stores, wide-column stores, document stores, etc. The key is always speed and scalability; thus IoT ecosystems tend to always add up sensors and more and more end nodes. The proper data storage manages and handles the single measurement points as well as the semantical descriptions. A taxonomy about metadata stores is presented in Figure 7.(viii)Notification SWE-Sensor Alert Service (SAS) is a standard web service interface for publishing and subscribing to alerts from sensors [91].

3.2.1. The Gateway/Network Layer

The gateway shares services [110] and serves as the bridge between sensor networks and conventional communication networks [109], so the gateway layer possesses the following functionalities:(i)Provision of positional information needs higher potential so gateways as merging points have information about the peripheral nodes can apply to positioning, based on Global Positioning System (GPS) sensors, IP geolocation [111], or indoor positioning systems [112, 113].(ii)The provision of temporal attributes is not trivial as passive and very low energy devices by themselves cannot provide temporal attributes so this task should be performed at the gateway layer, based on ISO 8601, and is a standard for representations of date and time, in a time model [114] which describes and defines seven relationships between intervals (duration, starts, finishes, before, after, meets, and equals) or the OWL time vocabulary (https://www.w3.org/TR/owl-time/) used to express relations between instants and intervals together with information about durations, date, and time information.(iii)Low-level ontology alignment is the first step towards domain knowledge as it means a merging point towards an upper level or domain ontologies. Gateways have much more energy than devices. Therefore the first energy demanding to process in the face of semantics can take place at that this point.

3.2.2. Service and Application Support Layer (Middleware)

From architecture design pattern point of view, service and application support layer a middleware, seen as the core part of the backend also referred to as data-handler, can be also imagined as the classic Extract-Transform-Load (ETL) ware. Middleware operationally has some interesting ideas to keep up with scalability, as-is data handling, interoperability, data propagation, etc. Semantics is often criticized as it can be a severe performance bottleneck [115]. The patterns, which we present in this section, however, will gain speed, security, and confidence from semantics.

The middleware layer aggregates heterogeneous sensory data for the application layer [121]. Figure 6 illustrates that design patterns can be based on context-, device-, data-, and application-oriented approaches. Table 2 summarizes these approaches based on their semantical, computational, and storage demands. Event-based in addition to service- and database-oriented approaches are founded on specific semantic models as shown below:(i)Event-based middleware should handle event specification metadata and event processing rules [122]. The SenaaS ontology [116] supports the creation of events from observations.(ii)Service-oriented design approach facilitates dynamic component-based application development [123]. Therefore, service-oriented ontologies should handle service properties to measurements in the same way that the Distributed Internet-like Architecture for Things (DIAT) supports autonomous data collection and processing and has contextual inferences for common device management, as well as situational awareness for minimal human intervention and zero-configuration [124].(iii)Database-oriented middleware architectures utilize distributed databases [120] and standardized query languages (e.g., SPARQL, Structured Query Language (SQL), Cypher, etc.). The straightforward approach is to collect the data in a standardized way and immediately store it in its raw form. An important thing to consider is that most sensors transmit continuous data, also referred to as streaming data. Two primary solutions are available to store streaming data:(1) Hardcoded method: where the correspondence between streaming data and ontology is defined in programs. The different fields and formats of streaming data require different types of coding. The streaming data will be continuously written in RDF files according to the program code. The hardcoded method is lightning fast if it is written, but every single entity must be coded by hand which makes it very rigid and fast, on the other hand, extending the flow or applying another ontology causes the mapping to be created from the beginning. The object-oriented method involves the use of an object-relational-mapper (ORM) (e.g., Entity Framework by Microsoft [125] and Hibernate for Java [126]).(2) Mapping language method: mapping languages (e.g., D2RQ [73, 127], R2O [128], R2RML [128130] and On top [131]) provide a set of well-defined semantic primitives to describe the mapping relationships. The mapping of streaming data and the ontology can be added manually or automatically by programming.Alternative approaches exist to reduce coding volume. An example of mapping messages from the widely used publish/subscribe messaging protocol Message Queuing Telemetry Transport (MQTT) directly into JavaScript Object Notation for Linked Data (JSON-LD) (https://github.com/w3c/json-ld-syntax) with Grok patterns through the Logstash processing pipeline (https://www.elastic.co/products/logstash) is shown in [132].

As it has been presented, well designed infrastructure layers should focus on interoperability. For this purpose, multiple semantic techniques, e.g., ontology alignment, multiple merging points, gateways, special architectures, can be used.

3.3. The Application Layer

Successful applications require that all the needed atomic data fragments are at hand, semantically described, as well as all rules to make conclusions from the data. Therefore, the challenges of ontologies in the application layer are reusing the domain knowledge, cross-domain knowledge interlinking, and reasoning. That is why managing domain knowledge, however, is still very significant. The following tools have been used in the past for the domain knowledge referencing:(i)Ontology extension is a commonly used technique at building new ontologies, starting from a fixed, often standard base ontology [133]. In Section 2 we presented how ontologies are deriving mostly from W3C standards.(ii)High-level ontology alignments the goal of the technique is to find correspondence between two or more ontologies [134]. It has a lot of tools and ready-to-use frameworks as presented in ontology mapping reviews [134, 135].(iii)Ontology merging where the goal is to combine two or more ontologies into one [134]. This process requires an in-depth knowledge of both ontologies to avoid duplication of stored data and create deep interconnections between logical elements.(iv)Ontology catalogue, dataset catalogue supported by semantic search engines is one group of the tools for mapping data and its meaning from multiple sources.(v)Manual referencing is the most tiering tool in this list, where every matching of ontology is described and matched against by hand [136].(vi)Knowledge graphs are unstructured or semistructured linked data to handle the whole domain and interdomain knowledge [137].

As information provided by semantic enrichment of sensor data can be processed without requiring any knowledge of the sensor system [91], the application layer can process efficiently interpretable, contextualized high-level abstractions of sensory data [138] to generate situation-based notifications and decisions [38]. Based on this concept, the ontologies of the application layer should handle information that is related to the following functionalities:(i)Situation Awareness fuses information to make better decisions. Event prediction and Human Activity Recognition (HAR) are special types of situation awareness [139, 140]. The basic ontologies which support situation awareness are Semantic Web Rule Language Temporal Ontology (SWRLTO) [141] and Temporal Abstractions Ontology (TAO) [142]. TAO is designed to capture semantic temporal abstractions while SWRLTO provides temporal modelling and reasoning. The most well-known built-upon ontologies that support situation awareness are the following:(a)Situation Awareness Core (SAW-CORE) ontology formalizes the knowledge representation of objects, relations and their temporal evolutions which leads to good performance. The basic elements of SAW-CORE are: SituationObjects and Relations. Every SituationObject consists of Attributes which participate in Relations, the attribute triggers the Relations to be true or false [143].(b)Standard Ontology for Ubiquitous and Pervasive Applications (SOUPA) provides spatial-temporal representation to cover concepts like time instant (TimeInstant), intervals (TimeInterval), movable spatial things (SpecialTemporalThing) and geographical entities [144].(c)Event-Model-F supports the participation of objects in events, temporal duration of events as well as relations between events in mereological (events are based upon each other), causal and correlational forms [145].(d)SNAP/SPAN where SNAP and SPAN handles relations between spatial and temporary events, respectively [146].(ii)Tracking is a frequent IoT application problem. Besides status information, the I2oTonology ontology also handles accessibility and compatibility issues of the objects [147].(iii)High-level queries enrich query results from cloud sources like the Linked Open Data cloud (https://lod-cloud.net/) resulting in the Semantic Web of Things (SWoT) [42]. The two main technologies of the high-level queries are the semantic queries based on the variants of the SPARQL [36] and Ontology-Based Data Access (OBDA) [148]. Access control technologies will be more important in the future [149]. Contextual conditions bring new challenges to context-sensitive access control as context information will play a crucial role in dynamically changing environments [150]. As a promising example, recently an ontology-based approach that captures such contextual conditions and incorporates them into the policies, utilizing the ontology languages and the fuzzy logic-based reasoning has been presented [151].(iv)IoT streaming data integration and analysis by machine learning techniques are gaining more interest as most streaming decision models should run in resource-aware environments and detect and react to changes in the environment and many organizations need to deal with massive datasets in different formats coming from multiple sources. The best practice for the performance assessment of how machine learning models is given in [152] while the challenges of IoT streaming data integration are summarized in [153].

The development of cloud computing platforms should focus on ensuring scalability and reliability [154, 155], device and data management, monitoring [156, 157], information arrival rate, and flexibility [158]. General IoT platforms, like Google Cloud IoT (https://cloud.google.com/solutions/iot/), Microsoft Azure IoT Suite (https://azure.microsoft.com/en-us/services/iot-hub/), and Siemens-MindSphere (https://www.siemens.com/global/en/home/products/software/mindsphere.html), do not natively support semantic descriptions; instead, they use extendable internal data models and functions to implement the previously mentioned functionalities. Orion Context Broker, OCB (https://fiware-orion.readthedocs.io/en/master/), is an outlier in this Platform-as-a-Service (PaaS) market as it natively supports semantics by the NGSI-LD description that is based on JSON-LD (JSON for Linking Data) designed to represent knowledge in RDF triplets of entities, properties and relationships.

The extension of data models to information models ensures interoperability and the useful for the evaluation of Quality of Service (QoS) [159, 160] and Quality of Information (QoI) [161]. The LOV4IoT ontology catalogue in the Linked Open Vocabularies (LOV) (https://lov.linkeddata.es/) references 510 ontology-based research projects for IoT and its applicative domains (https://lov4iot.appspot.com/?p=ontologies) related to Healthcare (159), Environment, e.g., smart energy, weather, etc. (95), Generic IoT (86), Smart Cities (24), Smart Homes (59), Robotics (28), Agriculture (24), Tourism (31), Transportation (58), others, e.g., security, measurements, etc. (49). The LOV4IoT catalogue is a comprehensive source for the application-oriented ontology selection, so in the remaining part of this section focuses on how semantic technologies are used in some of the most frequently applied solutions.

Generic IoT semantic technologies are mostly used to ensure interconnection between platforms and operation domains:(i)Hypercat is both a format and an API ecosystem developed for interacting, fetching and searching IoT catalogues [162]. Hypercat has been developed to resolve the differences of IoT-based applications by defining a universal adapter language. Hypercat describes the API functionalities and usage that no new nor universal API has to be built for existing ones. Hypercat can point to at any resource with a URL or URI; common resource types in Hypercat in IoT applications are SenML objects to representing time series [163].(ii)OpenIoT (http://www.openiot.eu/) is a W3C SSN based platform that focuses on connecting sensor devices to software by enabling context and semantic discovery and sensor management. OpenIoT uses Hypercat for high-level API interoperability. The visual monitoring and configuration possibilities of sensor metadata offered by the Integrated Development Environment (OpenIoT IDE) enable zero-programming application development [164]. OpenIoT is also a good example for Sensing-as-a-Service (SaaS) [165] paradigm [166].(iii)FIESTA-IoT addresses semantic interoperability at all levels of IoT. FIESTA-IoT uses the previously described OpenIoT framework to manage hardware level interoperability at the sensor level. The ontology ensures data-level interoperability at the device layer by semantic annotations of the raw data, model level interoperability at the gateway layer by ontology alignments to existing IoT ontologies, query level interoperability at the service and application support layer by querying unified knowledge bases, reasoning level interoperability based on the deduction of meaningful information, and applicative domain level interoperability at service/application level to support Linked Open Services and cross-domain applications [167].(iv)Inter-IoT framework is based on open source hardware and software tools granting multilayer interoperability among IoT system layers (devices, networks, middleware, application services, and data/semantics). The interoperability of data and semantics is solved by the introduction of a new ontology, GOIoTP (Generic Ontology of IoT Platform) [168]. The interoperability between ontologies is solved by ontology matching and merging routines of the Inter-Platform Semantic Mediator (IPSM) tool [169].

IoT-based healthcare applications are very diverse, covering cognition monitoring (dementia care, assistive living), to electrocardiography, through comprehensive personal health monitoring with wearables. Unique factors of healthcare ontologies are how they manage medical sensors and measuring equipment, handle diagnosis and pathways, and ensure interconnection the hospital information systems. In the following, we discuss the role of the ontologies through the presentation of the details of some of these solutions.(i)Smart Appliances REFerence for Health (SAREF4Health) was developed for handling real-time electrocardiography (ECG) with a focus on wearable devices. The ontology is based on the Smart Appliances REFerence (SAREF) (http://ontology.tno.nl) that also includes extensions to smart energy, environment, and buildings. In this extension the ontology is designed to serialize SAREF4Health messages as JSON-LD to represent the time series of ECG signals and ensure compatibility with the HL7 Fast Healthcare Interoperability Resources (FHIR) [170].(ii)Health and alarm ontology is an ontology-based context management system developed to support home-based care. The utilized rule-based reasoning evaluates risks based on environmental conditions, alarms, and social contexts, to notify relatives and caretakers about the occurred risks on multiple levels [171].(iii)Technology Integrated Health Management (TIHM)  (http://iot.ee.surrey.ac.uk/tihm/models/fhir4tihm) application was developed to support home-based dementia care by machine learning based information extraction from aggregated environmental and physiological data [172].

Internet of Robotic Things (IoRT) related ontologies play an essential role in the handling of information about positions, controls, commands, and observations for enabling the necessary robotic abilities, i.e., perception, motion, manipulation, autonomous decision, and interaction [173]. In the following, some examples will be given to describe how semantic technologies are used in this field.(i)RoboEarth is a knowledge-based system designed to enhance robot intelligence by cloud services. The ontology includes action receipts and reduces the learning curve of the robots by sharing contextual data such as pictures and object descriptions (e.g., weight, surface friction, etc.) [174, 175]. The semantic mapping system of RoboEarth is based on SLAM (Simultaneous Localization And Map building) which provides the scene geometries and the object locations. The semantic reasoning and contextual shared data boosts the mapping, aids robot in operation, e.g., object recognition and learning [176].(ii)Smart and Networking Underwater Robots in Cooperation Meshes (SWARM) ontology increases safety and reduces the operation cost of the offshore operations by cooperative autonomous underwater vehicles (AUV) [177]. The SWARM platform consists of several domain-specific ontologies related to mission and planning, robotic vehicles, environment recognition and sensing, and communication and network domains [178] and has a publish/subscribe based semantic middleware designed for device and service registration, semantic enhancement, and context awareness [179].(iii)Core Ontology for Robotics and Automation (CORA) (https://raw.githubusercontent.com/srfiorini/IEEE1872-owl/master/cora.owl) was developed by the IEEE Ontologies for Robotics and Automation Working Group (ORA). The main role of CORA is to maintain the consistency among the different subontologies in the standard, serving mediator of interactions among heterogeneous agents, with different sensors and different capabilities [180]. CORA covers the most general concepts, relations, and axioms of robotics and automation and serves as a reference for knowledge representation and reasoning in robotics as well as a formal reference vocabulary for communicating knowledge about robotics and automation between robots and humans [181], which has significant benefits in human-robot interaction [182].

Smart city is a vision of IoT data-driven efficient asset and resource management in urban areas. The concept interconnects transportation (smart transportation), energy management (smart energy), public administration (smart administration), industries (smart industries), and security management (smart security) as well as healthcare (smart health). The primary information sources of smart city solutions are wireless sensor networks, and the interconnection of heterogeneous data related to multiple domains represents the main challenge of the developments, so this field of application motivates the development of multidomain sensor fusion [183] and semantic technologies as it will be demonstrated in the following:(i)READY4SmartCities is a project to obtain a reduction of energy consumption and CO2 emission at smart cities communities level. The ecosystem contains an ontology collection containing the descriptions to handle issues related to energy (e.g., energy type, demand, etc.); climate (e.g., rainfall, sunshine hours, etc.); weather (e.g., temperature and wind speed); environment (e.g., pollution); and buildings (e.g., building characteristics, owner, manager, etc.) [184].(ii)Knowledge Model for city (Km4City) (http://www.disit.org/km4city/schema) was developed to manage complex data sources of smart cities. The model contains six macroclasses: general public administration, street types, points of interests (e.g., services, activities), information about local public transports (e.g., schedule times, rail graph, etc.), sensors (e.g., traffic flow, pollution, and weather), and temporal attributes (e.g., time intervals) [185].(iii)SCRIBED is a semantic model created by IBM for large-scale data gathering from cities worldwide. It introduces four semantic technologies: Common Alerting Protocol (CAP), National Information Exchange Model (NIEM), UCore, a simple OWL implementation, and Municipal Reference Model (MRM). The technologies address problems like interagency data exchange. The SCRIBED ontology is a robust upper ontology describing high-level entities of geospatial (e.g., roads, landmarks, etc.) and temporal objects (e.g., road work), as well as Key Performance Indicators (KPIs) of the cities [186].

This overview illustrated that the goal of the IoT research is to enable ubiquitous access to information in high-level applications that need context awareness for decision-making [187].

3.4. Discussion and Future Trends

Based on the previously presented overview of the layers it can be concluded that in every layer the application of the ontologies aims to enrich the data to obtain more system-independent and contextualized information from the row measurements. The contextualized information improves the compatibility, flexibility, and connectivity of IoT in terms of the applications which lead to linked (open) data-based systems. Some illustrative solutions that are related to this trend have already been implemented. Sensor-based Linked Open Rules (S-LOR) [188], linked sensor middleware [189], and publishing style systems [190, 191] are good examples of this approach.

Privacy and security will be the cornerstones of sensor network-based applications. The access control of linkable data is becoming ever more critical since users will enrich the data with sensitive contextual information, like location [192]. The analysis of these privacy issues like identification, tracking, and profiling is presented in [193]. These issues require research on how the availability of this information should be controlled [194]. The development of threat model-based taxonomies is one of the promising approaches in a combination of contextual security and access management [195]. Promising steps on delivery security are the blockchain technology. The description of sensors and sensor states can ensure monitoring and reasoning which information should be securely provided to the application layer. Motivated by this requirement the interest in the development of trust ontologies are also arising. We should also mention that security mechanisms often increase latency which is very crucial at real-time applications and also often forms bottlenecks in the processing pipelines. The services of the infrastructure layers (see Section 3.2) are also crying out for the development of solutions that improve security, scalability, and reliability [194] without sever bottlenecks.

Future research and developments should focus on open and modular platforms that utilize easy-to-extend semantic technologies. Cloud-based systems will also emerge, and unique structures will be developed. Among these, it is worth mentioning dew server-based solutions, where dew servers will collect and process streaming data from the IoT sensors and devices and will ensure communication with higher-level servers in the cloud [196].

An availability statistic of the LOV4IoT collection illustrates that the majority of the published ontologies are not freely available or not widely applicable (lost or confidential (27), not freely available (203), ongoing work (46), published but not following the LOV best practices, so hard to interconnect (198), published, and in LOV (31)). This statistics shows that future research and development should much more focus on standardization and ensuring open source availability.

As ever more intelligence should be embedded in sensors and IoT systems, the semantic models should support the self-awareness/self-monitoring/device lifecycle management of intelligent sensors systems. Edge-computing based techniques are in the focus of this research and development [197], so one of the most interesting problems of the future is how semantic and machine learning models should be integrated.

4. Conclusions

As the Internet of Things-based products and processes are rapidly developing, there is a need for tools that can support their fast and cost-effective implementation. Ontologies in sensor technology have led to an incredible advance in the development of these techniques by standardizing the manipulation, share, reuse, and integration of sensory measurements.

This article showed these semantic technologies by following the layers of IoT solutions. Based on the analysis of the presented approaches, we can conclude that there is a need for further standardization to achieve more flexible connectivity, interoperability, and fast application-oriented development.

Real-time applications request a better understanding of the data especially in the on-site processing of complex events and multivariate time series what is the promise of fog and edge-computing solutions. Therefore, further research and development should focus on automated information enrichment, data fusion, and processing of complex event series. Enriched sensor descriptions also aid to create more detailed contexts that can serve as real-time information for the linked (open) data based solutions.

Conflicts of Interest

The authors declare that no conflicts of interest exist with regard to the publication of this paper.

Acknowledgments

This research was supported by the National Research, Development and Innovation Office (NKFIH) through the project OTKA-116674 (Process Mining and Deep Learning in the Natural Sciences and Process Development) and the EFOP-3.6.1-16-2016-00015 Smart Specialization Strategy (S3) Comprehensive Institutional Development Program.