Abstract

An ontological approach as a tool for managing the processes of constructing mathematical models based on interval data and further use of these models for solving applied problems is proposed in this article. Mathematical models built using interval data analysis are quite effective in many applications, as they have “guaranteed” predictive properties, which are determined by the accuracy of experimental data. However, the application of mathematical modeling methods is complicated by the lack of software tools for the implementation of procedures for constructing this type of mathematical models, creating an ontological model that operates by the categories of the subject area of mathematical modeling, regardless of the modeling object proposed in this article. This approach has made it possible to generate tools for mathematical modeling of various objects based on the interval data analysis for any software development environment selected by the user. The technology of creating the software on the basis of the developed ontological superstructure for mathematical modeling using the interval data for different objects, as well as various forms of user interface implementation, is presented in this article. A number of schemes, which illustrate the technology of using the ontological approach of mathematical modeling based on interval data, are presented, and the features of its interpretation when solving environmental monitoring problems are described.

1. Introduction

Mathematical modeling is one of the main tools that allows describing the object in a simple form, exploring it, and predicting behavior. Mathematical modeling is understood as the process of building a model and its application to certain applied problems [14].

Mathematical modeling processes consist of a large number of procedures, which are mainly implemented in the relevant tools, that is, in the form of certain software systems [3, 4].

Examples of these software environments are Matlab, GNU Octave, Scilab, and SageMath. These tools are multipurpose and well developed. However, practitioners often need to use more specialized tools for building mathematical models, as well as to adapt existing tools to nonstandard conditions that are absent in the noted environments. In this case, there are difficulties in using and interpreting such tools because the simulation procedures are hidden from the researcher, and this makes it difficult to use them by making appropriate software changes [48].

In this case, the most appropriate solution is to create an ontological description of certain methods of mathematical modeling. It describes in detail the components of a model building process and its application. Then this ontological description is used to generate appropriate software. This approach, on the one hand, will allow the integration of the created software in various applied systems and, on the other hand, will make changes to existing software [4, 912].

The availability of ontological descriptions of modeling processes based on certain methods makes it possible to unify the software used for a wide range of tasks. It enables, based on experience, a repository of mathematical model creation that can be used to model a wide range of mathematically similar properties [1323].

The positive effect of this approach will be a significant simplification of the process of creating tools for both the modeling processes organization and their application to applied problems.

One of the directions of mathematical modeling is the inductive approach, which is based on a self-organized process of the evolutionary transition from primary data to explicit mathematical models that reflect the patterns of functioning of simulated objects and systems, which are implicit in existing experimental research and statistical data [2427].

An important feature of the inductive approach implementation is the nature of the uncertainty in information data sets (probabilistic, interval, fuzzy), as this approach is based on methods of data analysis. In a number of works [2830], the ontological approach for the construction of the mathematical models within the framework of the inductive approach is based on a group of methods of data handling (GMDH). Within the framework of the proposed approach, the key parameters for the main components of the modeling process are identified, which determine the possibility of generalization and expediency of constructing multifunctional software modules in the development of computer inductive modeling tools based on GMDH [26, 31, 32]. Since the mentioned approach has a complex structure, which is interpreted using Protege [3336] and does not contain applied software-interpreted solutions, its practical use in other approaches to mathematical modeling is not advisable. The use of such an approach is time-consuming to formalize the subject area and, due to the complexity of its presentation within the Protege system, will not contribute to support among the developers of the appropriate applied software solutions [19, 37, 38].

Another direction in mathematical modeling according to the inductive approach is presented by the methods of mathematical modeling based on interval data [3943]. The multiple estimates of the parameters of the “input-output” model, built on the results of an experiment in which the output variables are obtained in interval form, are the peculiarities of these methods [44, 45].

As a result of the application of the methods of interval analysis, instead of one “input-output” model, there is a corridor (set) of equivalent interval models of the system. The properties of the obtained models depend on the chosen method of sets of parameter estimation. Preferably, sets of parameter estimates can be presented in the forms of a polyhedron, a multidimensional ellipsoid, or a rectangular parallelepiped that specifies the intervals of parameter values [46, 47].

Given that the methods of systems modeling, based on the analysis of interval data, require minimal information about the research system, their applications significantly expand the class of research systems [48].

However, these methods are limited for use by both researchers and users-practitioners due to the lack of developed ontological description for this area of mathematical modeling, which would make it possible to expand the scope of application of the existing interval models for a particular subject area and to develop new models. An example, in this case, is the field of building mathematical models for medicine [41] or environmental monitoring, in particular, the description of mathematical models based on interval data for the processes of air pollution by harmful emissions from vehicles [4648]. The long-term experience of the authors of this work in creating and applying this type of model has shown that in the case of changes in the state of the environment, or conditions for obtaining interval data, most built interval models lose accuracy or become inadequate. The application of the ontological superstructure to the process of development and use of models significantly expands the possibilities of modeling the characteristics of these systems and increases the accuracy of the model in specific cases. Simply put, an ontological model as an “add-on” can use the “switch” functions to select the best model from the repository, depending on changes in the simulation environment.

The need for automated, systematic, and reusable mathematical models as an environment for knowledge obtaining, accumulating, and reusing is fully justified in the context of a large amount of information about knowledge, which is generated and stored.

Therefore, the aim of this article is to create an ontology of mathematical modeling based on interval data, which would expand the possibilities for researchers dealing with the objects of different nature, data on which were obtained in interval form, as well as for practitioners who can use it for modeling processes in medicine, environmental monitoring, etc.

2. Statement of the Problem of Mathematical Modeling Based on Interval Data

The problem of object modeling based on interval data is considered in [42, 47]. The authors of the interval approach declare that it has a number of advantages over the stochastic (probabilistic) approach. Among them is the absence of a requirement to research the statistical characteristics of the simulated object. As it is known, this reduces the number of experiments (data sampling). Therefore, the interval approach is more useful for researching the object properties in conditions of limited data sampling. A declarative approach to presenting knowledge about object modeling methods based on interval data analysis makes it possible to develop tools for using this approach by both researchers and practitioners. To develop a declarative ontology, the basic concepts of this approach should be considered.

First, the basic concept refers to a method of presenting data in the form of intervals of possible values of the simulated characteristic:where are, accordingly, the lower and upper bounds of intervals of possible values of the output characteristic at a point with discretely given spatial coordinates , , (for objects with distributed parameters) and time discrete (for dynamic objects, for example, a dynamic of air pollution from vehicles in discrete time).

Note that in the measuring experiment, the lower and upper bounds can be set by the relative error of the measuring device: and , where is the measured value of characteristic; is a relative error of measuring.

Representation of experimental data in interval form (1) is reasonable in cases: when the measurement error significantly exceeds the methodological errors and modeling errors, intervals (1) set the tolerance bounds of deviations of the simulated characteristic of the object from the nominal, under conditions of known maximum values of errors in the experiment.

Next, it is necessary to determine the mathematical object to represent the object model. In this case, it is limited to a discrete linear model in generalwhere is a vector of basic functions, in general nonlinear, with the help of which the values of the simulated characteristic of the object are transformed, as well as the input variables at discrete space points and for a certain time are discrete.

As a result of performing the procedure of structural identification, a discrete model is determined, in particular: the vector of basic functions ; sets and dimension of vectors of input variables (controls) ; is an order of a discrete model, which as is known is equivalent to the order of a differential equation analogous to a discrete model. To implement a discrete model, it is also necessary to specify the initial conditions, i.e., the value of each element in the set for certain discrete, as a rule, initial one, and set the value of the components in the parameters vector .

If the general form of the discrete model is known, for example, due to physical considerations, it remains to identify the parameters in a way to ensure maximum agreement of the simulated characteristic of the object with the experimentally obtained values of this characteristic. This task is called the parametric identification task [42].

Let’s assume that the vector of estimates of parameters in the difference operator (2) is obtained on the basis of interval data analysis. Substituting a vector of parameter estimates from difference operator instead of the vector of their true values in expression (2) together with the specified initial interval values of each element of the set and given vectors of input variables an interval estimate of the simulated characteristic at points with discrete spatial coordinates , , and on time discrete can be obtained:

Now, the problem of parametric identification of the interval discrete model (IDM) based on the interval data analysis can be mathematically formulated.

The conditions of matching the experimental data presented in the interval form (1) with the data obtained on the basis of the macromodel in the form of IDM (3) are formulated as follows:

Conditions (4) provide obtaining the interval estimates of the simulated characteristic of the object within the intervals of possible values of the characteristic obtained experimentally.

Substitute in equation (4) instead of interval estimates of the simulated characteristic; its interval values are calculated on the basis of IDM (3) together with taking into account the given initial interval values of each element from a set:and given vectors of input variables , and receive the following:

Therefore, an equation (6) is obtained by substituting interval estimates of initial characteristics (given as initial conditions and predicted on the basis of expression (3) in the remaining nodes of the grid) in conditions (4).

As it is known, the obtained system is an interval system of nonlinear algebraic equations (ISNAE). Therefore, the task of identifying the parameters of IDM (3) under conditions (4) is the task of solving ISNAE in the form (6).

It should be noted that ISNAE (6) is formed recurrently. The total number of interval equations is a product of .

Obviously, the greater the number of equations in the interval system, the more difficult it is to find the ISNAE solution.

Given that this problem cannot be solved for a predetermined number of iterations, this type of problem belongs to NP-complete. The only way to solve it is to do a full search or random search. Given the complexity of the task of IDM parametric identification, to find at least one ISNAE solution, random search methods can be used [42].

These computational schemes for the implementation of the method of IDM parametric identification are based on four-step procedures [44].Step 1. Set the initial conditions in the form (5).Step 2. Set the initial or randomly generate the current estimate of the vector of the IDM parameters.Step 3. Calculate the interval estimates of the simulated characteristic at points with discrete-given spatial coordinates , , and on time discrete using a recurrent scheme (4).Step 4. Check the “quality” of the current approximation of the estimate of the vector of IDM parameters [39, 40].

In this step, assume that the “quality” of the approximation will be higher if the predicted corridor is closer, built on the basis of this parameter vector approximation, to the experimental one.

If the calculated value of “quality” of the current approximation of the estimate of the vector of IDM parameters at the current iteration is zero (  = 0), then the procedure is over; otherwise, go to Step 2.

The quality of the approximation will be quantified as the difference between the centers of the most distant predictive and experimental intervals in the case when they do not intersect, and the width of the intersection of the predictive and experimental intervals is the smallest, for the case of their intersection [40].

Formally, these conditions are written as follows:where and are operations for determining the center and width of the interval correspondingly.

Therefore, the problem of parametric identification of interval models of the object is formulated in the form of an optimization task:where the value of the objective function is calculated by formula (7) or (8).

Let’s consider the problem of IDM structural identification in general (3). The complexity of the task of configuring IDM (3) is that not only the parameters are unknown, but the same is with the structure. In this case, to find the IDM parameters, it is necessary to solve the problem of parametric identification and identify the structure–structural identification. Note that both these tasks are very closely related because parametric identification is a structural stage, and to find one solution to the latter, it is necessary to make many attempts to find the vector of IDM parameters. Note that the “success” of the task of finding the vector of IDM parameters directly depends on the success of the process of selecting its structure. After all, if the defined IDM structure is “unsuccessful,” then it is impossible to find a solution of the parametric identification task.

Therefore, parametric identification is a stage of structural identification. When the data is given in interval form, this step is to find estimates of the IDM parameters by solving the ISNAE (6) for some known vector of basic functions (structural elements of the IDM).

To solve ISNAE (6), the method of parametric identification based on random search procedures is used. The application of this method involves, instead of ISNAE (6) solving, the search for some approximation to its solution, which determines the quality of the current IDM structure [47].

Let’s use some notations that are necessary to reveal the essence of the task formulation. Denote by the current IDM structurewhere is a set of structural elements that specify the current s IDM structure.

Next, denote the following symbols: is a number of elements in the current structure ; is the set of all structural elements, where (power of the set ); is a vector of unknown parameter values. Structural identification aims at finding the IDM structure in the form of (10) so that the interval discrete model is formed on its basis [48].

The conditions (4) are true, i.e., the interval estimates of the predicted value of the simulated characteristic are included in the intervals of tolerance values of the simulated characteristic on the set of all discrete.

The quality of the current IDM structure is estimated on the basis of the value of the indicator , which quantifies the proximity of the current structure to a satisfactory level in terms of providing conditions (4). Afterward, will be called the objective function of the optimization task of the structural identification of a mathematical model with guaranteed prognostic properties.

The value of the quality indicator for the current IDM structure is calculated using modified expressions (7) and (8):where are operations from interval analysis determining the center and width of the intervals, accordingly.

Expression (12) describes the “proximity” of the current structure to a satisfactory level in the initial iterations, and expression (13) in the case of ensures the fulfillment of conditions (4).

The task of IDM structural identification is written formally in the form of the task of finding the minimum of the objective function :where is a number of elements in interval model structure; is a set of potential structure elements in a model.

From expressions (12) and (13), it is seen that for the calculated value of the objective function for the IDM structure , the inequality will be satisfied under any conditions. Therefore, the objective function has a global minimum only at those points for which the equality holds. Based on the theory of multiplicity of models [40], it can be stated that in the search space for solutions to the IDM structural identification task, the function has many global minima.

The smaller the value of , the “better” the current IDM structure. If , then the current IDM structure makes it possible to build an adequate model for which the interval estimates of the predicted characteristic belong to the intervals of possible values of the modeled characteristic.

As it can be seen, the IDM structural identification is reduced to the multiple repeating of the parametric identification problem-solving. Therefore, it is important to develop methods of structural identification, which would reduce the number of iterations of the method for finding an adequate structure of the mathematical model and, accordingly, would reduce the required number of repeating the parametric identification problem-solving.

3. Methods of Mathematical Modeling Based on Interval Data

The previous section presents a four-step procedure for solving the problem of parametric identification. However, to date, the most effective methods for solving this optimization problem are methods based on behavioral models of artificial bee colonies (ABC) [49]. The substantiation of this fact is given in [40, 44].

To build a method of parametric identification, the principles of behavioral models of the bee colony are used.

Initialization phase. Vectors that determine the possible minimum points of the objective function (9) are the vectors of parameter estimates and are denoted by . In the context of the behavioral model of the bee colony, this means that each vector of the nectar source coordinates corresponds to one bee that investigates it. Let’s set the number of the entire population to be equal to the value and set the bounds of the parameter estimates

In this phase the following formula is used:where are lower and upper bounds of parameter values at the initialization phase.

Notice that in this phase, all the parameters of the algorithm are also configured [42].

The phase of worker bees. In the context of the optimization task, the phase of worker bees means the search for new estimates of solutions (16) with smaller values of the objective function. To calculate the possible points of the local minimum of the objective function, the following formulas are used:

After calculating the coordinates of the possible points of the minimum a pairwise comparison of the existing and current values of the parameter estimates (16) is performed using the objective function:

The phase of researchers bees. In the context of the optimization task, at this stage, the most probable points (vectors of parameter values) were determined, around which it is necessary to conduct a detailed study of the objective function. It is these points that claim to provide local minima of the objective function. For these purposes, the probabilistic approach is used, namely, the probabilities of the expediency of research are calculated, and each specific point is given by the vector of parameter values from the previously found ones. The expression for calculating the specified probability is as follows:

It should be noted that in the case of a significant deviation between the values of the objective function , calculated for different points (vectors of parameter values), it is necessary to rewrite formula (20), taking into account the normalization of the values of this function. In this case, the formula takes the following form:

Based on the calculated probabilities, the number of points for researching the possible local minima of the objective function from task (9) is determined. However, given that the value of in this formula must be an integer because it determines the number of points in the neighborhood of the studied point to find the minimum of the objective function, the formula will be rewritten as follows:where is the operator of selection of the integer part from number.

Then the procedure is repeated to determine the points where the lowest value of the objective function is achieved.

To avoid focusing on the local minima of the objective function, the phase of scout bees is used.

The phase of scout bees. This is the phase where new solutions to the optimization problem are randomly calculated again. To do this, formula (18) is used. As mentioned above, in the context of the behavioral model of the bee colony, this means the exhausting of current nectar sources.

Each iteration of calculations involves obtaining a new number of points in addition to the current ones. At the end of each iteration, it has points - applicants for research. Therefore, at the end of the iteration, a group selection of points is performed with the smallest value of the objective function , so that their number is equal to the value of . This procedure is called group selection. The procedure ends under the condition  = 0.

Given the analogy between the mathematical formulation of problems of parametric and structural identification of object models, the main phases of the method for structural identification of models of dynamic objects based on the behavioral models of the bee colony are considered.

Initialization phase. In this phase, the main parameters of the method are set: ; ; ; is a current iteration number; is the total number of iterations and the set of structural elements is , and also the initial set (with power ) of the structures from the set of structural elements is randomly formed.

In this case, the structural elements will look different than in Table 1. The results of coding the structural elements for the case of developing a model of the characteristics of a dynamic object are shown in Table 1.

Next, to form structures, consider a set of operators. Note that their names and purposes are stored by analogy with the existing method of structural identification built on the ABC.

The phase of worker bees. In the phase of worker bees, the operator , which transforms the structure of the interval model in the form (10), is used. On the current iteration of implementation of the method of structural identification, this operator forms, on the basis of each of the current structures of the mathematical model, one “new” structure , which is close to the current one. Therefore, the operator converts the set of the current structures generated on the iteration into the set structures by randomly selecting and replacing part of the elements of the current structure and also replaces on selected elements from the set . In this case, the set of elements of the current structure that need to be replaced is inversely proportional to the value of the objective function , which is calculated by formulas (12) or (13).

Next, in this phase, using the operator , pairwise selection is performed to choose the best structure from the two ones: the current and the generated one. To do this, the following formula is used:

The operator implements the process of synthesis of the set of “best” structures from the current sets , . Thus, a set of structures of the first series of formation is obtained.

The phase of researchers bees. As already mentioned, in this phase, the number of structures is determined. It will be generated on the basis of each structure from the set . This indicator is calculated by formulas:

Next, in this phase also, the operator is used, which converts the current structure into a certain number of structures. In this case, the total number of structures distributed between the current structures is equal to . Thus, means the transformation of each structure from the set of structures of the first series of formations, generated by iterating the algorithm , to the set of structures , . Replacement of elements in each current structure (or some structures) is carried out randomly on the basis of the calculated value of the number elements in the current structure and is inversely proportional to the value of the objective function . This substitution is also performed on randomly selected elements from the set .

Also, in this phase, group selection of the “best” structure from the current is performed and the set is formed in its neighborhood by the values of the objective function. This selection operator, as distinct from the pair selection operator , has the following form:

Operator (25) implements the process of synthesis of the set of “best” IDM structures from the current sets and in the method of ranking all structures by the values of the objective function (12) or (13) with subsequent selection of structures by the highest value of the objective function of the optimization task (12), (13). Thus, the set of structures of interval models of the second series of formation is obtained.

Exit from the local minima of the objective function in task (12), (13) is carried out in the phase of scout bees.

The phase of scout bees. To do this, for each current structure enter the counter, which is incremented by “1” each time. If during pairwise or group selection, the current structure is not “updated,” and reset, otherwise. Comparing the value of this counter with some constant given in the initialization phase makes it possible to decide whether the current structure has exhausted itself. If the counter reaches the value , it is no longer appropriate to modify this current structure. This means that the function (14) is in the local minimum. Then, use the operator , which randomly generates a “new” structure from the set of all structural elements randomly, as in the initialization phase, only for one structure. Therefore, such structures will be only a few percent of the value (of all worker bees).

The procedure is completed under the condition that for some structure in the task of parametric identification, the condition is true: .

The main problem with using these methods is the lack of declarative ontological description, which does not allow developing software environments as a tool. On the other hand, as it is seen from the description of the structural identification task, the main problem for its solving is the formation of a set of potential structural elements of the model of difference (discrete) equation, which represents a mathematical model of the object. This problem can be solved by the ontological description of the subject area of modeling, i.e., operational ontology. Therefore, solutions to these problems will reduce the complexity of the modeling procedure and adequate models with guaranteed prognostic properties will be obtained.

4. Features of the Ontological Approach Implementation

The need for automated, systematic, reusable mathematical models as an environment for obtaining, accumulating, and reusing knowledge is fully justified in the context of a large amount of information about the process and production of previously generated and stored knowledge. To achieve these goals, as well as in order to expand the possibilities of the researchers of objects of different nature in cases when the data is presented in interval form, it is necessary to build an ontology of mathematical modeling based on interval data.

In the proposed ontological approach to represent the concepts, methods, and tools of mathematical modeling based on interval data, namely the declarative and procedural parts, mathematical knowledge is separated. The declarative part consists of the information needed to build the model, the information obtained from the model, and the corresponding mathematical expressions that represent the model. The procedural part consists of detailed parts of the model, appropriate methods and algorithms for their implementation, and procedures for initializing variables and their interpretations. Among the tools used to build and apply the ontology, Protege and OntoStudio are the most commonly used [33, 34, 50]. Due to their reliability, widespread use, scalability, and extensibility, these tools can also be used in the process of building appropriate ontological models to represent and manage the knowledge they accumulate in the process of mathematical modeling [35, 51, 52]. However, these tools are difficult to integrate into software and hardware systems, which, in particular, are often used in medicine, where the speed and quality of managing decisions are a priority. Therefore, for building an ontology in this paper, the following tools are used:(i)tools of modern relational databases for information storage [5355];(ii)algebra of tuples for the formalized presentation of knowledge and its subsequent program interpretation regardless of the selected software platforms for its implementation, as well as for implementation of effective methods of managing accumulated knowledge [5659];(iii)Python and Java as programming languages for the appropriate interpretation of the proposed methods and tools [6063].

In Figure 1 a general scheme of the relationship between the declarative and procedural parts of the knowledge that is accumulated in the process of mathematical modeling based on interval data within the proposed ontological approach is shown.

The declarative part of the ontological approach consists of an ontology of formalized mathematical models (declarative ontology), which contains model definitions and an information repository. The ontology of using mathematical models (operational ontology) contains design data, operating conditions, and equipment parameters for the use of models. Model ontology consists of a model class that has both attributes and instances.

A class of equations denotes model equations (integral equations, algebraic equations or functions), model parameters, dependent and independent variables, and universal classes of constants. All of the above attributes of the class describe some knowledge about the mathematical model in a very explicit way, which makes representation more computer-interpreted, systematic, and more generalized in nature.

The feature of the proposed approach is that the components of the model created in this way can be reused. That is, equations, variables, and assumptions from one model can be reused when creating another model or the formed repository of mathematical models can be reused in the process of interpretation in other information systems. Thus, the process of creating mathematical models and their practical use becomes more intuitive and user-oriented, which is not very oriented in the modeling process. Each model in this approach is a specific instance of the ontology model class.

The ontology of formalized mathematical models also contains a functional representation of the model in the form of a graphical interpretation for the diagnosis of inaccuracies based on the improved model.

A subset of concepts and relationships that are fixed in the general ontological model is shown in Figure 2.

The procedural part of the ontological approach consists of a mechanism for construction based on methods of data relationship analysis, which analyzes equations in the ontological interpretation of mathematical models and translates them into expressions that can be interpreted in other external software environments. The general scheme of this approach is shown in Figure 3.

The ontology of a mathematical model consists of an operating class, the subclasses of which are various operations that occur during the implementation of the model and also contain the conditions for the implementation of each operation. This ontology also consists of a class of results, which stores the results of the model solving, as well as the results of experiments.

The model selection process control subsystem creates operators to initialize model parameters with corresponding values, creates associations between index variables and values for which it is denoted, initializes universal constants, collects actual model solution commands, and finds the appropriate solution to a set of equations.

This software-interpreted ontological approach provides the user with a number of additional features in the form of implemented functions. Among these features is symbolic processing, which directly analyzes the equations in different formats and provides their interpretation in different programming languages.

The graphical user interface is designed to display the results of solving (graphs or expressions) along with saving returns to the ontology of mathematical models and is also used to select the best instance of the model that is best suited for use in a particular application area.

Based on the analysis of the structure of interval models, the modeling process, and the features of experiments, the mathematical model from the point of view of the ontological approach is formalized by the following structures:where is the subject area within which the mathematical model is constructed or used; are the descriptions of the mathematical model; is a set of objects where the model can be used; is a set of parameters; is a set that describes the result of building object models; is a set of characteristics of the experiments; is a set of methods for structural identification of models; is a set of methods for identifying model parameters.

In turn, the subject area is described by a tuplewhere is the subject area identifier; is a subject area.

Descriptions of the mathematical model have the following structure:where is the identifier of equation; is a formalized description of the equations of a mathematical model.

The structure of the description of the set of objects where the model can be used has the following representation:where is an object identifier; is the information that describes the structure of the object of the model usage.

Tuple description of the set of parameters:where is a parameter identifier; is a parameter type; are the values of model parameters.

The presentation of the results of building object models is as follows:where is a result identifier; are the statements that describe the result.

The characteristics of the experiments are presented as follows:where is the identifier of the features that affect the experimental conditions; are the main characteristics; are the alternative characteristics; is a statement that describes the conditions of mathematical model usage.

Tuple for many methods of model structural identification:where is a method identifier; is a method of model structure identification; is the set of statements that describes the method; is the identifier of the parametric identification method.

The set of methods for identifying the parameters of the models will be presented as follows:where is an identifier of the model parameter identification method; is a method of model structure identification; is the set of statements that describes the method.

An example of implementation of the ontological approach for constructing models of fields of harmful emission concentrations in the squat layer of the atmosphere in the conditions of large errors of observations is shown in Figure 4.

The scheme of formalization of the mathematical model using the developed tool SmartOntologyModeller reflects the main structural components within the proposed ontological approach. As seen, the information repository with a formalized model description and external modeling environment, which describes the use of software-implemented models (in this case, an interval model with guaranteed interval parameter estimates), is translated to the index representation and stored in the HasEquation attribute. The diagram shows the dependent and independent variables and parameters combined to represent the structure of the interval model with guaranteed interval estimates of the parameters. On the right side of the diagram, the process of using assumptions for the implementation of methods, the conditions of experiments, recommendations for the use of methods, and visualization of simulation results are formalized.

As an option for using the above ontological description, consider the method of constructing a mathematical model for modeling based on interval data.

Let’s present this method as a sequence of steps.(1)The user selects the subject area: . The notation “_” means the prefix of choice, C is the selection procedure.The result is a proposed set of mathematical models for a set of objects.(2)Selection of the object of modeling.The formal description of this procedure is as follows:where is the projection operation of the tuple algebra, is the sampling operation from the set by the given attributes, is the ordering operation by the values of the corresponding attributes.The result of the operation is a selected object with a set of possible models if any of them are in the repository.(3)Choosing the conditions of application of the model:(4)Model selection.For this case use the following procedure:(5)To select and , a set is formed that represents the results of building object models using the following description:If the repository does not have adequate models to describe the object, continue to build models.(6)Choosing the conditions of model application (characteristics of the experiment):(7)The user chooses the method of identifying the model structure(8)Determining the structure of the model and its parametersThe result of this operation is a set of object models.(9)For certain and , a set is formed that describes the results of model construction:

Performing steps 1-5 makes it possible to choose an adequate model for describing the object in the repository. Steps 1, 2, 6–9 are used in case of the absence of models in the repository.

The proposed ontological description makes it possible to develop the environment for modeling on the basis of interval data.

5. Results and Discussion

The practical implementation of the ontology of mathematical modeling based on interval data leads to the formation of common structural elements based on the specifics of their use for a particular subject area. The practical implementation of software as one of the options for using the developed repository of model experiments in various subject areas within the proposed ontological approach is described in this paper.

As an example of the application of the ontological approach, the problem of building models of fields of harmful emissions concentrations in a squat layer of atmosphere on the basis of macromodels in the form of difference operators is considered, which structure needs to be selected under conditions of coordination with experimental data and when big errors in observations occur. Differential equations in partial derivatives, or their difference analogs, serve as a theoretical basis for modeling the processes of pollutants spreading in the atmosphere. In addition, due to big observation errors, the boundaries of which are usually known, the difference operators are built on the basis of methods of interval data analysis.

Consider the case of describing the field of concentrations of harmful emissions of a substance in the squat layer of the atmosphere by a macromodel in the form of a difference operator (2):where in our case is the predicted (true) value of the concentration of harmful substances in the squat layer of the atmosphere at a point in the city with discrete coordinates ; is unknown vector (dimension ) of parameters of the difference operator.

To estimate the vector of parameters of the difference operator, use the results of observations of the concentration of harmful substances for given discrete coordinates :where is measured value of the concentration of harmful substances in the squat layer of the atmosphere at a point in the city with discrete coordinates ; are the random limited by the amplitude errorswhich in the general case depend on the discrete values of the space coordinates.

Using the model of observations (44) and taking into account the limitation on the amplitude of the error (45), estimates of the concentration of harmful substances obtained on the basis of experimental data acquire interval representationwhere is a guaranteed interval which includes the true unknown concentration of the substance, i.e.,

Then, substituting in expression (5) the value of , which is given by the difference operator (43), the conditions for matching the experimental values of concentrations with the simulated ones are obtained.

Further, according to the description in paragraph 2, it is necessary to solve the problem of structural and parametric identification of the model using ABC algorithms.

One of the initial structures generated on the basis of the ontological description has the following form:

As a result of solving the problem of structural and parametric identification, a difference operator that adequately describes the spatial distribution of concentrations of nitrogen dioxide is obtained:

The mathematical models obtained in this way are stored in the repository.

If the object is changed, then in general the identification scheme remains unchanged.

The authors of this article have developed a number of models not only for predicting the spatial distribution of nitrogen dioxide concentrations for different conditions but also for predicting the dynamics of this harmful substance or the dynamics of carbon monoxide for different conditions. However, for their effective use, it is necessary to obtain a correct ontological description.

Based on the developed method of ontological description of the mathematical modeling of objects on the basis of interval data, some results of such description are shown in Table 2.

Based on the method of choosing a mathematical model within the ontological approach for modeling based on interval data, it is possible to switch models from the information repository depending on the conditions and specifics of the relevant experimental studies. The ability to control the switching process was practically implemented in the web-based information system SmartOntologyModeller.

Table 2 contains three columns that correspond to the description of the ontological model, namely: Attribute Description Value. These structural elements represent the subject area, object, modeling conditions (two groups of conditions), variables, etc. Also, for the specified conditions of application, there is a repository of models (4 such models are given in the table).

Thus, having a repository for the specified object (concentrations of harmful emissions in the squat layer of the atmosphere), the first five steps of the above method of choosing a mathematical model for modeling based on interval data can be applied:Step 1. Selection of the subject area: is “pollution of the squat layer of the atmosphere by harmful emissions from vehicles.”Step 2. Selection of the modeling object is “concentration of nitrogen dioxide emissions from vehicles.”Step 3. Selection of the conditions for the application of the model is “error in measuring the concentration of nitrogen dioxide at the level of 15%; control of traffic intensity; uniform period of measurements.”Step 4. Selection of a model from the repository for approximation of the fields of concentrations of nitrogen dioxide emissions from vehicles in Ternopil city, taking into account the results obtained in the previous steps:Step 5. For the obtained model, tabular and visual results of its use from the repository can also be received. For example, Table 3 compares the results of predicting nitrogen dioxide concentrations and those measured at control points.

Figure 5 shows an example of switching by choosing a mathematical model based on interval data depending on changes in the subject characteristics of the model. Switching occurs by changing the conditions of the simulation environment.

It should be noted that in the case of another task, such as modeling the dynamics of concentrations of harmful carbon monoxide emissions during the day in a certain area of the city and the existing repository of these models, the scheme of applying the method of choosing a mathematical model for modeling based on interval data will be the same. However, in the fifth step, the results will be presented adequately to the selected object. For this case, the results are presented in Figure 6.

The accuracy of the model of the dynamics of atmospheric pollution by vehicles is characterized by the equivalent accuracy of the measurement experiment. If the conditions of the experiment are changed, the accuracy of the model may also change. The advantage of the proposed approach is the saving of resources, which is achieved through the reuse of the developed model repository for the relevant objects from the repository.

Figure 6 shows the results of the corresponding switching, related to changes in the conditions of tracking traffic flows and according to the characteristics of the section of the street under research.

The connected Python toolkit allows the user to select a sample of the model and the corresponding operational example, after which the operators can build using the appropriate libraries that interpret equations from formatted, indexed parts, initialize model parameters based on the corresponding sample of operation, and finally allow the model to build the necessary solution. When calculating, the results are interpreted in the appropriate graphical interface using graphs, and tables, resulting in files, as well as other results that are stored in the operational part of the mathematical model with the appropriate refinements. This refinement will allow in the future choosing the right models depending on the specifics of the conditions of the experiments and the relevant subject area.

6. Conclusions

The inductive approach to mathematical modeling of complex systems based on interval data is limited to strictly formalized and algorithmic procedures. The proposed ontological superstructure for mathematical modeling of objects based on interval data makes it possible to generate tools in the form of software for building interval models. On the other hand, in the presence of previously constructed interval discrete models, the ontological superstructure makes it possible to create a repository of these models, as well as to manage this repository. In this case, it serves as a “switch” that choose the most accurate and adequate model from the repository of previously created models. The advantage of the proposed approach is illustrated by the example of modeling the processes of air pollution by harmful emissions from vehicles. In particular, the example illustrates the “switching” of the choice of a mathematical model based on interval data depending on changes in the subject characteristics of the model. Switching occurs by changing the conditions of the simulation environment.

In further research, the implementation of tools for integration of the offered ontology in external information systems for the purpose of their expansion and qualitative improvement is planned.

Data Availability

The data cited in this study are available from the published papers or the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This research was funded by the Ministry of Science and Higher Education in Poland under the program “Regional Initiative of Excellence,” 2019–2022, project no. 015/RID/2018/19, total funding amount 10,721,040,00 PLN and partially supported by the Ministry of Education and Science of Ukraine under the grant “Mathematical and computer modeling of objects with distributed parameters based on a combination of ontological and interval analysis” January 2022–December 2024, state registration number 0122U001497.