Abstract

As a tool for analyzing time series, grey prediction models have been widely used in various fields of society due to their higher prediction accuracy and the advantages of small sample modeling. The basic GM (1, N) model is the most popular and important grey model, in which the first “1” stands for the “first order” and the second “N” represents the “multivariate.” The construction of the background values is not only an important step in grey modeling but also the key factor that affects the prediction accuracy of the grey prediction models. In order to further improve the prediction accuracy of the multivariate grey prediction models, this paper establishes a novel multivariate grey prediction model based on dynamic background values (abbreviated as DBGM (1, N) model) and uses the whale optimization algorithm to solve the optimal parameters of the model. The DBGM (1, N) model can adapt to different time series by changing parameters to achieve the purpose of improving prediction accuracy. It is a grey prediction model with extremely strong adaptability. Finally, four cases are used to verify the feasibility and effectiveness of the model. The results show that the proposed model significantly outperforms the other 2 multivariate grey prediction models.

1. Introduction

Time series prediction has always been an important issue in economic, finance, marketing, as well as social problems. With the development of science and technology, various forecasting methods emerge in endlessly. At present, hundreds of tools for analyzing time series have been developed, such as LR (linear regression), ARIMA (autoregressive integrated moving average) [1], and dendritic neuron model [2, 3]. However, these prediction models can only be established under the condition of large samples. As we all know, there are some applications that are difficult to obtain large sample data in our real life. For example, holing and sampling are an important means to analyze the oil and gas reserves of some region; however, the cost of holing is too high to drill many holes. Therefore, we cannot obtain the large number of sample data on oil and gas reserves. Faced with this small sample situation, the traditional forecasting model is obviously no longer applicable.

Grey prediction models play an important role in the grey system theory, which was pioneered by Deng [4]. At present, grey prediction models have been widely used in various fields of society due to their high prediction accuracy and the advantages of small sample modeling [512]. Depending on the number of variables required for modeling, grey prediction models can be divided into univariate grey prediction models and multivariable grey prediction models. The GM (1, 1) model is the most basic univariate grey prediction model and is a model for predicting time series with high uncertainty. The GM (1, N) model is the most basic multivariate grey prediction model, and it is used to predict time series affected by several different factors. At present, scholars mainly focus on the improvement of the GM (1, 1) model, but there are few studies on the improvement of the GM (1, N) model. To fill this gap, Zeng et al. established a new optimized grey prediction model and confirmed the feasibility and effectiveness of the model through examples [12]; in order to increase the adaptability of the multivariable grey prediction model, Wang established a multivariable grey prediction model with time power terms [13]; Xie et al. established a discrete multivariate grey prediction model [14]; in order to further improve the adaptability of the discrete multivariable grey prediction model, Ding et al. proposed a discrete multivariable grey prediction model with time power terms [15]; Considering that the discrete multivariable grey model has the problem of low model accuracy, Ding et al. proposed a multivariate discrete grey prediction model with time delay effect [16]; Ma et al. established a new multivariable grey prediction model with fractional order accumulation [17] and so on. These improvements all further improved the accuracy of the multivariable grey prediction model and expanded the grey system theory.

There are countless measures to improve the GM (1, N) model, but there are very few measures to optimize the background values of the GM (1, N) model. In order to make up for this shortcoming, this paper will establish a new the background values of the multivariate grey prediction model. The rest of this paper is organized as follows. The preliminary knowledge required in this article are displayed in Section 2, including the concept of basic multivariable grey prediction model, error analysis, and the optimized GM (1, N) model. Section 3 introduces the method of using the whale optimization algorithm to solve the new model. In Section 4, the advantages of the new model over the traditional grey model are illustrated by four real cases. The conclusion of this study is discussed in Section 5.

The main contributions of this paper are drawn as follows:(1)This paper proposes the idea of dynamic background value and combines it with the grey prediction model(2)This paper uses the whale optimization algorithm to solve the optimal parameters of the model(3)The model proposed in this paper is successfully applied to the case of energy consumption in China

2. Preliminary Knowledge

2.1. The Basic Multivariate Grey Prediction Model-GM (1, N) Model

Definition 1. Assume that is the dependent variable sequence and are independent variable sequences, then are called the first-order cumulative generating sequence of , whereand the following differential equationis called the whitening differential equation of the multivariate grey forecasting model (GM (1, N) model).
The following difference equationis called the GM (1, N) model, which is usually used to estimate the parameters of the model, whereare called the background values. The parameters , and of the GM (1, N) model are usually estimated using the least square method, namely,whereThe discrete form of the solution of equation (2) can be written aswhich is usually called the response function of the GM (1, N) model. According to the first-order cumulative reduction formula, the prediction formula of the GM (1, N) model can be obtained, namely,where n is the number of samples needed to build the model and m-n is the number of data that needs to be predicted.

2.2. Error Analysis of the GM (1, N) Model

According to equation (7), it can be seen that the prediction accuracy of the GM (1, N) model depends on the parameters , and the parameters are closely related to the background values. Therefore, the background values of the GM (1, N) model are the key factors that influencing the prediction accuracy of the GM (1, N) model.

Considering the integration of equation (2) in the interval , it follows

It follows from equation (9) that

Comparing equation (2) with equation (10) shows that the original GM (1, N) model uses and instead of and , respectively, which creates errors.

2.3. Dynamic Background Values

This section will introduce the preliminary knowledge needed to establish the optimized GM (1, N) model.

According to the nature of grey modeling, we can know that the time series used to construct the background value forms a large interval on the time axis, which we denote as , and in this large interval , each pair of adjacent constitutes a small interval . It is easy to see that is continuous in the interval ; then, according to the knowledge of advanced mathematics, the following formula can be obtained:

This article provides a simple proof process, as shown in Lemma 1.

Lemma 1 (the second mean value theorem). If function is continuous on interval and is a monotonic bounded function on interval , then there is that makes true.

When Lemma 1 satisfies equation (12), we can get the following equation:

Using equation (11) to replace the equation (4) in the GM (1, N) model, the background value of the optimized GM (1, N) model is obtained, namely,

It can be seen that the parameters are dynamic values, which will change with the change of that constructs the background value. Therefore, this article calls them dynamic background values.

2.4. The Optimized GM (1, N) Model Based on Dynamic Background Values

This section will introduce in detail how to optimize the GM (1, N) model.

Considering the integration of equation (2) in the interval , it follows

According to formula (14) and the trapezoidal formula, we can get the expression form of the parameter estimation formula of the optimized GM (1, 1, N) model, namely,

The GM (1, N) model with the same parameter estimation formula as formula (16) is called the optimized GM (1, N) model based on dynamic background values (abbreviated as DBGM (1, 1, N) model).

Similarly, according to the least square method, the parameter estimation method of the DBGM (1, 1, N) model can be obtained, namely,where

The time response function of the DBGM (1, N) model can be obtained by introducing the parameters , and estimated from equation (17) into equation (7), namely,

According to the first-order cumulative reduction formula, the prediction formula of the DBGM (1, N) model can be obtained, namely,where n is the number of samples needed to build the model and m-n is the number of data that needs to be predicted.

3. Solving Method of the DBGM (1, N) Model

3.1. Method for Determining Parameters of the DBGM (1, N) Model

It should be noticed that the parameters have been assumed to be given before we build the proposed model. Actually, selection of the optimal values of is also a significant issue, as it plays an important role in improving accuracy of the DBGM (1, N) model. In this section, we will present the details on how to compute the optimal values of based on the whale optimization algorithm. The objective of the optimal values of should make the proposed model have the highest accuracy with the given sample. Therefore, we just establish an optimization problem of which the objective is to minimize the error of the proposed model by changing the values of , and the constraints follow the modeling steps of the proposed model. In this paper, we choose the mean absolute percentage error (MAPE) as the criteria to evaluate the validation error of the proposed model, and then mathematical formulation of the optimization problem can be written as

It can be seen that the abovementioned planning problem is very complicated, and conventional methods cannot be used to solve this planning problem. Therefore, this article will introduce how to use the whale algorithm to solve this planning problem (see the next section for details).

3.2. The Whale Optimization Algorithm

Inspired by the social behavior of humpback whale groups, Mirijalili and Lewis proposed the whale optimization algorithm (WOA) in 2016 [18]. At present, WOA has been widely used in bioinformatics [19], image processing [20], and other fields due to its excellent performance. At the same time, WOA is also used to solve nonlinear programming problems which are more complex than problem (21) [21]. Therefore, this paper chooses WOA to solve the nonlinear programming problem (21). The main idea of WOA is as follows.

When whales prey, they move in a spiral to surround the school of fish currently considered the best target. Then, these whales update their positions based on the candidate target. This behavior can be expressed by a mathematical formula, namely,where represents the current position of the whales, represents the current best position of the whales, is a random number in the interval , is a stochastic number in the interval , is an arbitrary constant which determines the shape of the spiral movement, is the maximum number of iterations of the algorithm, and is a probability to choose a movement strategy from encircling and spiral moving behaviors. When the norm of is greater than 1, the position of all whales is updated based on the position of a whale randomly selected. This model can also be expressed by mathematical formulas, namely,where is the position of a randomly selected whale in the herd.

Since the original WOA is designed for unconstrained planning problems, it cannot be directly used to solve optimization problems with constraints. Therefore, a fitness function needs to be established to calculate the fitness of each whale agent. According to the nonlinear programming problem described in Section 4.1, the fitness function can be described as

It is worth noting that the authors use the WOA algorithm to solve the model which does not mean that WOA is the most suitable algorithm for solving the model. The WOA algorithm is used only for the purpose of solving model parameters. In fact, there is a more complete algorithm than the WOA algorithm. For example, the performance of Chaotic Local Search-based Differential Evolution Algorithms for Optimization proposed in reference [22]and the aggregate learning gravitational search algorithm with self-adaptive gravitational constants proposed in reference [23] is better than WOA, and readers can refer to it by themselves.

3.3. The Computational Steps

According to the principles of the DBGM (1, N) model, the computational steps can be summarized as follows:

Step 1. Calculate the parameters of the model according to the method described in Section 3.1.

Step 2. Bring the parameters obtained in Step 1 into equation (14) to calculate the dynamic background values.

Step 3. Bring the dynamic background values into equation (17) to calculate the least squares parameters of the DBGM (1, N) model.

Step 4. Put the least squares parameters obtained in Step 3 into equation (19) to get the predicted results of the DBGM (1, N) model

Step 5. Compute the predicted values of using the 1-IAGO equation (20)

3.4. Evaluation Indices of the Modeling Accuracy

The mean absolute percentage error (MAPE) and the absolute percentage error (APE) are used to assess the accuracy of the prediction models, which are defined as follows:where is the original series and is the fitted or predicted series.

4. Application

In this section, the advantages of the DBGM (1, N) model over the other grey models are demonstrated by four real cases. These models include the GM (1, N) model, ARIMA, ANN, and the optimized GM (1, N) model based on the Simpson formula (abbreviated as Model 1) [24] (only optimization measures of the same type are meaningful for comparison; the authors only found an article discussing optimizing the background value of the GM (1, N) model; therefore, this article chooses to use the model proposed in this document for comparison). Table 1 briefly lists the basic information for each data set.

4.1. Selection Method of N of Three Models

As we all known, a country’s energy consumption is closely related to the country’s economic situation and population. Therefore, this article chooses the population and GDP as the influencing factors; that is, China’s electricity consumption is set to , China’s energy consumption is set to , China’s per capita living energy consumption is set to , China’s per capita living electricity consumption is set to , the population is set to , and the GDP is set to .

4.2. Case 1: Forecasting China’s Electricity Consumption

With the continuous progress of human society and the continuous improvement of living standards, the problem of energy shortage has become a hot issue that humans pay close attention to. As one of the most important energy sources, electricity plays an important role in the power system and is the main driving force for the development of the country and society. Especially, short-term electricity demand forecast is more important in power system planning, including the scheduling of fuel purchases, the economic dispatch of production capacity, and power system management. However, unlike other energy sources, electricity cannot be stored on a large scale. If the power consumption is overestimated, the power system operators will be misled to make inappropriate decisions, resulting in increased operating costs and wasted energy. If the power consumption is underestimated, consumers will face electricity shortage. Bunn and Farmer pointed out that for every 1% increase in the forecast error of power production, the operating value will be lost by 10 million dollars [25]. Therefore, the power market needs an accurate and effective forecasting method to predict the electricity consumption.

In this section, the DBGM (1, N) model will be used to forecast China’s electricity consumption. The raw data of China’s electricity consumption from 2005 to 2017 are collected from the official website National Bureau of Statistics of China, which are listed in Table 1 (http://www.stats.gov.cn/english/). The points from 2005–2011 are used for building the prediction models, and the last 6 points are used for testing the prediction accuracy of the models. The prediction results of the five prediction models are shown in Table 2. The parameters of the three grey prediction models in this case are shown in Table 3. The graphs of the predicted values and MAPEs of the three prediction models are also plotted in Figure 1 (it is worth noting that only the indicators of the three grey prediction models are given in the figure; the purpose is to let people more intuitively see the degree of difference between three models of the same type). According to the predicted results shown in Table 2, it can be seen that the predicted results of the DBGM (1, N) model are the closest to the actual values, while the performance of ANN is the worst. It can also be seen from the curves shown in Figure 1 that the predicted values of the DBGM (1, N) model are the closest to the actual values. Therefore, the DBGM (1, N) model shows the best prediction performance in this case.

4.3. Case 2: Forecasting China’s Energy Consumption

As a basic resource, energy not only plays an important role in promoting economic development but also is closely related to people’s daily life; the production and consumption of energy also have an important impact on the ecological environment. Scientific and accurate prediction of energy consumption is the premise of formulating energy development plan. In this section, the DBGM (1, N) model will be used to forecast China’s energy consumption. The raw data of China’s energy consumption from 2005 to 2019 are collected from the official website National Bureau of Statistics of China, which are listed in Table 1 (http://www.stats.gov.cn/english/). The points from 2005–2011 are used for building the prediction models, and the last 8 points are used for testing the prediction accuracy of the models. The prediction results of the five prediction models are shown in Table 4. The parameters of the three grey prediction models in this case are shown in Table 5. The graphs of the predicted values and MAPEs of the three prediction models are also plotted in Figure 2. According to the predicted results shown in Table 4, it can be seen that the predicted results of the DBGM (1, N) model are the closest to the actual values, while the performance of ANN is the worst. It can also be seen from the curves shown in Figure 2 that the predicted values of the DBGM (1, N) model are the closest to the actual values. Therefore, the DBGM (1, N) model shows the best prediction performance in this case.

4.4. Case 3: Forecasting China’s per Capita Living Energy Consumption

With the advancement of urbanization, economic growth, and the improvement of the living standards of residents, the demand for energy in China has been greatly increased, in which the energy consumption of residents shows the characteristics of rapid growth, accounting for a large proportion of China’s total energy consumption. Whether the future energy supply can support the sustainable growth of China’s economy has become a topic of concern at home and abroad. Therefore, it is of great practical significance to accurately predict the per capita living energy consumption in the future for maintaining the healthy, sustainable, and stable development of China’s social economy.

In this section, the DBGM (1, N) model will be used to forecast China’s per capita living energy consumption. The raw data of China’s per capita living energy consumption from 2005 to 2017 are collected from the official website National Bureau of Statistics of China, which are listed in Table 1 (http://www.stats.gov.cn/english/). The points from 2005–2011 are used for building the prediction models, and the last 6 points are used for testing the prediction accuracy of the models. The prediction results of the five prediction models are shown in Table 6. The parameters of the three grey prediction models in this case are shown in Table 7. The graphs of the predicted values and MAPEs of the three prediction models are also plotted in Figure 3. According to the predicted results shown in Table 6, it can be seen that the predicted results of the DBGM (1, N) model are the closest to the actual values, while the performance of ANN is the worst. It can also be seen from the curves shown in Figure 3 that the predicted values of the DBGM (1, N) model are the closest to the actual values. Therefore, the DBGM (1, N) model shows the best prediction performance in this case.

4.5. Case 4: Forecasting China’s per Capita Living Electricity Consumption

In this section, the DBGM (1, N) model will be used to forecast China’s per capita living electricity consumption. The raw data of China’s per capita living electricity consumption from 2005 to 2017 are collected from the official website National Bureau of Statistics of China, which are listed in Table 1. The points from 2005–2011 are used for building the prediction models, and the last 6 points are used for testing the prediction accuracy of the models. The predicted results of the five prediction models are shown in Table 8. The parameters of the three grey models in this case are shown in Table 9. The graphs of the predicted values and MAPEs of the three prediction models are also plotted in Figure 4. According to the predicted results shown in Table 8, it can be seen that the predicted results of the DBGM (1, N) model are the closest to the actual values, while the performance of ANN is the worst. It can also be seen from the curves shown in Figure 4 that the predicted values of the DBGM (1, N) model are the closest to the actual values. Therefore, the DBGM (1, N) model shows the best prediction performance in this case.

5. Conclusions

In this paper, we proposed a novel multivariate grey model based on dynamic background values (abbreviated as DBGM (1, N) model), along with a whale algorithm-based algorithm to optimize its unknown parameters. The DBGM (1, N) model can adapt to different time series by changing parameters to achieve the purpose of improving prediction accuracy. It is a grey prediction model with extremely strong adaptability. In order to verify the feasibility and effectiveness of the DBGM (1, N) model, the DBGM (1, N) model, GM (1, N) model, ARIMA, ANN, and Model 1 (Model 1 is another optimized GM (1, N) model proposed in this paper, that is, the optimized GM (1, N) model based on the Simpson formula) are applied to four real cases. The results of four cases show that the prediction accuracy and fitting accuracy of the DBGM (1, N) model proposed in this paper have been greatly improved compared with those of the GM (1, N) model, and the prediction accuracy of the DBGM (1, N) model is the highest among the five prediction models. Therefore, it can be seen that the DBGM (1, N) model proposed in this paper has certain practical value. It is worth noting that the method of optimizing the background values of the multivariable grey prediction model proposed in this paper can not only be used to improve the prediction accuracy of the grey prediction models but also help solve the function approximation problem of definite integral.

Although the DBGM (1, N) model proposed in this paper has high prediction accuracy, it is still not very perfect. If it can be combined with fractional accumulation operator, can the prediction accuracy of the model be further improved? How to combine it with the fractional accumulation operator is also a problem. In addition, the grey multivariate prediction model is more reasonable and scientific than the univariate grey prediction model. However, the grey multivariate prediction model still has a shortcoming. When using the multivariate grey prediction model to predict the dependent variable, it is necessary to provide independent variable data; that is, if you want to predict the dependent variable data for the next 5 years, then you must provide the independent variable data for the next five years. So, how to obtain the independent variable data in the next five years is still a question. This problem exists not only in grey multivariate prediction models but also in other prediction models that need to consider influencing factors. Is it possible to use a univariate grey prediction model to predict the data of the independent variables before making predictions and then use the predicted data of the independent variables to build a multivariate grey prediction model for prediction?

Data Availability

The raw data of China’s energy consumption from 2005 to 2019 are collected from the official website National Bureau of Statistics of China,Which Website is http://www.stats.gov.cn/english/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.