Abstract

Accurate and reliable predictions of landslide displacements are difficult to perform using traditional point prediction approaches due to the associated uncertainty. Prediction intervals are effective tools for quantifying the uncertainty of point predictions by estimating the limit of future landslide displacements. In this paper, under the framework of the original lower upper bound estimation method, a direct interval prediction approach is proposed for landslide displacements based on the least squares support vector machine (LSSVM) and differential search algorithms. Two LSSVM models are directly implemented to generate the interval of future displacements, and the optimal model parameters are derived by the differential search algorithm. The Baishuihe landslide and the Tanjiahe landslide located on the shoreline of the Three Gorges Reservoir, China, are used to test the proposed approach. Compared with other models, the proposed method performed best and presented the smallest coverage width-based criterion values of 0.8927 and 1.0562 at monitoring stations XD01 and ZG118 for the Baishuihe landslide, respectively, and 0.1316 and 0.1191 at monitoring stations ZG289 and ZG287 for the Tanjiahe landslide, respectively. The results indicate that the proposed approach can provide high-quality prediction intervals for landslide displacements in the Three Gorges Reservoir area.

1. Introduction

Landslides represent a serious hazard posing a serious threat to human life and property safety and causing tremendous damage to the environment. The Three Gorges Reservoir area in China is a high-incidence area for landslide disasters [1, 2], with the number of landslides exceeding 4,200 [3]. It is impossible to comprehensively control for all potential landslides in this area. Therefore, early warning systems, which represent a cost-effective approach, are urgently needed for mitigating disaster risk.

Accurate and reliable displacement prediction is a core element of landslide early warning systems [4]. However, landslide deformation is characterized by complexity, uncertainty, and nonlinearity. Various factors, such as the landform, geological structure, stratigraphic lithology, rainfall, and reservoir level, affect the evolution of landslides and increase the difficulty of accurately predicting landslide displacement. Numerous prediction models have been proposed during the past several years. These models can be divided into two main categories: physically based models and phenomenological models [5]. Physically based models use the relationship between the geomaterial physical properties (e.g., shear strength, deformation modulus, and permeability coefficient) and landslide displacement to predict future displacements. A physically based model facilitates a better understanding of landslide deformation mechanisms; however, due to uncertainty in the geomaterial properties and boundary conditions of the model and the time-consuming implementation, the applicability of such models in large and complex landslides is limited. Alternatively, phenomenological models directly reveal the empirical relationship between the limited displacement data and underlying causes [5]. Compared with physically based models, phenomenological models do not require the construction of explicit expressions between the geomaterial properties and landslide displacements; therefore, they have broad applicability. Since Saito [6] established an empirical approach for landslide prediction based on tertiary creep in 1965, phenomenological models have evolved from empirical models to statistical prediction models [7] and then to computational intelligence models [4, 8]. In recent years, machine learning methods have been introduced for landslide prediction because of their powerful nonlinear mapping ability [4, 911]. These prediction methods include single computational intelligence models, such as backpropagation neural network (BPNN) [12, 13], Gaussian process [14], and extreme learning machine (ELM) [15], as well as hybrid intelligent models, such as ensemble of ELMs [16], decision tree combined two-step cluster [17], SVM optimized by particle swarm optimization (PSO) [18], LSSVM optimized by a genetic algorithm [19], and chaotic ELM [20]. The outstanding performance of computational intelligence algorithms in landslide displacement prediction makes this method increasingly attractive.

However, most prediction approaches are deterministic or point prediction methods [21]. Although the output of a specific model is believed to be quite accurate, it is still affected by the uncertainty contributed by the model structure, input selection, dataset noise, and so on. A single point prediction cannot provide the degree of uncertainty associated with a prediction. Thus, decision makers cannot determine the level of risk when facing a mitigation decision. From a practical viewpoint, the information provided by traditional point prediction methods may be insufficient. Therefore, more meaningful methods must be proposed to evaluate the uncertainties associated with point prediction. Prediction intervals (PIs) are powerful techniques for quantifying the uncertainty of point predictions [22]. These techniques consist of upper and lower bounds within which future targets are expected to lie with a predetermined probability. In this form, the best and worst conditions can be obtained, thus enabling decision makers to make more informed decisions. In recent years, numerous studies reporting the application of PIs have been performed in many fields, such as electricity price forecasting [23], flood forecasting [24], and wind power forecasting [25, 26].

PIs have also been introduced into landslide displacement predictions. However, these studies are at the nascent stage and very limited. In [21, 27], the bootstrap technique and the ELM method were combined for the interval prediction of landslide displacement. The bootstrap technique is the most frequently used technique for the construction of PIs, and it is easy to implement and quite reliable compared with other approaches. However, the calculation efficiency of this method is low for large datasets [28]. In [29], an improved lower upper bound estimation (LUBE) method for the interval prediction of landslide displacements was proposed. The method constructs PIs by utilizing an evolutionary algorithm to optimize the outweigh analysis of the artificial neural network (ANN) model with random hidden weights. The LUBE method [28] is a reliable method for interval predictions of time series data. It constructs PIs directly without the limitations of implementation difficulty, low computational efficiency, or doubtful assumptions regarding the data distribution compared with traditional methods (e.g., bootstrap and Bayesian methods) [30]. However, ANNs require the adjustment of many parameters and are prone to being trapped in a local optimum. These drawbacks of the ANN-based LUBE method may increase the forecasting uncertainty.

In this study, the LUBE method is applied to construct the PIs of landslide displacement. To overcome the disadvantages of ANNs, LSSVMs are applied instead. Unlike ANNs, the LSSVM has an excellent inference and generalization capacity with fewer debugging parameters and can always find a global minimum and avoid the overfitting problem. The differential search (DS) algorithm [31], which is a stochastic computational intelligence algorithm with a strong global optimization capability, is introduced to optimize the parameters of the LSSVMs. Thus, a direct interval prediction approach, namely, DS-LSSVM, for landslide displacement is developed. For testing the proposed approach, the Baishuihe landslide and Tanjiahe landslide located on the shoreline of the Three Gorges Reservoir, China, are applied. To validate the effectiveness of the proposed method, it is compared with several other methods, including the LSSVM model optimized by particle swarm optimization (PSO-LSSVM), LSSVM model optimized by genetic algorithm (GA-LSSVM), hybrid method combining the bootstrap, ELM, and ANN methods (bootstrap-ELM-ANN), and ELM optimized by particle swarm optimization (PSO-ELM).

2. Study Area

The study area is located in the Three Gorges Reservoir area in Shazhenxi, a town in Zigui County in Hubei Province, China (Figure 1(a)), and it belongs to a low-middle mountain topographic region. The mountain elevation in this area ranges from 500 to 900 m, and the range extends in the east-west direction. The lithology mainly consists of Jurassic and Triassic sandstone, shale, and limestone rocks, and their combination often forms soft and hard formations. The site belongs to a subtropical monsoon climate. The annual average temperature is 15°C, and the annual average precipitation is 958 mm. The rainy season is mainly concentrated from April to October, during which the monthly average rainfall is between 150 mm and 458 mm. The regulation of the Three Gorges Reservoir level adopts dry season and flood season schemes, and the reservoir level fluctuates between 145 m and 175 m and shows obvious seasonal features. From mid-September to the end of October every year, the water level rises from 145 m to 175 m and remains at 175 m until December. From January to May, the reservoir declines to 155 m. In mid-June, the reservoir reduces to a flood-control limit of 145 m. Between mid-June and mid-September, the water level is generally maintained at 145 m. Groundwater in the study area is mainly recharged by rainfall and the reservoir, and the reservoir level is the lowest groundwater erosion level. Due to the complex geological conditions, landslide hazards are frequent. The Baishuihe and Tanjiahe landslides (Figure 1(b)) are two typical landslides in this area.

2.1. Baishuihe Landslide
2.1.1. Geological Conditions

The Baishuihe landslide is located on the southern bank of the Yangtze River, approximately 56 km upstream of the Three Gorges Dam (Figure 1). Figure 2(a) shows the three-dimensional topographic contour map of the landslide, which is fan-shaped in plan view and covers an area of approximately 21.5 × 104 m2. The elevations of the top and tip of the landslide are 390 m and 75 m, respectively. The length and width of the displaced material are 500 m and 430 m, respectively. The average thickness of the landslide is 30 m according to the boreholes. The volume of the displaced material is estimated at 645 × 104 m3, and the main movement direction of the landslide is 20°.

The engineering geological cross section of the Baishuihe landslide is shown in Figure 2(b). The soil profile consists of three overlying zones. (1) The first zone is a quaternary deposit composed of silty clay and fragmented rubble. Silty clay is moist and brown in colour. The fragmented rubble originates from the underlying parent rock (mostly siltstone) and varies in shape (angular and subangular) and diameter (mostly ranging from 0.1 m to 0.6 m), with a content between 10% and 30%. (2) The second zone is a narrow band of silty clay (represented by a red line in Figure 2(b), i.e., shear zone). The thickness of the band lies between 0.2 m and 1.3 m, and the average thickness is 0.7 m. Brown silty clay is wet and dense and has high plasticity. This clay contains 10–30% subrounded siltstone gravel with diameters between 10 and 20 mm. (3) The third zone is bedrock that consists of siltstone with interbedded layers of mudstone and carbon shale. The bedrock is moderately weathered and belongs to the Lower Jurassic Xiangxi Formation (J1x), and its dip direction and dip angle are 15° and 35°, respectively. The siltstone is hard, while the interbedded layers of mudstone and carbon shale are susceptible to weathering; thus, landslide susceptibility is increased because of the unfavourable properties and weak strength.

2.1.2. Deformation Characteristics

Since 2003, the Baishuihe landslide has undergone considerable deformation in May and July each year. To monitor the displacement and render an early warning system, eleven global positioning system (GPS) monitoring monuments were installed on the main body of the Baishuihe landslide and surveyed monthly. According to ground deformation monitoring, the landslide can be divided into an active block and a relatively stable block. The monitoring displacement values of the five GPS monuments located in the active block and the corresponding rainfall and reservoir water levels over a period of ten years between 2003 and 2013 are displayed in Figure 3. The results indicate that the displacement of the landslide has increased continuously with time since the GPS monuments were installed in 2003. Two deformation phases, short-term accelerating deformation and long-term almost imperceptible deformation, can be distinguished from the step-like deformation characteristics of the monitoring displacement. The short-term accelerating deformation phase mainly occurs during the rainy season and reservoir drawdown. The long-term almost imperceptible deformation phase mainly occurs during the dry season and reservoir impoundment, which indicates that heavy rainfall and reservoir drawdown are the two main triggering factors that cause serious deformation of the Baishuihe landslide.

2.1.3. Correlation Analysis

To better understand the triggers for seasonal rapid acceleration, the displacement velocity at XD01 is correlated with the reservoir level, and the rainfall and reservoir level changes are analyzed. As shown in Figure 4, large bubbles (denoting high deformation rate) are mainly located at a reservoir level of approximately 145 m and lie in an area where reservoir water changes slowly (between 0 and −7 m per month). These data indicate that reservoir level drawdown is a main triggering factor of accelerated deformation, and the landslide is in the most dangerous stage when the reservoir level is about to drop to the lowest water level of 145 m. Relatively, large bubbles are distributed under different rainfall conditions. However, among the relatively large bubbles, approximately two-thirds are located above the rainfall level of 100 mm per month. This pattern demonstrates that rainfall is partly correlated with landslide deformation. The cold-coloured bubbles are mostly located at 100 mm/month rainfall and above the 150 m reservoir level. These bubbles are smaller than the warm-coloured bubbles at the 145 m water level. This pattern indicates that the combined effect of heavy rainfall and rising reservoir level on landslide deformation is less than the combined effect of heavy rainfall and water level drawdown. The above analysis indicates that reservoir level drawdown is the main triggering factor of Baishuihe landslide deformation, while rainfall is a secondary factor. Their combined effect on landslide deformation is greater than the effect of a single factor.

2.2. Tanjiahe Landslide
2.2.1. Geological Conditions

The Tanjiahe landslide is located on the right bank of the Yangtze River, 3 km upstream of the Baishuihe landslide and 59 km from the Three Gorges Dam (Figure 1). The landslide is horn-shaped in plan view and covers an area of approximately 40 × 104 m2 (Figure 5(a)). The elevations of the top and tip are 432 m and 135 m, respectively. The length, width, and average thickness of the displaced material are 1,000 m, 400 m, and 40 m, respectively. The volume of the displaced material is estimated to be 1,600 × 104 m3. The main movement direction is 340°.

Figure 5(b) shows the engineering geological cross section of the Tanjiahe landslide. The soil profile consists of four layers from top to bottom. (1) The first layer is a thin quaternary deposit. The average thickness of this layer is 5 m. The quaternary deposit is composed of brown, yellow, and moist silt clay and siltstone rubble and quartz sandstone with various shapes (angular and subangular) and diameters (ranging from 0.1 m to 0.3 m). The rubble content is between 20% and 50%. (2) The second layer is thick and consists of a yellow debris deposit. The average thickness of this layer is 35 m. The material of the blocky rock is siltstone and quartz sandstone. This layer underlies the thin quaternary deposit and is the main component of the displaced material. (3) The third layer is a narrow band of black silty clay. This layer is the location of the shear zone represented by a red line in Figure 5(b). The thickness of this layer ranges from 0.1 m to 0.3 m. Black silty clay is wet and dense and has intermediate plasticity. This clay contains 20–40% subrounded gravel of siltstone and quartz sandstone, with diameters from 5 mm to 20 mm. (4) The fourth layer is bedrock that consists of carbonaceous siltstone and quartz sandstone with interbedded coal seams. The bedrock is moderately weathered and belongs to the Lower Jurassic Xiangxi Formation (J1x). The dip direction and the angle of the bedrock are 10° and 36°, respectively.

2.2.2. Deformation Characteristics

Four GPS monitoring monuments were installed to monitor the displacement of the landslide since 2006, and the survey frequency was once per month. Figure 6 shows the time series of the monitoring displacements and the corresponding reservoir water levels and rainfall over a period of nine years. The figure shows that the deformation of the lower part (ZG290) is obviously smaller than that of the central and upper parts (ZG287, ZG288, and ZG289), the monitoring displacement increases continuously with time, and the displacement of each monitoring monument is synchronous. The deformation characteristics of the Tanjiahe landslide are approximately uniform without an obvious acceleration phase such as in the Baishuihe landslide. The landslide deformation rate will increase when it encounters rainfall or when the reservoir level is high (between 160 m and 175 m). The deformation characteristic illustrates that a high reservoir level and rainfall accelerate the deformation of the Tanjiahe landslide.

2.2.3. Correlation Analysis

The correlations among the statistical data of the displacement velocity of ZG289, rainfall, reservoir level, and reservoir level velocity are illustrated in Figure 7. The figure shows that large bubbles (relatively high deformation rate) are distributed throughout the reservoir level intervals and present the greatest distribution at the lowest reservoir level (145 m) and the highest reservoir level (between 160 m and 175 m). Based on the relative height (rainfall) of the large bubble distribution, the location of the large bubbles in the highest reservoir level (primarily located below 50 mm rainfall) is lower than that in the lowest reservoir level (primarily located above 50 mm rainfall). These findings show that the high deformation rate at the lowest reservoir level is mainly caused by rainfall, while the high deformation rate at the highest reservoir level is mainly caused by the reservoir level. Further analysis of colour of the large bubbles shows that the cold-coloured bubbles (reservoir level raising) are generally located under the warm-coloured bubbles (reservoir level drawdown), which means that an increasing reservoir level is the major factor leading to deformation. The large deformation rate during water level drawdown is mainly caused by rainfall. The above analysis shows that an increasing reservoir level and rainfall are the two main triggering factors that affect the deformation of the Tanjiahe landslide.

3. Methodology

3.1. Formulation of PIs

PIs can provide a range encapsulating the future unknown targets with a confidence level (usually 95%). Given a dataset , xi represent the input factors and ti represent the related output displacements of the forecast. A PI can be represented by the following equation:

The future predicted displacement is expected to be covered by with a coverage probabilitywhere α is the quantile of the standard normal distribution and and denote the lower and upper bounds of the i-th PI, respectively.

In the original LUBE method, an ANN model with two outputs is used to construct PIs. The two outputs correspond to the lower and upper bounds. In this study, under the framework of LUBE, the LSSVM model is applied instead of an ANN to build the PIs of landslide displacement. One standard formulation of the LSSVM model can generate only a single output. Thus, two LSSVM models are applied to generate two outputs for predicting the two bounds of the PI. Figure 8 shows the LSSVM model used to construct PIs in the LUBE method.

3.2. Performance Indices

Two indices, namely, the PI coverage probability (PICP) and normalized mean PI width (NMPIW), are applied to assess the performance of the proposed method. The PICP evaluates the possibility that the future target values lie within the upper and lower limits. It can be expressed by the following formula:where N is the number of predicted samples. The parameter ci can be expressed as follows:

The larger the PICP, the more targets fall within the prediction interval. The constructed PIs with a PICP greater than or close to the nominal confidence level are reliable PIs. Otherwise, the prediction interval is invalid and unreliable. Generally, an ideal PICP can be easily obtained by widening the PIs from either side. However, excessively wide PIs are meaningless in practice since they cannot provide accurate quantifications of target uncertainty. Therefore, the quality of the PI in terms of its width must also be assessed. The NMPIW is introduced in this paper and can be expressed by the following formula:where tmax and tmin denote the minimum and maximum values of the monitored displacement, respectively. NMPIW denotes the mean width of PIs normalized by the range of the target. If the constructed PIs are reliable (have a satisfactory PICP), then a lower NMPIW value of the PIs will correspond to a higher quality of the PIs.

In general, a PI with a high PICP and low NMPIW is considered high quality. However, both measures assess the PIs from only one aspect. This paper aims to construct PIs with higher PICP and lower NMPIW values. Therefore, a combined index, i.e., coverage width-based criterion (CWC) modified from [29], is applied to comprehensively evaluate the quality of PIs. The CWC can be given as follows:where ψ is in the range (0.1%, 0.5%). The parameter μ is equal to 1 − α, and δ lies in the range (0, 1). In the training of LUBE, γ equals 1, and in testing, γ is defined as the following step function [22, 26, 28, 30]:

The aim of the CWC is to balance the NMPIW and PICP of the PIs. A small CWC corresponds to high-quality PIs, and vice versa.

3.3. LSSVM for Regression Analysis

The LSSVM is an improved formulation of the original SVM. The LSSVM can greatly reduce the computational cost of SVMs by converting the inequality constraint to a set of linear equations. Details on this method can be found in [32]. In this study, the LSSVM is used for regression analysis. Given a training dataset with input data and output target , where Rm denotes the m-dimensional vector space. The formulation of the LSSVM for regression analysis can be represented using the following constrained optimization problem:where γ is a regularization parameter, ξi represent random errors, is the weight vector, is the kernel space function, and b is the bias.

By solving the above optimization problem, the result can be constructed aswhere αi is the Lagrange multiplier and is a kernel function matrix. In this paper, the radial basic function (RBF) is selected as the kernel function of the LSSVM because it has fewer parameters and excellent nonlinear mapping performance [33, 34]. The form of the RBF function is represented as follows:where δ is the bandwidth of the RBF (δ > 0).

The parameters γ and δ are two hyperparameters of the LSSVM that strongly influence the accuracy of the forecasting. In this study, the DS algorithm is applied to search for the optimal γ and δ.

3.4. DS Algorithm

The DS algorithm is a nature-inspired metaheuristic optimization algorithm proposed by Civicioglu [31] in 2012, and it simulates the Brownian-like random-walk movement of organism migration. The details of the DS algorithm can be found in [31]. The DS algorithm is simple and easy to use and has a fast and large search range. Multiple organisms can be simultaneously considered in the optimization process of the DS, which increases the likelihood of finding the global optimal solution. Therefore, the DS algorithm is applied for the optimization of the LSSVM.

3.5. Implementation of DS-LSSVM

To construct PIs, both lower and upper bounds of forecasting targets must be calculated. Unlike the ANN-based LUBE method, which can have two outputs, the LSSVM model can generate only a single output. Therefore, two LSSVM models are applied for predicting the two bounds of the PIs. γ and δ are the two parameters that affect the prediction accuracy of the LSSVM. Thus, the DS algorithm is used to optimize the four parameters of the two LSSVMs. In point predictions, optimal parameters of the LSSVM are usually obtained by the minimization of error-based cost functions, such as the sum of squared error [18, 19, 33]. Since this paper aims to use LSSVMs for PI construction, PI-based cost functions are more reasonable than error-based cost functions for training the LSSVM. In this study, the PI-based cost function of the DS algorithm is the CWC (equation (6)). Minimizing the CWC in the iterations of the DS algorithm enables the identification of optimal parameters. Then, the optimal parameters are transferred to the LSSVMs to directly construct the PIs.

In this paper, the XD01 and ZG118 sites of the Baishuihe landslide and ZG289 and ZG287 sites of the Tanjiahe landslide are selected as prediction targets. The frequency of the reservoir level and rainfall are surveyed once a day, and the monitoring displacement is determined once a month. To ensure a consistent time scale, both reservoir level and rainfall are processed into a monthly average reservoir level and monthly rainfall (accumulated rainfall within one month). The overall flowchart of the construction of the PI by the LSSVM and DS algorithms is shown in Figure 9. The detailed steps are described as follows.

3.5.1. Input Variable Selection

To establish nonlinear mapping between the influence factors and landslide displacement, suitable input variables must be selected, and these variables must be tightly correlated with landslide displacement. According to the previous studies of landslide displacement prediction in the Three Gorges Reservoir area [8, 13, 15, 1719, 21], the current evolution state and external triggering factors of landslides are two aspects for input variable selection. Generally, three factors, namely, the displacements over the past 1, 2, and 3 months, are selected as the input variables corresponding to the current evolution state. Four factors, namely, the rainfall over the past month, the rainfall over the past 2 months, the average reservoir level in the current month, and the change in reservoir level in the current month, are usually selected as the input variables corresponding to the external factors. Thus, a total of seven input variables are selected. These input variables are commonly used and considered to be effective inputs [8, 13, 15, 1719, 21]. The displacement in the current month is selected as the output.

3.5.2. Data Partitioning

The formulated data set with N samples should be partitioned into two parts, i.e., the training data set and testing data set. The former is used to train the LSSVM, and the latter is applied to validate the proposed method. In studies of landslide deformation prediction in the Three Gorges Reservoir area based on a machine learning algorithm [8, 13, 15, 1719, 21, 27, 29], researchers usually select the monitoring displacement data of the most recent year as the testing set and use the remaining data as the training set. Therefore, this common criterion is employed to divide the data. The XD01 monitoring point of the Baishuihe landslide is processed in 100 groups. 83 observations between February 2005 and September 2011 are selected as the training set, and 17 observations between January 2012 and May 2013 are selected as the testing set. The ZG118 monitoring point of the Baishuihe landslide is processed in 117 groups. The training set is the 99 observations between October 2003 and September 2011, and the testing set is the 17 observations between January 2012 and May 2013. The ZG289 and ZG287 monitoring points of the Tanjiahe landslide are processed in 101 groups. 83 observations between February 2007 and September 2013 are selected as the training set. The testing set is the 18 observations between January 2014 and June 2015. To eliminate differences in the data set sizes, both training set and testing set are scaled in the range [−1, 1]. After training and testing, all the values are renormalized to the actual size.

3.5.3. Parameter Initialization

In the DS algorithm, the dimension of the respective problem, the maximum iteration, and the number of elements in the superorganism must be determined. In this study, the proposed method contains two LSSVM models, and each model contains two hyperparameters (γ and δ). Thus, the dimension of the respective problem is four . The number of elements in the DS algorithm is set to 40. The maximum iteration is determined by multiple trials. Most of the trial results show that the cost function value tends to be stable when the maximum iteration is 400, and a larger number of iterations do not significantly change the fitness value. Therefore, the maximum iteration is set to 400. The initial hyperparameter search range should be coarse and is set to [10−15, 1015], and the hyperparameters are optimized within this initial range. Then, through multiple trials, the coarse range can be refined based on the range within which most of the optimal values fall. In the Baishuihe landslide, the search range of γ and δ is refined to [1010, 1015]. In the Tanjiahe landslide, the search area of γ and δ is [1010, 1015] and [1010, 1014], respectively. The confidence level is set to 95%. In the CWC, µ, δ, and ψ are set to 0.95, 0.05, and 0.1%, respectively.

3.5.4. Training the LSSVM Using the DS Algorithm

The training set is applied to train the two LSSVM models. The training algorithm is the DS algorithm, and the training goal is to minimize the cost function CWC. In each iteration, the CWC is evaluated, and the optimal parameters are updated based on the CWC. In this study, the termination criterion is the maximum number of iterations. Once the termination criterion is reached, the four optimal hyperparameters of the LSSVM are obtained, and the PIs of the training set can also be obtained.

3.5.5. Construct the PIs of the Testing Set

The optimal hyperparameters are transferred to the LSSVMs for constructing the PIs of the testing data set. Then, the quality of the constructed PIs is evaluated by the PICP, NMPIW, and CWC. The point prediction can also be derived by averaging the upper and lower bounds of the constructed PIs when deterministic results are demanded.

4. Results and Comparison

4.1. Numerical Results

For validating the repeatability of the proposed method, the proposed method is repeatedly implemented 10 times in each case study. The prediction performances (CWC, PICP, and NMPIW) are calculated in each replicate, and the final lower and upper bounds of the PIs are constructed by averaging the results of 10 replicates. The convergence behaviour of the fitness function for the four monitoring points for the LUBE method is shown in Figure 10. The test results are reported in Tables 1 and 2.

Tables 1 and 2 show the summary of all results obtained by the ten calculations using the proposed method. The CWC, PICP, and NMPIW methods and their statistics, including the median and standard deviation (SD), are computed for the quantitative assessment of the performance of the DS-LSSVM method. The highest-quality PIs (the smallest CWC and its associated PICP and NMPIW) in 10 replicates are also listed in the row named Best. The PICP is very close to or higher than the 95% confidence level in most replicates in both landslide cases. This result shows that the PI generated by the proposed method is effective, reliable, and able to satisfactorily cover the target in different runs. The values of CWC, PICP, and NMPIW in ten replicates are relatively consistent, and their medians and best values are close for the two case studies. The respective SDs of these three indexes are 0.0709, 0.0304, and 0.0689 for XD01 and 0.1472, 0.0287, and 0.1450 for ZG118 of the Baishuihe landslide and 0.0571, 0.0315, and 0.0091 for ZG289 and 0.0474, 0.0454, and 0.0138 for ZG287 of the Tanjiahe landslide. The low variation in the three indexes confirms the repeatability and stability of the proposed method.

Figure 11 shows the PIs of the Baishuihe and Tanjiahe landslides with a 95% confidence level constructed by averaging the results of the ten replicates. The monitoring displacement of both training and testing sets for XD01, ZG118, ZG289, and ZG287 is well captured by the constructed PIs, and the trends of the upper and lower bounds are basically consistent with the actual observations. In Figures 11(a) and 11(b), there are individual displacements in the training dataset that are not covered by the constructed PIs. The reason for this phenomenon is that the model is trained to construct PIs with 95% confidence, not 100%, so there may be individual monitoring displacements in the training set that are not covered by the PIs. The PICP in both case studies generally meets the given 95% confidence level. The above results indicate that the proposed method can effectively evaluate the uncertainties corresponding to the landslide displacement prediction and provide satisfactory PIs.

4.2. Method Comparison

For comparison purposes, the PSO-LSSVM, GA-LSSVM, bootstrap-ELM-ANN [21], and PSO-ELM [25] methods are implemented to construct PIs using the same data sets. In the PSO-LSSVM and GA-LSSVM methods, the search range of γ and δ is [1010, 1015] in the Baishuihe landslide. In the Tanjiahe landslide, the search area of γ and δ is [1010, 1015] and [1010, 1014], respectively. By minimizing the cost function CWC in the GA and PSO, the optimal parameters of the LSSVM can be obtained. In the bootstrap method, a paired bootstrap is used, and the bootstrap replicate number is set to 200. The ELM is a single-hidden-layer feedforward neural network. The number of hidden nodes in the ELM is determined by a 10-fold cross-validation method using only the training set. The nodes of the hidden layer of the ELM are 23 and 22 for the XD01 and ZG118 sits of the Baishuihe landslide, respectively, and 29 and 30 for the ZG287 and ZG289 sites of the Tanjiahe landslides, respectively. The sigmoid function is selected as the activation function. A single hidden layer ANN is selected, and the nodes of the hidden layer of ANN are 10 and 15 for the Baishuihe landslide and Tanjiahe landslide, respectively. In the PSO-ELM, an ELM model with two outputs is optimized by PSO to directly generate PIs [25]. The nodes of the hidden layer and activation function of the ELM are the same as those of the bootstrap-ELM-ANN model. The cost function of the neural network-based LUBE is equation (6). All the compared methods are run ten times. Their statistical indicators of the comprehensive index (CWC) are calculated to compare the prediction performance of the proposed method.

For a better quantitative comparison, the percentage improvement (PER) is used, and it is defined aswhere proposed and compared represent the DS-LSSVM and the compared methods, respectively. Based on the definition of equation (11), if the value of PER is positive, then a greater PER value corresponds to higher-quality PIs of the DS-LSSVM relative to the PIs of the comparison method, and vice versa.

The PER values are reported in the parentheses in Table 3. Hereafter and for ease of reference, the compared methods’ subscripts are used to represent the PERs of the compared methods (i.e., PERPSO-LSSVM, PERGA-LSSVM, PERBootstrap-ELM-ANN, and PERPSO-ELM).

Table 3 shows that the PER values are all positive and different for the different methods. In the Baishuihe landslide, the minimum and the maximum of the median of the PERs are 13.73% and 67.55%, respectively. The minimum and the maximum of the SDs of PERs are 35.71% and 92.17%, respectively. In the Tanjiahe landslide, the minimum and the maximum of the median of the PERs are 18.39% and 99.48%, respectively. The minimum and the maximum of the SDs of the PERs are 53.02% and 100%, respectively. These data indicate that the proposed method outperforms the four compared methods in terms of the reliability and stability of the landslide displacement interval prediction and can construct a more high-quality PI.

The LSSVM-based LUBE methods (PSO-LSSVM and GA-LSSVM) have a lower median of the PER than that of the neural network-based methods (bootstrap-ELM-ANN and PSO-ELM). This result shows that the LSSVM methods establish higher-quality PIs than the neural network models.

Compared with the PSO-LSSVM and the GA-LSSVM algorithms, the proposed method provides a significant improvement in the median and SD. For example, in ZG289 of the Tanjiahe landslide, the median and SD of the PERPSO-LSSVM are 18.39% and 87.72%, and the median and SD of the PERGA-LSSVM are 66.65% and 94.74%, respectively. These data show that, for predicting the landslide displacement interval, higher-quality PIs can be obtained using the DS algorithm to optimize the LSSVM hyperparameters than using PSO and GA algorithms.

Compared with the LSSVM-based LUBE method, the bootstrap-ELM-ANN and PSO-ELM methods perform poorly in the two case studies. Specifically, the PSO-ELM method performs the worst and has serious overfitting problems in the Tanjiahe landslide. The PSO-ELM method can still obtain a reliable PI in the Baishuihe landslide case. For this method, the median of the CWC of XD01 is 1.70, although its stability is poor, and its SD value is 1.15. In the Tanjiahe landslide, the PSO-ELM method is completely ineffective, and its best, median, and SD values are much larger than those of the other three methods. These data show that the PSO-ELM method poorly predicts the cumulative displacement of the two landslides.

The bootstrap-ELM-ANN method has higher reliability than the PSO-ELM method. The obtained PICP values in ten experiments are relatively stable and all higher than the 95% confidence level. For example, the SD of the PERBootstrap-ELM-ANN is smaller than that of the PERPSO-LSSVM and PERGA-LSSVM for the Tanjiahe landslide and XD01 of the Baishuihe landslide. However, the NMPIW is large, resulting in a large CWC. The best and median values of the PERBootstrap-ELM-ANN are higher than those of the PERPSO-LSSVM and PERGA-LSSVM, indicating that the bootstrap-ELM-ANN method tends to build wide PIs to achieve a satisfactory coverage probability.

5. Discussion

Different landslides have various deformation characteristics and degrees due to the various environments and engineering geological conditions. Therefore, the prediction accuracy varies for different landslides using the same prediction method. For example, in this study, the prediction performance of the Baishuihe landslide and the Tanjiahe landslide is obviously different under the proposed method. The deformation rate of the Tanjiahe landslide is relatively mild, and the step-like deformation characteristic is insignificant. The deformation of this landslide is mainly controlled by internal factors, such as the creep of the sliding zone material, and does not respond dramatically to the triggering factors (reservoir level and rainfall), which makes the deformation of the Tanjiahe landslide easier to predict by an extrapolation of the early deformation. In contrast, the deformation of the Baishuihe landslide is drastic. Due to external factors, such as periodic rainfall and reservoir level fluctuations, the deformation of the Baishuihe landslide has obvious step-like characteristics. Because of the complex nonlinear relationship between the triggering factors and landslide deformation, accurately predicting landslide displacement is more difficult, especially in the accession deformation phase, and the prediction error is often large (this phenomenon can be found in [13, 15, 18]). The information provided by traditional point prediction methods is barren and cannot enable decision makers to perceive the reliability of the point prediction. Therefore, interval prediction, which can quantify the uncertainties in the displacement prediction, is a better alternative and more reasonable than traditional point prediction. Through interval prediction, the range of the future deformation trend is quantitatively evaluated, and the best and worst conditions are provided to decision makers so that more rational disaster prevention decisions can be made.

For the same landslide displacement data, the quality of the constructed PIs by different machine-learning-based interval prediction methods is also diverse because of the imperfect structure, overfitting, and hyperparameters. In practice, a PI with a coverage probability that is equal to or greater than the given confidence level and an interval width that is as narrow as possible is expected. A PI with a high coverage probability but a very wide width is meaningless. This paper compares the proposed method with the PSO-LSSVM, GA-LSSVM, bootstrap-ELM-ANN, and PSO-ELM methods using the same data set. The comparative analysis shows that the proposed method is obviously superior to the compared method and can construct a high-quality PI with a high coverage probability and a narrow width. The bootstrap-ELM-ANN method builds the PI via a quantile analysis of the point prediction errors with certain prior assumptions. Because these assumptions may not conform with the actual situation, the established PIs will be unreliable and invalid. The proposed method based on the LUBE framework can directly build PIs without any error assumptions with high robustness and reliability. Therefore, this method is a promising tool for constructing the PI of the landslide displacement. The neural network-based LUBE method (PSO-ELM) is also compared with the proposed method. However, the performance of this method in the test set of the two case studies is poor. The PSO-ELM method overfits the Tanjiahe landslide, and the constructed PIs in the Tanjiahe landslide are not satisfactory in terms of the confidence level and the interval width. The overfitting problems may be due to fewer training samples, noise in the training data, or improper training. The process of solving the overfitting problem of the model is relatively tedious, which is a setback to the strong inference capacity of the proposed method and the good applicability to different landslides.

In practical applications, the proposed method is suitable for landslide displacement prediction in the medium term and long term and can be used to construct the PIs of the expected displacement, which has been correlated with the reservoir level and rainfall forcing. PIs are more realistic and relevant for decision makers than point predictions since they allow researchers to acknowledge uncertainty [21]. The construction of PIs can effectively supplement point predictions for early warning systems. The PIs can be used to detect changes in the creep stage. If the monitored value is much beyond the established PI range, then researchers should be alert and seek supplementary information to determine whether the landslide is in the tertiary creep stage. To this end, time-of-failure forecasting methods could be run in parallel to compute the alert velocity thresholds, and the corresponding early warning procedures should be considered until either collapse occurs or the landslide reaches a new equilibrium [4].

In this study, the proposed method needs a minimum window of three months of previous measurements to predict the displacement of the next month. Thus, the method is only applicable for landslides with relatively long and continuous monitoring data. PIs are constructed at a 95% confidence level in this paper. The other quantiles of the PIs can also be built by the proposed method if needed. Since the monitoring data of the landslide displacement are sparse, the prediction accuracy may be low for the cumulative displacement. Therefore, wavelet decomposition, empirical mode decomposition, and other time series decomposition techniques can be introduced and may improve the quality of the PIs. In addition, the appropriate selection of the input variables can effectively improve the prediction accuracy of the model. Because the focus of this paper is the interval prediction algorithm, the research on input variable selection is shallow, and only seven input variables are selected. Future research can fully investigate the influencing factors that may be related to deformation according to the specific landslide, and correlation analysis algorithms such as mutual information, maximum mutual information theory, and partial autocorrelation functions can be introduced to establish suitable variable selection criteria.

6. Conclusions

In this paper, a direct interval prediction method to quantify the uncertainties associated with landslide displacement prediction is developed. Landslide displacements can be predicted with reasonable confidence in advance by the proposed method. In this method, two LSSVMs are applied to construct the PIs of landslide displacement. The parameters of the LSSVMs are optimized by the DS algorithm, which has strong global optimization capabilities. By minimizing a PI-based cost function, the optimal parameters of the LSSVMs are obtained by the DS algorithm. The proposed method is applied to the Baishuihe landslide and the Tanjiahe landslide. The prediction results show that the proposed method has good applicability to the two landslides. The prediction performance of the proposed method is compared with that of several other algorithms, namely, the PSO-LSSVM, GA-LSSVM, bootstrap-ELM-ANN, and PSO-ELM methods. The comparison results confirm the effectiveness and superiority of the proposed method in the construction of high-quality PIs for landslide displacement. Therefore, the DS-LSSVM method is a promising technique for landslide displacement interval prediction in areas of the Three Gorges Reservoir with similar geological conditions as the Baishuihe and Tanjiahe landslides.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was funded by the National Key R&D Program of China (2017YFC1501305), the National Natural Science Foundation of China (Grant no. 41702328), the Hubei Provincial Natural Science Foundation of China (Grant no. 2019CFB585), Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) (Grant nos. CUGL170813 and CUGQYZX1747), Xi’an Centre of Geological Survey, China Geological Survey (Grant no. DD20190714), Science and Technology Research Project of Hubei Education Department (D2019038), and the Open Foundation of Top Disciplines in Yangtze University.