Abstract

Light detection and ranging (LiDAR) is commonly used to create high-resolution maps; however, the efficiency and convergence of parameter estimation are difficult. To address this issue, we evaluated the structural characteristics of received LiDAR signals by decomposing them into Gaussian functions and applied the variable projection algorithm of the separable nonlinear least-squares problem to the process of waveform fitting. First, using a variable projection algorithm, we separated the linear (amplitude) and nonlinear (center position and width) parameters in the Gaussian function model; the linear parameters are expressed with nonlinear parameters by the function. Thereafter, the optimal estimation of the characteristic parameters of the Gaussian function components was transformed into a least-squares problem only comprising nonlinear parameters. Finally, the Levenberg–Marquardt algorithm was used to solve these nonlinear parameters, whereas the linear parameters were calculated simultaneously in each iteration, and the estimation results satisfying the nonlinear least-square criterion were obtained. Five groups of waveform decomposition simulation data and ICESat/GLAS satellite LiDAR waveform data were used for the parameter estimation experiments. During the experiments, for the same accuracy, the separable nonlinear least-squares optimization method required fewer iterations and lesser calculation time than the traditional method of not separating parameters; the maximum number of iterations was reached before the traditional method converged to the optimal estimate. The method of separating variables only required 14 iterations to obtain the optimal estimate, reducing the computational time from 1128 s to 130 s. Therefore, the application of the separable nonlinear least-squares problem can improve the calculation efficiency and convergence speed of the parameter solution process. It can also provide a new method for parameter estimation in the Gaussian model for LiDAR waveform decomposition.

1. Introduction

Light detection and ranging (LiDAR) is an active remote sensing measurement system that integrates a laser altimeter, a global navigation satellite system (GNSS), and an inertial measurement unit (IMU) [1]. Currently, LiDAR is widely employed for distance measurements [2], vegetation inversion [3, 4], urban modelling [5], marine mapping [6], and other such applications. In such applications, the received waveform is mathematically described as the convolution of the emitted laser pulse and ground response function [7]. The geometry, physical characteristics, and vertical distribution of the objects irradiated by the laser pulses can be obtained by analyzing the properties of the received waveform, such as its amplitude and pulse width.

Waveform decomposition is an important step when processing full-waveform LiDAR data. The reflected pulse corresponding to each object is extracted from the received signal, using waveform modelling. Using LiDAR full-waveform data is a prerequisite for terrain and building reconstruction [8]. As LiDAR data are dependent on several factors, such as the background light and electrical noise from the avalanche photodiode, the selection of an accurate decomposition model and an efficient and fast algorithm is important to ensure accurate and precise results. For a majority of the received signals, amplitude modulation approximates a Gaussian distribution; therefore, waveform signal decomposition based on Gaussian fitting is the most popular method used for LiDAR full-waveform decomposition [9, 10]. Hofton et al. [11] screened the waveform data according to the importance of the Gaussian component and then solved the optimal value of each waveform component parameter using the nonlinear optimization Levenberg–Marquardt (LM) algorithm. Wagner et al. [12] analyzed the theoretical basis of simulating a LiDAR waveform using a Gaussian mixture model and obtained the pulse amplitude, half-width, and distance via Gaussian decomposition. These factors were included in the calibration equation, and the data were calibrated using an external reference to estimate the target backscattering cross section. Lai et al. [13] decomposed the waveform components from the original waveform data, layer by layer, until the maximum peak value of the remaining waveform was less than a certain threshold. Thereafter, the initial parameters were optimized using the limited-memory Broyden–Fletcher–Goldfarb–Shanno algorithm, and the optimal solution of the waveform component was finally obtained. Lu et al. [14] improved the second-order reciprocal zero-point method to detect the initial position of waveform components and then iteratively optimized the initial value using the LM algorithm to achieve least-squares fitting of the waveforms.

In general, the Gaussian decomposition of waveforms consists of two steps: (1) determining the number of Gaussian components and their initial parameters and (2) optimizing and postprocessing the characteristic parameters. A majority of parameter estimation methods regard all Gaussian parameters as nonlinear variables [15] and then employ the nonlinear optimization method to solve the parameters. Liu et al. [16] combined this method with sequential quadratic programming (SQP), developing a gradient-based optimization algorithm; it determined the optimal time delays and system parameters in a novel dynamic optimization problem for nonlinear multistage systems. To address the randomness of setting the final iteration condition artificially in the process of obtaining the optimal valuation, Zhai et al. [17] proposed a novel characteristic value correction iteration method, and L curve was drawn to express the correlation of solutions and residuals and the iteration time corresponding to the maximum curvature point is chosen as the iteration termination condition to obtain the “optimal” intermediate estimates. To enhance the estimation accuracy, Ding et al. [18, 19] presented a filtering-based gradient iterative algorithm and a filtering-based least-squares iterative algorithm, which yielded an improved convergence speed. However, a majority of the classic LiDAR waveform decomposition methods focus on determining the initial parameters and the iterative algorithm of parameter optimization. These methods neglect the feature that the full-waveform model is a linear combination of multiple Gaussian components. Moreover, the traditional nonlinear optimization method relies on the initial value of all parameters involved in the calculation, which requires a high demand on the parameter optimization algorithm. If some of these initial parameters are not selected appropriately, the waveform parameters may not converge to the optimal estimation value, thereby worsening the waveform processing effect.

Golub and Pereyra proposed an explicit variable projection (VP) method for solving the separable nonlinear least-squares problem in 1973 [20]. Compared with the corresponding algorithm, which directly solves the nonlinear minimum resultant problem, it featured three advantages: fewer iteration steps, fewer initial point guesses, and potentially reduced ill-conditioned degree when the original problem is ill conditioned. Based on this method, the improvement and application [2123] of VP were conducted. Kaufman [24] proposed a modified VP algorithm based on trapezoidal decomposition to simplify the calculation, which reduced its computational complexity and improved computational efficiency. Subsequently, Ruano et al. [25] proposed an improved VP algorithm based on orthogonal triangle (QR) decomposition for the sparse nonlinear function matrix, which also improved operation efficiency. According to different calculation methods of the Jacobian matrix of variable projection algorithm, Gan et al. [26] summed up four separation algorithms to solve the problem of separable nonlinear least squares and proved that it is usually very beneficial to separate the variables into a linear and a nonlinear part. Furthermore, for cases where the number of observations was significantly higher than the number of linear parameters, Gan and Li [27] proposed a VP algorithm based on Gram–Schmidt matrix decomposition, which reduces the amount of calculation required. In addition, Chen et al. [28] proposed a modified Gram–Schmidt method-based variable projection algorithm for separable nonlinear models and verified its effectiveness experimentally. Wang et al. [29] proposed a variable projection algorithm based on singular value decomposition (SVD) to solve the separable nonlinear least-squares problem. SVD was adopted to simplify the VP algorithm and improve the stability of the calculation. Subsequently, after the elimination of the linear parameters by the improved VP algorithm, the separable least-squares problem was transformed into an optimization problem only containing nonlinear parameters.

In this study, we applied the variable projection algorithm based on the SVD of the separable nonlinear least-squares problem for the waveform fitting of received LiDAR signals. Linear parameters were expressed with nonlinear parameters by the function, and the problem was then transformed into a least-squares problem only containing nonlinear parameters. Finally, the improved LM algorithm [30, 31] was employed to solve the nonlinear parameters, whereas the linear parameters were solved using the variable projection algorithm for nonlinear functions.

The remainder of this paper is organized as follows. In Section 2, the method for determining the number of components and the initial values of the Gaussian decomposition parameters for LiDAR waveforms is presented, along with a brief description of the traditional method for estimating waveform parameters. Section 3 presents the proposed method for separating linear and nonlinear parameters in the waveform model based on a variable projection algorithm; the optimization algorithm and the processing of waveform parameters after separation are also discussed in this section. Subsequently, the methods with and without separation of the linear and nonlinear parameters, i.e., the proposed and existing methods, respectively, are compared using numerical experiments, as described in Section 4. Finally, the conclusions and implications of the study are discussed in Section 5.

2. Gaussian Decomposition of LiDAR Signals

A Gaussian mixture model with N Gaussian functions was used to describe the received LiDAR waveform according to the radar range formula. After fitting and decomposing, the received pulses related to single objects are separated and extracted. Finally, the attribute information of each signal received from the ground is obtained.

Assuming that a received waveform has sampling points and is composed of N waveform components, the waveform based on the Gaussian function model is expressed as follows:where , and are the pulse amplitude, center position, and half-width of the waveform component, respectively. Here, denotes the noise in the original waveform. The aim of full-waveform Gaussian decomposition is to optimize parameters of Gaussian components, where the number of linear parameters (as) is and the number of nonlinear parameters ( and ) is .

2.1. Determining the Number of Gaussian Waveform Components and Parameter Initialization

To determine the components of the Gaussian waveform, the amplitude threshold is set and the first derivative of the function is calculated. When the maximum value is greater than the amplitude threshold, it is determined as the peak value. Thereafter, the second derivative of the function is calculated, and the inflection point is obtained. The time difference between two adjacent inflection points should be greater than the half-width of the transmitted pulse. Finally, the peak and inflection points are matched, where the position of the peak should lie between the two inflection points.

After determining the number of Gaussian function components, the feature parameters are initialized, where and are set as the sampling times of adjacent inflection points. Subsequently, the center position and the half-width of the Gaussian component are calculated using equations (2) and (3), respectively. The pulse amplitude refers to the amplitude corresponding to the center position:

2.2. LM Optimization without Waveform Parameter Separation

After determining the number of components and initial parameters of the Gaussian function, the most commonly used method regards both linear and nonlinear parameters as nonlinear parameters and subsequently solves them using the LM algorithm, for the nonlinear least-squares optimization [20]. The objective function is constructed by minimizing the sum of the squares of errors between the fitted and measured waveforms after establishing the Gaussian function model. According to the principle of nonlinear least squares, the solutions of linear parameters and nonlinear parameters minimize the following functions:where, is the original observation signal, is the number of observations, is the nonlinear function component, is the residual vector, and is the Euclidean norm.

The parameters to be determined are

The function model is expressed as

The objective function is established under the least-squares principle:

The Jacobian matrix consisting of the first-order partial derivatives of the residual vector is

Thus, the Hessian matrix of is approximated as

Therefore, the LM optimization algorithm is used to iterate the parameters:where , is the iteration result, is the iteration direction, and is the iteration length. To ensure that the objective function, i.e., Equation (6), is always in a descending state, is calculated using the Armijo criterion [32] of imprecise search. The term is adjusted using a strategy similar to the radius of the trust region [33]. First, a quadratic function is defined at the current iteration point to represent the quadratic approximation of the objective function:

Thereafter, the ratio of the actual decrease in to the expected decrease in the objective function is calculated based on the iteration direction:

For an initial value of , is calculated at each step in the LM algorithm. Subsequently, is adjusted according to the value of . Finally, a line search is performed to complete the iteration. When is close to one, the fitting between the quadratic function and the objective function is good at point . The parameter should be smaller when the LM algorithm is used to solve the nonlinear least-squares problem. When is close to zero, the fitting between the quadratic function and the objective function is poor at point . The parameter should be larger when LM is used to solve the nonlinear least-squares problem. When is neither close to zero nor one, is suitable and does not require adjustment. The critical values are typically 0.25 and 0.75, and the rules of adjusting are as follows:

3. LM Optimization Method for Waveform Parameters Based on Variable Projection

3.1. Gaussian Model Based on a Variable Projection Algorithm

Considering the structural characteristics of the functional model for the received LiDAR signal, we derived a new equation as a linear combination of with respect to . Equation (4) is written in the matrix form aswhere is the vector of the original signal. The column of corresponding to the nonlinear function is for parameter . For the given nonlinear parameters , the linear parameter can be estimated by solving the following linear least-squares problem:where is the pseudoinverse of . Substituting (20) in (19) yieldswhere is the orthogonal projector on the linear space spanned by the columns of and is the projector on the orthogonal complement of vector in the column space. To simplify the calculation, the matrix composed of nonlinear functions is decomposed using SVD:where is an orthogonal matrix, is an diagonal matrix, and is a orthogonal matrix. We obtained as

Thus,

If the rank of the matrix is , the first elements on the diagonal of are nonzero, i.e., . Therefore, can be divided into an matrix and an matrix and can be divided into a matrix and a matrix :

Therefore, the minimization problem of the objective function (15) is simplified as

3.2. Optimal Estimation Method for Waveform Gaussian Parameters

To fit the separated LiDAR signal with the Gaussian decomposition model, we combined the differential equation of the pseudoinverse matrix and the variable projection algorithm. The Jacobian matrix [20] of the separated objective function is

The objective function only containing nonlinear parameters is optimized using the LM algorithm. To ensure global optimization, once the nonlinear parameters are calculated for a given iteration, the linear parameters are calculated using equation (15) until the overall gradient is less than a threshold value. For the convenience of expression, in the following text, the traditional LM method without the separation of parameters is referred to as LMunSep. Moreover, the separable nonlinear least-squares solution method (i.e., LM method with the SVD variable projection algorithm for parameter separation) is referred to as LMVP. The optimization process for the waveform decomposition parameters based on the LMVP method is summarized in Algorithm 1.

Step 1: for initial values for , set , threshold value , and ; the maximum number of iterations is 100.
Step 2: compute and using equation (14) and (23), respectively.
Step 3: compute the iteration length using the Armijo criterion and the iteration direction .
Step 4: adjust according to and then compute using the LM algorithm iteration, i.e., equation (10).
Step 5: compute according to the variable projection algorithm.
Step 6: if , stop. Else, and go to step (3).

4. Numerical Experiment

The nonlinear optimization LM method for parameter separation proposed herein for the estimation of Gaussian decomposition parameters of LiDAR signals was validated using five groups of simulated LiDAR signals and the ICESat/GLAS satellite LiDAR waveform data for the terrain of Germany in 2003. The same data were evaluated using the traditional nonlinear optimization method to compare the two models in terms of solution results, evaluation coefficient, and convergence. The experiments were performed using MATLAB 2016b on a 2.3 GHz desktop PC running Windows 10.

4.1. Example 1: Simulation Experiment

It was assumed that the five groups of simulated received LiDAR signals were composed of four Gaussian components, with a sampling interval of 0.5 ns and a sampling number of 200. The amplitude, position, and width of the original four Gaussian functions in the five groups are listed in Table 1.

Due to the influence of noise, the actual waveform distribution was not completely Gaussian. Based on the simulation data, random noise obeying a standard deviation of 0.5 was added to obtain the final five groups of discrete received signals. First, according to the parameter initialization method, the initial values were selected. Thereafter, the waveform parameters were iteratively calculated using both methods for a comparison.

4.1.1. Comparisons of Solution Results in Simulation Experiment

For the five groups of data, the results of the optimization methods with and without parameter separation are listed in Table 2.

From Table 2, it is evident that the results for both methods were identical, except in the case of group 4. For a clearer visualization of the decomposition and fitting effect, the original observation value, decomposition waveform, and fitting waveform obtained using both methods for the five groups of signals are depicted in Figures 1 and 2. As the fitting waveform was identical to the decomposition waveform obtained using the LMunsep method and the LMVP method, except in the case of group 4, groups 1, 2, 3, and 5 were represented as a graph, as depicted in Figure 1. Figures 2(a) and 2(b) present the results for group 4 obtained using the LMunsep method and the LMVP method, respectively.

As shown in Figure 1, the decomposition waveforms calculated via the two methods are identical. In Figure 2, the decomposition waveforms and the fitting waveforms obtained for group 4 are inconsistent. After comparing Figures 2(a) and 2(b) and the parameter results in Table 2, we observe that the optimization method and the decomposition waveform based on variable projection separation fit the received signal more closely than those of the traditional optimization method. Moreover, when the iterations reach the maximum limit, the traditional parameter optimization method did not converge on the optimal value.

4.1.2. Analysis of Simulation Example Results

To analyze the effect of the parameter optimization method based on variable projection separation, we compared it with the traditional parameter optimization method in terms of iterations (Iterations), computing time (T, the unit is in seconds), root mean square error (RMSE), goodness of fit (R2), correlation coefficient (C), and maximum deviation (M_diff):where is the original received signal and is the fitted received signal.where is the average of .where is the mean of . RMSE values closer to 0 indicate a better fit, whereas R2 and C values closer to 1 indicate a better correlation between the fitting result and the original signal.where M_diff is the maximum absolute value of the local difference between the received signal and the fitted signal.

Subsequently, the evaluation coefficient results were compared, as shown in Table 3.

Table 3 indicates that the calculation for group 4 data using the LMunSep method required the highest number of iterations (100), and its results differed significantly from the optimal estimate (as shown in Figure 2). However, the proposed improved LMVP method yielded the optimal estimate after just 13 iterations. Similarly, for the other groups of data, the improved method required fewer iterations than the LMunSep method, while yielding a significantly improved computational efficiency. The RMSE, R2, C, and M_diff values were identical for both methods and all data, except in the case of group 4, wherein both methods obtained the optimal estimation.

4.1.3. Convergence Analysis in Simulation Experiment

To further compare the results of the LMunSep and LMVP methods, the parameter iteration process for the five groups of data is depicted in Figure 3. Figure 3(a) presents a comparison between the iteration processes of the linear parameter for the five groups using the two different methods. Figures 3(b) and 3(c) present the same comparison for the nonlinear parameters and , respectively. Both the methods converged to the optimal value; however, the LMVP method required fewer iterations, while achieving faster convergence.

4.2. Example 2: Decomposition of Real Full-Waveform LiDAR Data

In the real experiment, we used the ICESat/GLAS satellite LiDAR waveform data for the terrain of Germany in 2003 (Sources: https://nsidc.org/data/), with a sampling interval of 1 ns and a sampling number of 544.

4.2.1. Comparisons of Solution Results in a Real Experiment

For the LMunSep method, we conducted two experiments that were restricted by different maximum number of iterations, i.e., 100 and 200 iterations. The results of the parameters were calculated using the LMunSep and LMVP methods (Table 4).

In the case of the LMunSep method, both experiments reached the maximum number of iterations; the results for 200 iterations was always closer to the LMVP method than those for 100 iterations. When the LMunSep method was allowed to run 200 iterations, it approached the results from the LMVP method; however, it never met them. Therefore, to obtain an optimal estimation value from the LMunSep method, more than 200 iterations and additional time is required.

Figure 4 presents a clearer comparison between the decomposition and fitting effect for the two models, original observation value, decomposition waveform, and fitted waveform of the received LiDAR signals. Figure 4(a) presents the LMunSep graph for the 200-iteration run, and Figure 4(b) depicts the LMVP graph.

As in the comparison of the parameters, the fitting and decomposition waveforms for the LMunSep and LMVP methods yield similar results; however, the LMVP method always yields the best fit.

4.2.2. Analysis of the Real Experiment Example Results

We analyzed the parameters in the LMunSep and LMVP method calculations to determine the evaluation coefficients (Table 5).

The two experiments using the LMunSep method failed to achieve the optimal result with the maximum number of iterations; however, the LMVP method achieved the optimal result within 14 iterations, i.e., a reduction of 14 times considering the 200-iteration case. The RMSE from the LMVP method was the lowest, whereas the RMSE from the LMunSep method using 200 instead of 100 iterations was closer to that of the LMVP method. Considering the goodness-of-fit and correlation coefficient, the LMVP method yielded values closer to 1 than the traditional method, indicating better fitting. Similarly, M_diff was lower for the improved method. In conclusion, in the real numerical experiment, the LMVP method achieved higher accuracy and better fitting than the LMunSep method.

4.2.3. Convergence Analysis in Real Experiment

The parameter iteration process depicted in Figure 5 provides a more clear comparison between the results of both models in the real experiments.

Although the LMunSep and LMVP methods converge to a similar value, the LMVP method requires fewer iterations and achieves faster convergence. For the LMunSep method, there are only minor changes in the optimal value during the later iteration steps; however, for the LMVP method, the optimal value converges in 14 iterations, achieving a higher convergence speed.

Therefore, in summary, the simulation and real experiments prove that the LMVP method yields better results than the LMunSep method and more efficient calculations.

5. Conclusions

This study focused on the special case of LiDAR signals that can be decomposed into a linear combination of multiple Gaussian function components. The linear and nonlinear parameters are separated using the SVD variable projection algorithm, and the original problem is transformed into a least-squares problem comprising only nonlinear parameters. Compared with the traditional method of parameter estimation without separation, the proposed method increased the probability of global convergence by reducing the dimension of the parameters to be estimated, especially in terms of reducing the number of iterations and computational time. In addition, the new method yielded better waveform fitting, which increased the accuracy of extracting the ground feature information from the received signal. The findings of this study are expected to expand the application of separable nonlinear least-squares methods, thereby providing a reliable solution for parameter estimation when processing LiDAR signal data.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Nature Science Foundation of China (grant no. 41774002), the Taishan Scholars Program of Shandong Province (grant no. TSXZ201509), and the Graduates Innovation Fund of Shandong University of Science and Technology (grant no. SDKDYC190207).