Abstract

This paper studies an advanced intelligent recognition method of underwater target based on unmanned underwater vehicle (UUV) vision system. This method is called kernel two-dimensional nonnegative matrix factorization (K2DNMF) which can further improve underwater operation capability of the UUV vision system. Our contributions can be summarized as follows: (1) K2DNMF intends to use the kernel method for the matrix factorization both on the column and row directions of the two-dimensional image data in order to transform the original low-dimensional space with nonlinearity into a higher dimensional space with linearity; (2) In the K2DNMF method, a good subspace approximation to the original data can be obtained by the orthogonal constraint on column basis matrix and row basis matrix; (3) The column basis matrix and row basis matrix can extract the feature information of underwater target images, and an effective classifier is designed to perform underwater target recognition; (4) A series of related experiments were performed on three sets of test samples collected by the UUV vision system, the experimental results demonstrate that K2DNMF has higher overall target detection accuracy than the traditional underwater target recognition methods.

1. Introduction

In the recent decades, more and more attention has been paid to the target detection by using UUV vision system [1, 2]. The underwater target detection aims to hunting and processing the target of interest, which may be a good way to eliminate potential threats and avoid the damage [3, 4]. For a long time, many scholars have devoted themselves to the development of UUV vision technology and thus, many effective methods for UUV vision technology have been developed and applied to deal with problems in the real environment [5, 6]. Among these studies, the UUV vision system-based target detection is one of the most concerned topics in the field of UUV vision technology, the three-dimensional of UUV model equipped with vision system is shown in Figure 1. The traditional model-based target detection method largely depends on the prior knowledge of the detection target, but the knowledge acquisition in connection with the detection target is often very difficult, which limits the application of the model-based target detection method in practical problems. Therefore, the urgent need for research methods on images data itself, namely, the desire for target detection based on algebraic methods has emerged, which has led to the target detection techniques of multi-variable statistical analysis. Such as principal component analysis (PCA) [7] and two-dimensional principal component analysis (2DPCA) [8, 9].

It is well known that besides the linear relations, the nonlinear structures are also hidden among the variables of image data which are difficult to be described. Therefore, in the past decades, kernel method has been rapidly developed as a new technology for processing nonlinear data. By using the kernel method, the original input image data is mapped to a high-dimensional or infinite-dimensional Hilbert space called a feature space where the image data structure in the high-dimensional space is linear. In addition, by introducing some special kernel functions, the inner product in the feature space can be calculated without considering the nonlinear mapping. For example, Xie and Lam proposed single-sample face recognition method by using kernel PCA [10]; Sun et al. proposed an effective K2DPCA approach [11]. Eftekhari et al. proposed a face recognition algorithm based on a block-wise 2D kernel PCA [12]. However, none of the above methods can ensure that the obtained matrix factors are nonnegative, and the basic components representing local features cannot be extracted due to the holistic nature of these proposed methods.

To seek solution for this problem, a new subspace method called nonnegative matrix factorization (NMF) [1316] was proposed. At present, NMF algorithm has been smoothly applied in the field of pattern recognition and image processing. Different from the traditional matrix factorizations, the core objective of the NMF is to find the product of two nonnegative matrix factors which is then used as the approximation to original data matrix. It is precisely the introduction of the non-negative conditional constraints on the matrix factor, so that the local features learned by NMF learning can reconstruct the original image data information through superimposition method, and the subtraction operation is no longer needed to eliminate some information. However, there are still two obvious shortcomings in the application of NMF algorithm in the field of target recognition. First, the two-dimensional image matrix must be transformed into a one-dimensional image vector, which can cause the problem of large dimension. Second, the matrix-to-vector transformation may result in the loss of information hidden within the two-dimensional image matrix. Therefore, in order to solve the two problems, the 2DNMF [1720] method was invented. 2DNMF considers the column and row information of the image matrix in two direction and finds the nonnegative matrix factors in two directions. Therefore, comparing to the NMF [15] method, 2DNMF is better than NMF in computational efficiency and detection accuracy.

Although 2DNMF [17, 18] has been successfully applied to the field of the target detection, it does not perform well when image data contains a strong nonlinear characteristic. For this purpose, the kernel two-dimensional nonnegative matrix factorization (K2DNMF) has been proposed, which is a nonlinear extension of standard two-dimensional nonnegative matrix factorization. In addition, this paper is not simply introducing the idea of kernel method, we explore the different interpretations of K2DNMF when column basis matrix factors are restricted to having different properties. Meanwhile, K2DNMF not only maintains the nonnegative and low-rank properties of the column basis matrix factor and the row basis matrix factor, but also exerts the orthogonal constraint on these two matrix factors respectively leads to a good subspace approximation of the original data matrix in the feature space. In the phase of underwater target detection, K2DNMF could accurately extract the effective information of the underwater target and identify the target with an effective classifier, thereby reducing the computational complexity. Experimental results demonstrated that in comparison with the traditional underwater target detection method, K2DNMF had better feature extraction ability and higher detection accuracy for underwater target images collected by UUV vision system.

The remainder of the paper is organized as follows: In Section 2, we briefly reviewed the feature mapping method. In Section 3, the K2DNMF method was proposed and its algorithm was described. In Section 4, the underwater image data collected by the UUV vision system was used to evaluate the performance of the K2DNMF method for underwater targets detection. Finally, a brief conclusion was summarized.

2. Feature Mapping

The advantage of feature mapping is that it can transform the nonlinear relationship of sample data in low-dimensional space into a linear relationship in high-dimensional space [21]. In addition, by introducing a kernel function, one can avoid carrying out the feature mapping and compute the inner product in the feature space. More knowledge about feature mapping and kernel functions are introduced in the following part of this section.

Consider underwater original training image samples , denoted the is a by matrix, we align the original training images into an augmented matrix which can be written as follows:

where represents the kth column vector of the augmented matrix . Therefore, it can be easily seen that the dimension of the augmented matrix is the by . Each data vector can be transfer to a higher-dimensional or even infinite-dimensional feature space by the following mapping function .

Thus, in the feature space, the augmented matrix can be denoted by .The mapping mode is also called kernel mapping, the kernel matrix is defined as:

where representative the kernel function. In feature space, data standardization can be done through mean centering and variance scaling of kernel matrix [21].

where , and denotes the trace of matrix.

3. Kernel 2DNMF

Generally, concerning with algorithms on the matrix decomposition of two-dimensional image data, such as BDPCA [22, 23] and RC2DPCA [24] algorithms that divide the original data space along both directions in the row and column into several subspaces, attempt to find the subspace approximation for the original data along both directions in the row and column. In this section, the augmentation matrix will be decomposed to find low-rank matrices in both the column and row directions of the high-dimensional feature space, thereby establishing a model for underwater target detection.

3.1. Column Direction Decompose of K2DNMF

Similar to KNMF [2528] method, consider the decomposition of the following form:

where represent basis matrix and represent the coefficient matrix respectively. For image feature extraction, we can choose the parameter arbitrarily only if it is smaller than the parameter . In this paper, since each column vector of corresponds to a column of the image after feature mapping, the matrix is also called column basis matrix. Furthermore, in order to achieve improvement of subspace approximation performance and the need to reduce computational load, the two methods of BDPCA and RC2DPCA enlighten us to expect that the column basis matrix can maintain orthogonality in the framework of K2DNMF method.

Through the above analysis, matrix and can be obtained by solving the following optimization problems:

where represents the Frobenius norm of matrix, is the identity matrix. However, since the kernel mapping function is unknown, it is almost impossible to obtain matrices and . Fortunately, if we constrain the basis vectors to lie within the column space of , i.e., , where is coefficient, Equation (7) can be further transformed into the following form:

where is coefficient matrix and is the trace of matrix. From the constraint conditions of the objective function in Equation (8) can be seen that the constraint conditions of the objective functions simultaneously include inequality constraints and the equality constraints. Therefore, Karush-Kuhn-Tucker (KKT) conditions are used to obtain the optimal solution of the objective function. The Equation (8) can be further rewritten as:

The expression of Equation (9) is the objective function of K2DNMF to reflect the image column direction information in the feature space. Next, we employ the Lagrange multiplier method to derive the iterative solutions of Equation (9). The Lagrangian function is defined as:

where , and are the Lagrange multipliers associated with constraints and , respectively.

Consider the zero condition of the partial derivative of with respect to yields:

where represents the partial derivative, the subscript denotes the entry of the matrix. By right Multiplying on both sides of Equation (11) and obtain with the help of the KKT condition, the updating rules for can be obtained as:

where represents the division operation of matrix elements. According to the KKT condition, it can be shown that the optimal solution of the target function must satisfy and . Therefore, the Lagrangian function can be redefined as and can be rewritten as follows:

In order to obtain the value of the . Requiring that the partial derivatives of with respect to and yields:

With the help of , Equation (15) can be simplified as:

By simultaneously left multiplying both sides of Equation (14) by and applying two known conditions and to Equation (14), we obtain which then is substituted into Equation (12) yields the updating rule for :

Thus far, if a matrix whose initial value is non-negative, a pair of finally converged nonnegative matrices and can be obtained by repeated iterations. The expression of the column basis matrix can be obtained by :

Since the feature mapping function is unknown, the final result of the column basis matrix cannot be calculated, which does not affect the effective expression of the underwater target feature information.

3.2. Row Direction Decompose of K2DNMF

In Section 3.1, we obtained a nonnegative column basis matrix and a nonnegative coefficient matrix by the decomposition of , so that the ith sample image can be easily derived as:

where , the purpose of this section to find the row basis matrix of K2DNMF. To this end, we constructed a new matrix that contains the row direction information of image samples in the feature space. By using the similar decomposition method in column direction, nonnegative matrix can be written as follows:

where , . and are the row basis matrix and the coefficient matrix corresponding to the row basis matrix, respectively. It is expected that the row basis matrix still maintains the orthogonality. Therefore, Equation (20) can be further transformed into the following optimization problem to be solved:

By expanding Equation (21) according to the calculation method of Equation (8), the following optimization problem containing the double constraints of equality and inequality can be derived as follows:

Next, the Lagrangian technique is used to derive the iterative solution of Equation (22). The Lagrangian function is defined as:

where , and represent the Lagrange multipliers associated with the constraints , and , respectively. Next, consider the zero condition of the partial derivative of with respect to yields:

where the subscript denotes the entry of the matrix. By right multiplying on both sides of Equation (24) and applying KKT condition , the updating rules for can be obtained as:

Then, the Lagrangian multiplier is determined by using the KKT condition. Thus, the Lagrange function can be redefined as , the form is shown as follows:

Moreover, requiring that the partial derivatives of with respect to and vanish, we have

Applying to Equation (28) yields:

By simultaneously left multiplying both sides of Equation (27) by and applying two known conditions and to Equation (27), we can deduce . Substituting into Equation (25) that can further simplify updating rule for :

Now, if a matrix whose initial value is nonnegative, a pair of finally converged nonnegative matrices and can be obtained by repeated iterations. Here can be decomposed into the form of sub-matrices align.

where is regarded as the coefficients matrix of , so the expression of can be approximated to the product of the row basis matrix and the coefficient matrix .

So far, K2DNMF row direction decomposition has been completed. By combining the results of the column decomposition of K2DNMF described in Section 3.1, The whole process of the K2DNMF algorithm is obtained and displayed in Table 1. At the same time, a total of four matrix factors , , and have also been determined by employing the corresponding iteration rules which are summarized in Table 2.

3.3. Underwater Target Detection Based on K2DNMF

Two stages are involved in the underwater target detection using the K2DNMF method, the feature extraction stage and feature classification stage of the underwater target respectively.

First, the stage of feature extraction is considered, since the column basis matrix and row basis matrix with orthogonality are regarded as orthogonal projection matrices in the K2DNMF method, so and are used for feature extraction operations. For any given new image sample , the feature matrix can be written as:

where is given by:

Mean centering and variance scaling of can be done by [21]

where , Equation (33) can be written as:

Next, a classifier that combines matching degree will be designed to achieve underwater target detection [29]. Assume that there is a total of training sample images . Using the first stage feature extraction method, the feature matrix of each training sample can be obtained. The distance of between any two sample feature matrices and is defined as:

Then, the distance between the feature matrix of each training sample and their mean feature matrices can be obtained by Equation (38). These distances can be defined in the form of a set (the set is also called the set of feature distance):

Suppose that the testing sample is given, the feature matrix can be easily obtained according to Equation (37), then the matching degree between and can be computed to judge whether or not belongs to the underwater target image. The matching degree is defined as follows:

where is parameter and can be determined by the matching degree between the maximum value in the feature distance set and . We set a threshold for the matching degree ( also called the control limit). If the condition of is satisfied, the target would be determined as an underwater target, otherwise, the underwater target was not found. Process of the underwater target detection based on K2DNMF as shown in Figure 2.

4. Experiments and Analysis

In this section, the proposed K2DNMF method was applied to underwater target images collected by UUV vision systems. The experiment was conducted in the experimental pool of Best Sea Assembly and control technology Institute of Harbin Engineering University. Real experimental pool was shown in Figure 3. The compared algorithms were BDPCA [22], RC2DPCA [24], 2DNMF [18], PNMF [30], and MKNMF [31]. Among them, BDPCA, RC2DPCA, and 2DNMF belong to linear methods, while PNMF, MKNMF, and K2DNMF belong to nonlinear methods which adopt the polynomial kernel . For algorithms of BDPCA and RC2DPCA, we choose the eigenvectors whose cumulative variance contributes of the eigenvalues in column direction and row direction are 90%. The maximum iteration number of the NMF-related is set to 300 and kept it constant in all the experiments. The number of PNMF and MKNMF feature is chosen as 200. The number of column direction features of 2DNMF and K2DNMF is chosen as 200, and the number of row direction features is chosen as 160. The kernel parameters are set as , , and for K2DNMF, PNMF, and MKNMF, respectively. Finally, an efficient classifier is used to carry out the underwater targets detection under the condition that the thresholds for all methods are set to be 80%. The experiments were repeated 10 times, the average detection accuracies and the average matching degrees were recorded.

UUV vision system is mainly composed of underwater camera and underwater lighting equipment and installed in the bow portion of UUV. The underwater camera can collect 24 frames of image data per second. Taking into account the data storage space and the rate of operation, the resolution of each frame of image data is normalized from pixel arrays to pixel arrays. The UUV vision system was used to consecutively sampling 50 seconds for each underwater target, including six types of underwater targets. one frame of image was collected every 0.25 seconds and stored in the underwater target image dataset, therefore, the underwater target image dataset is composed of 1200 frame images. In addition, to better simulate underwater optical fiber cable, all target images are cylindrical in shape. The underwater target image dataset contains various states of the underwater target, and some underwater target images with different states are shown in Figure 4. The frame images from the underwater target image dataset were randomly selected for training, while the remaining frame images were divided into three parts for testing. We stipulate that the difference between the number of any two types of underwater targets in the training set should not exceed 10%, otherwise the training dataset should be re-selected. The first part selected the frame images as the testing set 1, The second part selected the frame images after the testing set 1 is removed as the testing set 2. The third part selected the remaining frame images and added 600 frames of 15 underwater nontarget image dataset collected by UUV vision system as testing set 3 to be used, The above process was repeated 10 times.

When the number of training samples are chosen as 1000, the average detection accuracy of each method in the testing set 1 are shown in Table 3. Figure 5 shows the matching degree of different methods for sample data in the 7th experiment of testing set 1, where the red dots indicate the error detection of the underwater target, the green dots indicate the correct detection of the underwater target, and the blue line is the control limit. From Figure 5 and Table 3 can be clearly seen that the test results of PNMF, MKNMF, and K2DNMF is better than BDPCA, RC2DPCA, and 2DNMF, it shows that the nonlinear methods outperform their linear counterparts. The best performance among them is the K2DNMF algorithm proposed in this paper. For other methods, the accuracy of underwater target detection was increased by 9.4%, 7.2%, 5.5%, 3.8%, and 0.6% respectively. This demonstrates that our proposed K2DNMF method can have more effective feature extraction capabilities than other methods and can accurately detection underwater targets.

In order to demonstrate the ability of each method to recognize targets when the number of training samples is changed, different numbers (300, 400, 500, 600, 700, 800, 900, 1000) were selected from the underwater target image dataset to test in testing set 2. The average detection accuracies are shown in Table 4 and plotted in Figure 6, where the numbers in parenthesis are the standard deviations. Figure 6 and Table 4 show that when the number of training samples increases, the performance of all methods were improved. The detection accuracy of K2DNMF increases from 79.92% with 300 training images to 90.20% with 1000 training images. For other methods, the detection accuracies of BDPCA, RC2DPCA, 2DNMF, PNMF, and MKNMF increase from 73.12%, 74.63%, 71.86%, 72.90%, and 78.33% with training number 300 to 81.80%, 84.00%, 84.60%, 85.80%, and 88.20% with training number 1000 respectively. From Figure 6 and Table 4 can also be seen that when the number of training samples exceeds 50% of the samples in the image dataset, kernel methods are competitive to the linear method, that is, PNMF, MKNMF, and K2DNMF outperform BDPCA, RC2DPCA, and 2DNMF. Further observations demonstrate that among the selected training samples, the K2DNMF algorithm is the best overall performance in the detection accuracy compared to other methods. Figure 7 visually show that the standard deviation distribution of the detection accuracy with different methods when the same number of trainings samples were used in testing set 2. It can be seen from the experimental results of Figure 7 the standard deviation of the detection accuracy of K2DNMF is smaller than other methods when the training samples number (300, 400, 500, 600, 700, 800, 900, and 1000) were selected, this is probably because the orthogonality of column basis matrix and row basis matrix is considered in the objective function of K2DNMF, which is important to improve the robustness of the algorithm.

In view of the better simulation of underwater target detection, in testing set 3, in addition to the rest frame images, we added 600 frames underwater nontarget image datasets containing fifteen categories collected by the UUV vision system as the source of interference, and each category consists of 40 frame image data whose resolution of each frame is normalized from pixel array to pixel array. Figure 8 shows some interference source images in testing set 3. The average detection results are tabulated in Figure 9 and Table 5. It can be seen that the accuracy of K2DNMF ascends12.60% from training number 300 to training number 1000, while the detection accuracies of BDPCA and RC2DPCA increases 11.80% and 12.46% from 300 training images to 1000 training images. For NMF methods, the detection accuracy of 2DNMF, PNMF, and MKNMF increase 15.74%, 16.89%, and 12.96% from 300 training images to 1000 training images. From Figure 9 and Table 5 can also be clearly seen that when the number of training samples is 500, 600, 700, 800, 900, and 1000, the kernel method outperforms the linear methods for the accuracy of target detection. That is to say, when the number of training samples exceeds 41.67% of the samples in the image dataset, average detection accuracies PNMF, MKNMF, and K2DNMF are higher than BDPCA, RC2DPCA, and 2DNMF. By comparing and analyzing the experimental results of the testing set 2, if the kernel method is used for underwater target detection, the number of the training sample is recommended to exceed 50% of the image sample dataset. In addition, it is easy to find that among the number of selected training samples, the K2DNMF method proposed in this paper overall detection accurate higher than other methods for underwater target. This is probably because that K2DNMF preserves the structural information embedded among pixels, which is most vital for target detection. When the training samples (300, 400, 500, 600, 700, 800, 900, 1000) training samples were selected, the standard deviation distribution of the detection accuracy corresponding to each method is shown in Figure 10. From the experimental results of Figure 10 can be seen that the standard deviation of the detection accuracy from K2DNMF is the smallest in numerical value among all the methods regardless of which of the training sample numbers was selected, demonstrate the stronger robustness of the K2DNMF method.

5. Conclusion

In this paper, a new approach to underwater target detection—kernel two-dimensional nonnegative matrix factorization (K2DNMF)—has been developed. The acquisition method of the matrix factors , , and in the feature space was discussed. In order to achieve the better expression of the original data in the feature subspace, besides guaranteeing the nonnegativity and low-rank of the column basis matrix and the row basis matrix , the constraint on the two factors ortho-normality has been applied. In addition, in the phase of target detection, by combining the feature information of the underwater images extracted by the K2DNMF with the matching degree method, the computational complexity can be reduced to some extent. Finally, the underwater target detection scheme is further expanded by the K2DNMF method. Our proposed method has been evaluated by image data collected by the UUV vision system. Experiment results demonstrate that K2DNMF-based has good feature extraction ability and target detection effect.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by the Best Sea Assembly and the Control Technology Institute. The authors would like to thank Xue Du, Juan Li and Tianhao Jiang for providing assistance with the underwater experiments. This work is also supported in part by the National Natural Science Foundation of China under Grant 51609046 and 51709062, in part by Research Funds for the Underwater Vehicle Technology Key Laboratory of China under Grant 614221502061701 and 6142215180107.