Abstract

In this paper, a multi-unmanned underwater vehicle (UUV) cooperative dynamic maneuver decision-making algorithm is proposed based on the combination of game theory and intuitionistic fuzzy sets. Underwater environments with weak connectivity, underwater noise, and dynamic uncertainties are fully considered through intuitionistic fuzzy sets, which solves one of the main problems in making decisions underwater. Subsequently, the intuitionistic fuzzy multiattribute evaluation of a UUV maneuver strategy is conducted, and the intuitionistic fuzzy payment matrix of the cooperative dynamic maneuver game is obtained. Thereafter, the Nash equilibrium condition is proposed to satisfy the intuitionistic fuzzy total order, and the Nash equilibrium maneuver decision-making model under a dynamic underwater environment is established. Meanwhile, the modified particle swarm optimization method is presented to solve the established problem and find the optimal strategy. Finally, an example is used to verify the superiority of the proposed cooperative dynamic maneuver decision-making algorithm.

1. Introduction

Unmanned underwater vehicles (UUVs) are characterized by their small size, superior maneuverability, low cost, preferable stealth, etc. They can operate independently or under a manned operation. The multi-UUV control algorithm has gained increased attention, which involves a multi-UUV cooperative formation control, cooperative navigation, cooperative confrontation, etc. [15]. At present, research on multi-UUV collaborative formation and navigation has had prosperous developments [2, 6]. However, studies on multi-UUV cooperative confrontation are still quite limited. Multi-UUV cooperative confrontation can be applied to marine scientific investigation and military confrontation, including underwater multitarget tracking, surveillance, and detection, effectively increasing the radius of underwater operation and reducing underwater equipment losses and casualties.

Maneuver decision-making is key to multi-UUV cooperative confrontation because it acts as the basic action in each confrontation step [7]. Current studies on maneuver decision-making mainly focus on unmanned aerial vehicles (UAVs) and land unmanned system (LUS) clusters, among others [8]. There are many studies on single-agent control technology but only a few on multiagent decision-making technologies. There are also various studies on unilateral strategy optimization but few on bilateral game theory. Moreover, most references research path planning for a single agent. For instance, in [4], the simulations give a numerical experiment with six agents. Therefore, a more scientific and accurate real-time confrontation strategy can be formulated by introducing cooperative game theory into the maneuvering and decision-making of unmanned system clusters [9]. However, Wang et al. have only studied AUV strategies and have not discussed the UAV number in their study.

As opposed to other land or air unmanned vehicle cluster confrontation, multi-UUV cooperative game theory has some unique features. First, UUV has a low communication rate, poor information interaction ability, and weak perception because of the weak connectivity in underwater environments, making it difficult to locate a UUV cluster precisely and restricting the decision-making process. Subsequently, in the confrontation process, not only the antagonistic situation should be considered, but also the cooperation between UUVs in the cluster is required. In addition, underwater confrontation in a real-time situation is dynamic and lasts for several rounds, which makes it more complicated.

Dai et al. used game theory to realize the decision-making of noncommunication multirobot (three in one experiments) tasks [10], in which the joint probability distribution of the robots were established according to the distance information, and the dynamic game process with incomplete information was presented. An approximate dynamic programming method was proposed for the one-to-one air combat maneuver problem in [11]. The discrete simulation model of the UAV air combat was analyzed and validated through game theory by Poropudas et al. [12]. Suresh and Ghose synthetically considered the number of UAVs, weapon configuration, and ground defense system, discussed the tactical cooperation of UAVs in ground confrontation, and proposed a four-to-four UAV grouping algorithm based on Dubins’ path [13]. The game characteristics between 22 multirobot patrol formation and patrolled objects were analyzed by Hernandez et al. and a distributed dynamic collaboration method based on game theory was proposed in [14]. Dahl et al. proposed the application of space chain scheduling to solve the cooperative game problem in three-to-three multirobot task allocation [15]. Wang et al. studied a cooperative game-based autonomous cluster aggregation strategy for the cluster aggregation behavior of a UAV cluster in implementing reentry target-oriented cooperative surveillance [16]. For underwater confrontation, Muhammed et al. proposed different kinds of game theories for cooperation among acoustic sensor nodes and compared their performances under different conditions [17]. However, these existing studies concentrated on the unmanned aerial and land systems cluster, which have not fully considered the impact of underwater environmental characteristics.

This study focuses on two key factors in underwater maneuver decision-making, namely, its weak interconnection characteristic and dynamic confrontation process. Weak interconnection, including weak connectivity, underwater noise, dynamic uncertainties, leads to the uncertainty payments of the maneuver decision-making process of UUVs. Classical game theory only discusses the game with clear payments [18]. However, in an actual underwater environment, the information provided is mostly fuzzy. If this fuzzy information is converted into a clear value directly, it will lead to distortion and loss of real information. Consequently, the maneuver decision-making algorithm will naturally lose its viability as a strategy choice. Therefore, in this study, a cooperative dynamic maneuver decision-making algorithm is proposed based on intuitionistic fuzzy game theory. Underwater environments with different kinds of uncertainties are fully considered through the intuitionistic fuzzy sets, which solves one of the main problems of underwater decision-making process. Meanwhile, the intuitionistic fuzzy multiattribute evaluation of the UUV maneuver strategy is performed, and the intuitionistic fuzzy payment matrix of a mobile game is obtained. The Nash equilibrium condition satisfying the intuitionistic fuzzy total order is proposed, and the Nash equilibrium maneuver decision-making model under a dynamic underwater environment is established. Finally, the modified particle swarm optimization method is used to solve the established problem and find the optimal strategy. The general diagram of the maneuver decision-making process is shown in Figure 1.

The rest of this paper is organized as follows. Section 2 presents the maneuver attribute evaluation process. Section 3 provides the decision-making model based on intuitionistic fuzzy game theory. Section 4 is the main result of the existence of Nash equilibrium. The cooperative dynamic maneuvering strategy optimization is presented in Section 5. Section 6 shows an example of multi-UUV confrontation. Finally, conclusions are drawn in Section 7.

2. Maneuver Attribute Evaluation

To establish the fuzzy payoff matrix, the evaluation of multi-UUV maneuver attributes is presented according to the information based on the situation of different confronting sides. The confrontation trajectory of a multi-UUV can be regarded as a combination of multiple maneuver actions. There are seven basic maneuver actions in UUVs, namely, keep the pace, speed up, speed down, left turn, right turn, pitch up, and pitch down. It should be noted that these actions might be limited according to the features of the UUV. The two confrontation sides are named as “A” and “D”, respectively. The maneuver strategy sets and of sides “A” and “D” are defined aswhere and denote the strategy “keep the pace,” and denote the strategy “speed up,” and denote the strategy “speed down,” and denote the strategy “left turn,” and denote the strategy “right turn,” and denote the strategy “pitch up,” and denote the strategy “pitch down.”

Four attributes are considered in the maneuver attribute set :where is the distance factor, is the velocity factor, is the deflection angle, and denotes the depression angle.

The main difference between multi-UUV confrontation and other confrontations with autonomous robots is the information transmission mode. Due to the submarine environment, the information in a multi-UUV confrontation process is mainly received through underwater acoustics. The shallow water acoustic channel is a channel with time-space-frequency variation [19]. It has a strong multipath interference, high environmental noise, large transmission loss, and notable Doppler shift effect [19]. Therefore, the information provided in the multi-UUV confrontation process has strong uncertainties. It is difficult to accurately quantify the extent of the threat of each side during the decision-making process [20]. Hence, in this paper, each attribute is divided into seven levels by using an intuitionistic fuzzy language. In a practical confrontation, the fuzzy language can be transformed into a certain set to participate in the decision-making process. Because the intuitionistic fuzzy set could measure the degree of fuzziness of the original information more comprehensively, the fuzzy language is transformed into intuitionistic fuzzy sets here [20, 21]. The relationship between the fuzzy language and fuzzy sets is listed in Table 1.

The importance level of the th maneuver attribute factor relative to the th one is obtained according to experience and practical problems as presented in Table 2 [22].

Therefore, the importance level matrix can be achieved using the following equation:where is the inverse of according to the definition of the importance level.

The threat weight of each attribute is obtained as

A multi-UUV confrontation model generally includes two forms, one is a pure strategy model and the other is a mixed strategy model. When the probability of one of the mixed strategies is 1, it becomes a pure strategy model. In an actual confrontation, both sides need to determine their strategies according to the dynamic information of the confrontation process and then achieve the payoff matrix of both sides. According to equation (1), the dimensions of the maneuver strategy sets and are both . Thus, the maneuver strategy of “A” is and “D” selects . The intuitionistic fuzzy set can be obtained according to Table 1 to quantitatively evaluate the chosen strategy, where and are the membership and nonmembership degrees [23]. Therefore, the fuzzy evaluation matrix under the attribute , of “A” is expressed as

Definition 1. For the intuitionistic fuzzy set , the weighted arithmetic integration factor is defined aswhere is the threat weight which satisfies .
Therefore, the fuzzy payoff matrix can be achieved through the following equation:where , .

3. Decision-Making Model Based on Intuitionistic Fuzzy Game Theory

According to the above preliminaries in Section 2, a multi-UUV cooperative dynamic maneuver decision-making model is built in this section based on intuitionistic fuzzy game theory. In actual confrontation, with the change of real-time information, it is difficult for both sides to obtain each other’s strategy in advance, so it is quite hard to produce the optimal strategy. The main characteristic of game theory is that the action schemes adopted by the participants are interdependent, and the gains depend on the strategies adopted by both the participants and others. Then, the optimal solution can also be found under the condition that the information of the opponents is incomplete.

The maneuver game under uncertain underwater environment discussed in this paper essentially belongs to the category of two-person zero-sum game [12]. Each of the confronting sides are regarded as players in the confrontation process. Because on the uncertainty of underwater environment and the weakness of interconnection, the players’ judgment of the situation is often ambiguous and uncertain. Therefore, the two-scale intuitionistic fuzzy set is used to solve such problems.

Let be the fuzzy payoff matrix, and the players “A” and “D” choose pure strategies with probability and , respectively, and denote and , so we call and the mixed strategies of “A” and “D.”

Then, denoteas the mixed strategy spaces of “A” and “D.” So is the intuitionistic fuzzy two-player zero-sum matrix game.

Definition 2. Under the fixed strategy , the expected return of player “A” isaccording to the algorithms of intuitionistic fuzzy sets [24].
Besides, the expected return value of player “A” isThe membership degree and the nonmembership degree of the intuitionistic fuzzy expected return represent the acceptance and rejection of the strategy by the players, respectively. Owing to the nature of two-scale conflict, the score function method is usually used to rank the intuitionistic fuzzy sets.

Definition 3. Assume and which are the intuitionistic fuzzy sets, and which are the scores which represent the degrees of the chosen strategy, satisfying the requirements of decision, and and are the accuracies which represent the accuracies of the chosen strategy, meeting the requirements of decision. Then, the total order relation of these fuzzy sets can be achieved as follows:When , we call is less than , denoted as ;When and , we call is less than , denoted as ;When and , we call is equal to , denoted as .

Definition 4. In the intuitionistic fuzzy zero-sum game , if there exist strategy pairs , for which satisfy , we call the mixed strategy as the Nash equilibrium strategy which satisfies the intuitionistic fuzzy game .
Then, we will study the existence of the Nash equilibrium strategy in the next section.

4. Main Result Discussion

To analyze the existence of the Nash equilibrium strategy of the intuitionistic fuzzy game in equation (9), the following Lemma 1 is introduced first.

Lemma 1. (see [18]). There exists a Nash equilibrium of mixed strategy for a game , if the strategy space of the game is a closed and convex set and payoff function is continuous for any .
Based on Lemma 1, we obtain the existence of the Nash equilibrium strategy of the intuitionistic fuzzy game described by the following Theorem 1.

Theorem 1. For the intuitionistic fuzzy game , there exists a Nash equilibrium of mixed strategy.

Proof. The strategy space of an intuitionistic fuzzy game is mixed. So, for any two mixed strategy and , it implies , which means that the strategy space is a closed and convex set.
Besides, the expected return value (10) is the payoff function of the intuitionistic fuzzy game . Equation (10) is continuous for any . Based on Lemma 1, there exists a Nash equilibrium of mixed strategy for the intuitionistic fuzzy game .
This completes the proof of Theorem 1.

Remark 1. Although the existence of the Nash equilibrium of mixed strategy for the intuitionistic fuzzy game (9) can be ensured, it is difficult to obtain an analytical solution of the Nash equilibrium. Thus, most research studies try to calculate the numerical solution of the Nash equilibrium by using optimization algorithms. Based on Definition 4, the analysis of optimization algorithms is given as follows.
According to the definition of the Nash equilibrium of mixed strategy for the intuitionistic fuzzy game in Definition 4, the optimal strategy of “A” is to maximize its intuitionistic fuzzy expected return. On the other side, the optimal strategy of “D” is to minimize its loss. Therefore, according to the maximum and minimum theorem of game theory [25], the nonlinear programming model can be used here to find the optimal confrontation strategy:where is the optimal expected return, which satisfies equation (11), and is the pure strategy of with the mixed strategy of . Notations and are defined in Definition 2 and 3, respectively. Based on Definition 4, optimal expected return and optimal mixed strategy could be calculated.
Equivalently, for the mixed strategy , it haswhere is the optimal expected return, which satisfies equation (12), and is the pure strategy of with the mixed strategy of . Based on Definition 4, optimal expected return and optimal mixed strategy could be calculated.
It is difficult to obtain the optimal solutions of equations (11) and (12). Thus, how to calculate these two optimization problem equations (11) and (12) is shown in Section 5.

5. Cooperative Dynamic Maneuver Strategy Optimization

The intuitionistic fuzzy payoff matrix is obtained, and the planning model is established according to the above attribute evaluation. In this section, the optimal maneuvering strategy of a multi-UUV game is achieved through the modified particle swarm optimization (MPSO) method. Variable detection vectors were added to widen the particle exploration space in the proposed MPSO method. Moreover, the learning strategy is improved to aid the particles jump out of the local optimum. Assuming that the problem is in D-dimensional space, the velocity vectors and position vectors are defined as

The updated equations of velocity and position can be expressed aswhere is the inertial weight coefficient for linear decline, are the acceleration coefficients, are the random numbers generated from , represents the best location for the th particle (individual optimum), and represents the best location in the whole population (global optimum).

In practice, the fitness function should be multimodal. When the particle is trapped in the local optimum, the proposed parameter optimization algorithm should be able to change its original trajectory to adaptively explore a new solution space. To achieve this, the learning strategy is applied in the proposed MPSO method. There are two key points to be emphasized here. First, to improve the dynamic performance of PSO, a new velocity update equation is designed. Then, a backward learning strategy based on adaptive Gauss distribution is proposed to overcome the blindness in stochastic evolutionary search, which enables particles to escape from the local optimum. It should be noted the proposed MPSO algorithm with the learning ability does not increase the time complexity compared with the original PSO algorithm. The detailed steps of the MPSO with the learning ability are shown in Figure 2.

In recent years, many studies have observed that if the particles converge too fast, they will shrink locally in several generations [26]. This phenomenon leads to a similar search behavior among individuals and loss of population diversity. If the particles are trapped in the local region, it will be difficult to have them jump out of the local optimum because of their similar search behavior and lack of adaptive detection ability. To improve the performance of the PSO algorithm, particles should be able to adaptively change the original trajectory and explore new spaces. The problem here is how to guide the particles to move to different regions, which might become the global optimum, and explore the solution space more extensively. Therefore, in this section, an improved method with an adaptive detection vector is proposed as

The added detection vector could help the particles to cover a wider range of solutions with a larger probability through the adaptive variable detection radius :where is a random number, are the upper and lower boundaries of the problem, is a variable parameter, and represents the iteration times. The speed update equation of the algorithm shows that the group members can explore unvisited regions with high probability in the solution space. The larger detection radius enhances the exploratory behavior of the particle, enables it to leave the current region, and encourages it to search for other regions. A small detection radius enhances the development of particle optimum solutions by searching for a small area near the optimum solution. Hence, the entire feasible solution space can be covered and explored as much as possible using the velocity update equation of the adaptive variable detection vector.

6. Example

In this section, an example is given to verify the effectiveness of the proposed decision-making algorithm. Suppose “A” and “D” are engaged in an two-vs-two underwater confrontation, which means that there are four UUVs “A1,” “A2” and “D1,” “D2”. Each UUV has seven strategies according to equation (1), hence both “A” and “D” have 49 strategies. The initial positions of “A1,” “A2” are (−400 m, 100 m, and 800 m), (−400 m, 100 m, and 800 m) and “D1”, “D2” are (400 m, 100 m, and 800 m), (400 m, −100 m, and 800 m), respectively. The velocity, deflection angle, and pitch angle of “A1,” “A2” are 23 m/s, , and and 23 m/s, , and ; the velocity, deflection angle, and pitch angle of “D1,” “D2” are 25 m/s, , and and 25 m/s, , and , correspondingly. Both sides have the same control ability, and the time interval of the confrontation steps is 5 s. It is evident that “D” possesses some advantage at beginning. Notably, the maximum maneuver steps should be decided according to the effectiveness of the UUV used in the confrontation. There are 40 steps in the confrontation process whose return values are shown in Figure 3. According to Section 5, the obtained return values show that the Nash equilibrium condition of the intuitionistic fuzzy game is satisfied. Based on Figure 3, this is a very weak dominant strategy equilibrium. In theory, the strategy sets of “A” and “D” are the same, such that their strategy equilibrium is a very weak dominant.

To compare the confrontation performance, “A” employs the cooperative dynamic maneuver decision-making algorithm proposed in this study, and “D” employs the max-min decision-making algorithm during the multi-UUV confrontation process [25]. The three-dimensional confrontation process with five main stages is shown in Figures 48. The red dotted line represents the path of “A1,” the red solid line represents “A2,” and the blue dotted and solid lines represent “D1” and “D2,” respectively. The“’’ shows the initial position, and“’’ shows the current position. The confrontation ends when the return value of one side reaches the absolute advantage. For stage 1, the calculated optimal mix strategy of “A” is presented in Table 3, and then, depicted in Figure 4; “D” possesses the dominant position, in which “D1” tries to attack “A1” and “D2” is moving towards “A2”. The optimal mix strategy of “A” for stage 2 is calculated and listed in Table 4. As shown in Figure 5, “D1” and “D2” try to attack “A2,” and “A1” attempts to turn to escape. Table 5 proposes the optimal mix strategy of “A” in stage 3; in Figure 6, “D1” and “D2” continue to attempt to attack “A2,” but “A2” turns to escape, and “A1” turns to return to the confrontation. In stage 4, the optimal mix strategy of “A” is shown in Table 6. “A2” turns continuously and escapes successfully, “A1” also turns and tries to move towards “D1” and “D2,” and “D1” and “D2” turn back to “A2” in Figure 7. The situation varies here, in that “A” possesses the dominant position. Additionally, this is validated in Figure 3, in which the return values change from negative to positive. In the end, both “A1” and “A2” possess the dominant positions, such that “A” achieves the absolute advantage and ends the confrontation, which is illustrated in Table 7 and Figure 8. The example validates the effectiveness of the proposed multi-UUV maneuver decision-making algorithm.

7. Conclusion

In this study, an intuitionistic fuzzy set is introduced into game theory to examine the cooperative dynamic maneuver decision-making algorithm for a multi-UUV. The characteristics of underwater environment including different kinds of uncertainties are expressed using intuitionistic fuzzy sets. The maneuver game model with intuitionistic fuzzy information is established, and the condition of the Nash equilibrium strategy is presented. Combined with the background and model characteristics, the optimal maneuver strategy is obtained using MPSO in each step of the dynamic confrontation process. Moreover, an example of a multi-UUV dynamic confrontation with several maneuver decision-making steps is utilized to show the superiority and effectiveness of the proposed maneuver decision-making algorithm.

Data Availability

The data used to support the findings of this study are currently under embargo, while the research findings are commercialized. Requests for data, 6 months after publication of this article, will be considered by the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

The work presented in this paper was a collaboration of all authors. Lu Liu contributed the idea and wrote the paper. Lichuan Zhang did the strategy optimization and reviewed the paper. Shuo Zhang analyzed the performance of the multi-UUV system. Sheng Cao made the software of the simulation.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 51979229), the Fundamental Research Funds for the Central Universities of China (nos. G2018KY0305 and G2018KY0302), China Postdoctoral Science Foundation (nos. 2019M650274 and 2019M663811), the Natural Science Foundation of Shaanxi Province (nos. 2019JQ-164 and 2020JQ-194), the Basic and Applied Research Foundation of Shandong Province (no. 2019A1515111073), and the Opening Foundation of Key Laboratory of Ocean Engineering (Shanghai Jiao Tong University, no. 1817).