Abstract

The reach-avoid game theory is an ideal tool to handle the conflicts among intelligent agents and has been previously studied assuming full state information and no time limits on the players in the past decades. In this article, we extend the problem by requiring the defender to detect the attacker and adding maximum operation time constraints to the attacker. The attacker aims to reach the target region without being captured or reaching its time limit. The defender can employ strategies to intercept the attacker only when the attacker is detected. A geometric method is proposed to solve this game qualitatively. By analyzing the geometric property of the Apollonian circle and the detection range, we give the barrier under the condition that the attacker is initially detected and the attacker’s shortest route which guarantees its arrival at the target region when it is initially outside the detection range. Then, a barrier that separates the game space into two respective winning regions of the players is constructed based on the shortest route and the time limit of the attacker. The main contributions of this work are that this paper provides the first attempt to introduce the abovementioned two concepts simultaneously, which makes the game more practical, and we provide the complete solution of the game in all possible situations.

1. Introduction

The differential game theory is first introduced in 1965 to analyze the interception of incoming aircraft by missiles [1]. This theory has strong aerospace connotations [28]. The conflicts and cooperation among multiple aircrafts or spacecrafts can be formulated as differential game problems. For example, Tang et al. [7] solved the problem of spacecraft interception and proposed switching strategies based on the differential game theory. Pachter et al. [3, 4] used the game theory to investigate the interception of an air threat by two attacking missiles, and the situation that an attacking missile pursues a target aircraft, while a defending missile tried to intercept the attacking missile. Liu et al. [6] proposed a novel cooperative guidance law of a leader-follower multiagent system based on robust multiagent differential games. Ye et al. [5] investigated a proximate satellite pursuit-evasion game and obtained the open-loop solution to the game. There are two kinds of differential games introduced by Isaacs in his book. The first kind of game is called the game of degree, in which there is a continuous payoff function and the adversarial players try to minimize or maximize the value of the function, respectively. The game of degree theory parallels optimal control theory greatly and the existing control theories can be utilized to solve this kind of problem [5, 6]. The second kind of game is called the game of kind, in which we are only interested in which player can win the game, rather than finding an equilibrium value of a certain function. Unlike the game of degrees, there is still no mature method to solve the game of kind problems. This means that there are many blanks to be filled in this research direction and they are worth studying. In this work, we investigate a typical kind of these games, the reach-avoid game. In a classic reach-avoid game, the attacker or the evader tries to reach the target area and avoids being captured by its opponent, while its opponent, which is often named the defender or the pursuer, strives to prevent it from reaching its goal by intercepting it or delay its intrusion indefinitely. This kind of a game has received a great deal of interest because it can be utilized to handle a wide range of problems in many realistic situations [916]. For example, in the military field, the scenarios where an interceptor missile intercepts an offensive missile to protect national borders, or an attacking bomber invades a target area protected by some antiair weapons, can be formulated as a reach-avoid game.

Referring to Isaacs’ book [1], the key point of this kind of question is to obtain a hypersurface in the game space, which is called the barrier. This hypersurface divides the game space into two disjoint parts: the winning region of the attacker and that of the defender. In the literature, the Hamilton–Jacobi–Isaacs approach (HJI approach) is ideal for solving the low-dimensional game. By solving an HJI partial differential equation, the optimal strategies of the players can be obtained, as well as the winning regions of the players. Up till now, lots of seminal works have been put forward about reach-avoid differential games and many variations of this kind of game have been proposed. For example, Margellos and Lygeros [17] developed a framework for formulating reach-avoid games with nonlinear dynamics as optimal control problems. Bhattacharya et al. [18, 19] studied a visibility-based target-tracking game in the presence of a circular obstacle. The barrier was constructed by using Isaacs’ techniques according to the symmetry of the environment in reduced state space. The authors of [20] obtained the barrier of the lifeline game by integrating the HJI partial differential equations. They proposed a resultant force method for the attacker to balance the active goal of reaching the lifeline and the passive goal of avoiding capture. In the past few years, lots of work focused on the reach-avoid game with multiple players [2124]. This kind of game is a significant research topic and is also difficult to analyze due to the raising dimension of the state space, the complex cooperative interactions among the players in the same team, and the conflicting and complicated goals of the players in different teams. Chen et al. [25, 26] provided a graph-theoretic maximum matching approach to decompose the multiplayer reach-avoid game into several subgames with two players and merged the pairwise outcomes of the subgames for a solution of multiplayer version. Based on the time derivative of an appropriately defined risk metric, Selvakumar and Bakolas [27] provided a nonlinear state feedback strategy for the evader in a multiplayer reach-avoid game. A decentralized, real-time algorithm for the cooperative pursuit of a single evader by multiple pursuers was presented based on the idea of minimizing the area of the generalized Voronoi partition of the evader by Zhou et al. [21]. Yan et al. [28] considered a reach-avoid game with two defenders and one attacker on a rectangular domain. In the work, they proposed a geometric approach to construct the related barrier analytically. Also, they fused the game of kind and the game of degree by defining some payoff functions, and the optimal strategies of the players can be obtained simultaneously. They further studied a new multiplayer reach-avoid game of assigning tasks to guarantee the most evaders intercepted based on their previous work and obtained the solution of this game between two adversarial teams with an arbitrary number of players in a general convex planar domain. The authors of [2931] introduced a new kind of reach-avoid games with multiple homogeneous intruders and defenders, which is called the perimeter-defense game. In this game, the defenders are confined to the boundary of the target area. They decomposed the game into games of local subteams and reduced the design of the defense strategy to an assignment problem to handle the complexity brought by the high dimensionality of the joint state space. Apart from adding the number of players involved in the game, the so-called target-attacker-defender game is also a significant kind of game in this research field. In a TAD game, the attacker aims to capture the target and avoid being intercepted by the defender, while the defender and the target cooperate to prevent the attacker from reaching its goal. This kind of game can be considered a special kind of the reach-avoid game whose target area is moveable. A pursuit-evasion-defense differential game in dynamic constrained environments was investigated through the solution of a double-obstacle HJI variational inequality by Fisac and Sastry [32]. Oyler et al. [33] presented the TAD game in the presence of obstacles which is more suitable to analyze realistic situations than the general differential game. Liang et al. [34] studied a two-pronged pursuit-evasion problem, and based on the explicit policy method, they give the complete expression of the barrier. They provided a further study on strategies’ switch of the players for cooperative target defense [35].

All these works have made seminal contributions to the research of the reach-avoid game. However, most of these works consider only the situation when both sides of the game have full state information of the other. In a realistic situation, the agent must be equipped with sensors to acquire information about the environment nearby and the detection ranges of the sensors are always limited. In this work, we assume that the defender is equipped with a radar with a limited detection range and the attacker has full state information of the defender, including the radius of its detection range. This assumption corresponds to the realistic military conflict in which the attacker has the initiative and the defender side has less information than its opponent when the attacker plans its attack. The defender can get access to the position and velocity information of its opponent only when the attacker is inside its detection range. This assumption makes the game more realistic than the classic games and more valuable to investigate. Furthermore, only a few works have considered the maximum operation times of the players involved in the game. For example, Yan et al. [36] considered the aforementioned constraints in the differential game. In fact, due to the limited energy of the agents, the time requirements of their missions, and other practical constraints, the operating times of the agents are always limited. Motivated by this fact, a kind of reach-avoid game with a time limit is considered in this article. Since the defenders are always deployed near the target area, which is friendly to them, and the attackers come from their bases far away from the target area, it is reasonable to assume that the defender has adequate energy to support its operation during the game. Therefore, we only consider the time limit of the attacker in this article. In this work, the players have simple motion in a two-dimensional plane, which is partitioned by a straight line called the target line into two disjoint parts: the target region and the play region. The objective of the attacker is to reach the target region before reaching its maximum running time or being captured. Contrarily, the defender employs strategies when the attacker is detected to intercept the attacker or forces the attacker to terminate in the play region. It can be easily found that if there is no constraint on the attacker, the attacker can always adopt strategies to bypass its opponent’s detection range and ensure its victory. So, we need to make restrictions on the attacker. Due to the increasing complexity of terminal conditions caused by the time limit and detection range, the classic Hamilton–Jacobi–Isaacs approach is not suitable to handle the problem proposed here. To overcome these disadvantages, instead of using the classical approach, this paper introduces a geometric method to solve the new kind of reach-avoid game qualitatively and obtain the complete barrier of the game. The shortest trajectories that guarantee the victory of the attacker are also provided in this article.

We divide the game into two parts. The first part is that the attacker is initially located inside the detection range of the defender and the second part is the attacker is located outside that. The main contributions of this article are as follows.(1)For the first part, we prove that the attacker cannot leave the detection range of the defender, which is a common conclusion that can be used in other games considering the time limit and detection range with different game setups. Also, we obtain the expressions of the related barrier.(2)We investigate the possible equilibrium terminal states of the game and prove that if the attacker is located on the part of the barrier outside the detection range, it must reach the target line at its time limit by employing its optimal strategy. This means that the optimal strategy of the attacker must be moving along the shortest trajectory that guarantees its arrival at the target line without being intercepted in advance. This is a common conclusion in all the reach-avoid games with different game setups considering the time limit and detection range. It shrinks the possible set of the game’s terminal states and reduces the difficulty of solving the problem.(3)Using the properties of the circle and the form of the players’ optimal strategies obtained via the HJI differential equations, we acquire the optimal strategies and the related optimal trajectories of the attacker according to its initial location. The optimal trajectories of the attacker obtained in this work are valid when the defender is equipped with radar of which the detection range is circular or seekers of which the detection area is sectorial.(4)By comparing the optimal trajectories and the time limit of the attacker, the barrier of the reach-avoid game with a time limit and detection range for the second part is constructed and illustrated. Combining the barriers of the first part and the second part, the complete barrier of the reach-avoid game with a time limit and detection range is obtained.

To our best knowledge, it is the first time that someone brings the time limit and the detection range to the reach-avoid game simultaneously. The introduction of these concepts makes the game more complex and more practical than most of the previous works in this research direction. The solution obtained in this article can be used to quickly judge the result of the game via the initial configuration of the game and can handle practical situations which have real-time requirements.

The rest part of this article is organized as follows. In Section 2, the formulation of the reach-avoid game with a time limit and detection range is presented and the game is divided into two parts, the part where the attacker is inside the detection range and the part that the attacker is outside the detection range. In Section 3, the possible neutral terminal states of the game are provided and the barrier of the former part of the reach-avoid game is constructed utilizing a geometric method. In Section 4, we prove that the optimal trajectory of the attacker is the shortest trajectory that guarantees the arrival at the target line when it is outside the detection range, and we give the expressions of the optimal trajectories in all possible situations. In Section 5, the barrier of the game is provided and illustrated based on the optimal trajectories obtained and the time limit of the attacker. In Section 6, some numerical simulations are provided to prove the correctness of the barrier obtained in this work. Finally, in Section 7, the conclusion and future work are summarized.

For the ease of the reader, some of the important notations with summarized descriptions are listed in Table 1. These notations will also be explained in detail in the rest of the paper.

2. Problem Formulation

2.1. The Reach-Avoid Game with a Time Limit and Detection Range

In this section, the problem formulation of the reach-avoid game with a time limit and detection range is provided. The attacker and the defender move in a planar environment , as is shown in Figure 1. The planar space is separated by the target line, which is a straight line and is denoted as , into two parts: the play region and the target region . It can be seen that all the possible target areas in practical situations can be approximated by polygons whose boundaries consist of line segments. Thus, it is essential to investigate the reach-avoid game with the aforementioned game setup. In this article, both players are considered as mass points with simple motion and they are initially located in the play region . A Cartesian coordinate system is built with the origin and the -axis located on the target line . Thus, , , and . The states of the players at the time are denoted by their Cartesian coordinates and . We assume that the origin of the coordinate system is below the initial position of . This means that , and this assumption does not influence the solution to the problem. The state of the game is given by , and the state space of the game is . In this article, we assume that the players have a simple motion with speeds , respectively. The maximum speeds of the players are two constants, which are denoted as , respectively. This model and the game environment are typically encountered in the games of Isaacs and there exist many seminal works analyzing practical conflicts based on them [3, 4, 12]. The maximum speed of the defender is assumed to be larger than that of the attacker, and the speed ratio between these two players is defined as . Their control variables are their instantaneous heading angles . The dynamics of the players are expressed as follows:

The defender can respond to the attacker’s strategy instantly and the control variables are selected simultaneously and can change instantaneously.

In practical application, due to the constraints like the limited energy, the time requirements of the attacker’s missions, and so on, the maximum operating time of the attacker is always limited. Since the defender is always deployed near the target area which is friendly to it, we assume that the defender has adequate energy to support its operation. Also, the defender can employ state feedback strategies against the attacker only when it has access to the position of the attacker. The defender stays still before it detects the attacker via the sensors it equipped. The shape of the sensor’s detection area, for example, the radar, is usually circular. The detection area of the defender considered in this work is shown in Figure 1. We assume that the attacker knows the detection range and location of its opponent. This assumption is reasonable since the attacker often has the initiative when it plans its attack in the conflict and the defender often has less information than its opponent in realistic situations. Also, we assume that the defender cannot foresee the intrusion to show the initiative of the attacker. Thus, when the attacker is initially outside the detection range, the defender does not know the existence of the attacker and cannot adopt any strategies against it. In this situation, and . However, if the attacker is once detected and then leaves the detection range, the defender will exhibit a random walk until it finds the attacker.

Let be the maximum operating time of . Since is a constant, the possible maximum range of is a constant and is denoted as . From the time limit and simple motion of , we can learn that the possible terminal position set of the attacker, which is called the reachable area of the attacker (RAA), is a circle with the center located at . The radius of this circle is . Let be the detection radius of . The boundary of the detection range is called the circle of detection. The condition that the defender finds the attacker is given as follows:where stands for the Euclidean norm in . The RAA and the detection range of the defender are shown in Figure 1.

In this article, only point capture is considered, which means that the interception happens when the separation between the attacker and the defender becomes zero. Suppose that the game terminates at the time , which cannot be larger than . The capture set of this game is given as follows:

The terminal set of the game can be denoted as follows:

The subspace is defined as follows:

If the game state terminates in , the attacker wins the game. Contrarily, the defender wins if the game state terminates in , which is

Thus, the target of the attacker is to drive the game state into , while the defender tries to force the state of the game into . Since the reach-avoid game terminates immediately when the attacker enters the , in this article, we assume that the initial position of the attacker lies in the play region to ensure that the game exists [37].

2.2. Main Problem

In this article, we mainly focus on solving the reach-avoid game with a time limit and detection range qualitatively. This kind of problem is known as the game of kind and it requires us to answer the question that which player wins the game with a given game space and , the initial positions of the attacker and the defender, the maximum speeds of the players and , the time limit , and the detection radius . Referring to Isaacs’ book, to solve this problem, we need to construct the hypersurface called the barrier in based on the aforementioned information. When the initial position of is fixed, the barrier degenerates into a 2-dimensional curve that can be illustrated. The barrier divides the game space into two disjoint parts: the attack dominance region (ADR) and the defense dominance region (DDR). These concepts of regions were introduced by Yan et al. [28], which are similar to the concepts of the capture zone and the escape zone in Isaacs’ book. If the game state is located in ADR, the attacker can ensure the successful arrival at without being intercepted by the defender or reaching its time limit, irrespective of the strategies that the defender employs. On the contrary, if the state of the game is located in DDR, the attacker cannot reach its goal, which ensures the victory of the defender.

When the state of the game lies on the barrier , the attacker must apply the optimal control variables and to ensure the state stays outside the DDR, while the defender must choose its optimal strategies and to prevent the state of the game from getting into the ADR. Both players adopt their optimal strategies and the game achieves an equilibrium outcome. This outcome is called the neutral state of the game and is regarded as a situation where neither the attacker nor the defenders win the game. Although it may guarantee the victory of one player, any tiny change could lead to a different result in the game. This kind of situation is also regarded as the equilibrium outcome. The set of the game’s neutral states consists of three parts and can be delineated as [37].

The first part of the set indicates that when the players both adopt their optimal strategies, they coincide with each other once the attacker reaches before reaching its time limit. The equilibrium outcome is caused by the capture. The second part indicates that the attacker reaches its goal at time without being intercepted. The equilibrium outcome is caused by its time limit here. The third part is the aggregation of the other two kinds of equilibrium outcomes. In general, the barrier is constructed based on the set of neutral states. However, it is obvious that the set of the game is complex and it is difficult to construct the barrier via Isaacs’ classic approach since sketching the boundary of the terminal set is a challenge [34].

Also, different from the traditional reach-avoid game which usually assumes that the attacker and the defender have full access to the positions of each other, in this work, the location of the attacker is unknown to the defender if the attacker stays outside the detection range. Thus, when the attacker is outside the detection range, this game degenerates into an optimization problem related to the attacker and this degeneration increases the difficulty of the game. Since there is a time limit of the attacker, the optimal strategy of the attacker is to find the shortest path toward when it remains undetected. Due to the existence of this degeneration, two situations need to be discussed.(1)The attacker is initially located inside the detection range, which means that .(2)The attacker is initially located outside the detection range, which means that .

In the following part of the article, we will discuss these two situations to form the complete solution of the reach-avoid game considered in this work.

3. The Reach-Avoid Game When the Attacker Is inside the Detection Range

In this section, we investigate the reach-avoid game with a time limit and detection range when the attacker is initially located inside the detection range. For simplicity, time will be omitted in the rest part of the article. It is obvious that at the beginning of the game, in this situation, the attacker and the defender can adopt strategies against their opponent and the game is a traditional reach-avoid game. This leads to the following theorem:

Theorem 1. When the attacker is inside the detection range, the optimal trajectories of the players are straight lines, and the attacker remains inside the detection range during the rest of the game process.

Proof. Referring to Isaacs’s book, we can easily obtain the optimal strategies of the players via the HJI approach. The attacker strives to reach the target line without being intercepted before reaching its time limit. The payoff function of the game can be expressed as follows:where and are weighting coefficients. The function represents the payoff related to the terminal state of the game, and the function represents the payoff accumulated during the game process. Since we are only interested in the outcome of the game, which is only related to the terminal state, we have . If the attacker tries to win the game, it needs to adopt proper strategies to maximize the payoff function. On the contrary, the defender strives to minimize the function . Thus, the value of the game is as follows:By adding this payoff function, the game can be considered a minimax optimization problem. From the dynamics of the players, which is equation (1), and the calculus of variations, we can obtain the Hamiltonian of this differential game, which is as follows:where is the corresponding co-state vector. Since the Hamiltonian and the dynamics of the game are decoupled in control variables and , according to the HJI approach, we have as follows:Thus, the optimal strategies of the players can be expressed as follows:Additionally, the co-state dynamics are as follows:Hence, all the co-states are constant and the optimal heading angles , remain constant during the game process. In other words, the players’ optimal trajectories are straight lines. Also, according to equation (13), when the attacker is detected, the attacker and the defender will move at their maximum constant speeds during the game process. This theorem is also obtained and utilized in many works which have similar game setups [24, 37]. In fact, when the attacker is initially located in the detection range, the original game in this work becomes the game that has already been studied by Yan et al. [36].
Referring to Isaacs book [1], the attacker and the defender ultimately coincide with each other when adopting the aforementioned strategies, irrespective of the situation that reaches the target region or its time limit. All the possible intersection points in this situation form a circle with a center at and radius of . This circle is called the Apollonian circle. Since they ultimately coincide with each other along straight lines, it is obvious that the distance between the attacker and the defender monotonically decreases to zero. Therefore, once the attacker enters the detection range, it remains inside the detection range during the rest of the game. The proof is completed.

Lemma 1 (see [1]). For a pair of straight-moving agents, the planar space can be partitioned into two disjoint parts by the related Apollonian circle. Moreover, all the points inside the circle can be reached by the attacker before the defender regardless of the attacker’s time limit. Consider the definition of the Apollonian circle: since the attacker and the defender ultimately coincide with each other when they adopt their optimal strategies obtained in Theorem 1, the attacker can never reach any point outside the Apollonian circle once it is detected. Moreover, if the Apollonian circle intersects with the target region, the attacker can ensure its victory by moving directly toward any point which is inside both the Apollonian circle and the target region, regardless of its time limit.

From Theorem 1, we can learn that when the attacker is initially located inside the detection range, the game considered in this work is the same as the game with a time limit of the attacker only. The existence of the defender’s detection range does not influence the game process in this situation. It should be noticed that the game with a time limit only has already been studied in former work [36]. The optimal trajectories of the players are straight lines, which are obtained in both Theorem 1 and Yan et al.’s work [36], and we give the expressions without proof here. The barrier of the reach-avoid game with a time limit only, which is denoted as , is given by the following equation:when , where , ; and .

When , the barrier of the given game is as follows:

From equation (16), we can learn that if , the initial position of does not influence the barrier. Thus, in the rest of this article, we mainly focus on the situation that . Some useful conclusions were acquired in Yan’s work [36] that will be utilized to handle the problem considered in this article hereinafter, and we give these conclusions without proof. Due to the symmetry, we investigate the barrier in the condition that only. Analogous analysis can be done for the situation when .

Lemma 2. If is located on the barrier with , the defender has no influence on the attacker, and the optimal trajectory of is to move vertically downward to . In this situation, reaches at time . The equilibrium outcome of the game is caused by the attacker’s time limit only here. Furthermore, from the time limit of , the equations (16) and (17), we can find out that the space is always inside the DDR because the maximum range of the attacker is .

The barrier of the game with a time limit of the attacker only is illustrated in Figure 2. The part of the barrier when the attacker is inside the detection range can be easily obtained from equations (15) and (16), which are expressed in the following theorem.

Theorem 2. The barrier when the attacker is initially located inside the detection range is the part of the barrier inside the detection range. This part of the barrier can be expressed as follows:

Proof. This theorem is obvious and the proof is omitted here.

Also, there is a simple theorem about the expression of the barrier and the parameter .

Theorem 3. For a point on the barrier, if , . Furthermore, if satisfies thatwhere the barrier of the game considered in this article is .

Proof. It is obvious that if there is no defender, the shortest path for the attacker to get to is moving vertically downward. When and the attacker moves vertically downward, all the points on the attacker’s trajectory are outside the defender’s detection range. In other words, the defender cannot adopt any strategy against its opponent in this situation. Thus, the part of the barrier here satisfies that and the attacker reaches at . According to Lemma 2, if , the defender cannot influence the attacker when it moves vertically downward to . So, if , .
From the conclusion obtained above in this proof, we can learn that the fourth part of the equation (15) is always a part of the barrier related to the game considered in this work. According to Theorem 2, if the first three parts of the equation (15) are inside the detection range, these parts of the barrier are also the components of . Thus, in this situation, the whole barrier of the game satisfies . The condition that the first three parts are inside the detection range is that the point is inside the detection range, which can be expressed as follows:Thus, we complete the proof.

4. The Optimal Strategies of the Attacker outside the Detection Range without a Time Limit

In the rest part of the article, we will investigate the part of the barrier outside the detection range. Different from the situations discussed above, when the attacker is outside the detection range, the part of the reach-avoid game before the attacker enters the detection range is an optimization problem related to the attacker only. Moreover, in the related optimization problem, the attacker’s optimal trajectories can be any kind of curve, rather than just a straight line. If we want to obtain the barrier of the game, we need to obtain the attacker’s optimal strategies in all possible situations first. It is obvious that the attacker’s initial position, the parameters , , and all influence the related trajectory. According to equation (16), Lemma 2, and Theorem 3, we only need to discuss the situation when and in the following.

Let us first consider the attacker’s initial position. To simplify the problem, we divide the space outside the detection range of the defender into two different areas. The division is shown in Figure 3. The consists of the area that satisfies and the area below the detection range. The second area is above the detection range with . When is initially located in , its optimal strategy is moving vertically downward. This is the shortest trajectory to the target line and cannot detect when it moves along the related trajectory.

When is initially located in , however, it cannot move vertically downward without remaining undetected during the game process. The possible trajectories of the attacker here can be divided into two types:(1)All parts of the attacker’s trajectory are outside the detection range. This kind of trajectory is called the bypassing trajectory.(2)Some part of the attacker’s trajectory is inside the detection range. This kind of trajectory is called the penetrating trajectory.

As mentioned in Section 2.2, the barrier is generally constructed based on the set of the game’s neutral states in this research direction. However, the possible terminal set of the game in this work is complex, which increases the difficulty of solving the game. Thus, it is necessary to analyze and simplify the neutral terminal set to make the game solvable. For the first type, since the attacker remains undetected during the game, the defender cannot capture the attacker, and the related neutral terminal set belongs to . For the second type, we have the following theorem.

Theorem 4. If the attacker moves along a penetrating trajectory, the possible related neutral states cannot be in .

Proof. We use reduction to absurdity to prove this theorem. Suppose that there is some position of the attacker that the related optimal strategies of the attacker lead to the terminal states in . Since the attacker adopts strategies of the second type, according to Theorem 1, the part of the attacker’s optimal trajectory inside the detection range must be a straight line. Let be the intersection point of the optimal straight line inside the detection range and the boundary of the detection range. This point is called the penetration point.
As shown in Figure 4, if the optimal strategies of the players lead to the terminal states in , which means and coincide with each other on before , according to the definition of Apollonian circle, the possible terminal positions of and can only be or . and are the left and right intersection points of the target line and the -based Apollonian circle. Let be the terminal time in this situation and be the related optimal trajectory outside the detection range. We have . Notice that , . Thus, it can be easily found out that the distance between and is longer than that between and . Considering that all the points on between and are inside the -based Apollonian circle, the attacker can reach any one of these points without being intercepted, regardless of its time limit. If the terminal neutral position is , the attacker can reach any point between and along the trajectory and the straight line between and that point within . This new trajectory leads to the victory of the attacker without reaching the time limit. So, the terminal neutral position cannot be .
Next, we investigate the situation that the terminal neutral position is . There always exists some point on the right of satisfies that . Since is located between and , the attacker can reach the target line along and line without being captured or reaching its time limit. In other words, the attacker can always find a better strategy to avoid making the game states terminate in and the possible neutral states cannot be in . The proof is complete.

Thus, the set of the game’s neutral states consists of and when is initially outside of the circle of detection. This means that if the attacker is on the barrier outside the detection range, it must reach its time limit once it reaches the target line. From this conclusion, we can learn that the optimal trajectory of the attacker must be the shortest trajectory which guarantees the arrival of the attacker at without being intercepted in advance, no matter which kind of trajectory it is. If the optimal trajectory is not the shortest, there are always some other trajectories that ensure the arrival of at within . Moreover, the related possible neutral terminal set is here, which contradicts Theorem 4. It can be seen that the shortest path which guarantees the arrival at without being captured in advance has nothing to do with the time limit. If we obtain the shortest path, the barrier can be easily constructed based on the length of the related path and the attacker’s maximum range .

To construct the barrier of the game and reduce the complexity of the investigation, we need to analyze the aforementioned attacker’s shortest paths in all possible situations first. We start with the bypassing trajectory. Here, we give a simple conclusion which will be utilized in the following part of the article.

Lemma 3. As shown in Figure 5, for a point outside a certain circle and a point on the boundary of the circle, the shortest path between them without penetrating the circle is as follows:where , and are two tangent points of which the corresponding tangent lines cross point . The point is one of the tangent points which is nearer to .

According to the theorems given above, we can easily obtain the optimal bypassing trajectory. Here, we give the optimal trajectory without proof.

Lemma 4. If the initial location of the attacker is inside shown in Figure 3, the optimal strategy of it is moving vertically downward. If the attacker is initially located in , the optimal strategy without penetrating the circle of detection consists of three parts. The first part is moving toward the lower one of the tangent points of which the related tangent lines cross . Then, the second part is moving along the circle of detection to point on the circle of which the -coordinate is . Moreover, the third part is moving vertically downward to .

Then, we carry on investigating the penetrating trajectory. Since a part of the attacker’s optimal trajectory is inside the detection range here, when moves along the trajectory, there must be a point at which the attacker contacts the circle of detection for the first time. The coordinate of this point is denoted as . Also, there must be an intersection point of the circle of detection and the part of the optimal trajectory inside the detection range. This point is denoted as and the coordinates of can be expressed as .

It is obvious that the penetrating trajectory consists of two parts: the part before reaching and the part after reaching . Lemma 3 has already proposed the optimal trajectory of the former part and we focus on finding the optimal trajectory of the latter part. Noticing that and may not be the same point. So, the penetrating trajectory can be further divided into two subtypes:(1)The first touching point and the penetration point are the same. The related trajectory is called the directly penetrating trajectory.(2)The first touching point and the penetration point are different points. The related trajectory is called the bypassing-penetrating trajectory.

It is obvious that if and are different points, the optimal path that connects these two points is the arc with a radius of . Moreover, it is indisputably that is below the point . So, the part of trajectory after reaching is the combination of the arc and the part of trajectory after reaching . The former part is simple. In the following part, we need to answer the questions of what the optimal trajectory of is after the attacker reaches and where the optimal is to propose the optimal trajectory of the part after it reaches .

As mentioned above, the parameters , , and all influence the related trajectory and need to be analyzed. To obtain the possible shortest penetrating trajectory, we need to find out the condition that ensures the existence of this kind of trajectory first. From Lemma 4, we can learn that the point must be located in shown in Figure 3. This means that . According to Lemma 2, the space is always inside the DDR. Thus, if the attacker can enter the detection range and acquire an equilibrium outcome, should satisfy that . The conditions and cannot be met simultaneously when . So, if the attacker can move along a penetrating trajectory and reach the target line, there must have . This also means that if , the attacker can only move along the bypassing trajectory.

Also, we can learn from Lemma 1 that the attacker can never reach any point outside the Apollonian circle once it is detected. So, if the -based Apollonian circle does not intersect with and enters the detection range from , the attacker can never reach and the defender wins the game. Thus, the attacker can enter the detection range only when the -based Apollonian circle intersects with . This condition means that the lowest point on the related Apollonian circle is on or inside , which can be expressed as follows:

Simplifying this in the equation, we have as follows:

In the rest part of this section, we will analyze the possible situations related to the parameters , , and and the attacker’s optimal trajectories in these situations. This means we need to answer the question of which kind of trajectory the attacker should choose based on the values of the parameters and provide the precise form of the related paths. From inequality (22), we have a simple conclusion.

Theorem 5. If , the attacker cannot enter the detection range without losing the game. In this situation, the attacker can only move along the bypassing trajectory to bypass the defender’s detection range.

Proof. Let us investigate the inequality (22). Similarly, we only consider the situation when . It is convenient to build a polar coordinate system to help investigate the optimal trajectory in this situation. Let the pole be located on , and the polar axis be parallel to the -axis of the Cartesian coordinate system. The coordinates of can be expressed as . The polar coordinate of is denoted as . Moreover, we have the following equation:where . From the definition of the point we can learn that the distance between and when the attacker enters the detection range is . So, the inequality (22) can be rewritten as follows:where . The function increases monotonically with the raising of . So, if , the inequality (22) is never true. Moreover, the attacker can only move along the bypassing trajectory to bypass the detection range here. This situation can be expressed as follows:which can be rewritten as follows:Thus, we complete the proof.
When there exists some point that satisfies condition (22) and , according to the proof of Theorems 1 and 4, the optimal trajectory of after it reaches must be a straight line connecting and some point on . As shown in Figure 4, this point is , the left intersection point of the related Apollonian circle and , when is on the right of . Moreover, the point is the point below when is on the left of . The optimal trajectory of depends on the location of . Let be the -coordinate of . Here, we give an important theorem.

Theorem 6. If and , the optimal trajectory of must be the penetrating trajectory when it is located in .

Proof. First, as stated above, if the attacker can move along the penetrating trajectory, must be smaller than . Assume there exists some point inside that meets the condition that the related is on the right of . When and the attacker moves along the circle of detection from to ( to in Figure 5), it must reach that before reaching the point . Moreover, the optimal trajectory inside the detection range is moving vertically downward from that point, which is shorter than moving to the point and then moving downward. This means that the bypassing trajectory is not optimal here.
From Figure 4, it can be easily found out that, if the point related to the point is on the left of , there must exist a certain that meets the condition given above. This means that when the variable satisfies the following condition,the optimal trajectory of must be the penetrating trajectory. This inequality can be rewritten as follows:According to Theorem 3, we are investigating the optimal trajectory with here. It is essential to compare the variables and to judge whether there is an interval between these two variables or not. From the condition that guarantees the existence of the penetrating trajectory, we have . This means thatIt can be seen that the variable increases monotonically with the raising of . So, we have as follows:Thus, the interval between these two variables exists and the proof is completed. The conclusion when can be obtained easily due to the symmetry and the related proof is omitted here.
Here, the attacker’s optimal trajectory when remains unsolved. This is the most complicated situation because the bypassing trajectory, the directly penetrating trajectory, and the bypassing-penetrating trajectory are all possible. We need to investigate the length of these three kinds of trajectories simultaneously. Instead of directly analyzing this complicated situation, we will discuss the two subtypes of the penetrating trajectory in the interval of shown in Theorem 6 first. It is essential to judge which one is better to obtain the attacker’s optimal trajectory, and the analysis will lead to a significant conclusion that can help analyze the unsolved situation mentioned above.

Theorem 7. In the situation that , , and , if the point satisfies that , the optimal trajectory of the attacker is a bypassing-penetrating Trajectory. If the point satisfies that , the optimal trajectory of the attacker is a directly penetrating trajectory. The variable meets the following condition:

The variable is the distance between the penetrating point and the left intersection point shown in Figure 4. There exists a certain angle that satisfies

If satisfies , the optimal trajectory of the attacker inside the circle of detection is moving vertically downward.

When , if the point satisfies that , the optimal trajectory of the attacker is a bypassing-penetrating trajectory. Moreover, if the point satisfies that , the optimal trajectory of the attacker is a directly penetrating trajectory.

Proof. Let us investigate the situation that first. As stated above, if the attacker can move along a penetrating trajectory, the -based Apollonian circle should have at least one intersection point with . The point satisfies the condition that the related Apollonian circle and have only one intersection point, and the -coordinate of is . From Theorem 6, we can learn that when , the left intersection point related to the point is on the left of the point . This means that when moves clockwise from to , there must exist some point of which the related left intersection point is below it. This situation is shown in Figure 6. Thus, if the attacker moves along a bypassing-penetrating trajectory, the intersection point on this trajectory cannot be below . This means that if moves clockwise from to , the optimal trajectory inside the detection range should be the line segment connecting and the related . Consider that the -coordinate of equals that of the related , the polar angle of should satisfies thatDefining function , we havewhere . The denominator of n is always positive. Hence, if , the numerator of the function equals zero. The numerator of is the equation of a parabola related to . From equation (34), we can learn thatwhich isSubstitute the ends of the interval shown above into , we haveHence, there is only one unique satisfying equation (33). All the points satisfying meet the condition that , which means if the attacker chooses one of these points as , the optimal trajectory of the attacker inside the circle of detection is moving vertically downward.
Next, we investigate the parameter . The distance between and is as follows:We take two possible between and into consideration, the polar angle of the first is and that of the second one is , . This means that the second point is below the first one. From Lemma 3 we can learn that the optimal trajectory between these two points is an arc with a radius of . The distance of the trajectory from the first point to is and the corresponding distance of the second one is . We define the function as follows:Thus, when , the equation (39) becomesWhen , the trajectory related to the first possible is longer than that of the second point, which means that the second one is better, and vice versa. So, if there exists some point that and is above , the attacker should move clockwise from to and enters the detection range there. If is below , the attacker should enter the detection from this point, which means and are the same point. Thus, it is essential to check whether this point exists.
Substituting equation (38) into function , we havewhereIt can be easily found out from (42) that decreases monotonically with the raising of . Equation (43) can be rewritten as follows:Since and decrease monotonically with the raising of , is monotonically increasing. Consider that , we have and the denominator of equation (41) decreases monotonically.
Then, let us investigate equation (44). We define . We take the derivative of equation (44) related to we haveNotice thatSubstituting inequality (47) into equation (46), we have as follows:This means that increases monotonically, and we have . Consider that the denominator of equation (41) decreases monotonically, the equation (41) is a monotonic increasing function. Notice that , the maximum and minimum value of this function is as follows:Consider the definition . The second equation of (49) can be rewritten as follows:It can be easily found that increases monotonically with the raising of . Thus,Since , and function increases monotonically, there is only one special value that meets the following condition:Thus, we prove the existence and the uniqueness of the special value .
Next, we investigate the relationship between variables and . Assume that . If satisfies , according to the definition of the variable , the optimal trajectory of the attacker when is moving clockwise from point to and then moving vertically downward. However, from the definition of , we can find that if the attacker moves vertically downward from the point , it can guarantee its arrival at , and the latter trajectory is shorter than the former one. This contradicts the definition of . Thus, there always has the condition that .
Thus, if the point satisfies that , the optimal trajectory of the attacker is a bypassing-penetrating trajectory. If the point satisfies that , the optimal trajectory of the attacker is a directly penetrating trajectory. Moreover, if satisfies that , the optimal trajectory of the attacker is moving vertically downward from there. The conclusions when can be acquired easily from symmetry are omitted here, and we complete the proof.

Now, we can analyze the situation that . With Theorem 7 obtained, this complicated situation is solvable now.

Theorem 8. In a situation where, the optimal trajectory of the attacker is the bypassing trajectory.

Proof. Let us investigate the two kinds of penetrating trajectories first. Consider the situation that , . The variables and defined in the proof of Theorem 7 meets the following condition:According to equation (50) and condition (53), the minimum value of here satisfies thatSince is a monotonic increasing function, the function is always larger than zero in this situation. This means that the directly penetrating trajectory is always longer than the bypassing-penetrating trajectory. Notice that , if the attacker’s trajectory is the bypassing-penetrating trajectory, should move clockwise from to point and then move along a line segment connecting and the left intersection point of and the related Apollonian circle. According to inequalities (27) and (28), when , the left intersection related to the point meets the condition that . From the proof of Theorem 3, we can learn that if there exists some point on the attacker’s optimal trajectory of which the -coordinate is , it needs to move vertically downward from that point. Thus, the optimal trajectory of in this situation is moving clockwise from to point and then moving vertically downward. It can be easily found out that the optimal described above is exactly the bypassing trajectory. Thus, we complete the proof.

Now, the optimal trajectories of the attacker in all possible situations are obtained. The optimal trajectories are summarized as follows:(1)When , the optimal trajectory of the attacker is the bypassing trajectory.(2)When and , the optimal trajectory of is the penetrating trajectory. The penetrating trajectory can be further divided into two subtypes in this situation according to Theorem 7.(3)When and , the optimal trajectory of is the bypassing trajectory.

When , according to Theorem 3, the barrier of the game is the same as the barrier with a time limit of the attacker only. Since the solution to the problem has already been obtained, it is unnecessary to investigate the attacker’s optimal trajectory in this situation. Moreover, it is worth noticing that the optimal trajectory obtained in this section is still valid if the defender is equipped with an infrared seeker or vision sensor of which the detection area is usually a sector rather than a circle. The only difference is that we need to further compare the special points with the endpoints of the sector’s arc to build the optimal trajectory. For example, as shown in Figure 7, if the polar angle of the endpoint is larger than the corresponding and , part of the attacker’s optimal trajectory is moving vertically downward from a point rather than moving clockwise to point . And if the polar angle of the endpoint is between and , the optimal trajectory is the same as that under the condition that the defender has a circular detection range.

With the optimal trajectories in all possible situations obtained, we can now investigate the barrier of the game considered in this article.

5. Game of Kind

In this section, we give the complete solution to the reach-avoid game with time limit and detection range. The expressions of the related barrier in all possible situations are provided. The situations are classified based on the parameters , , and . Consider that the barrier when is obtained in Theorem 3 and depicted in Figure 2; in this section, we mainly focus on the situation that . Also, we only analyze the situation that , the barrier satisfying can be obtained easily via symmetry.

5.1. The Reach-Avoid Game with a Time Limit and Detection Range When

According to Section 4, we can find that the attacker can enter the detection range and reach the target area only when the penetrating point satisfies . This means that the point satisfies . Consider that the area is always inside the DDR. When , we have and there is no possible that the attacker can enter the detection range from it and reach the line . Hence, the attacker can only move along the bypassing trajectory under this condition.

Additionally, we need to investigate the position relation between the circle of detection and the barrier with a time limit only to judge whether the part of the barrier with expression (17) exists. Assume that the circle of detection has intersection points with barriers. The intersection points must meet the condition that

Simplifying equation (55), we have as follows:

Notice that the coordinates of the points on meets the condition that , the solution of equation (56) is as follows:

Substituting equation (57) into the first equation in (56), and we have

Since , , and , we have . Consider that , if equation (58) has real roots, it needs to meet the condition thatwhich is

So, the barrier considered in this subsection can be divided into two situations: the one with and the one with . Both situations need to be discussed in detail.

5.1.1. The Reach-Avoid Game with a Time Limit and Detection Range When

Let us first investigate the former situation. In this situation, there exist at most one intersection point between and the circle of detection, which means the part of barrier with expression (17) does not exist. Considering that the optimal trajectories of the attacker must be the bypassing trajectory, the complete barrier in this situation is as follows:where . The related barrier is depicted in Figure 8. Comparing the barrier with the barrier which contains only the time limit of , we can find that the existence of the detection range expands the area of ADR, which means it benefits the attacker. It is obvious that if , the expression of the barrier is . The defender cannot influence the barrier under this condition.

5.1.2. The Reach-Avoid Game with a Time Limit and Detection Range When

Next, we investigate the situation when . In this situation, the circle of detection has two intersection points with , which are denoted as and . When , according to Theorem 2, the complete barrier is as follows:

The related barrier is depicted in Figure 9.

5.2. The Reach-Avoid Game with a Time Limit and Detection Range When

According to Subsection 5.1, when , the expression of the barrier is equation (61). Hence, we consider only the situations with here. Moreover, this situation can be further divided into two situations and we will analyze them individually.

5.2.1. The Reach-Avoid Game with a Time Limit and Detection Range When

Consider that , we first investigate the situation that . According to Section 4, the optimal trajectory of the attacker here is the bypassing trajectory. If an attacker is initially located on the barrier and its optimal trajectory is the bypassing trajectory, its initial position should satisfy thatwhere

The first part of the equation represents the length of the segment connecting the initial position and the related first touching point . The second part is the length of the arc connecting and point . Also, the third part is the length of moving vertically downward toward . When is initially located on the barrier, the total length of the three parts should be . According to the definition of bypassing trajectory, the segment is a segmental tangent line of the circle of detection, and the point is the related tangent point. Hence, the corresponding barrier should be as follows:where is the polar angle of the intersection points of and the circle of detection in the polar coordinate system defined in Theorem 5, the variable satisfies that

The corresponding barrier is depicted in Figure 10. The local enlarged figure of Figure 10 is illustrated in Figure 11. The part of the barrier outside the circle of detection on the left consists of two parts.

5.2.2. The Reach-Avoid Game with a Time Limit and Detection Range When

Next, we investigate the situation that and . It should be noticed that if the attacker enters the circle of detection with satisfying , it cannot reach successfully. Hence, the optimal trajectory of between and point is the bypassing trajectory. According to Theorem 7, the optimal trajectory of the attacker is the penetrating trajectory here. The related barrier is complicated and needs to be discussed in different conditions.

First, we consider the situation that the variables , , and satisfy . Here, if the variable meets the condition , the optimal trajectory of after arriving at is moving clockwise from point to and then moving along the segment . If an attacker is initially located on the barrier and its optimal trajectory is the penetrating trajectory in this situation, its initial position should satisfy that

The function is defined as equation (38), and the function is defined as equation (64). The optimal trajectory of each condition in expression (67) is illustrated in Figure 12.

The related barrier in this situation can be acquired by comparing the length of the optimal trajectories shown above and the time limit and is given by the following expression:

Moreover, the related barrier is depicted in Figures 13 and 14. It can be found that the part of the barrier outside the circle of detection shown in Figure 14 consists of three parts.

Then, we consider the situation that the variables , , and satisfy . In this situation, the attacker needs to move along the boundary of the circle of detection from to the point and then enters the circle of detection. However, we notice that the point is located on the barrier , and the time that costs the attacker to reach from the point is , which means cannot move from to and then reach within its time limit. Thus, in this situation, the barrier of the game satisfies , which is illustrated in Figure 2.

Now, the complete barrier of the original game with a time limit and detection range in all possible situations is provided. The situations and the related barrier are presented in Table 2.

6. Numerical Simulation

In this section, we give several numerical simulations to prove the correctness of the barrier obtained in this work. As stated in Lemma 4, if the attacker is initially located in , it needs to move vertically downward. In this situation, if , it wins the game, and vice versa. The result is simple and the simulations related to this situation are omitted.

Case 1. The initial positions of the defender and the attacker are , , which is inside ADR, and , which is inside DDR. The speeds of the players are and , respectively. The maximum operating time of is 3 and the maximum detection range of is 3. Thus, the barrier of the game in this situation satisfies equation (62). The Apollonian circle related to intersects the target line while that of does not. Consequently, the attacker can ensure its winning by moving straight toward any point on the part of within the Apollonian circle, and the attacker is intercepted. The results of the games are shown in Figure 15.

Case 2. The initial positions of the defender and the attacker are and , which is inside ADR, respectively. The maximum operating time of is 4 and the maximum detection range of is 4.5. The speed of the attacker is and the speed ratio is . In this situation, the expression of the related barrier is (66). As is shown in Figure 16, the attacker moves along the bypassing trajectory and wins the game. The length of the trajectory is 3.9179, which is less than . In this situation, the attacker remains undetected during the game process. Hence, the defender cannot adopt any strategy against the attacker.

Case 3. The initial positions of the defender and the attacker are and , which is inside ADR, respectively. The maximum operating time of is 4 and the maximum detection range of is 5. The speed of the attacker is and the speed ratio is . In this situation, the expression of the related barrier is (68). Thus, the attacker should move along a bypassing-penetrating trajectory to win the game, the penetrating point is defined in equation (31). The result of the game is shown in Figure 17. The attacker can choose some point on the target line inside the Apollonian circle related to the point to move toward and win the game. The total length of the trajectory is 3.9240, which is less than .
If the attacker enters the detection range before reaching , it may lose the game. This situation is shown in Figure 18. Although the attacker moves toward to save time, it cannot reach the target line before reaching its time limit.

Case 4. The initial positions of the defender and the attacker are and , which is inside DDR. The rest part of the game configurations is the same as that in Case 3. As is shown in Figure 19, in this situation, if moves along the bypassing trajectory, it must terminate inside and lose the game. If moves along the bypassing-penetrating trajectory, it needs to acquire the shortest path by entering the circle of detection from the point . However, as is illustrated, the shortest path is still longer than the maximum range , which means the defender can ensure its winning here.
The numerical simulations given above prove the correctness of the barrier constructed in this work. With the complete barrier obtained, the agent participating in the game can judge whether it can win the game or not via the initial configuration of the game easily and adopt proper strategies against its opponent. If the attacker is located in ADR, it can guarantee its victory by moving along the optimal trajectory obtained in Section 4. Moreover, if it is located in DDR, it can either choose to abort its mission or sacrifice itself against the defender to help its teammate, if exists, win the game.

7. Conclusion

In this article, we investigate the solutions to the reach-avoid game with a time limit and detection range on an unbounded planar domain. This is the first attempt to add both the maximum operating time of the player and the detection range constraint into the classic reach-avoid game at the same time. The main achievements are listed as follows:(1)We prove that the attacker cannot leave the detection range of the defender once it is detected, and we obtain the expressions of the barrier when the attacker is initially located inside the detection range.(2)We investigate the possible equilibrium terminal states of the game and prove that if the attacker is located on the part of the barrier outside the detection range, it must reach the target line at its time limit when it moves along its shortest possible path toward the target line. This is a common conclusion in the reach-avoid game with a time limit and detection range and it shrinks the possible set of the game’s terminal states and reduces the difficulty of solving the problem.(3)Using the properties of the circle and the form of the players’ optimal strategies obtained via HJI differential equations, we acquire the optimal strategies and the related optimal trajectories of the attacker participating in the reach-avoid game with a time limit and detection range according to its initial location in all possible situations. Moreover, these trajectories are still valid when the defender is equipped with an infrared seeker or vision sensor of which the detection range is usually a sector.(4)By comparing the length of the attacker’s optimal trajectory and its time limit, we obtain the expressions of the barrier with a time limit and detection range in all possible situations for the first time. Moreover, all the possible barriers are illustrated.

To our best knowledge, it is the first time that someone brings the time limit and the detection range to the classic reach-avoid game simultaneously. The introduction of these concepts makes the game more complex and more practical than most of the previous works in this research direction. Based on the expressions of the barrier obtained in this article, the results of the game can be obtained immediately when the initial configurations are given. So, the barrier obtained in this article can be utilized to evaluate the battlefield situation which has critical real-time requirements. In future, more practical and complex dynamic models, rather than simple motion, will be considered. Also, the game with a complex environment such as consisting of obstacles will also be one of our focuses.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.