Abstract

In this paper, the issue of optimally modifying the structure of a directed network to guarantee its structural controllability is investigated. Given a directed network, in order to obtain a structurally controllable system, a framework for finding the minimum number of directed edges that need to be added to the network is proposed. After we get these edge-addition configurations, we further calculate the network cost of each optimization scheme and choose the one with the minimum cost. Our main contribution is twofold: first, we provide an algorithm able to find all optimal network modifications in polynomial time; second, we provide a way to calculate the cost of optimizing the network based on the node betweenness. Numerical simulations are given to illustrate the theoretical results.

1. Introduction

The ultimate goal of complex network research is to find effective means to control network behavior and make it serve human beings. Controllability is a basic concept in control theory, which quantifies the ability to control a dynamical system from any initial state to any final state in finite time [1]. In the past decade, the issue of network controllability for complex dynamical systems has attracted increasing attention and becomes a focal topic in interdisciplinary research [230]. Numerous works have been reported from rather diverse perspectives on such topics as structural controllability [2, 3]; exact controllability [18]; edge dynamics [1921]; optimization [2224]; control energy [25, 26]; and robustness [27, 28].

In the study of network controllability, we usually rely on the theory of structural controllability [3137]. If there is a matrix pair that is controllable, all structurally equivalent matrix pairs are controllable except for special ill-conditioned cases [31]. Recently, those results have been applied to the controllability analysis of directed complex networks [2, 3, 16, 19, 22, 23] from a graph-theoretic perspective. Note that it is very effective to analyze network controllability by using tools developed under the background of structural control theory [31].

Optimization of the network controllability is of prime importance in real applications. Generally speaking, given a network which is structurally uncontrollable, we can make it structurally controllable through two strategies: (i) add external input signals to the original network [16] and (ii) add new edges to the network topology [23]. Wang et al. provided a method to change the structure of a complex network to make the system structurally controllable when only a single driver node was considered [22]. Zhang and Zhou considered three related problems on determining the minimal cost structural perturbations, including edge additions, edge deletions, and input deletions to make a networked system structurally controllable/uncontrollable [24]. Chen et al. proposed an approach to adding minimum directed edges to the original network so as to ensure structural controllability [23].

Motivated by the above discussions, a minimum-cost optimization method to guarantee structural controllability is investigated in this paper. It should be emphasized that, differing from [23], in this work, a new method is proposed to optimize network topology and thus to ensure the network controllability. Moreover, it also provides a way to calculate the total cost of optimizing the network. However, in [23], it only gives a method to optimize the network topology without considering the optimization cost. Note that calculating the optimization cost is exactly the major point in this work. In [27], Zhang et al. considered the problem of network cost. Although the measurement index of edge cost was given therein, it did not provide a simple and effective method to calculate the total network cost. Compared with the previous works, we not only address the problem of optimizing network controllability but also propose a way to calculate the cost of optimizing the network. The main contributions of this article are as follows. (i) We propose a new method to optimize the network topology so as to ensure the network controllability. (ii) We propose an algorithm to solve the optimal edge-addition configuration problem. (iii) After getting all the edge-addition configurations, we introduce network cost measurement indexes to calculate the cost of optimizing the network. Based on which, we can determine the optimal edge-addition configuration with minimum-cost. The results of this paper can provide both theoretical and technical guidance for the analysis and control of real complex networks. The obtained results shed some lights on the transformation of a structurally uncontrollable network to a structurally controllable one with a low cost. For example, in the power network, transmission lines with the lowest cost can be set up among substations to safely and efficiently control the entire power network.

The rest of the paper is organized as follows. Section 2 introduces the notation and terminology used in this paper. Problem formulation and preliminaries on graph theory are introduced in Section 3. The main results are given in Section 4. In Section 5, a network cost index is given to determine the minimum-cost edge-addition configuration. Finally, the summary of this paper and the prospect of future research are presented in Section 6.

2. Notations

In this paper, denotes the set of real numbers, is the space of real -vectors, and is the space of real matrices. For a set , its cardinality is denoted by .

A directed graph consists of a node set and an edge set . Here, implies that there exists a directed edge from node to node , and and are called the parent node and the child node, respectively. We can also say that the tail node is pointing toward the head node . For a digraph , a directed path of length from node to node is defined as a sequence of distinct edges of the form , in which all nodes are distinct. Here, node is called the beginning node and the end node of the directed path. A node is reachable from if there exists a directed path in from to . A directed graph is a subgraph of if and . A directed graph is said to be strongly connected if there exists a directed path between any two nodes. A strongly connected component (SCC) is a maximal subgraph that is strongly connected. Particularly, a source SCC has no incoming edges from another SCC.

A digraph contains a dilation if there is a subset of nodes such that the common-neighbor set of , denoted by , has fewer nodes than itself, i.e., . Here, is the set of nodes , in which there is a directed edge from node to some other node in . Notice that a digraph contains no dilation if each node has its own independent parent node. It is intuitively plausible that a dilation is a subgraph containing a relatively large number of nodes that are “dominated” by a small number of other nodes.

3. Problem Statement and Preliminaries

Consider a linear time-invariant (LTI) networked dynamical system described bywhere is the state vector of all nodes; is the input vector; is the input matrix identifying the nodes that are directly controlled, and is the adjacency matrix of the underlying network. The overall networked system described by (1) can be denoted by the matrix pair .

Definition 1. Linear network (1) is said to be state controllable if, for any initial state and any final state , there exist a finite time and an input , such that .

If networked system (1) is state controllable, we can say that the matrix pair is state controllable.

Definition 2. (see [16, 31]). A linear control system is a structured system if the elements in and are either fixed zeros or independent nonzero parameters. Both the two matrices and are called structured matrices.

In this paper, it is assumed that we only know the structure of the matrices and . This means that we know which elements in the matrices are fixed to zero and consequently which elements are nonzero free parameters.

Definition 3. A linear control system is structurally controllable if we can set some values to the nonzero parameters in and such that the resulting system is state controllable in the sense of Kalman defined in Definition 1.

A structured system can be represented by a directed graph whose nodes denote the (state and input) variables and edges indicate the connections between some variables [31]. In this paper, a structured system is denoted by a directed graph , in which is the node set and is the edge set. In particular, is the set of state nodes, corresponding to the nodes in the original network; is the set of input nodes corresponding to the inputs; is the set of edges between state nodes; and is the set of edges between input nodes and state nodes. In the whole paper, suppose that any input signal is applied to only one node, referred to as a driver node. A state node being reachable means that there is a directed path from some input node to this state node. Similarly, a node set is reachable if each node in the set is reachable. Notice that, in the remaining of the paper, unless otherwise specified, the reachability is only used for the state nodes.

In a digraph, an edge subset is a matching if no two edges in share a common parent node or a common child node. A matching of maximum size is called a maximum matching. The maximum matching of a digraph can be denoted by mapping the digraph to its bipartite representation. Consider a directed network , whose bipartite representation can be described by , in which and . That is, each state node of the original digraph is split into two nodes and . Here, if and if . To describe the relationship between the digraph and its bipartite graph, we use a signal-notation mapping to map directed edges from the system digraph into undirected edges of the system bipartite graph as follows: and . Also, we have that and .

Definition 4. The element in the matrix if there is a directed path from node to node . Set , . The matrix is called reachable matrix.

If only one external input is applied to node 1, then the first row of the matrix can be used to determine which nodes are unreachable.

Definition 5. The element in the matrix if edge is one of the matching edges of a maximum matching about a bipartite graph. The matrix is called maximum matching matrix.

The maximum matching of a directed graph is not unique. Therefore, the corresponding maximum matching matrix is not unique. It can be found from the matrix that the number of nonzero elements in the matrix is the number of matching edges in the maximum matching, and each row and each column have at most one nonzero element. The column is full of zero elements, indicating that node in the network does not have its own independent parent node.

Definition 6. Consider a directed network, in which only one external input signal is applied to node 1. If , , then such reachable matrix is called matrix. For example,

Obviously, if the reachable matrix of a network is a matrix, then all the state nodes in the network are reachable.

Definition 7. Consider a directed network, in which only one external input signal is applied to node 1. If the maximum matching matrix has a unique nonzero element in each column except for the first column, then such maximum matching matrix is called matrix. For example,

Obviously, if the maximum matching matrix of a network is matrix, then there is no dilation in the network.

A necessary and sufficient condition for the structural controllability of an LTI system is given as follows [31].

Lemma 1. (see [31]). The pair is structurally controllable if and only if the following two conditions are satisfied simultaneously:(1)Every state node in the digraph is reachable from some input node (2)The digraph contains no dilationsThen, we have the following controllability criterion.

Theorem 1. A directed network with is structurally controllable if and only if the following two conditions are satisfied simultaneously:(1)The reachable matrix of is a matrix(2)The maximum matching matrix of is a matrix

In this paper, given a structurally uncontrollable directed network, we study the problem of adding the least edges to improve the topology so as to obtain a structurally controllable system. After we get these optimal edge-addition configurations, we need to calculate the network cost of each optimization scheme and choose the one with the minimum cost. In summary, the problem is given as follows.

Problem 1. Given the pair with , finds.t. the reachable matrix of digraph is a matrix and the maximum matching matrix is a matrix,

where denotes the number of nonzero elements in a matrix .

If is structurally controllable, we refer to the matrix as an effective perturbed matrix and to in (4) as the modified matrix. The aim of this paper is to provide a characterization of all possible modified matrices by using graph-theoretical tools and design an algorithm to obtain such a solution.

4. Network Topology Optimization to Ensure Structural Controllability

Note that the system digraph is denoted by . Therefore, given an effective perturbed matrix , we can relate a digraph to the perturbed structured system , which we denote by , where the edge set is such that if and only if . Since the matrix is closely related to the , we can rewrite Problem 1 in a different way.

Problem 2. Given the system digraph with , finds.t. the reachable matrix of the digraph is a matrix and the maximum matching matrix is a matrix.

Additionally, define a feasible edge-addition configuration as a set of directed edges that is a feasible solution of Problem 2.

The solutions to Problem 2 are given in this section. First, a definition is introduced to describe the smallest set of edges needed to achieve reachability, i.e., satisfy condition (1) in Lemma 1. Let be the system digraph. The set of state nodes can be divided into two sets based on their reachability, namely, , where is the set of reachable nodes and is the set of unreachable nodes. In addition, assume that there are source SCCs that are unreachable, whose node sets are denoted by . In order to make the nodes in these unreachable source SCCs reachable, we need to add a new edge between the reachable node and the node in the source SCC so that all the nodes in the source SCC are reachable. Moreover, since the source SCC has outgoing edges pointing to other nodes, the unreachable nodes that are connected to the source SCC will also become reachable.

Definition 8. A set is made up of connected edges, then the set is called the connected edge set. Here, the connected edge refers to the connecting edge between the reachable node and the unreachable node.

Algorithm 1 is illustrated in Figure 1. The connected edge set contains the minimum number of added edges required to ensure that all the state nodes are reachable. Obviously, the connected edge set can only satisfy condition (1) in Lemma 1 and cannot guarantee the structural controllability of the networked system. To ensure structural controllability of the system, these edge additions must satisfy two conditions: (i) a set of connected edges and (ii) the “tail” node of the new edge is not used as an independent parent node in the maximum matching. It is the “head” node of the edge that has no independent parent node.

Input: reachable nodes sets and unreachable nodes sets
(1)Order the unreachable source SCCs:
(2)Select any edge in which is in the set of reachable nodes and is in the first source SCC
(3)Merge all reachable state nodes into a larger set (we can do it using either BFS/DFS or union-find)
(4)Call Steps 2-3 recursively until all unreachable source SCCs become reachable

Theorem 2. Consider a directed network , whose bipartite representation is denoted by . Let be a maximum matching, be a node set in which each node is not used as independent parent node, and be a node set with no independent parent nodes. A set is a feasible edge-addition configuration if and only if it contains the union of the following two sets:(1) is the set of connected edges(2)

Theorem 2 provides some feasible edge-addition configurations, but we need to find the optimal one from these configurations. Therefore, the first task is to select the optimal solution from these feasible solutions. From the above discussion, it can be found that, after determining the maximum matching of a bipartite graph, if those unmatched nodes (nodes without independent parent nodes) happen to be distributed in different source SCCs, then the added edges just meet both conditions in Lemma 1, which is exactly what is needed. To explore this situation, we introduce the following concepts.

Definition 9. Consider a directed network , whose bipartite representation is denoted by . Let be a maximum matching associated with . Moreover, let be the set of nodes in which each node has no independent parent nodes. If there is at least one node , in an unreachable source SCC, then such an unreachable source SCC is called an ideal source SCC.

Whether an unreachable source SCC is an ideal source SCC depends mainly on the specific maximum matching. Because there may be more than one maximum matching corresponding to a directed network, it is not possible to determine whether a node has an independent parent node in the maximum matching.

Definition 10. The of the directed network is defined as the maximum number of ideal source SCCs in all the maximum matchings.

We can determine a maximum matching attaining using Algorithm 2.

Input: A directed network with ;
(1)Write the reachable matrix of the directed network, and determine the unreachable node set in the network by the position (column ordinal) of the 0 element in the first row.
(2)Find the unreachable source SCCs.
(3)Select the nodes located in the source SCCs from the unreachable nodes set and mark their column ordinals.
(4)By using the marked column ordinals to identify an ideal maximum matching . Its corresponding maximum matching matrix is . The column ordinals corresponding to all 0 columns in the matrix need to match the marked column ordinals as much as possible.
(5)According to Step 3, an ideal maximum matching matrix can be obtained. From the matrix , the nodes corresponding to the matching column ordinals can be found.
(6)Based on the distribution of the nodes found in Step 5 in the source SCCs, can be calculated.

We take Figure 2, for example, to illustrate Algorithm 2.

The reachable matrix corresponding to the digraph in Figure 2(a) is expressed as follows:

The unreachable node set can be determined as by the position of the 0 element in the first row of . Moreover, there are two unreachable source SCCs (red box), whose node sets are and , respectively. Then, we can label columns 3, 5, and 6 of as follows:

Figure 2(b) shows the bipartite representation of the original directed network (Figure 2(a)). In order to make the column ordinals corresponding to all 0 columns in the maximum matching matrix coincide with the marked column ordinals as much as possible, an ideal maximum matching is determined in Figure 2(c), and its corresponding maximum matching matrix is expressed as follows:

There are at most two 0 columns in that are consistent with the marked column ordinals, and the corresponding node is located in , and node is located in , so .

If all the state nodes that are not used as independent parent nodes are unreachable, then additional edges are needed to satisfy condition (1) in Lemma 1. Therefore, in this case, calculating according to Algorithm 2 does not necessarily lead to an optimal configuration of added edges. To illustrate this statement, we take Figure 3 for example.

Next, we will propose Algorithm 3 to solve Problem 2. Algorithm 3 is mainly divided into the following four steps:Step 1. All the state nodes in the directed network are classified into a reachable node set and an unreachable node set, respectively, based on the node reachability.Step 2. Determine the ideal maximum matching to get . If there exist some unreachable nodes that are not used as independent parent nodes in the ideal maximum matching, then we alter the matching by finding a directed path rooted at the input node.Step 3. Add some edges to satisfy Lemma 1. These edges start at reachable nodes that are not used as independent parent nodes and end at nodes that have no independent parent nodes in unreachable source SCCs.Step 4. If there are unreachable nodes that are not used as independent parent nodes, then we need to add a set of connected edges to ensure that both two conditions of Lemma 1 are satisfied.

Input: A directed network ;
(1)All the state nodes in the network are classified into a reachable node set and an unreachable node set . Then, determine the unreachable source SCCs in the directed network .
(2)Using Algorithm 2 to get and .
(3)if , then
(4)Find an unreachable node , and thus add the edge ;
(5);
(6)else
(7)Set ;
(8)end if
(9)Obtain the unique set of disjoint directed paths in , where the beginning node of each is in some unreachable source SCCs and the end node is not used as a separate parent node;
(10)Let , are the beginning and end nodes of each path , respectively;
(11)Let ;
(12)if , then
(13)Find a reachable node ;
(14)for do
(15);
(16)
(17)end for
(18)if , then
(19);
(20);
(21)when
(22)end if

Given a structurally uncontrollable system that contains unreachable nodes and/or dilations. Therefore, we need to optimize the network topology to ensure structural controllability by adding edges. Algorithm 3 is given to obtain optimal edge-addition configuration to solve Problem 2.

Next, an example in Figure 4 is given to illustrate Algorithm 3.

5. Network Optimization Cost

We have solved the optimal edge-addition configuration problem; however, there are multiple potential edge-addition configurations to ensure structural controllability. From the application perspective, the lowest cost configuration is usually selected as the final optimization solution. Therefore, we present Problem 3 based on Problem 2, taking the network cost into account. In order to solve Problem 3, we introduce an edge cost measurement index to calculate the edge cost and thus obtain the cost of the whole network. In addition, we need to adopt a simple and practical method to calculate the cost of the network and determine a minimum-cost configuration to ensure the controllability based on the optimal edge-addition configuration.

Problem 3. Consider a directed network , finds.t. the new directed network contains neither unreachable nodes nor dilations. Also, the cost of the new directed network must be the lowest one.

5.1. Main Idea

Given a structurally uncontrollable directed network . The optimal edge-addition configuration is obtained by using Algorithm 3. The first step of calculating the network optimization cost is to obtain the load of each node in the network. Note that the nature of node load is exactly consistent with the betweenness centrality of the node. Betweenness centrality of a node refers to the proportion of the number of paths passing through the node in the total number of shortest paths. Intuitively, the betweenness centrality reflects the importance of the node as a “bridge.” Therefore, the initial load on each node can be denoted by its betweenness centrality [27]. We can calculate the betweenness centrality of each node by “pajek” software after importing a directed network. There is a nonlinear relationship between the load of a node and its capacity [38, 39], so we can determine the node capacity by this nonlinear relation. The cost of a node can be measured by its node capacity in the network. We take the larger one of the two node capacities as the cost of the edge that connects these two nodes [40]. In this paper, we calculate the network costs of all optimal edge-addition configurations and then choose the one with the lowest network cost as the optimal edge-addition configuration.

The specific calculation process of network cost is given as follows:Step 1. Node load can be measured by the betweenness centralitywhere denotes the betweenness centrality of node , denotes the number of the shortest directed paths that passes through node , and means the number of the shortest directed paths from node to node .Step 2. There is a nonlinear relationship between node load and node capacity described bywhere is the capacity of node , . Since there is a positive correlation between node load and capacity, set . Thus, the node capacity is determined byStep 3. Use the index of node capacity to measure the node costwhere denotes the cost of node .Step 4. Compare the capacities of two nodes of an edge, and take the larger one as the capacity of the edge (edge cost)where is the cost of edge .Step 5. Calculate the network cost of each configuration according to Step 4 where denotes the cost of the whole network.

5.2. Data Processing

In Figure 4(a), the initial directed network is given. Get the optimal edge-addition configuration by Algorithm 3, and and and .

The new directed network resulting from the first configuration scheme is shown in Figure 5. Figure 6 shows the curve of the state of each node over time.

We import this new directed network into pajek software to calculate the betweenness centrality of each node. The original data of betweenness centrality of each node are shown in Table 1. In Table 2, we collate the data of node load, node capacity, edge cost, and network cost according to each step described in Section 5.1. Then, we get the network cost of the first configuration scheme.

The new directed network resulting from the second configuration scheme is shown in Figure 7. The original data of betweenness centrality of each node are shown in Table 3. Similarly, we can obtain the data of node load, node capacity, edge cost, and network cost, as shown in Table 4.

The new directed network resulting from the third configuration scheme is shown in Figure 8. The original data of betweenness centrality of each node are shown in Table 5. Similarly, we can obtain the data of node load, node capacity, edge cost, and network cost, as shown in Table 6.

The new directed network resulting from the fourth configuration scheme is shown in Figure 9. The original data of betweenness centrality of each node are shown in Table 7. Furthermore, we can obtain the data of node load, node capacity, edge cost, and network cost, as shown in Table 8.

Comparing the network costs of the above four configuration schemes, we choose the fourth scheme as the optimal edge-addition configuration so as to get the solution of Problem 3.

5.3. Illustrative Example

In [23], a directed network as shown in Figure 10 is considered. The authors proposed 14 edge-addition configurations, i.e., , . However, they did not tell us which one is the optimal edge-addition configuration with the lowest cost. Using the results of our work, the cost of each optimization scheme can be calculated, and finally a scheme with the lowest cost can be selected to ensure the structural controllability of the network.

6. Conclusions

In this paper, we have solved the problem of how to optimize the network topology to ensure structural controllability. Given a structurally uncontrollable directed network, Algorithm 3 presents all possible edge-addition configurations. After determining the optimal edge-addition configuration, a network cost index is given to choose the lowest cost configuration.

In future, we can combine these two strategies of adding edges and adding external input signals to ensure the network controllability and choose the scheme with the highest benefit by comparing the costs of several strategies. In addition, we can extend a single directed network to the topology design of a multiplex network [29, 41] so as to ensure the structural controllability of the multiplex network.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under grant no. 61973064, the Natural Science Foundation of Hebei Province of China under grant no. F2019501126, the Natural Science Foundation of Liaoning Province of China under grant no. 2020-KF-11-03, and the Fundamental Research Funds for the Central Universities under grant no. N182304013.