Abstract

Data envelopment analysis (DEA) is a popular mathematical tool for analyzing the relative efficiency of homogenous decision-making units (DMUs). However, the existing DEA models cannot tackle the newly confronted applications with imprecise and negative data as well as undesirable outputs simultaneously. Thus, we introduce undesirable outputs into modified slack-based measure (MSBM) model and propose an interval-modified slack-based measure (IMSBM) model, which extends the application of interval DEA (IDEA) in fields that concern with less undesirable outputs. The novelties of the model are that it considers the undesirable outputs while dealing with imprecise and negative data, and it is slack-based. Furthermore, the model with undesirable outputs is proven translation-invariant and unit-invariant. Moreover, a numerical example is provided to illustrate the changes of the lower and upper bounds of the efficiency score after considering the undesirable outputs. The empirical results show that, without considering undesirable outputs, most of the lower bounds of the efficiency scores will be overestimated when the DMUs are weakly efficient and inefficient. The upper bound will also change after considering undesirable outputs when the DMU is inefficient. Finally, an improved degree of preference approach is introduced to rank the DMUs.

1. Introduction

Data envelopment analysis (DEA) is a popular mathematical tool for analyzing the relative efficiency of homogenous decision-making units (DMUs). With multiple inputs and outputs, DEA can measure the relative efficiency of DMUs by using a ratio of the weighted sum of outputs to the weighted sum of inputs. An efficient DMU always consumes less input to produce a specific amount of outputs or produces more outputs by consuming an equal amount of inputs. However, the conventional DEA models of CCR [1] and BCC [2] are based on two priori assumptions that limit their application: the input and output data should be first precise and, second, nonnegative.

Obtaining precise data in real-life situations is not always possible, so bounded (interval), ordinal, and ratio-bounded data are often used in applications [3, 4]. This precise data assumption can, in some cases, limit the applications of conventional DEA models. Cooper et al. [5] first introduced the imprecise (interval) DEA (IDEA) to cope with imprecise data, and many scholars have since contributed to the theoretical development of this method. Despotis and Smirlis [6] transformed a CCR model to handle interval data, and it gave a natural outcome in the form of lower and upper bounds of efficiency scores. However, the transformation was only applied to variables. Entani et al. [7] formulated dual models of the IDEA with an interval efficiency obtained from both optimistic and pessimistic viewpoints. Based on this, Wang et al. [8] developed a pair of interval models to convert ordinal preference information and fuzzy data into interval data through scale transformation and an α-level set, respectively. Wang et al. [9] further introduced a virtual anti-ideal DMU into a bounded DEA model to unify the best and the worst relative efficiencies under optimistic and pessimistic situations. However, Azizi and Jahed [10] pointed out that this assumed virtual anti-ideal DMU will make no sense when the input is zero and proposed a pair of improved IDEA models that make it possible to conduct a DEA analysis using the concepts of the best and worst relative efficiencies. Toloo et al. [11] constructed a pair of IDEA models based on pessimistic and optimistic standpoints to identify the unique status of each imprecise dual-role factor. Amir et al. [12] addressed the managerial and technical issues in allocating weights and in handling imprecise data through a total cost of ownership- (TCO-) based DEA approach. However, these models are not slack-based and can only deal with nonnegative data, indicating that these models can only measure radial efficiency with nonnegative data.

In addition to the assumption of precise data, conventional DEA models assume that all DMU inputs and outputs are nonnegative. However, this is not always possible in real-life problems when loss occurs, such as with profit or noninterest income. Traditionally, negative data are eliminated or transformed to positive through data transformation [13, 14]. However, eliminating the negative data will lose some DMU information, and the solution of the object function will be affected through the data transformation. Pastor [15] was the first to use the translation invariance property of DEA models when addressing negative data, which does not require the data to be eliminated or transformed. Halme et al. [16] introduced the property to radial models for dealing with interval data, including negative data. Hatamimarbini et al. [17] developed the interval semioriented radial measure (SORM) model to evaluate efficiency in the presence of interval data without sign restrictions. Cheng et al. [18] developed a variant of radial measure (VRM) to address variables, which could be negative or nonnegative, for different DMUs, but the efficiencies produced by the input-oriented VRM model may be negative [19] and those from the output-oriented VRM model can be in the range of [0.5, 1] [20]. To avoid such drawbacks, Tung [20] further defined two efficiency measures for input-oriented and output-oriented VRM models. Although the models mentioned above can deal with negative data, and some are translation-invariant and/or unit-invariant, they are still not slack-based models and ignore the inefficiency caused by nonradial slacks.

Thus, most developed models have addressed only imprecise data or only negative data rather than both simultaneously, and none are slack-based. Tone [21] proposed a slack-based measure SBM (a) of efficiency that puts aside assumptions about proportionate changes in inputs and outputs and deals directly with the input excesses and the output shortfalls of DMUs. Lotfi et al. [22] integrated the SBM (a) model into IDEA to address interval data from the optimistic perspective and defined the upper and lower bounds of the SBM-efficiency scores, to classify DMUs into three subsets. Azizi et al. [23] formulated SBM (a) models in IDEA from both optimistic and pessimistic perspectives to measure the overall performance of DMUs. These SBM (a)-based IDEA models measure the nonradial efficiency with interval data, but do not consider negative data. Sharp et al. [24] introduced the idea of the range-possible improvement into the SBM (a) model and developed a modified slack-based measure (MSBM) model to evaluate DMUs with negative data. The MSBM considers input and output slacks and possesses the property of being translation invariant. Tone et al. [25] proposed base point SBM (BP-SBM) models, which are consistent with ordinary SBM (a) models, to deal with negative data. Both MSBM model and BP-SBM models are slack-based and can handle negative data. However, they ignore imprecise data. Yang and Mo [26] considered these three characters simultaneously and extended the MSBM model to the interval MSBM (IMSBM) model, to evaluate the efficiency of particular DMUs with imprecise and negative data, and is also slack-based. However, the IMSBM model does not consider undesirable outputs. Tone [27] developed a new SBM (b) model from the SBM (a) model to measure efficiency in the presence of undesirable outputs. However, SBM (b) cannot yet deal with imprecise data.

This study develops the IMSBM model to address undesirable outputs, which extends the application of IDEA in fields that concern with less undesirable outputs, such as air pollutants, hazardous wastes, and nonperforming loans. Our new IMSBM model is based on SBM (b), unlike the current IDEA models, and thus it considers undesirable outputs and both radial and nonradial efficiencies from the perspectives of slacks. We also confirm that the new model is unit-invariant and translation-invariant. In Table 1, we compare the new IMSBM model with the other DEA models mentioned above.

The remainder of this paper is organized as follows. In Section 2, the IMSBM model with undesirable outputs is presented. Section 3 classifies DMUs into three subsets, and an improved degree of preference approach is introduced to rank the interval efficiencies. The IMSBM model with undesirable outputs is applied to evaluate the interval efficiency of Chinese city commercial banks in Section 4. The final section presents our conclusions.

2. The MSBM and IMSBM Models with Undesirable Outputs

Färe et al. [28] pointed out that the assumption of the constant returns to scale (CRS) suggested that any DMU could be radially expanded or contracted to form other feasible DMUs, which causes inconsistency with negative data. However, this is not the case under a variable returns to scale (VRS), so the models mentioned below are therefore assigned under the VRS.

2.1. The MSBM Model with Undesirable Outputs

First, we extend the MSBM proposed by Sharp et al. [24] to deal with undesirable outputs. Consider a set of homogenous units under analysis, and each consumes varying amounts of different inputs to produce different outputs (), where is the number of good outputs and is the number of bad (undesirable) outputs. Specifically, consumes of each input to produce of each good output and of each bad output. The inputs, good outputs, and bad outputs can be represented by three vectors , , and , respectively. Then, the production possibility set () under VRS assumption is defined aswhere is the intensity vector and keeps under VRS assumption. A (, , ) is efficient in the presence of undesirable outputs if there is no vector such that , and with at least one strict inequality [27]. When considering input and output slacks, i.e., input exceeds (), good output shortfalls (), and undesirable outputs exceeds (), the production possibility set () under VRS assumption can be defined as

We now introduce the ideal point into the MSBM model with undesirable outputs. For a given dataset, the ideal point is considered as . Therefore, for , the range of possible improvement is defined as

Obviously, . Replacing the corresponding terms in the SBM (b) model with , and , the MSBM model with undesirable outputs is thus

According to Tone [29] and Cooper et al. [30], in formula (4), the minimization of the numerator can be interpreted as the MSBM-input-efficiency, that is, . In addition, the reciprocal of the maximization of the denominator can be interpreted as the MSBM-output-efficiency, that is, . Therefore, the MSBM nonoriented efficiency can be defined as min through multiplying by , and min subjects to . In formulas (4) and (5), , and are slacks in the input, good output and bad output of , respectively. The weights of each input , good output , and bad output are determined subjectively by decision-makers and subject to , , ,, and .

Note that when , , or , it is assumed that the corresponding , , or is dropped from the numerator or denominator [24].

2.2. The IMSBM Model with Undesirable Outputs

The IMSBM model with undesirable outputs can be defined based on the MSBM model with undesirable outputs.

For the IMSBM model with undesirable outputs, the inputs, good outputs, and bad outputs are assumed to be interval variables denoted as , , and , where is the lower bound of , is the upper bound of , is the lower bound of , is the upper bound of , is the lower bound of , and is the upper bound of . In this case, the ideal point in the IMSBM model with undesirable outputs is considered as . Consequently, for , the range of possible improvement is defined as

Obviously, . Therefore, the IMSBM model with undesirable outputs for is defined aswhere is interval data denoted as . Similarly, when , or is zero, the corresponding term is assumed to be dropped from the numerator or denominator.

The lower bound of the interval efficiency is under the most unfavourable situation for . Thus, consumes to produce and , while consumes to produce and . Symmetrically, the upper bound of the efficiency is the most favourable situation for . Thus, consumes to produce and , while consumes to produce and . Therefore, models (7) and (8) interpret the IMSBM model with undesirable outputs as a whole, including the relative efficiencies under the most unfavourable and favourable situations. Subsequently, they can be divided into a pair of precise models, the lower efficiency models, and the upper efficiency models. Models (9) and (10) interpret the lower efficiency under the most unfavourable situation for ; inversely, models (11) and (12) interpret the upper efficiency under the most favourable situation [26, 31]:

According to the Charnesa and Cooper transformation [32] and referring to [3335], the IMSBM model with undesirable outputs can be transformed into a linear programming form. We multiply a scalar variable for both the numerator and the denominator of the objective function of (9) which does not impact . By adjusting and if the denominator equals 1, then the denominator can be regarded as a constraint, and the objective function minimises the corresponding numerator. The lower bound of the IMSBM model with undesirable outputs is

Formula (14) is a nonlinear programming problem due to its nonlinear terms, and some definitions are needed to transform it into a linear programming problem. Assume

Obviously, , and the transformed problem is

Assuming the optimal solution of (16) and (17) to be , then, according to (15), the optimal solution of (9) and (10) can be obtained numerically as

Symmetrically, the transformed problem of (11) and (12) is

Assuming the optimal solution of (19) and (20) to be , then, according to (15), the optimal solution of (11) and (12) can be obtained numerically as

If the lower bound of is inefficient, it can be improved to become efficient by

Symmetrically, if the upper bound of is inefficient, it can be improved to become efficient by

2.3. Properties of the IMSBM Model with Undesirable Outputs

The following properties are considered the bases of designing an efficiency measure [1].

Property 1 (translation-invariant). This is critical, particularly when input-output data contain zero or negative values.

Property 2 (units-invariant). This is considered an important property in DEA, and in general mathematical terms, this property is referred to as dimensionless.

Theorem 1. The IMSBM model with undesirable outputs is translation-invariant.

Proof. A measure is translation-invariant if and only if the model is equivalent before and after the translation [36].
Transform the input data and by the real number , transform the good output data and by the real number , and transform the bad output data and by the real number , where subjects to , subjects to , and subjects to (without loss of generality, , , and are assumed to be nonnegative). Models (9) and (10) for the translated data arewhere , , , , , and . It can be verified that , and . As , the constraints in (25) imply , , and . Therefore models (9) and (10) and (24) and (25) are equivalent problems, and thus models (9) and (10) are translation-invariant. Models (11) and (12) can similarly be proven to be translation-invariant. Therefore, the IMSBM model with undesirable outputs is translation-invariant. This proof is thus complete.

Theorem 2. IMSBM model with undesirable outputs is unit-invariant.

Proof. Consider the input, the good output, and the bad output in the models (9) and (10), rescale both bounds of the input by multiplying it by a scalar , rescale both bounds of the good output by multiplying it by a scalar , and rescale both bounds of the bad output by multiplying it by a scalar . The ideal point is . It can be proven that , , , , , and . From the constraints, we have , , , , , and . The rescaling does not impact . Thus, models (9) and (10) are unit-invariant.
Models (11) and (12) can be similarly proven. Therefore, the IMSBM with undesirable outputs is unit-invariant. This proof is thus complete.

3. Classification and Ranking of the DMUs

The efficiency scores measured by the IMSBM model with undesirable outputs are calculated in an interval form, and thus a simple and practical approach is required to compare and rank the performance of the DMUs.

Haghighat and Khorram [37] noted that DMUs can be classified into three subsets according to the interval efficiency. The first is the strictly efficient subset, with . The second is the weakly efficient subset, with . The third is the inefficient subset, with . Ranking the DMUs in the same subset is obviously difficult when the DMU number is greater than one. Wang et al. [38] proposed the degree of preference approach for ranking interval data. However, although this approach is suitable for a pairwise comparison, it is less convenient in a complex system. Thus, we introduce an improved degree of preference approach to rank interval efficiency scores.

Suppose there are two interval efficiencies, denoted as and . Then, the degree of preference of over () can be defined as , which reflects the interrelationship among and :

Accordingly, the degree of preference of over can be defined as

Besides the above two options in (26) and (27) ( and ), the interrelationship among and exists in the third option, that is, ( and ). It is easy to verify that if , then . According to (26) and (27), if , such that , then the following matrix that consists of is an antisymmetric matrix.where and , .

If and , then , indicating that the degree of preference satisfies transitivity [38].

According to transitivity, it can be verified that the degree of preference approach possesses the following property.

Property 3. If , , then ; inversely, if , , then ; and if , then , .
Here, and denote the sum value of the degree of preference of the deferent rows in matrix, that is,where and can be denoted by the vector . We can verify from the property that if , then , and if , then . Therefore, the different interval efficiency scores in subsets and can be ranked through vector due to the transitivity.

Proof. For a matrix, as , if , then and , and thus it possesses the property.
For an matrix, if , then , and . If , then (when , ), specifically.(1)If , according to (26), thenWhen , thus .
When , thus , as and , then .
Likewise, when , it can be verified that .(2)If , according to (26), thenWhen , thus , as , then .
When , thus , as , then .
Therefore, if , it can be verified that , according to the transitivity property, then , , , , . This proof is thus complete.

4. Application to Chinese City Commercial Banks

In this section, we implement the proposed IMSBM model with undesirable outputs to evaluate the efficiency scores and the classification of Chinese city commercial banks in 2017. Our study represents the first attempt to measure the interval efficiency of these banks with both negative data and undesirable outputs.

Based on the availability of data, we evaluate the interval efficiency scores of 99 city commercial banks, each of which is associated with two inputs (staff costs (COST) and total assets (ASST)) and three outputs (noninterest income (NINT), interest income (INTE), and nonperforming loan (NPL)). To simplify the problem, the inputs and outputs are weighted equally, as presented in Table 2 in the parentheses. Due to space limitations, we only give the DMUs with negative data in Table 2. Four DMUs have negative outputs in all of the samples. DMU87 (Cang Zhou bank) has negative noninterest income at both the lower and the upper bound. Three other banks (DMU66 (Xia Men bank), DMU77 (Ying Kou bank), and DMU97 (Gui Zhou bank)) have negative noninterest income at the lower bounds. The remaining 95 banks have positive inputs and outputs at both bounds and are not included in Table 2.

The resulting interval efficiency scores and corresponding classification evaluated by the IMSBM model with undesirable outputs are shown in Table 3, and those for the model without undesirable outputs are given in the two adjacent columns for comparison.

Table 3 shows that, for the IMSBM model with undesirable outputs, only DMU18 (Liang Shan Zhou bank) is strictly efficient, 82 banks are weakly efficient, and the remaining 16 are inefficient.

A comparison of the efficiency scores evaluated by the IMSBM models with and without undesirable outputs shows that the strictly efficient DMUs in the two models are the same (i.e., DMU18). However, it is important to note that the interval efficiency scores of the weakly efficient and the inefficient DMUs changed after considering the undesirable outputs. When the DMUs are weakly efficient, the lower bound of the efficiency score decreased after considering undesirable outputs except DMU3, DMU22, DMU71, and DMU85, while the upper bounds of the efficiency score remained unchanged. When the DMUs are inefficient, the lower bound of the efficiency score also decreased after considering undesirable outputs, while the change of the upper bound of the efficiency score is complicated. The upper bounds of the efficiency score of 8 banks increased, and for the other 13 banks, the opposite is observed. Therefore, without considering the undesirable outputs, the lower bound of the efficiency score will be overestimated as a whole, when the DMUs are weakly efficient and inefficient. In addition, the upper bound of the efficiency score will change when considering the undesirable outputs when the DMUs are inefficient.

The details of the performance ranking are required, in addition to the classification. According to (26) and (27), the interrelationship among DMUs can be established through the degree of preference , which constitutes a matrix. Due to space limitations, the matrix is not shown in this paper. The sum value of the degree of preference can then be calculated according to (29), and all of the DMUs are ranked based on the value. The value and the corresponding rank of each DMU are shown in Table 4, where and denote the sum values of the degree of preference and the rank of each DMU, respectively, with the IMSBM model with undesirable outputs, and the contrasting and with the model without undesirable outputs are given in the adjacent two columns.

As shown in Table 4, DMU18 is strictly efficient under both models; therefore, the sum values and are both equal to 98, excluding the value on the leading diagonal. From and , DMU18 is found to be ranked in the top position under both models. We can then examine the ranks of the other 10 DMUs below DMU18. With the IMSBM model with undesirable outputs, the relationship among the 10 DMUs is established as DMU71 DMU85 DMU2 DMU1 DMU4 DMU42 DMU47 DMU72 DMU96 DMU91. For the IMSBM model without undesirable outputs, the relationship is established as DMU71 DMU85 DMU2 DMU72 DMU1 DMU42 DMU4 DMU91 DMU70 DMU96. This indicates that the IMSBM model with undesirable outputs leads the ranks of the weakly efficient and inefficient DMUs to change.

5. Conclusion and Discussion

This study develops the IMSBM model to address undesirable outputs, which extends the application of IDEA in fields that concern with less undesirable outputs, such as air pollutants, hazardous waste, and nonperforming loan. Several models in the literature have been developed to handle problems of imprecise and (or) negative data, but few models consider handling imprecise and negative data simultaneously. These models also ignore undesirable outputs. Thus, we first propose the IMSBM model with undesirable outputs. The model is novel as it considers undesirable outputs while dealing with imprecise and negative data, and it is slack-based, which ensures efficiency is obtained when considering both radial and nonradial slacks.

This study establishes that the IMSBM model with undesirable outputs is translation-invariant and unit-invariant. The model is applied to evaluate the interval efficiency scores of Chinese city commercial banks, which are compared with those evaluated by the IMSBM model without considering undesirable outputs. The empirical results show that the IMSBM model with undesirable outputs reduces the lower bounds of the efficiency scores of the weakly and inefficient DMUs as a whole. Therefore, without considering undesirable outputs, most of the lower bounds of the efficiency scores will be overestimated when the DMUs are weakly efficient and inefficient. In addition, the model leads to changes in the upper bounds of the efficiency scores of inefficient DMUs. Finally, the interval efficiency scores are ranked with an improved degree of preference approach.

The proposed IMSBM model with undesirable outputs is assigned under the VRS, but not the CRS. Therefore, the interval efficiency scores evaluated by the right model are pure technical efficiencies (PTE). In addition, the resulting interval efficiency scores are in the range of [0, 1], and the upper bound of each cannot be greater than one, so the strictly efficient DMUs cannot be ranked. Thus, in future studies, we will focus our attention on the IMSBM model with undesirable outputs under the CRS to evaluate the technical efficiency (TE). In addition, we will develop a superefficiency model from our model to rank the strictly efficient DMUs.

Data Availability

All of the data used to support the application of the model were collected by the authors from the annual reports and audit reports of Chinese city commercial banks.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.