Pitfalls and Remedies in DEA Applications : How to Handle an Occurrence of Zero in Multipliers by Strong Complementary Slackness Conditions

This study discusses a guideline on a proper use of Data Envelopment Analysis (DEA) that has been widely used for performance analysis in public and private sectors. The use of DEA is equipped with Strong Complementary Slackness Conditions (SCSCs) in this study, but an application of DEA/SCSCs depends upon its careful use, as summarized in the guideline. The guideline consists of the five suggestions. First, a data set used in the DEA applications should not have a ratio variable (e.g., financial ratios) in an input(s) and/or an output(s). Second, radial DEA models under variable and constant Returns to Scale (RTS) need a special treatment on zero in a data set. Third, the DEA evaluation needs to drop an outlier. Fourth, an imprecise number (e.g., 1/3) may suffer from a round-off error because DEA needs to specify it in a precise expression to operate a computer code. Finally, when a large input or output variable may dominate other variables in DEA computation, it is necessary to normalize the data set or simply to divide each observation by its average. Such a simple treatment produces more reliable DEA results than the one without any data adjustment. This study also discusses how to handle an occurrence of zero in DEA multipliers by applying SCSCs. The DEA/SCSCs can serve for a multiplier restriction approach without any prior information. Thus, the propesed DEA/SCSCs can provide more reliable results than a straight use of DEA.


Introduction
Data Envelopment Analysis (DEA) has been long serving as a methodology to evaluate the performance of organizations in business, economics and other areas.The applicability of DEA is not limited in research areas in social science, rather extending to engineering and natural science.The father of DEA is Professor William W. Cooper (University of Texas at Austin) who worked on DEA from the beginning to 2012.The survey research [1] summarized his contributions in DEA from an academic perspective of management science and operation research, dating back to a linkage between DEA and L1 regression developed in 18th century and [2] documented his conceptual and philosophical contributions in accounting and economics, based upon which he developed DEA as a methodology of "social accounting" and "social economics".
In the history of DEA, many DEA researchers (e.g., [3][4][5]) discussed the importance of incorporating SCSCs (Strong Complementary Slackness Conditions) into DEA.For example, the research of [4] documented what SC-SCs were and then proposed a use of the primal-dual interior-point method that incorporated them into DEA algorithm.Acknowledging the importance of the interiorpoint method in solving DEA equipped with SCSCs, studies [6][7][8] have recently proposed a new use of SCSCs for DEA (hereafter, DEA/SCSCs) in primal-dual combined radial and non-radial models, so not the algorithmic perspective.Their original studies [6,7] applied DEA/ SCSCs for identifying a supporting hyperplane(s) on an efficiency frontier.Then, they determined the type and magnitude of RTS (Returns to Scale) based upon the upper and lower bounds of an intercept of a supporting hyperplane(s).
Acknowledging the contributions of many previous research efforts on DEA, this study recently finds that students, professors and individuals who are not familiar with DEA, attempt to use the methodology for their applications in which data sets have unexpected conditions.For example, a data set contains zero or negative.Another example is a data set that contains a ratio variable in an input(s) and/or an output(s).Furthermore, a data set whose observation has an outlier(s) and/or an imprecise number (i.e., 2/3) where 2/3 is mathematically precise, but the number becomes imprecise in operating a computer code.In these cases, DEA applications need to consider a special treatment for each case.
Besides such fundamental issues, this study discusses another problem, or an occurrence of zero in multipliers.This study knows that [9] first discussed the problem of zero in multipliers.To overcome the problem of zero in multipliers, DEA researchers have long discussed multiplier restriction methods such as assurance region analysis [10] and cone ratio [11].Such approaches for multiplier restriction are very important in obtaining reliable results and related business/policy implications.However, the proposed approaches always need prior information (e.g., previous experience and scientific evidence).In many DEA applications, it is not easy for us to access such prior information.Moreover, the information contains a subjective decision by a user(s).The proposed use of SCSCs can omit the subjectivity in DEA assessment.No previous study has described such a use of SCSCs for multiplier restriction in DEA.
The remainder of this study is organized as follows: Section 2 provides an overview on an appropriate use of DEA.Section 3 describes mathematical formulations on DEA/SCSCs.Section 4 describes computational comments on DEA/SCSCs.Section 5 concludes this study along with future research directions of DEA.

Problems in DEA Applications
This section reviews a use of DEA and its fundamentals before describing DEA/SCSCs for multiplier restriction.The following concerns are important in applying DEA to various performance assessments.The violation needs a special treatment, depending upon each case and these combinations.
(a) Ratio Variable: A data set used in DEA applications should not have a ratio variable in an input(s) and/ or an output(s).To deal with the ratio variable, we need a series of modified DEA models.This study does not discuss the problem because [12] has provided a detailed description on the computational modification regarding DEA.A straightforward use of DEA does not properly function on the ratio variables.This concern is important because the original DEA model (i.e., CCR) has the mathematical structure of total weighted outputs divided by total weighted inputs.Thus, the structure of DEA has a ratio structure between inputs and outputs, so that these variables should not be ratio variables.In a similar manner, the proposed approach, or DEA/SCSCs, does not properly function on ratio variables, as well.This indicates that DEA/SCSCs need to develop a new approach to handle the ratio variable in an input(s) and/or an output(s).A future extension of DEA/SCSCs will explore the research task in another article.
(b) Zero or Negative in Data: Radial DEA models (i.e., CCR and BCC) need to treat zero in a data set, specially.As mentioned previously, [9] discussed about the special treatment in which a user needs to add a small number to zero.The treatment is practically acceptable, but mathematically problematic in DEA assessment, because the radial models produce different efficiency scores between with and without the treatment.Meanwhile, [8] has discussed a mathematical rationale regarding why the radial models cannot directly handle the occurrence of zero in a data set.According to [8], the radial models do not have the property of translation invariance so that they cannot directly handle an occurrence of zero in a data set.The property implies that an efficiency measure should not be influenced even if inputs and/or outputs shift toward a same direction by adding or subtracting a specific real number.Their study (in Table 1) indicates that RAM (Range-Adjusted Measure) has the property of "translation invariance" so that the non-radial model can handle an occurrence of zero in a data set.The property of the translation invariance is applicable to a negative value in data if a user depends upon the RAM.In contrast, it is impossible to apply radial models (i.e., BCC or CCR) and their related DEA/SCSCs to analyze a data set that contains zero or negative value in data.
(c) Outlier in Data: If an outlier exists in a data set examined by DEA, it is necessary to drop it from the data set because the outlier destroys the shape of an efficiency frontier so that DEA evaluation does not produce a reliable result.See, for example, [13] discussed how to handle the outlier issue in DEA.
(d) Imprecise Number: For example, an imprecise number (e.g., 1/3) may suffer from a round-off error because we need to specify the number by a precise expression to run a computer code.The number is mathematically acceptable, but not acceptable in the operation of a computer code for DEA.Some DEA investigator uses 0.3333 or the other may use 0.3334.To avoid the round- off error, it is necessary to specify a data range between 0.3333 and 0.3334 in the example of 1/3.It is trivial that DEA may produce different solutions by depending upon the two round-off numbers.See, for example, [14] for a detailed description on how to handle such imprecise data in DEA.Thus, we need to depend upon a special treatment on the data set in DEA and DEA/SCSCs computations.(e) Data Adjustment: When a large input or output variable dominates the other variables in terms of the magnitude, the large variable dominates the computation of DEA and DEA/SCSCs.In the case, it is necessary for users to normalize the data set or simply to divide each observation by its average.Such a simple data adjustment produces more reliable DEA results than the one without any data adjustment.
It is trivial to us that the problems discussed above originate from DEA itself, not SCSCs.The use of SCSCs depends upon only data sets that DEA can properly function.Otherwise, DEA/SCSCs may produce unacceptable results (e.g., a negative efficiency score, an unbounded solution, and an infeasible solution).
Finally, it is important to add that the use of DEA faces an occurrence of zero, but not a negative value, in data.However, when we apply DEA to a financial data set, the data usually contains financial ratios with negative variables.In the case, we need to depend upon a use of DEA-Discriminant Analysis (DEA-DA) that has a special structure for analyzing various financial data sets, containing negative ratio variables.See, for instance, [15] for a description on DEA-DA and its related applications on financial performance assessments.

DEA/SCSCs
This study starts with reviewing a radial DEA model and then extends it to SCSCs.The model used in this section is a radial model (or so-called BCC: Banker-Chaners-Cooper) that is formulated under variable RTS., , , 0 , , , 0 (T) indicates a vector transpose.It is important to note that a data set used for DEA performance evaluation should not violate the conditions summarized in Section 2.
The important feature of DEA is that it relatively determines the level of efficiency on the k-th DMU by comparing it with the other DMUs in terms of their multiple inputs and outputs.The following radial model may express the mathematical structure of DEA to measure the efficiency score of the k-th DMU: .
Here, the subscript   k indicates the specific k-th DMU examined by Model (1).The scalar   j  , often referred to as a "structural" or "intensity" variable, is used to make an analytical linkage among all DMUs in a data space.An efficiency score    is unrestricted (URS) and it is often referred to as "technical efficiency" of the k-th DMU.
Here, x i d stands for the i-th input slack and stands for the r-th output slack.Models (1) and ( 2) are mathematically same each other.However, there are two differences between them.One of the two differences is that Model (1) may consider the slacks ( ) as slack or surplus variables, but Model (2) considers them as decision variables in their computations.As a result, Model (2) can incorporate SCSCs more restrictively than Model (1).The other difference is that dual variables can be expressed by non-negative in Model (1) but they are unrestrictive (so, positive, zero and negative) in Model (2).In addition to the two concerns, it is important to mention that the original radial formulation (e.g.BCC) maintains slack variables in the objective function so that it becomes An importance of the objective function is that the dual variables become always larger than or equal  , so being always positive.
In other words, a problem associated with Models ( 1) and ( 2) discussed in this study is that they may produce zero in their dual variables.
To describe the dual issue discussed above, this study returns to Model (1) and formulate its dual model in the following mathematical structure: Here, are the i-th dual variable related to the first set of constraints in Model (1) and r are the r-th dual variable related to the second set of constraints in Model (1), respectively.A dual variable  , being unrestricted (URS), is derived from the third constraint of Model (1).DEA researchers conventionally refer to each dual variable as a "multiplier".
In the dual formulation of Model ( 2), all dual variables become unrestricted.To maintain the consistency between Models ( 1) and ( 2) in their dual formulations, this study incorporates the non-negativity on dual variables of Model ( 2), as formulated in Model (3).
s.t.all constraints in 1 and 3 , where the equation the objective of Model ( 1) is equivalent to that of Model (3).The last group of constraints indicates that an optimal solution obtained from Model (10) can satisfy SC-SCs (7)(8)(9).An unknown decision variable    is incur- porated into Model (10) in order to maintain SCSCs on optimality.
It is possible to replace Model (1) with Model (2) in Model (10) to incorporate the influence of slacks more clearly.Mathematically, the two models do not have any difference except the sign of dual variables.However, they are computationally different, as mentioned previously.Furthermore, the multiplier restriction by SCSCs functions on efficient DMUs, not inefficient DMUs.However, it is true that only efficient DMUs consist of an efficient frontier, based upon which DEA evaluates the performance of all DMUs.

Comments on Computation
The number of constraints determines the computational time of DEA because it determines the size of a basic matrix in linear programming.It is true that DEA/SCSCs has longer side constraints than original DEA.For example, the number of constraints in Model (1) has 1 m s   3 and that of Model (10) where n is usually larger than The size of a basic matrix of DEA/SCSCs becomes much larger than that of DEA.In the case, as discussed in [16], a column reduction tech-nique of linear programming becomes useful in the computation after changing Model (10) to a dual formulation.See, for example, [17,18] for a detailed discussion on DEA special algorithms.
To overcome the computational problem of DEA/SC-SCs, this study proposes two possible approaches.One of the two approaches is a use of the primal-dual interiorpoint method, proposed by [4], because the method can simultaneously solve the primal part and the dual part of Model (10) so that the computation time of Model (10) is almost same as that of Model (1).Second-Order Cone Programming (SOCP) is the most promising approach among primal-dual interior methods.See [19] for a description on how to use SOCP for DEA.
The other approach is that we can utilize network computing, proposed by [16], which connects multiple computers and synchronizes them as a single computing entity for DEA.See a large simulation study of the network computing in their study.If we apply the network computing to Model (10) and ordinal linear programming software to Model (1), then the former computational time is much faster than the latter.Thus, DEA/SCSCs has many computational options in dealing with a large data set (e.g., more than 100,000 DMUs).The combination between primal-dual interior point method (e.g., SOCP) and network computing equipped with column reduction technique can very effectively solve various large DEA/ SCSCs.
Finally, it is not necessary for us to use special computer schemes, as discussed above, because a modern computer is very efficient and fast.The computer developments are much faster than algorithmic developments in recent days.Thus, the computation of DEA/SCSCs can depend upon the modern computer in dealing with a small data set (e.g., less than 1,000 DMUs).

Conclusion and Future Extensions
This study provided a set of guidelines for a proper use of DEA and DEA/SCSCs.The DEA/SCSCs is mathematically correct, but its successful applications of SC-SCs depend upon a careful use, as summarized in this study.All the guidelines discussed for DEA/SCSCs are applicable to a proper use of DEA, as well.Besides the guidelines, this study discussed implications of SCSCs from the primal and dual aspects of DEA/SCSCs.Such two aspects on SCSCs have been never explored in the previous DEA studies.
It is true that both DEA and DEA/SCSCs are not perfect.There are many problems associated with their uses in addition to the problems discussed in this study.For example, DEA assumes that all DMUs use same inputs and same outputs.The underlying assumption is often unrealistic in modern business.For example, a firm uses three inputs to produce two outputs.Meanwhile, another firm uses four inputs to three outputs.Thus, different firms use different combinations between inputs and outputs.The number of inputs and outputs are usually different among firms.In the case, it is impossible for us to apply DEA under such a business environment.That is a major problem associated with DEA, so becoming an important future research task for this study.
It is also true that managers of each firm have their own learning capabilities to adjust their strategic behaviors in a dynamic time horizon.The previous studies did not pay attention to the learning capabilities of managers.The observation on business reality suggests that DEA research needs to direct itself toward a combination between DEA and artificial intelligence (e.g., agent-based approach for complex analysis) in computer science.It can be easily envisioned that such a combined research effort will open up a new research area for DEA.
In conclusion, it is hoped that this study contributes in DEA.We look forward to seeing future research extensions, as suggested in this study.
The DEA model has the following mathematical (input-based) structure to measure an efficiency score    of the k-th DMU (Decision Making Unit,