Public Transit Performance Evaluation Using Data Envelopment Analysis and Possibilities of Enhancement


This study evaluates the operational performance of all routes of Sajha Bus Yatayat operating inside Kathmandu valley using Data Envelopment Analysis (DEA) in terms of efficiency and effectiveness score. This approach allows us to access the relative performance of transit system in absence of historical data and research to compare with. To explore the possibility of enhancing the performance, scenarios were created for relatively underperforming routes and long route problem by changing the most important input variable and output variables accordingly with regression model where it was relevant. Partial Least Squares (PLS) regression was used to determine the most influential input variables to the output variables. DEA was conducted to access the performance of all routes under these scenarios. Underperforming routes except the longest route under the first set of scenarios, emerge to be better performing efficiently without considerable negative deviation in effectiveness. The result of second set of scenarios for long route problem suggests that the longest route’s performance can be enhanced significantly upon proper route alignment. Scenarios development and evaluation can help lead transit companies to explore the strategies to facilitate operational performance enhancement.

Share and Cite:

Sujakhu, S. and Li, W. (2020) Public Transit Performance Evaluation Using Data Envelopment Analysis and Possibilities of Enhancement. Journal of Transportation Technologies, 10, 89-109. doi: 10.4236/jtts.2020.102006.

1. Introduction

Kathmandu valley, in central Nepal suffers high population density because of the centralization, causing major traffic problems like congestion, delay, accidents, high demands of fuel, air pollution, noise pollution etc. To meet the mobility demand, people prefer private vehicles like motorcycle over public vehicles. From 1989/90 to 2017/2018, vehicles registration in Bagmati zone has elevated from 34,606 to 1,172,413 out of which 12,617 are buses and 921,917 are motorcycles (74% in overall vehicle composition in the valley) [1]. High rise in the number of private vehicles is the major cause of traffic problems. Effective and efficient public transit discourages private vehicles and other externalities of traffic problems. Thus, the growth of public transit should be given special attention. Evaluating performance of public transport system facilitates operational performance improvement and strategic decisions. Hence, the public transit agencies must have an effective method of measuring and evaluating the performance of their service and thus can allocate scarce resources to meet demand and provide a level of serviceability under severe operational stress and financial constraints.

Public bus service should be operated efficiently and effectively, considering both demand and supply perspective to make the choice more favorable and attractive. To maintain the balance between demand and supply can be a hard task. Reducing operation and maintenance costs results in reduction of ridership. Whereas, increasing ridership has to be dealt with high operational cost. Regarding effectiveness, passengers should be able to feel satisfied being their daily travel requirement met at lower cost. As such, effectiveness can be measured by the service utilization (ridership), service quality, and service satisfaction [2]. Regarding efficiency, transit operators typically aim at minimizing the operational costs conditional on meeting the daily travel demand by passengers. As a result, efficiency measures describe the relationship between resource inputs and produced outputs and include indicators of the overall cost efficiency, labor utilization, and vehicle utilization [2]. So, effectiveness and efficiency should be considered independently in public transit system.

Public transit system consists of multiple inputs and output variables. Data Envelopment Analysis (DEA) provides an innovative approach to aggregate all input and output variables into a single scale to measure performance levels [3]. Up to the date, there are very fewer studies that had been conducted on operational performance evaluation of public transit in Nepal. Therefore, this study is conducted to fill the gap in order to contribute to developing a model using Data Envelopment Analysis (DEA) for measuring efficiency and effectiveness of public transit service provided by Sajha Bus Yatayat in Kathmandu Valley.

In this paper we applied partial least square to build a relation between input variables and output variables and found out the most influential input variables. We established linear regression model between the most influential variables and output variables, later to be used to create scenarios in DEA model. This paper focuses mainly on presenting an approach using DEA method to investigate the operational performance of the public transit, identify the drawbacks of the system and find out the possibilities of mitigating them through creating new possible scenarios. Scenarios were created for relatively underperforming routes and longest route. These scenarios instigate the possible way of improving the operational performance of the routes.

2. Literature Review

There are two approaches to access the performance of transit system: either by comparing to standards or by measuring and accessing the relative efficiencies if no standards are available [4]. As in our context, there has not been standard benchmark established for comparison, second approach is chosen. There are several parametric and non-parametric methods to measure and access performance. Parametric techniques such as t-test, correlation coefficients, ordinary least square were used by researchers. Those techniques entail certain assumptions on the functional forms of the production or cost functions which led researchers to widely use non-parametric technique like Data envelopment Analysis (DEA) to evaluate performance of public transit systems [5].

Barnum et al. [6] evaluated the performances of 46 bus routes of US transit systems using the DEA method with the additional perspective of the environmental influences. Lao & Liu [7] applied the DEA method to compute and analyze operational efficiency and spatial effectiveness scores for each bus line based on the service costs (inputs) and benefits (outputs) with the results indicating no clear positive or negative associations exist between operational efficiency and spatial effectiveness for bus lines. Hawas et al. [4] used DEA to measure and analyze efficiency and effectiveness of Al Ain public bus service concluding that reducing operating hours have very less impact upon current efficiency and effectiveness measure that may help authorities to cut the operating cost. Georgiadis et al. [8] used DEA to evaluate the performance of individual bus lines composing the public transport network in Thessaloniki, Greece and concluded that efficiency of local bus lines is slightly better than operational effectiveness without indicating a clear positive or negative relationship between the two performance components. Several researches used more complex and advanced DEA methods such as Super Efficiency Data Envelopment Analysis(SEDEA) [9], Robust SEDEA [10], Combined Efficiency Method (CEM) [11], analytic hierarchy process (AHP) with DEA [12] etc. Review of different types of DEA models had been published by Adler et al. [13]. Mahmoudi et al. [14] provide literature review and classification of the applications of DEA in transportation systems. DEA has been widely popular and effective performance evaluation method not only in transportation sector but also other scientific research fields.

DEA is non-parametric linear programming approach for relative efficiency estimation and ranking of decision-making units (DMUs) in operation research and economics. DEA was first proposed by Farrell [15] as piecewise-linear convex hull approach to frontier estimation, which got popularized by Charnes et al. [16] as DEA in CCR model and by Banker et al. [17] in BCC model.

Transit system produces multiple outputs consuming multiple inputs. There had been the debate of which of those parameters defines overall performance of the public transit. Generally, labor, capital and energy are used as inputs whereas vehicle kilometers, seat kilometers, passenger kilometers are used as outputs [2], [18] [19]. Lao & Liu [7] used operation time, round-trip distance, and number of bus stops as inputs to measure operational efficiency whereas commuters who use buses, population 65 and older and persons with disabilities were used as inputs to measure spatial effectiveness. Sakano & Obeng [20], Sanchez [21] used fuel consumption, number of full-time workers and number of operating bus as the input variables. Hawas et al. [4] used average travel time per round trip, number of vehicles, operators, total number of stops in round trip as input variables. Input variables can be modified by the researchers as per requirements and scope of their study as long as they include the major operating and maintenance cost of the system.

Sanchez [21] used many output variables like vehicle kilometers, seating capacity, service hours, number of passengers, and average age of the fleets to evaluate bus service performance of Spanish transport systems. Lao & Liu [7] took total number of passengers as output performance indicator for measuring both operational efficiency and spatial effectiveness. Hawas et al. [4] used output variables: total average number of passengers per day as effectiveness measure and vehicle km per day as efficiency measure. DEA can employ various output variables as performance indicators as per the scope of the study. Choosing the input and output variable as being a critical state, special attention should be given considering the direction of the study.

3. Methodology

3.1. Partial Least Squares (PLS)

PLS offers several advantages, like capability to analyze multiple responses, handle collinearity, detect outliers and the ease of visual interpretation of the data. PLS regression first determines latent factors, a variant of principal components to reduce the dimensionality of the independent as well as dependent variables [22]. Partial least square regression thus handles highly auto-correlated variables in easy manner. It uses these factors as independent factors in a linear model to explain variation in the dependent variables [23].

Variable importance in the projection (VIP) is accessed to determine the importance of different variables in the model with respect to both the dependent and other independent variables. Simply for set of explanatory variables X(n, p) linked to a response y(n, 1) through the linear relationship y = α + Xβ + ε, for some unknown regression parameters α and β and error term ε, VIP accumulates the importance of each variable j being reflected by loading weight w from each component. VIP measure vj can be expressed as [24] :


where SSa is the sum of squares explained by the ath component. Here, the vj weights is a measure of the contribution of each variable according to the variance explained by each PLS component where represents the importance of the jth variable, value of A is set during the loading weight calculation algorithm [25]. The VIP scores are based on a weighted sum of squares of the PLS loadings and calculated for each variable taking into account the amount of explained Y variance. VIP-vector summarizes all the factors and Y-variables; thereby enabling us to identify predictor variables influencing the prediction models. Independent variables with VIP ≥ 0.8 has been suggested to discriminate between relevant and irrelevant predictors [22] [23]. Variable with highest VIP score is considered as the most influential variable. Standardized model coefficients are used to interpret the results of the regression [22]. We used XLSTAT to conduct PLS to determine the most influential input variable.

3.2. DEA Model

DEA measures the relative efficiencies of set of peer units called decision making units (DMUs). DMUs require certain resources (inputs) to produce results (output). It establishes an empirical piecewise linear production frontier to monitors the conversion of inputs into outputs and calculates the relative efficiency of DMU by comparing its production frontier with the estimated production frontier. Hence, it directly compares a DMU’s performance against the best practice of peer or their combinations. DEA model does not assume functional form relating input to the output as well as does not relate inputs and outputs of different units.

DEA models can be classified into two types based on their orientation: input-oriented and output-oriented model. Input oriented models minimize the inputs while producing at least the observed output. Output oriented models maximize its output while consuming at most the observed input levels. Sajha Bus Yatayat has been recently reestablished. To attract the passengers towards public transit mode, it has to provide good service to passengers regardless of operational cost. So, we choose output-oriented BCC model to maximize the ridership. BCC model is based on Variable Return to Scale (VRS) assumption. VRS suggests that the estimated production frontier can pass anywhere relative to the origin in input output space [7]. BCC model improves drawback of Constant Return to Scale (CRS) as it supports the fact that the productivity at the most productive scale size may not be attainable for other scale size at which decision-making units (DMU) are operating. It estimates the pure technical efficiency of a DMU at which a given DMU is operating [26].

Mathematically BCC model [17] is as follows:



Subject to





j: Index of decision-making unit (DMU), j = 1, …, J

n: Index of input, n = 1, ..., N

m: Index of output, m = 1, …, M

xnj: The nth input for the jth DMU

ymj: The mth output for the jth DMU

um,vn: Non-negative scalars (weights) for the mth output and the nth input

θk: Efficiency/Effectiveness ratio of DMUk

DMUk is designated as the targeted DMU. The objective function (1) maximizes the ratio of weighted outputs to the weighted inputs. The weights um and vn are the decision variables. These weights are changed until the ratio of the weighted outputs to the weighted inputs is maximized for the target DMUk, while same weights are applied to all DMUs. The value of the ratio, θ, in (1) is referred to as the efficiency/effectiveness score of DMUk, where 0 ≤ θ ≤ 1. For DMU to be fully efficient, the value of θ is 1. The weights are the decision variables and the values of inputs and outputs are the actual observed values. Constraint (3) keeps the DEA model’s Variable Return to Scale (VRS) status. Constraint (4) represents non-negativity restrictions for the weights.

Data Envelopment Analysis Computer Program (DEAP) was used to estimate efficiency and effectiveness measures. DEAP is based on three principles [27]. We used the model based on the first principle [28] of standard CRS and VRS models that involve technical and scale efficiencies. Vehicle kilometers per day and average passengers per day were used as efficiency and effectiveness measure respectively. A scale according to Lao & Liu [7] was used to classify the routes on the basis of efficiency and effectiveness scores. Technical efficiency score from VRS DEA of 1 signifies completely efficient and effective routes. Scale between 0.6 to 1 represents fairly efficient and fairly effective routes. Score less than 0.6 signifies the routes being inefficient and ineffective.

After the efficiency score of current baseline condition of the routes is accessed, we create scenarios for underperforming routes and the longest route. To create scenarios for underperforming route, we change the value of the input variables by 1, 2 and 3 as a hit and trial method. The value for the output variables is accessed by the linear regression model. Long route problem is dealt by splitting and merging the routes. DEA is applied to check the performance deviations under the scenarios to evaluate whether the scenarios are preferable or not.

3.3. Input and Output Variables

Selecting input and output variables is the critical state. Generally, labor, capital and energy are used as inputs and vehicle kilometers, seat kilometers, passenger kilometers are used as outputs [2] [18] [19]. But due to lack of actual cost data of labor, fuel and other expenses, researchers approach towards different sets of input variables [7] [19]. There are three different approaches to choose the input and output variables depending upon the availability of the data and scope of the research. They are 1) separate sets of input and output variables [29], 2) separate input and same output variables [7], 3) same input and separate output variables [4] [19].

Karlaftis [19] used total number of vehicles, total number of employees, total annual amount of fuel as input variables for measuring both efficiency and effectiveness. Hawas et al. [4] used average travel time per round trip, number of vehicles, number of operators, total number of stops as input variables for measuring both efficiency and effectiveness. Our data availability fits the same third approach of selecting input and separate output variables.

Operation cost of bus route is basically represented by number of stops [7], number of stops, number of buses, average travel time [21]. Karlaftis [19] used vehicle kilometers and passengers boarding as output variables. Vehicle kilometers indicate the service produced which can be termed as efficiency. Passenger boarding represents the service consumption or the measure of system effectiveness.

For our study, we use the following input and output indicators as in Table 1.

Each of the inputs represents one of the operation costs. Average number of operators per day reflects labor cost. Average number of bus denotes capital cost or investment. Average travel time per round trip and total number of stops projects fuel consumption cost. In our study we do not consider non-driving labor. As well as deadheading service type (local or express routes) is not included.

Efficiency: Vehicle km per day on each route is the output variable which indicates the measure of efficiency. It signifies the efficiency of utilization of operational and investment cost.

Effectiveness: Average passengers per day on each route is the output variable used as measure of cost effectiveness or effectiveness. We intend to maximize the ridership or effectiveness of the system in this study.

4. Case Study

4.1. Sajha Bus Yatayat

Sajha Bus Yatayat restarted its inter-district bus service within three districts of

Table 1. Input and output indicators for DEA model.

Kathmandu valley in 2013 A.D with strong motive to provide affordable, efficient and safe mode of service [30]. Currently it has 55-seater 45 buses with Euro 3 emission standard operating inside Kathmandu valley. It operates in 8 routes inside the valley from 6 am to 8 pm. In our DEA model we assign the routes as DMUs. It has systematic passengers boarding and drop off system unlike other transit system in the valley. The buses stop at only designated stops and the passengers are required to enter through one door and exit through the next. All buses are fitted with close circuit cameras for safety. It is the only transit system with ticketing system with use of paper tickets. The ticket fare is 25 Nepali rupees. It provides three types of services: intercity Kathmandu valley bus service, Long distance bus service and Long-distance night bus service. Sajha Yatayat’s Kathmandu valley service includes 8 routes. We termed each route with number for easiness. Table 2 shows the eight routes.

Table 2. Routes of Sajha Bus Yatayat.

4.2. Data

We include all eight routes of Sajha Bus Yatayat inside the Kathmandu valley for our study. We obtained secondary data from Sajha Bus Yatayat office. The data was in the form of Daily Income Sheets of the company. Data taken was of the first month of Nepali calendar year 2076B.S. Required data was extracted from the Daily Income Sheets. Main motive to choose first month of new year was to include the possible fluctuation in traffic flow that occurs at time of event like new year. Input variables: number of bus, number of operators and output variables: vehicle kilometers per day, passengers per day were included in Daily Income Sheet. Schedule of the routes were used to extract number of the stops. Average travel time was obtained from the detail scheduling assessment done by the transit company. For each bus, a single operator was assigned for a single whole day which makes the number of operators and buses same in our data. The number of buses and operators assigned were different for different days without any patterns. So, we had to take the average number of buses and operators. In this study we do not address the factors such as vehicle type, bus size, local or express route. 30 days of data were averaged for our DEA model (Table 3).

Table 3. Data for DEA model.

4.3. DEA Results for Baseline Condition

DEAP is run to obtain the efficiency score (Table 4) and effectiveness score (Table 5).

We further classify them as in Table 6 on the basis of the efficiency and effectiveness scores.

Routes 2, 3 and 6 are the most effective and efficient routes. Route 2 and 3 have the highest outputs turnovers for both efficiencies and effectiveness measure while just consuming the average amount of inputs (number of stops and average travel time) as compared to the other routes. The highest consumption of input variables (average number of operators and average number of buses) among all the routes is over shadowed by consumption of average number of other input variables and highest outputs turn overs. In other words, route 2 and 3 are the most efficient in creating the balance between the demand and supply factors at optimum level. In case of route 6, it has the least input consumption among all the other routes. Despite of the output being the least, the route’s least input consumption is sufficient to make it efficient and effective. The least number of stops in route 6 can be major cause for the least fuel consumption or at least the least fuel consumption during acceleration and deceleration at stops. Even though the route is the smallest in length, average travel time for the route is not the least because of the extra minutes for bus stopping to waiting and picking up passengers with heavy luggage at stops. Due to the fact that it only run one bus, number of passengers commuting towards airport destination is at full capacity in the bus. It supports to make the route effective.

Route 1 and Route 7 are the effective and fairly efficient. Both of the routes despite having high number of stops, has no relevantly high number of passengers. Bungamati, Budhanilkantha in route 1 and Godawari in route 7 are not the busiest or most commuted places. One of the disadvantages of route 1 is the highest number of stops. The route is evenly spread with few populations throughout the route. To collect the passengers, it has to make more stops comparatively

Table 4. Efficiency score for baseline condition.

Table 5. Effectiveness score for baseline condition.

Table 6. Classification of routes on the basis of efficiency and effectiveness score.

than other routes which in turn results in high fuel consumption. Route 1 can be easily an efficient and effective route if stop stations could be properly managed and settled at bit farther.

Route 4, 5 and 8 are the relatively underperforming compared to other routes. These routes are the longest among all. Longer route means high number of stops, high travel time, uneven distribution of passengers though out the whole route. This can explain routes being unable to be effective.

The results show that the system is operating fairly with no inefficient and ineffective routes. The only drawback can be the longer routes which are also in no worse condition. The system is ready to serve higher demands. Higher demands obviously demand a greater number of buses and operators. In the next part of our study we check the possibilities of improving the relatively underperforming routes under different scenarios.

4.4. Partial Least Square (PLS) Regression and Linear Regression Model

In this part, we create PLS model between the input and output variables and access the linear relation between the most important explanatory (input) variable and dependent (output) variables.

Table 7 shows high collinearity exists among average number of buses and average number of operators (Variable Inflation Factor (VIF) greater than 10) which indicates that the regression coefficients are poorly estimated due to multi collinearity. Hence it was necessary to conduct Partial Least Square Regression to formulate relation between the variables and know the magnitude of the influence of independent variables on dependent variables. Second level interaction at 95% confidence level was applied which provided better model than without interaction. Second level interaction means interacting at most two variables at a time. Model contained three components. First component and second component explained 96.26% of the model and Q2 value of the first component is 72.5% (>60%) so we need not go further to take component 2 and component 3. As every next component is better than the previous one, if the previous component is good enough it is absurd to move above the previous component. In Table 7, average number of bus per day and average number of operators per day are highly linear with both the response variables.

Figure 1 and Figure 2 show that for both response variables average number of buses, average number of operators and average travel time were the important variables with Variable importance in the Projection (VIP) greater than 0.8. Average number of buses and operators were both equally the most influential variable to explain both vehicle km per day and average passengers per day. Standard coefficient of the models shows that greater the average number of

Table 7. Correlation matrix of all the variables (explanatory and response) used in PLS regression and results of multi-collinearity test among explanatory variables.

Figure 1. PLS analysis showing VIP and model coefficient for vehicle km per day. AB—Average number of bus per day, AO—Average number of operators per day, AT—Average travel time, TS—Total number of stops, VIP threshold—0.8. Goodness of fit statistics shows R2 = 0.991, Standard Deviation = 49.729 and RMSE = 35.164.

Figure 2. PLS analysis showing VIP and model coefficient for average passengers per day. AB—Average number of bus per day, AO—Average number of operators per day, AT—Average travel time, TS—Total number of stops, VIP threshold—0.8. Goodness of fit statistics shows R2 = 0.993, Standard Deviation = 217.085 and RMSE = 153.502.

buses and operators, greater the vehicle km per day and average passengers per day.

In our DEA model we create scenarios changing the values of most influential variables (number of buses/operators) keeping other input variables constant. Scenarios are created for the relatively underperforming routes to explore the possibility to enhance the performance. To access the new values of the output variables for these scenarios, we establish its linear regression with the input variables. Linear regression was done using the 30 days raw data. The model was created between the changing input variables i.e. number of buses/operators and the output variables.

Table 8 shows the linear regression model for the relatively underperforming routes. We use the linear model to access the new output variable values from the changed input variables under different cases. Despite of R square value being low for both output variables for route 4, because of high p value the model is considered valid.

4.5. Scenarios Tests

In this section, we investigate whether the performance of the relatively underperforming routes and long route problem can be improved under different scenarios, created by changing the input parameters value while keeping the other inputs value constant.

Scenario for relatively underperforming routes

To create the scenarios, we chose to change both number of buses and operators as they are the most influential variables according to our PLS results. They had to be changed by same amount, as primarily one operator is responsible to operate one single bus for the whole day. If we consider changing number of stops, we have to do the detail demographic study for the demands of each locality which is out of scope of this study. Whereas average travel time is controlled by the traffic condition of the routes. Scenarios 1, 2, 3 represents increasing bus and operators on route 4, 5, 8 respectively. In addition, combined scenarios 1 and 2, 1 and 3, 2 and 3 and finally 1, 2 and 3 were also considered making total of 7 scenarios. We increased the number of bus/operators by 1, 2 and 3. The values of output variables were calculated from the regression model (Table 8). New data under various increment of changing variables are given in Table 9.

Table 8. Linear regression model between number of operators/buses and output variables.

DEA model was run to recalculate the new efficiency and effectiveness score for each of the scenarios under each increment of changing variables.

Figures 3-5 show that there is no change in efficiency scores on efficient routes 1, 2, 3, 6, 7 in any of the scenarios. Any changes in the routes 4, 5, 8 doesn’t affect efficiency score of other routes except themselves. This indicates that there is no DMUs peer for these efficient routes and no routes have their performance influenced by the other routes, which is most likely because of other routes being already efficient.

For route 4, increasing the number of buses/operators is directly proportional to their efficiency as increasing number of buses/operators, its efficiency goes on rising. It shows increasing number of buses/operators or increasing the capital investment is favorable towards its efficiency performance. In case of route 5, it’s the opposite as the more increment on number of buses /operators the efficiency goes downwards. So, increasing the number of buses/operators is not favorable for it. For increasing number of buses/operators by 2, 3 or more, route 8 becomes an efficient route which suggest its favorability to increase the capital investment for the route by the company for future.

Figures 6-8 give the change of effectiveness score under 7 scenarios for all the routes. It shows that there is no change in effectiveness scores on effective routes 2, 3, 6 as well as other routes in any of the scenarios as same as in the case of the efficiency. In this case too, change in a route doesn’t affect the performance of other routes except for themselves. It shows there is no DMUs peer for these routes and no routes have their performance influenced by the other routes. Under no scenarios, there is improvement of effectiveness over any routes.

Table 9. New modified values for first set of scenarios on increasing the changing variable by 1, 2 and 3.

Changed variables were italicized.

Figure 3. Efficiency score for all scenarios on increment by 1.

Figure 4. Efficiency score for all scenarios on increment by 2.

Figure 5. Efficiency score for all scenarios on increment by 3.

Figure 6. Effectiveness score for all scenarios on increment by 1.

Figure 7. Effectiveness score for all scenarios on increment by 2.

Figure 8. Effectiveness score for all scenarios on increment by 3.

Route 4 is most likely affected by the increment in buses/operators’ number which is negative 0.051 at most on increment of 3 numbers of buses/operators. Route 8 is less affected by the changes with negative 0.01 at most on increment of 2 numbers of buses/operators.

Considering both efficiency and effectiveness deviations scenarios 1, 3 and combined scenarios 1 and 3 with increment of 2 or 3 number of buses are relatively favorable. The reduction in the effectiveness score is not significant as compared to the increment in efficiency score. Route 8 is efficient for 2 and 3 number of increments in buses/operators’ number whereas route 4 is just short of 0.05 score to reach full efficiency for 3 number of increments. This shows that routes 4 and 8 are favorable for future expansion of its service with addition of the buses and operators to operates them. Route 5, although is not in worst case of its performance, but is relatively in poor condition than other routes.

Long Route Problem

The possible reason for relative underperformance of route 5 can be, it being the longest route (58 km round trip). Longer routes entail total number of stops, average travel time for a round trip relatively higher than the shorter routes. It is further accompanied by uneven distribution of passengers throughout the route. In addition, most of the riders in the routes are destined to three main destinations: Dhulikhel at extreme of the route, Suryabinayak in Bhaktapur and Ratnapark in Kathmandu at extreme of the route, which can be reason for same passengers throughout the whole route. Because of the lack of passenger exchange, it can cause fewer passenger ridership which mean less effectiveness.

Route 5 and route 8 have the common portion of route from Ratnapark to Suryabinayak. We shorten Route 5 from Ratnapark-Suryabinayak-Dhulikhel to Suryabinayak-Dhulikhel. Portion of Ratnapark-Suryabinayak will be covered by route 8 which is Swoyambhu-Ratnapark-Suryabinayak. Total average number of bus/operators for route 8 gets increased by 1.75 which is average number of operating bus/operators in route 5. Other input parameters remain same. Table 8 regression model is used to calculate the output variables after 1.75 increment in average number of bus/operators.

Ratnapark to Dhulikhel (58 km round trip) is shortened to Suryabinayak to Dhulikhel (32 km round trip). Number of stops is reduced to 39. We consider this as completely new route making same number of trips per day even though after shortening the route it could make higher number of trips per day. Average travel time for round trip is 1.55 hr which is very less in proportion to distance ratio. Due to lesser number of stops throughout this portion, average travel time is significantly decreased. This can be huge factor for improvement in efficiency of the route. We assume the same number of bus and operators to run the route. Output variables vehicle km per day and average passengers per day are calculated on the basis of average trips per day and route length ratio respectively (Table 10).

Route 5

DEA was run again to calculate the new efficiency and effectiveness scores. In Table 11, it can be seen that after the longest route had been shortened, the route is fully efficient and effective. Route 5 cutoff its most of the stop station after splitting from proportionally lesser length of its route. Not only it departs from most of the stops but also the travel time has significantly decreased. As well as route 8 after merging with common portion of route 5 has become fully efficient route. Splitting the route 5 does not prove disadvantageous for emerging route 8. The only reason for negative deviation on effectiveness score is the extra 1.75 number of bus for route segment Swoyambhu to Ratnapark (10.6 km round trip) which is also negligible amount of 0.009. This scenario of route

Table 10. New modified value for input and output variables for Route 5 and Route 8.

Table 11. New efficiency and effectiveness score for long route problem.

New fully efficient and effective scores are italicized.

shortening and route emerging is perfectly favorable to implement. In this scenario too, there is no effect upon the performance of other routes except for themselves. It shows there is no DMUs peer for these routes and no routes have their performance influenced by the other routes.

5. Conclusions

In this study we used DEAP to conduct DEA to evaluate the fundamentals of operating performance of Sajha Bus Yatayat routes operating within the Kathmandu valley and explore the possibilities of improving the performance of relatively underperforming routes under different scenarios. To create the scenarios, the decisive changing input variables were employed from PLS results as the most influential variables. In the study, the number of operators and the number of buses were equally the most influential variables which might not be the same for other transit systems with the unequal number of operators and operating buses. To imitate the real case scenarios, linear regression model was adopted considering the high significance of the model.

Under the scenario of relatively underperforming routes, it was found out that further expanding of the service for routes 4 and 8 with additional operating buses is much more favorable for their performance than in the current situation. It can be concluded, though for route 5, adding new operating buses was not relatively favorable as for other routes 4 and 8, it was not in the worst condition and could serve with more operating bus if required to fulfill the demand. As a case of long route problem, the longest route 5 was shortened by splitting it from the common route portion of route 8. Route 5 being cut from larger proportions of stops and travel time than the length, served the route completely efficiently and effectively. This scenario also proved to be favorable for route 8 which scored as an efficient route without considerable negative deviation in effectiveness of the route.

This approach can be adopted by the transit company as reliable tool to access the efficiency and effectiveness of its system. Transit company can explore the strategies to facilitate operational performance improvement through the possible scenario development.


I would like to thank office staff of Sajha Bus Yatayat for providing the required data for the study. I would also like to thank Prof Li Wen Quan for the guidance. I would also like to acknowledge Dr. Sailesh Ranjitkar for his guidance and support.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Department of Transport Management (DoTM) (2019) Vehicle Registration Details. 6.
[2] Fielding, G.J., Babitsky, T.T. and Brenner, M.E. (1985) Performance Evaluation for Bus Transit. Transportation Research Part A: General, 19, 73-82.
[3] Barnum, D., McNeil, S. and Hart, J. (2007) Comparing the Efficiency of Public Transportation Subunits Using Data Envelopment Analysis. Journal of Public Transportation, 10, 1-16.
[4] Hawas, Y.E., Md. Bayzid Khan and Basu, N. (2012) Evaluating and Enhancing the Operational Performance of Public Bus Systems Using GIS-Based Data Envelopment Analysis. Journal of Public Transportation, 15, 19-44.
[5] Zhu, J. (2003) Quantitative Models for Performance Evaluation and Benchmarking: Data Envelopment Analysis with Spreadsheets And DEA Excel Solver. Kluwer Academic Publishers, Norwell, MA.
[6] Barnum, D.T., Tandon, S. and McNeil, S. (2008) Comparing the Performance of Bus Routes after Adjusting for the Environment Using Data Envelopment Analysis. Journal of Transportation Engineering, 134, 77-85.
[7] Lao, Y. and Liu, L. (2009) Performance Evaluation of Bus Lines with Data Envelopment Analysis and Geographic Information Systems. Computers, Environment and Urban Systems, 33, 247-255.
[8] Georgiadis, G., Politis, I. and Papaioannou, P. (2014) Measuring and Improving the Efficiency and Effectiveness of Bus Public Transport Systems. Research in Transportation Economics, 48, 84-91.
[9] Chen, N., Xu, L. and Chen, Z. (2017) Environmental Efficiency Analysis of the Yangtze River Economic Zone Using Super Efficiency Data Envelopment Analysis (SEDEA) and Tobit Models. Energy, 134, 659-671.
[10] Sadjadi, S.J., Omrani, H., Abdollahzadeh, S., Alinaghian, M. and Mohammadi, H. (2011) A Robust Super-Efficiency Data Envelopment Analysis Model for Ranking of Provincial Gas Companies in Iran. Expert Systems with Applications, 38, 10875-10881.
[11] Zhang, C., Juan, Z., Luo, Q. and Xiao, G. (2016) Performance Evaluation of Public Transit Systems Using a Combined Evaluation Method. Transport Policy, 45, 156-167.
[12] Li, X., Liu, Y., Wang, Y. and Gao, Z. (2016) Evaluating Transit Operator Efficiency: An Enhanced DEA Model with Constrained Fuzzy-AHP Cones. Journal of Traffic and Transportation Engineering (English Edition), 3, 215-225.
[13] Adler, N., Friedman, L. and Sinuany-Stern, Z. (2002) Review of Ranking Methods in the Data Envelopment Analysis Context. European Journal of Operational Research, 140, 249-265.
[14] Mahmoudi, R., Emrouznejad, A., Shetab-Boushehri, S. N. and Hejazi, S. R. (2020) The Origins, Development and Future Directions of Data Envelopment Analysis Approach in Transportation Systems. Socio-Economic Planning Sciences, 69, Article ID: 100672.
[15] Farrell, M.J. (1957) The Measurement of Productive Efficiency. Journal of the Royal Statistical Society: Series A (General), 12, 253-281.
[16] Charnes, A., Cooper, W.W. and Rhodes, E. (1978) Measuring the Efficiency of Decision Making Units. European Journal of Operational Research, 2, 429-444.
[17] Banker, R.D., Charnes, A. and Cooper, W.W. (1984) Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis. Management Science, 30, 1078-1092.
[18] De Borger, B., Kerstens, K. and Costa, á. (2002) Public Transit Performance: What Does One Learn from Frontier Studies? Transport Reviews, 22, 1-38.
[19] Karlaftis, M.G. (2004) A DEA Approach for Evaluating the Efficiency and Effectiveness of Urban Transit Systems. European Journal of Operational Research, 152, 354-364.
[20] Sakano, R., Obeng, K. and Azam, G. (1997) Subsidies and Inefficiency: Stochastic Frontier Approach. Contemporary Economic Policy, 15, 113-127.
[21] Sanchez, G. (2009) Technical and Scale Efficiency in Spanish Urban Transport: Estimating with Data Envelopment Analysis. Advances in Operations Research, 2009, Article ID: 721279.
[22] Lindgren, F., Geladi, P., Berglund, A., Sjöström, M. and Wold, S. (1995) Interactive Variable Selection (IVS) for PLS. Part II: Chemical Applications. Journal of Chemometrics, 9, 331-342.
[23] Nokels, L., Fahmy, T. and Crochemore, S. (2010) Interpretation of the Preferences of Automotive Customers Applied to Air Conditioning Supports by Combining GPA and PLS Regression. In: Vinzi, V.E., Chin, W.W., Henseler, J. and Wang, H., Eds., Handbook of Partial Least Squares: Concepts, Methods and Applications, Springer, Berlin, 775.
[24] Eriksson, L., Johansson, E. and Kettanah-Wold, N.W. (1999) Introduction to Multi- and Megavariate Data Analysis Using Projection Methods (PCA & PLS). Umetrics AB.
[25] Mehmood, T., Liland, K.H., Snipen, L. and Sæbø, S. (2012) A Review of Variable Selection Methods in Partial Least Squares Regression. Chemometrics and Intelligent Laboratory Systems, 118, 62-69.
[26] Fancello, G., Uccheddu, B. and Fadda, P. (2014) Data Envelopment Analysis (D.E.A.) for Urban Road System Performance Assessment. Procedia-Social and Behavioral Sciences, 111, 780-789.
[27] Coelli, T. (2016) A Guide to DEAP Version 2.1: A Data Envelopment Analysis (Computer) Program. CEPA Working Paper n96/08 4(1), 1-7.
[28] Fare, R., Grosskopf, S. and Lovell, C.A.K. (1994) Production Frontiers. Cambridge University Press, Cambridge.
[29] Chu, X., Fielding, G.J. and Lamar, B.W. (1992) Measuring Transit Performance Using Data Envelopment Analysis. Transportation Research Part A: Policy and Practice, 26, 223-230.
[30] Sajha Yatayat (2019) Sajha Bus Yatayat.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.