_{1}

^{*}

This paper takes a comprehensive study of taxi drivers’ labor supply behavior using a new dataset of taxi drivers from China. We find strong evidence that the working hours of drivers are negatively related to the hourly rates, and this effect is both statistically and economically significant. We then conduct a discrete-choice model, showing that the probability of stopping keeps increasing as cumulative working hours increase, but the probability of stopping first increases and then decreases as the cumulative fare increases. Lastly, we use the asymmetric model with the income target and working hour target as dummy variables, and the probability of stopping is significantly positively related to income target but shows no significant relation with cumulative fare.

There is a vast of studies in the economic literature focusing on the wage elasticity of labor supply. The neoclassic models of labor supply predict that work hours should respond positively to transitory positive wage changes, as workers intertemporally substitute labor and leisure, working more when wages are high and consuming more leisure when wages are low. While this prediction is straightforward, but it is difficult to find empirical support. The empirical evidence has been surveyed intensively (for example, Blundell and MaCurdy, 1999) and a summary of the findings is that wage elasticities of labor supply are generally very small, often not significantly different from zero, and sometimes even negative.

One criticism of this literature is that the standard neoclassical models assume that workers can choose their work hours in response to transitory wage changes, or alternatively, can select a job with the optimal wage-hours combination from a joint distribution of jobs. However, actual wage changes are rarely transitory, so the hypothesis of intertemporal substitution must be tested jointly along with the auxiliary assumption of persistent wage shocks. As a result, the insignificant or negative wage elasticity of labor supply can plausibly be attributed to specification errors.

The ideal test of labor supply responses to transitory wage changes would use a context in which wages are relatively constant within a short period but uncorrelated across periods. In such case, dynamic optimization models predict a positive relationship between wages and hours worked, because of the negligible impact of life-cycle wealth of the short period wage changes (see, for example, MaCurdy, 1981).

In order to realize the purpose of research, drivers, as one group of workers, provide us with the most appropriate research subject.

The most apparent advantage is that drivers face wages that fluctuate within a short period due to demand shocks caused by many factors, such as weather, traffic, day-of-the-week effects, holidays, and conventions. Although rates per hour/mile/job are set, during busy periods, drivers spend less time searching for customers and jobs and thus earn a higher hourly/daily wage. The wages tend to be correlated within the short periods and uncorrelated across periods.

Another advantage of focusing on drivers is that they can choose the number of hours they work each period, unlike most workers facing fixed work hours, e.g., eight hours per day and five days per week. In sum, such a study can be easily generalized to other types of workers who have the freedom to choose work hours/days or even the targeted customers, but a necessary condition is that there exist transitory wage changes.

In this paper, we use a comprehensive dataset of taxi drivers in Chengdu, China. Our dataset overcomes the aforementioned problems of the NYC taxi driver dataset. People usually do not tip taxi drivers in China and fare information automatically recorded by the meters on taxis. Hence, fares recorded are a very accurate measure of income earned in our dataset. The dataset contains over 14 thousand taxis. Each taxi there is an observation every minute including its location and status (with or without passengers). There are more than one billion observations in total. We further combine these minute observations into trips and we calculate the duration and fare earned for each trip.

Based on this comprehensive dataset, we perform empirical analyses. First, we conduct an OLS linear regression of working hours on the hourly rate earned for that day. The neoclassic theory predicts the coefficient of hourly rate to be positive, and a negative coefficient does not support the neoclassic model. Our estimation results show that hourly rate has a positive effect on working hours, and this relationship is statistically significant after controlling for a variety of fixed effects, including taxi fixed effects, weather fixed effect, day of week fixed effects, and week fixed effects. Considering the economic significance, a one standard deviation increase in hourly rate could lead to a decrease of working hour by around 9% its standard deviation.

The wage elasticity of labor supply can be estimated through a simple OLS regression, and the results can show a broad picture of how the daily hours worked are correlated with hourly earning opportunities for that day. The hourly earning opportunities can be computed as a fixed daily wage rate using total fare income divided by hours worked. The regression with one observation for each shift takes the following form

ln H i t = η ln W i t + X i t β + ε i t , (1)

where

H_{it} represents the hours worked by driver i at day t;

W_{it} = Y_{it}/H_{it} and Y_{it} is the total fare income of driver i at day t;

X_{it} are other factors affecting labor supply;

ε_{it} is a random component with a standard normal distribution.

The parameter η measures the wage elasticity of labor supply, and neoclassic models predict that η to be positive. An important econometric problem with this approach is that the estimate relies on there being significant exogenous transitory day-to-day variation in the average wage. This variation drives the accurate estimate of η. However, it is hard to see a source of legitimate variation in the average hourly wage in the real data.

Alternatively, the model of driver daily labor supply can be estimated as a survival time model in which quitting can occur at discrete points in time. Without deriving a full dynamic solution to the optimal stopping problem, a simple discrete-choice problem can be implemented empirically as reasonable approximation.

At any point s, a driver can calculate the forward-looking expected optimal stopping point, s*. The optimal stopping point can be a function of many factors, including hours worked and expectations about future earnings possibilities, etc. If daily income effects are important, the optimal stopping point can also be a function of income earned. A driver will stop at s if s ≥ s* so that s − s* ≥ 0.

A reduced-form representation of R(s) = s – s* is

R i d c ( s ) = α 1 h s + α 2 y s + X i d c β + u i + ε i d c s , (2)

where

i refers to driver;

d refers to the date;

c refers to the hour of the day;

h_{s} measures cumulative hours worked on the shift at s;

y_{s}_{ }measures cumulative income earned on the shift at s;

X_{idc} measures other determinants of the optimal stopping time.

The vector of X_{idc} includes weather, a set of fixed effects for hour of the day, day of the week, and location within a province/city. These variables are included to capture variation in earning opportunities from continuing to drive.

A driver stops driving at t if R_{idc}(s) ≥ 0. The coefficient α_{1} measures whether the probability of quitting will be related to hours worked, and the coefficient α_{2} measures whether income earned is important in deciding when to quit.

After any trip p during a shift, a driver can calculate the forward-looking expected optimal stopping point. This is a function of many variables, including hours worked so far on the shift and variables that affecting expectations about future earning possibilities. In addition, it could also be affected by the accumulated income in a nontraditional way: when the accumulated income is more than the reference income, there is a higher probability for the driver to stop working. An empirical representation of this reference-dependent model is given as follows:

C i j p = X i j p β + δ I [ Y i j p > Y T i j ] + γ I [ H i j p > H T i j ] + ε i j p , (3)

where

C_{ijp} represents the forward-looking expected optimal stopping point for driver i on shift j after trip p;

X_{ijp} is a vector of variables determining the optimal stopping time;

Y_{ijp} represents the cumulative income level for driver i on shift j at trip p;

I[Y_{ijp} > YT_{ij}] is an indicator equal to one if accumulated income is larger than the reference income level, and equal to zero otherwise;

H_{ijp} represents the cumulative working hours for driver i on shift j at trip p;

I[H_{ijp} > HT_{ij}] is an indicator equal to one if accumulated working hours is larger than the reference level, and equal to zero otherwise.

The positive value of δ represents the incremental probability of stop working when the accumulated income is above the reference income level, and the positive value of γ represents the incremental probability of stop working when the accumulated working hours are above the reference level. This model can be easily extended to using only income or working hours as references.

We use a comprehensive dataset of taxi drivers in Chengdu, China from August 3 to August 23, 2016. The dataset contains over 14 thousand taxis. Each taxi there is an observation every minute from 6:00 am to 11:59 pm including its location and status (with or without passengers). There are more than one billion observations in total.

Compared to the dataset used in studying NYC taxi drivers, our dataset has some advantages. It is collected through devices in taxis, which record all GPS location, fare information automatically, unlike the NYC taxi driver dataset which involves transcribe handwritten receipts. The quality of our dataset is better. Moreover, people usually do not tip taxi drivers in China. Hence, fares recorded are a very accurate measure of income earned.

We construct a new dataset by combining these minute observations into trips. We identify the time slots as a trip if the status changes (beginning from without passengers to with passengers and ending from passengers to without passengers). The duration of each trip is calculated as the sum of the time slots between each trip. The distance of each trip is calculated as the sum of the distance traveled during each time slots. The speed is then calculated as the distance divided by the time slots. Most importantly, we calculate the fare earned during each trip based on the following rule: the fare starts with 8 CNY, the price is 1.9 CNY for every kilometer travelled between 2 km and 10 km, and the price is 2.85 CNY per km for over 10 km; if the speed is lower than 12 km per hour, the time counts toward waiting time and every 5 minutes waiting time is counted as 1 km travelled.

As a robustness check, we refine the dataset and keep the information for one driver over each day. Following standard literature, we identify driver shifts by the length of the taxi status without passengers. If it lasts for more than two hours for one taxi without passengers, we define it as a shift, and we keep the information only for the first driver starting from the beginning of the day to the time of the shift. We acknowledge that identifying accurate shifts is difficult from an empirical perspective, and we rely on this method commonly used in taxi driver literature. We also try to identify the shifts with longer time slots, and the results are all consistent.

The literature of labor supply consists of two major competing theories, the neoclassical theory and reference-dependent theory. The empirical findings regarding these two theories are mixed and indecisive. This paper takes a comprehensive study of taxi drivers’ labor supply behavior using a new dataset of taxi drivers from China. By conducting our study in a different setting from the literature, we hope to clarify the findings in the literature.

We perform a linear regression of working hours on the hourly rate earned for that day as discussed in the empirical model (1). Specifically, we regress Ln(Work Hour) on Ln(Hourly Rate) and control for a set of fixed effects. As we discussed earlier, neoclassic theory predicts the coefficient of Ln(Hourly Rate) to be positive, and a negative coefficient does not support the neoclassic model.

We first list the summary statistics of the related variables for each taxi over each day. We have totally 197,573 taxi-day observations. The means of Total Income and Work Hour are 473.1 and 13.99, respectively. The Hourly Rate is averaged about 34.37, with the standard deviation about 49.89. The large standard deviation of Hourly Rate compared to its mean, indicates that we have enough variations in terms of the main independent variable of interest. We run regressions using the log form, and the results are unchanged if we directly use the level of the variables instead of logarithm. Ln(Work Hour) has the mean of 2.570, and standard deviation of 0.506. Ln(Hourly Rate) has the mean of 3.493, and standard deviation of 0.274. These values of means and standard deviations are used later to calculate the economic significance of our regression results. The mean of Weather is 1.187, indicating there are relatively more sunny days than rainy days during our sample period.

Column (3) of

(1) | (2) | (3) | |
---|---|---|---|

VARIABLES | Ln(Work Hour) | Ln(Work Hour) | Ln(Work Hour) |

Ln(Hourly Rate) | −0.135*** | −0.157*** | −0.164*** |

(−7.539) | (−9.358) | (−9.351) | |

Constant | 3.052*** | 3.128*** | 3.152*** |

(48.955) | (53.302) | (49.364) | |

Observations | 197,049 | 197,049 | 197,049 |

R-squared | 0.006 | 0.334 | 0.342 |

Taxi Fixed Effects | No | Yes | Yes |

Weather Fixed Effects | Yes | ||

Day of Week Fixed Effects | Yes | ||

Week Fixed Effects | Yes |

***denotes statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

All of the three columns show that Ln(Hourly Rate) has a negative effect on Ln(Work Hour), and the effect is statistical significant at 1% level. Now we consider the economic significance by using column (3) as an example. As shown in

Putting together, as drivers work less when wages go up, it is clearly an opposite effect to what neoclassical theory predicts. While our finding is in line with the literature studying taxi drivers’ labor supply. Farber et al. (2015) argue that the negative elasticity is not large enough, then he pointed out that this negativity could be due to the measurement or specification error which may lead downward bias of the elasticity. This is possibly due to that daily working hour is the dependent variable while the average hourly income is the ratio of daily income over daily hours.

Several papers in the literature then propose a possible way to fix this problem by using various instruments, i.e., other driver’s hourly wage on the same day. Farber et al. (2015) show that although the OLS result produces negative elasticity, it will be strongly positive once the instrument variable is added, hence, support the neoclassical prediction. This type of measurement error may exist due to “tips” or “imperfectly recorded and transcribed paper trip sheets” in the NYC taxi dataset used in many papers including Farber et al. (2015).

Our dataset is almost immune to this problem for the following reasons. First, taxi drivers in China rarely receive tips and they do not count on that as part of their income. Second, during the sample periods, all the trips are recoded through meters without any manual input. When the accuracy of the dataset is not a concern, IV method may not be a good estimaton, because such instruments are lack of variation and essentially constant across drivers and days. The instruments therefore are rather weak in terms of the explanatory power.

An important econometric problem with this approach is that the estimate

(1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | |
---|---|---|---|---|---|---|---|---|

VARIABLES | N | mean | sd | p1 | p25 | p50 | p75 | p99 |

Total Income | 197,573 | 473.1 | 336.2 | 27.94 | 401.0 | 480.3 | 551.6 | 902.5 |

Work Hour | 197,573 | 13.99 | 3.527 | 1.171 | 12.93 | 14.98 | 16.19 | 17.66 |

Hourly Rate | 197,468 | 34.37 | 49.89 | 13.96 | 29.41 | 32.83 | 36.88 | 64.23 |

Ln(Work Hour) | 197,468 | 2.570 | 0.506 | 0.223 | 2.560 | 2.707 | 2.785 | 2.871 |

Ln(Hourly Rate) | 197,049 | 3.493 | 0.274 | 2.748 | 3.382 | 3.492 | 3.608 | 4.163 |

Weather | 197,573 | 1.187 | 1.210 | 0 | 0 | 1 | 1 | 4 |

relies on there being significant exogenous transitory day-to-day variation in the average wage. This variation drives the accurate estimate of the coefficient. However, it is hard to see a source of legitimate variation in the average hourly wage in the real data. Hence, in the following, we examine the discrete-choice stopping model and its asymmetric effects.

The OLS linear regression produces negative elasticity of labor supply, which is also economically and statistically significant. This result cannot be explained by the neoclassical theory. On the other hand, the reference dependence model has a quite contrasting prediction on the elasticity as suggested in the previous section. To check if our OLS result is consistent with the reference dependence model, we follow Farber et al. (2015) to model the labor supply decision of taxi driver as a dynamic discrete choice problem, where they need to decide whether to continue working after each trip. The reduced-form therefore should take the potential earnings opportunities, hours worked, and income earned and other factors that could affect preferences for work into consideration.

As suggested in Farber et al. (2015), without deriving a fully dynamic solution to the optimal stopping problem, a simple discrete-choice problem can be implemented empirically as reasonable approximation. The optimal stopping point can be a function of many factors, including hours works and expectations about future earnings possibilities, etc. If daily income effects are important, the optimal stopping point can also be a function of income earned. Following our previous discussion of reference-dependent models, individuals can make decisions on either income or hour targets, or both of them. In order to identify the most relevant explanation, we examine all three models.

We first summarize all the related variables on the trip basis in

Fare Range and Hour Range are defined in details in the Panel B of

(1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | |
---|---|---|---|---|---|---|---|---|

VARIABLES | N | mean | sd | p1 | p25 | p50 | p75 | p99 |

Cum Fare | 7,394,068 | 286.2 | 1.110 | 8 | 120.7 | 246.0 | 380.9 | 711.8 |

Cum Hour | 7,394,068 | 7.778 | 4.580 | 0.220 | 3.874 | 7.574 | 11.55 | 16.76 |

Stop Trip | 7,394,068 | 0.0267 | 0.161 | 0 | 0 | 0 | 0 | 1 |

Fare Range | 7,394,068 | 2.124 | 1.683 | 0 | 1 | 2 | 3 | 7 |

Hour Range | 7,394,068 | 3.393 | 2.283 | 0 | 1 | 3 | 5 | 8 |

Weather | 7,394,068 | 1.182 | 1.203 | 0 | 0 | 1 | 1 | 4 |

VARIABLES | N | Range |
---|---|---|

Fare Range Dummy 0 (Baseline) | 1,532,930 | Cum Fare < 100 |

Fare Range Dummy 1 | 1,496,899 | 100 ≤ Cum Fare < 200 |

Fare Range Dummy 2 | 1,430,721 | 200 ≤ Cum Fare < 300 |

Fare Range Dummy 3 | 1,323,256 | 300 ≤ Cum Fare < 400 |

Fare Range Dummy 4 | 992,783 | 400 ≤ Cum Fare < 500 |

Fare Range Dummy 5 | 425,143 | 500 ≤ Cum Fare < 600 |

Fare Range Dummy 6 | 112,513 | 600 ≤ Cum Fare < 700 |

Fare Range Dummy 7 | 33,658 | 700 ≤ Cum Fare < 800 |

Fare Range Dummy 8 | 46,165 | Cum Fare ≥ 800 |

Hour Range Dummy 0 (Baseline) | 912,258 | Cum Hour < 2 |

Hour Range Dummy 1 | 998,911 | 2 ≤ Cum Hour < 4 |

Hour Range Dummy 2 | 1,003,267 | 4 ≤ Cum Hour < 6 |

Hour Range Dummy 3 | 990,945 | 6 ≤ Cum Hour < 8 |

Hour Range Dummy 4 | 942,233 | 8 ≤ Cum Hour < 10 |

Hour Range Dummy 5 | 899,563 | 10 ≤ Cum Hour < 12 |

Hour Range Dummy 6 | 864,135 | 12 ≤ Cum Hour < 14 |

Hour Range Dummy 7 | 602,538 | 14 ≤ Cum Hour < 16 |

Hour Range Dummy 8 | 180,218 | Cum Hour ≥ 16 |

and Hour Range for every two hours. If we define differently, for example, Fare Range for every 50 dollars or Hour Range for every one hour, we are able to obtain similar results.

Given the previous discussion, the reduced form of the income dependent model can be estimated by regressing Stop Trip on Fare Range Dummies and controlling for a set of fixed effects. The estimation results are presented in

(1) | (2) | (3) | (4) | |
---|---|---|---|---|

VARIABLES | Stop Trip | Stop Trip | Stop Trip | Stop Trip |

Fare Range Dummy 1 | 0.001*** | 0.002*** | 0.002*** | 0.114*** |

(10.357) | (15.114) | (14.421) | (13.741) | |

Fare Range Dummy 2 | 0.005*** | 0.006*** | 0.006*** | 0.327*** |

(20.922) | (22.495) | (21.435) | (30.165) | |

Fare Range Dummy 3 | 0.017*** | 0.020*** | 0.020*** | 0.707*** |

(22.120) | (22.346) | (21.773) | (62.424) | |

Fare Range Dummy 4 | 0.062*** | 0.067*** | 0.067*** | 1.246*** |

(32.949) | (32.690) | (32.044) | (104.286) | |

Fare Range Dummy 5 | 0.130*** | 0.140*** | 0.140*** | 1.655*** |

(65.945) | (65.948) | (64.426) | (114.819) | |

Fare Range Dummy 6 | 0.163*** | 0.178*** | 0.178*** | 1.799*** |

(93.285) | (108.376) | (105.585) | (96.033) | |

Fare Range Dummy 7 | 0.140*** | 0.160*** | 0.161*** | 1.700*** |

(48.265) | (53.385) | (54.946) | (56.802) | |

Fare Range Dummy 8 | 0.079*** | 0.134*** | 0.134*** | 1.372*** |

(19.126) | (77.589) | (75.397) | (46.341) | |

Constant | 0.003*** | 0.000 | 0.003 | −2.714*** |

(14.420) | (1.139) | (1.530) | (−226.711) | |

Observations | 7,394,068 | 7,394,068 | 7,394,068 | 7,394,068 |

R-squared | 0.059 | 0.068 | 0.069 | |

Taxi Fixed Effects | No | YES | Yes | No |

Weather Fixed Effects | Yes | Yes | ||

Day of Week Fixed Effects | Yes | Yes | ||

Week Fixed Effects | Yes | Yes |

***denotes statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

by day. In columns (1), (2), and (3), we conduct the OLS estimation, and in column (4), we estimate using probit model.

In column (2), we include Taxi Fixed Effects, and in column (3), we include Taxi Fixed Effects, Weather Fixed Effects, Day of Week Fixed Effects, and Week Fixed Effects. The results in these two columns are similar to those reported in column (1).

We can see a clear pattern that the probability of stopping slowly increases for the fare range between 300 and 400 CNY, and then it sharply increases for the fare range between 400 and 500. The probability of stopping peaks for the fare range between 600 and 700, and then gets lower as the fare range further increases. These results are all statistically significant at 1% level and they hold when controlling various fixed effects.

In column (4) of

The reference-dependent model with income target suggested that 1) if income is below the income target, drivers have a higher marginal utility of income; 2) if income is above the income target, drivers have a higher marginal utility of leisure (disutility of work). Moreover, such a change around the income target is not smooth. It implies that the probability of stopping will be lowest when income is below the income target and will be highest when income is above the target. Our results in

One possible alternative explanation of the finding in

The reduced form of the working-hour dependent model can be estimated by regressing Stop Trip on Hour Range Dummies and controlling for a set of fixed effects. The estimation results are presented in

In column (1), we show the OLS estimation result without controlling any fixed effects. As working hours accumulate, the probability of stopping starts increase significantly compared to the baseline level. Specifically, compared to the baseline level of cumulative hour below 2 hours, Stop Trip first shows no significant difference for cumulative working hours between 2 hours and 4 hours; Stop Trip increases by 0.001 when the cumulative working hours are between 4 hours and 6 hours; Stop Trip increases by 0.003 when the cumulative working hours are between 6 hours and 8 hours; Stop Trip increases by 0.006

(1) | (2) | (3) | (4) | |
---|---|---|---|---|

VARIABLES | Stop Trip | Stop Trip | Stop Trip | Stop Trip |

Hour Range Dummy 1 | −0.000 | 0.001*** | 0.001*** | −0.003 |

(−0.304) | (9.752) | (9.079) | (−0.253) | |

Hour Range Dummy 2 | 0.001*** | 0.003*** | 0.003*** | 0.118*** |

(7.583) | (15.234) | (14.816) | (9.161) | |

Hour Range Dummy 3 | 0.003*** | 0.005*** | 0.005*** | 0.223*** |

(11.990) | (17.618) | (17.005) | (17.591) | |

Hour Range Dummy 4 | 0.006*** | 0.009*** | 0.009*** | 0.396*** |

(19.447) | (24.642) | (24.030) | (25.468) | |

Hour Range Dummy 5 | 0.014*** | 0.017*** | 0.017*** | 0.651*** |

(24.398) | (22.866) | (22.465) | (60.589) | |

Hour Range Dummy 6 | 0.037*** | 0.042*** | 0.042*** | 1.019*** |

(21.481) | (21.857) | (21.537) | (76.420) | |

Hour Range Dummy 7 | 0.112*** | 0.121*** | 0.121*** | 1.576*** |

(43.464) | (42.777) | (42.363) | (147.585) | |

Hour Range Dummy 8 | 0.299*** | 0.319*** | 0.319*** | 2.261*** |

(80.732) | (81.297) | (82.771) | (123.185) | |

Constant | 0.003*** | −0.000* | 0.002 | −2.723*** |

(15.252) | (−1.823) | (0.825) | (−151.808) | |

Observations | 7,394,068 | 7,394,068 | 7,394,068 | 7,394,068 |

R-squared | 0.109 | 0.121 | 0.121 | |

Taxi Fixed Effects | No | Yes | Yes | No |

Weather Fixed Effects | Yes | Yes | ||

Day of Week Fixed Effects | Yes | Yes | ||

Week Fixed Effects | Yes | Yes |

* and *** denote statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

when the cumulative working hours are between 8 hours and 10 hours; Stop Trip increases by 0.014 when the cumulative working hours are between 10 hours and 12 hours; Stop Trip increases by 0.037 when the cumulative working hours are between 12 hours and 14 hours; Stop Trip increases by 0.112 when the cumulative working hours are between 14 hours and 16 hours; Stop Trip increases by 0.299 when the cumulative working hours are above 16 hours.

In column (2), we include Taxi Fixed Effects, and in column (3), we include Taxi Fixed Effects, Weather Fixed Effects, Day of Week Fixed Effects, and Week Fixed Effects. The results in these two columns are similar to those reported in column (1).

We can see a clear pattern that the probability of stopping slowly increases up to the working hours between 12 hours and 14 hours, and then it sharply increases for the hour range between 14 hours and 16 hours. The probability of stopping peaks for the working hour range above 16 hours, and we find no evidence for decreasing pattern as the working hours increase. These results are all statistically significant at 1% level and hold when controlling various fixed effects.

Column (4) of

Similar to the prediction of a reference-dependent model with income target, a reference-dependent model with hour target would suggest that the individual should have a higher marginal utility of work if working hours are below the target and higher marginal utility of leisure if working hours are above the target. In terms of probability of stopping, we should expect this probability to peak around the target working time.

On the other hand, the neoclassical model predicts that as working hours accumulated, taxi drivers’ marginal utility of leisure becomes larger. Therefore, the probability of ending a shift should keep increasing.

The finding in

The income and working hour dependent model can be estimated by regressing Stop Trip on both Fare Range Dummies and Hour Range Dummies and controlling for a set of fixed effects. This model is discussed in details in the empirical model (2). The estimation results are presented in

(1) | (2) | (3) | (4) | |
---|---|---|---|---|

VARIABLES | Stop Trip | Stop Trip | Stop Trip | Stop Trip |

Fare Range Dummy 1 | −0.000* | −0.001*** | −0.000*** | 0.006 |

(−2.100) | (−3.875) | (−3.549) | (0.700) | |

Fare Range Dummy 2 | −0.000 | −0.001*** | −0.001** | 0.027** |

(−1.692) | (−3.447) | (−2.778) | (2.313) | |

Fare Range Dummy 3 | −0.000 | −0.001* | −0.000 | 0.080*** |

(−0.665) | (−1.771) | (−0.593) | (4.020) | |

Fare Range Dummy 4 | 0.008*** | 0.008*** | 0.008*** | 0.224*** |

(7.464) | (8.407) | (8.295) | (9.516) | |

Fare Range Dummy 5 | 0.030*** | 0.030*** | 0.031*** | 0.358*** |

(20.097) | (22.230) | (22.375) | (15.190) | |

Fare Range Dummy 6 | 0.043*** | 0.046*** | 0.047*** | 0.418*** |

(25.992) | (29.724) | (28.142) | (19.215) | |

Fare Range Dummy 7 | 0.024*** | 0.032*** | 0.033*** | 0.334*** |

(10.522) | (14.003) | (15.391) | (13.901) | |

Fare Range Dummy 8 | −0.006*** | 0.011*** | 0.011*** | 0.147*** |

(−3.507) | (4.634) | (5.047) | (5.944) | |

Hour Range Dummy 1 | 0.000 | 0.001*** | 0.001*** | −0.006 |

(0.609) | (9.090) | (9.639) | (−0.505) | |

Hour Range Dummy 2 | 0.001*** | 0.003*** | 0.003*** | 0.107*** |

(7.764) | (12.068) | (14.696) | (7.811) | |

Hour Range Dummy 3 | 0.003*** | 0.005*** | 0.005*** | 0.192*** |

(10.166) | (13.072) | (16.150) | (11.761) | |

Hour Range Dummy 4 | 0.005*** | 0.008*** | 0.008*** | 0.329*** |

(15.376) | (18.047) | (22.430) | (14.797) | |

Hour Range Dummy 5 | 0.011*** | 0.014*** | 0.014*** | 0.525*** |

(21.164) | (17.160) | (20.076) | (23.463) | |

Hour Range Dummy 6 | 0.028*** | 0.033*** | 0.033*** | 0.818*** |

(20.575) | (18.346) | (19.836) | (36.136) | |

Hour Range Dummy 7 | 0.096*** | 0.103*** | 0.103*** | 1.306*** |

(40.456) | (36.993) | (38.873) | (47.798) |

Hour Range Dummy 8 | 0.277*** | 0.294*** | 0.294*** | 1.951*** |
---|---|---|---|---|

(65.605) | (66.372) | (68.561) | (55.072) | |

Constant | 0.003*** | −0.000 | 0.003 | −2.715*** |

(15.042) | (−1.615) | (1.013) | (−160.132) | |

Observations | 7,394,068 | 7,394,068 | 7,394,068 | 7,394,068 |

R-squared | 0.111 | 0.122 | 0.123 | |

Taxi Fixed Effects | No | Yes | Yes | No |

Weather Fixed Effects | Yes | Yes | ||

Day of Week Fixed Effects | Yes | Yes | ||

Week Fixed Effects | Yes | Yes |

* and *** denote statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

In column (1), we show the OLS estimation result without controlling any fixed effects. Comparing to the cumulative fare below 100, Stop Trip does not show significant increase until fare range is between 400 and 500, in which it increases by 0.008; Stop Trip keep increasing by 0.030 when the fare range is between 500 and 600; Stop Trip increases by 0.043 when the fare range is between 600 and 700; Stop Trip keeps increasing but with a smaller magnitude of 0.024 when the fare range is between 700 and 800; Stop Trip decreases when the fare range is above 800. Considering the effects of working hour ranges, Stop Trip shows no significant difference for cumulative working hours between 2 hours and 4 hours; Stop Trip increases by 0.001 when the cumulative working hours are between 4 hours and 6 hours; Stop Trip increases by 0.003 when the cumulative working hours are between 6 hours and 8 hours; Stop Trip increases by 0.005 when the cumulative working hours are between 8 hours and 10 hours; Stop Trip increases by 0.011 when the cumulative working hours are between 10 hours and 12 hours; Stop Trip increases by 0.028 when the cumulative working hours are between 12 hours and 14 hours; Stop Trip increases by 0.096 when the cumulative working hours are between 14 hours and 16 hours; Stop Trip increases by 0.277 when the cumulative working hours are above 16 hours.

Comparing the result here with those in the column (1) of

In column (4) of

Overall, we find that the probability of stopping keep getting larger at an increasing rate as working hours accumulated, but as the cumulative fare increases, it first increases and then decreases.

The reference-dependent model discussed in the previous section suggested that there are two “domain of losses”: 1) If income is below the income target, drivers have a higher marginal utility of income; 2) if hours are above the hours target, drivers have a higher marginal utility of leisure (disutility of work). It implies that the probability of stopping will be lowest when income is below the income target and hours are below hours target and will be highest when income and hours are above the income and hour target respectively.

Given these implications, if hours reference point matters for individual’s decision on stopping time, we should expect the probability to be peaked sometime as working hours accumulated. From our estimate, it does not happen. The probability of stopping keeps increasing as the neoclassical predicts that the marginal disutility of working becoming larger. In contrast, the probability of stopping indeed increases and peaked at 600 CNY, moreover the sharp jump around 500 yuan is consistent with the kink which would exist at the income target.

Our estimates show that taxi drivers’ behavior is better explained by the reference-dependent model with income target. This is in a quite contrast to the findings in the literature mainly rely on NYC taxi dataset.

To further check the robustness of our finding, we follow Crawford and Meng (2011) and Farber (2008) to estimate a reduced-form of stopping probability with dummy variables to measure the increment effects due to hitting the income and hours targets. As discussed in the empirical model (3), we run the regression of Stop Trip on Income Target and Hour Target, as well as Ln(Cum Fare) and Ln(Cum Hour).

As discussed in the variable definition, Income Target takes the value of 1 if the Cum Fare is above the daily average income over the sample period, and 0 otherwise. Hour Target takes the value of 1 if the Cum Hour is above the daily average working hours over the sample period, and 0 otherwise. The sample average income target is around 480 CNY and sample average working hours are around 13 hours, and these numbers are similar to those reported as mean value of Total Income and Work Hour in

The two dummy variables imply whether income or working hours above the targets. If any of their coefficients are positive, it is consistent with the prediction of a reference-dependent model.

Taking column (3) as an example, the coefficient of Income Target is 0.045, and it is significant at 1% level. The coefficient of Hour Target is also significantly positive. However, when we look at the coefficients of Ln(Cum Fare) and Ln(Cum Hour), we see a totally different pattern. The coefficients of Ln(Cum Fare) is not significantly different from zero, meaning that once we take into consideration of the effects from Income Target, the log level of cumulative fare

(1) | (2) | (3) | |
---|---|---|---|

VARIABLES | Stop Trip | Stop Trip | Stop Trip |

Income Target | 0.044*** | 0.045*** | 0.045*** |

(48.281) | (53.342) | (51.909) | |

Hour Target | 0.143*** | 0.147*** | 0.147*** |

(79.406) | (92.506) | (93.511) | |

Ln(Cum Fare) | −0.002** | −0.000 | −0.000 |

(−2.709) | (−0.349) | (−0.000) | |

Ln(Cum Hour) | 0.010*** | 0.008*** | 0.008*** |

(19.706) | (19.043) | (20.027) | |

Constant | 0.003 | −0.003 | −0.001 |

(1.225) | (−1.526) | (−0.172) | |

Observations | 7,392,146 | 7,392,146 | 7,392,146 |

R-squared | 0.098 | 0.102 | 0.102 |

Taxi Fixed Effects | No | Yes | Yes |

Weather Fixed Effects | Yes | ||

Day of Week Fixed Effects | Yes | ||

Week Fixed Effects | Yes |

***denotes statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

shows no effect on the probability of stopping. In contrast, the coefficient of Ln(Cum Hour) is 0.008, and significant at 1% level. The significant positive coefficient of the log level of cumulative working hours shows that the probability of stopping always increases as working hour increases.

Overall, our evidence from the asymmetric model again shows that we can use the reference-dependent model with income as target to explain the behaviors of taxi drivers, but there is a lack of evidence supporting working hour dependent model.

We find strong evidence that the working hours of drivers are negatively related to the hour rates, and this effect is both statistically and economically significant. We then conduct a discrete-choice model and estimate the probability of stopping on a set of cumulative fare ranges and cumulative working hours. This is consistent evidence showing that the probability of stopping keeps increasing as cumulative working hours increase, but the probability of stopping first increases and then decreases as the cumulative fare increases. This indicates the existence of an income target in taxi drivers’ labor supply decisions. Lastly, we use the asymmetric model with the income target and working hour target as dummy variables, and the probability of stopping is significantly positively related to income target but shows no significant relation with cumulative fare. In contract, both working hour target and cumulative working hours seem to be important to explain the probability of stopping.

Overall, our results clearly reject the prediction of the neoclassical theory as the elasticities of labor supply is significantly negative. More interestingly, among the three reference-dependent models, our results are better explained by the income-based reference model. That taxi drivers seem to target certain income levels instead of total working time. This finding is quite different from the literature. For example, Crawford and Meng (2011) find that their results are more in line with the reference dependent model with both income and hour targets. One possible explanation of the difference between these findings is that Chinese taxi drivers may view the income and leisure differently compared to their counterparts in New York City. Such difference may be due to cultures, working conditions, and living environment, etc.

Drivers are a preferred research subject for studying the wage elasticity of labor supply, which has been proved by the results of the above-mentioned models. And applications of the dataset of taxi drivers in Chengdu, China, also expose the difference between literature and empirical results, which calls for further studies.

Drivers, of course, are not representative of the whole working population. Besides some demographic differences, many other groups (e.g., farmers and small-business proprietors) have similar self-selected occupations with low variable wages, long work hours, and relatively high rates of accidents. Therefore, it is important for these works to make long-horizon planning and effectively allocate their labor and investment in economic and educational opportunities for themselves and their children. This is where calling for attention and help from educators and policy makers to improve the social welfare of a nation.

The author declares no conflicts of interest regarding the publication of this paper.

Tu, Y. Q. (2020). Reference-Dependent Preferences and the Labor Supply of Chinese Drivers. Open Journal of Social Sciences, 8, 358-376. https://doi.org/10.4236/jss.2020.89028