Reward and Punishment Mechanism in a Vertical Safety Regulation System : A Transferred Prisoner ’ s Dilemma

Under the current system of safety regulation in China, a lower level in the hierarchical system means more regulatory failure. The mechanism of rewards and punishments has been proven to compensate for regulatory failures according to game theory separately. This study analyzed rewards and punishments simultaneously to strengthen the regulatory power and offset the failure of regulation; examples are provided to facilitate a comparison of the failure possibilities for various degrees of rewards and punishments. In addition, this paper describes the transformation of the behavior of coal enterprises, miners and local governments. Doubling the rewards and punishments was determined to reduce the possibility of failure of local government regulation by 27%; in addition, the probability of safe production in coal mining enterprises increased by 87%, and the willingness of miners to disclose information increased by 50%.


Introduction
Since 1990s, production, foods and medicine-related accidents have occurred frequently in China.In addition to the rapid expansion of industrial through economic growth, failure of government regulation has been a major factor.Most previous studies have focused on the causes and internal factors of accidents, and have not provided effective solutions, or suggested specific mechanisms.The purpose of this study was to provide solutions and offer suggestions.Besides, compared with economic regulation, social regulation has received scarce attention in economics studies, and led to fewer results, particularly regarding safety regulation (See details in literature reviews).
The cause for the failure of safety regulations in China is power dispersion of regulatory in local governments, resulting in consolidation of ownership and supervision at the provincial level; additional causes are the lack of the third-party supervision, because collusion of local officers and firms induces the deviation of regulatory target of the local government.More than that, information asymmetry and inherent contradictions in the Chinese regulatory system play the most prominent roles in this failure.Therefore, preventing the consolidation of ownership and increasing supervision in the current local vertical regulatory system are essential for improving the effectiveness of safety regulations.
This study analyzed the regulatory system of the coalmining industry in China, and discussed how rewards and punishments influence the strategies of central local governments, as well as firms and workers; we discerned various levels of rewards and punishments and investigated the possibilities of preventing conspiracy.The Bayesian Nash equilibrium in this paper shows that a doubling rewards and punishments is likely to reduce collusion between local government and firms by up to 27%, and the probability of secure production in coal firms could improve by up to 87%.The combining of rewards and punishments can facilitate the revelation of firm information by encouraging workers to report illicit activities.
The following two sections present literature reviews and an analysis of the regulatory system of coalmining industry in China.Section 4 describes the model of the transferred PD with rewards and punishments between the central and local governments, and between workers and central government respectively.Section 5 provides an example based on the models introduced in Section 4, and compares the effectiveness of various regulatory mechanisms by analyzing each player's behaviors.The final section presents the conclusion and suggestions.
The three innovations of this study are as follows.First, we transferred the classical P's D game into cooperative game by two methods in theoretical perspective.Second, we used the rewards and punishments simultaneously to strengthen the regulatory power and offset the failure of regulation, such as collusion or rent-seeking of local governments with enterprises.Third, we fitted the data to figure out the pre-regulatory mechanism of China's coal mining industry.

Literature Reviews
Several empirical studies have focused on the reasons for and effectiveness of safety regulation, such as Viscusi [1], Lewis-Beck and Alford [2], Fishback [3], Ruser and Smith [4].Compared with economic regulation, social regulation has received scarce attention in economics studies, and led to fewer results, particularly regarding safety regulation.They have proved that government interventions have no significant positive effects on reducing the probability of accidents; sometimes they even exert negative effects.Stigler [5], Peltzman [6] and Becker [7] have revealed the reason for government failure, also known as capture theory.HÄGG [8] showed that the conflict of interests between private interests and public interests leads regulators to lose their effectiveness gradually in repeated games.High administrative costs restrict the frequency of inspection for regulators [9].
Conversly, other studies have argued that safety regulations are not always invalid as long as breaking "capture".Viscusi [3] showed that the safety regulations reduced economic losses by approximately 5% to 6% between 1973 and 1983, because of the improved inspection frequency and increase punishment levels associated with OSHA (Occupational Safety and Health Act in United States).Ruser and Smith [4] and Scholz and Grey [10] estimated that in the early 1980s, safety regulations reduced injuries by approximately 5% to 14%.Wages and power of regulation have proven to be the crucial factors of regulatory effectiveness in these studies.Therefore, production security is not only decided by a firms' behavior, but by a combination of multiple strategies of governments, enterprises, and workers.
Discussing the reasonability of economic regulation in a free market environment is beneficial; however, regarding social regulation, people pay more attention to the effectiveness of a mechanism's design [11]- [14].Mainardi [15] showed that casualties declined with the increase of economic and private investment rates, as has been proven by panel data from 1985 to 2002 among 12 industrial and developing economies.In other words, irrespective of the governmental regulation, the accident rate decreased with economic development and increasing wealth.
After the year 2000, Chinese researchers began investigating accidents in Chinese on coalmining-firms.Xiao et al. [16] showed that enhanced coal mine safety regulations markedly; however reduced mortality rates were offset by the adverse behavior of miners in the short term.In addition, they determined that the increase in coal production in the short term reduced death rate; however in the long term, the effect was inverted.The relationship between the concentration of the coal industry and coal mine safety since the year 2002 might only reflect the infection of industry consolidation and production growth, but not the influence of government behavior [17].Xiao et al. [18] proposed a "safety regulation fluctuations" norm to reflect the instability of safety regulation in China.
Furthermore, imperfect regulation systems in China have led to collusion between local government and firms [19] [20].The reason is that collusion has no significant effect on security incidents when coal mines are directly managed by the Central Government; however it becomes relevant once decentralization of enterprises in local government [21].Besides, several studies have analyzed safety regulation from the perspective of game theory; for example, safety regulation has been conceptualized as a two-person static game between a firm and a government or alternatively, a moral hazard that is induced by incomplete information of safety investment in various enterprises [22]- [24].
In Non-Cooperative game, solving the prisoner's dilemma (PD) and changing the Nash equilibrium from deviation (betrayal) into collaboration relies on improving the efficiency of cooperation by reward, punishment or a combination of both [25]- [29].Andreoni et al. [30] experimentally compared these three methods and determined that a combination of rewards and punishments let to superior results.
The aforementioned studies demonstrated only the phenomenon of regulatory failures, and offered no effective solutions.The cause for the failure of safety regulations in China is power dispersion of regulatory in local governments, resulting in consolidation of ownership and supervision at the provincial level; additional causes are the lack of the third-party supervision, because collusion of local officers and firms induces the deviation of regulatory target of the local government.Information asymmetry and inherent contradictions in the Chinese regulatory system play the most prominent roles in this failure.Therefore, preventing the consolidation of ownership and increasing supervision in the current local vertical regulatory system are essential for improving the effectiveness of safety regulations.

Current System of Safety Regulation in the Chinese Ming Industry
Coal has played a prominent role in the Chinese economy; however, the coal industry is a high-risk industry because of frequent and severe accidents that directly threaten workers' health and safety.In the year 2013, the 3.689 billion tons of coal produced in China constituted the largest output worldwide; accounting for almost one-third of the global production.However, 1052 workers died in the coal industry accidents in the same year.The mortality rate per 1 million tons of coal in China is 10 times higher than that in developed countries such as the Unite States and Australia.
As Figure 1 shows, the mortality rate has declined since a vertical system of safety regulatory was implemented in 1999, but enterprises under various types of ownership exhibit substantial differences.State-owned enterprises typically invest more in safety and the maintenance of equipment than local and township firms; for that reason they exhibit the lowest mortality rate; By contrast, over-exploitation and irregularities are common in township enterprises; therefore, they exhibit a mortality rate higher than the average value.The failure of regulation below the provincial level is severe 1 .
The vertical regulation system of coal production in China was established on December 30th, 1999.The State Council approved a third-party institution called the State Administration of Coal Mine Safety (SACMS) to regulate coalmining enterprises, and affiliated institutions of SACMS were established in every province; subsequently, local coal enterprises have been simultaneously regulated by the SACMS and local governments (Figure 2).In China, affiliated institutions of the SACMS have taken responsibility for the administrative management system (right side in Figure 2), and consequently, changing the safety regulation has been characterized by an interest games between central and local governments.Specific problems are as follows.
First, ownership and regulatory power are consolidated.Because coal constitutes a strategic resource of China, the ownership over coal-mining is a national concern; coalmining enterprises merely own the rights of mining and operation, it is akin to a type of principal agent relationship.State-owned enterprises (with an annual output of more than 600,000 t) are under the direct jurisdiction of the State-owned Assets Supervision and Administra-  tion Commission; state-owned local coal enterprises (with an annual output below 600,000 t) are under the jurisdiction of provincial governments; township coal mines are under the jurisdiction of township governments.Therefore, owners and supervisors of local and township coal mines are all local governments.
Second, the vertical supervision and administrative management systems intersect with each other.Local regulatory branches are not only administered by the central government, but also supervised by local governments according to the "Law of the PRC on Safe Production".The regulatory right has been devolved to provincial and municipal governments, which means that the supervisors must be supervised by local governments and cannot implement regulatory measures independently and effectively; this weakens the effectiveness of vertical supervision.
Third, government departments have different targets.As the policymaker, the Central Government aims to maximize social welfare; as the owner and supervisor of coal mine enterprises, the objective of local governments is growth of the local economy and social stability (this is because local governments are directly affected Fourth, most of China's safety regulation concerns only the technology; the other standard is unclear and poor maneuverability.For instance, the "Coal Mine Safety Regulation" which was revised in January 2005, specifically regulates all types of safe-mining technique indices.An additional legislation called the "Production Safety Incident Report and Investigation Regulation" stipulates the punishment for accidents induced by unsafe production; the penalties range from RMB 100,000 to RMB 5000,000.Moreover, even if workers reveal violations, they cannot receive corresponding rewards on the basis of current laws.The "Coal Mine Safety Regulation" specifies that enterprises can be fined between RMB 20,000 to RMB 100,000 if they are found to have engaged in illegal behavior before an incident 4 .Punishments are clearly asymmetric before and after the incidents, which induces enterprises to gamble.Although there is considerable legislation related to coal-mining production, specific rewards and punishments are not sufficiently clear.
In addition, safety regulation in China has lagged behind developed countries.Typically, enterprises amend internal procedures only after accidents occur, which is always too late.Therefore, a combination of pre-regulation and post-punishment is crucial.First, clear pre-regulatory mechanism must be designed; in this study we designed a reward and punishment mechanism that is easy to implement; it is presented in the following section.

Punishment at Two Levels of Government
Shelling [25] and Qin [29] discussed punishment in corporation games.In the following, let (D 1 , I 1 ) be the Central Government's strategy; D 1 means that the central government relies on local authorities for regulation, and to obtain information; I 1 means that the central does not trust local governments, and establishes other methods to obtain information, such as imperial envoy 5 .Let (D 2 , I 2 ) denote the strategy of local governments; D 2 means that a local government performs its duty regarding regulation; I 2 refers to malpractice and collusion with coal firms (which equals to the failure of vertical system).Figure 3 presents a basic PD game in Figure 3, in which S < P < R < T represents revenues according various combinations of strategies.
We defined the safety and reputation as social benefits of the Central Government; if an accident occurs, the reputation of the Central Government is affected.The payoffs corresponding to (D 1 , D 2 ) constitute an effective vertical regulatory system, where each player receives normal revenue.If the Central does not rely on the local government, but the local government still chooses D i , which is over-regulated for the Central Government, the local government is built on stilts, and receives S 2 as payoff.By contrast, if the local government engages in malpractice and the Central Government is misled, the payoff of the Central Government is minimal; payoff (P 1 , P 2 ) corresponds to the strategy of (I 1 , I 2 ).Irrespective of other conditions, (I 1 , I 2 ) constitute the only Nash equilibrium in this game, and the local government has an incentive to deviate from the original regulatory responsibilities; in addition, it is likely to no longer trust local agencies.
To correct this ineffective mechanism, set h = (h 1 , h 2 ) is defined as an additional punishment for deviation.Thus, the payoffs corresponding to (D 1 , D 2 ) remain unchanged.If a local government engages in malpractice The Local

D2 I2
The Central Prisoner's dilemma between two level governments.
while the Central Government relies on the local government, the local government must pay h 2 as a penalty; h 1 denotes the additional cost of obtaining information from other places (such as the Escrow group) for the Central Government; now the local government can save the same cost of regulation.In summary, the punishment pair h modifies the PD game as shown in Figure 4.
The punishment pair h * is intended to induce an effective strategy profile (D 1 , D 2 ) to be an SPE (substituted perfect Nash equilibrium) in a two-stage games, therefore, three necessary conditions must be considered, described in the following: (1) When the local government has responsibility, the Central Government's payoff of D 1 should be bigger than that of I 1 : 1 The dominant strategy profile of the Nash equilibrium is (D 1 , D 2 ), and the payoff for both local and Central government is the largest according to the new mechanism.The vertical system is more valuable under the qualified punishment pair.

Rewards between Workers and the Central Government
The direct victims of coal mining accidents are typically miners; they are inferior negotiators when accidents occur, particularly, the probability of accident occurred would be greatly enhanced while the local government captured by illegal enterprises.Thus, if information communication channels were established, miners would be able to report the illegal behavior of firms, and this would indirectly strengthen the power of regulation.In addition, the Central Government should encourage reporting by offering rewards to offset the risk of reporting, otherwise "reporting" may not be the dominant strategy of miners.
The strategies of both players based on the PD game are shown in Figure 5.The Central Government considers giving incentives to workers; miners choose whether to report.Payoff is 1 R′ for the Central Government and 2 R′ for worker while miner's reporting is encouraged by the Central Government.Payoff is S' while one player choose E i but the other one choose N i .Payoff is P's while both players choose N i .The variables profile S P R T ′ ′ ′ ′ < < < refer that the payoff of be defected is minimal, the payoff of defecting player maximize in verse.There is no information when miners do not report, even if the Central Government builds the platform and pays for the costs; however, if no encouragement is given to miners for reporting, they pay the extra cost for The Local

D2 I2
The Central

E2 N2
The Central

Figure 5.
Prisoner's dilemma between central governor and workers. 6We have similar structure model with Qin [29], so please see details of proofs in this reference.
the central access to information.The Nash equilibrium in this game is (N 1 , N 2 ), corresponding to the payoff ( ) , P P ′ ′ , in that case, miners have no incentive to report because they risk being fired.
Inducing cooperation through extra rewards from the Central Government to miners constitutes a replenished mechanism for changing the equilibrium [27].Set r = (r 1 , r 2 ) denotes a payment pair.First, no payment is made from each player when both players choose N i , thus, the payoff corresponding to (N 1 , N 2 ) remains unchanged.Next, the extra payment r 1 denotes the reward offered to miners to offset the cost of reporting; r 2 refers to the reputation of the Central which is presented as a social benefit when the Central Government encourages miners to report and build an information platform.Finally, if both players cooperate, the Central Governments' payoff is 1 1 2 R r r ′ − + whereas that of miners is 2 1 2 R r r ′ + − .Figure shows a summary.Similarly, if the reward pair r * inducing an effective strategy profile (E 1 , E 2 ) is an SPE in the games, four necessary conditions must be considered: (1) When miners report, the payoff of E 1 for the Central Government must larger than that of N 1 : (2) To achieve the effectiveness of rewards, unnecessary reports must be avoided by sufficient regulation, therefore we propose 1 1 2 (3) The payoff of E 1 for the Central Government must be larger than that of N 1 at least 1 1 2 1 R r r P ′ ′ − + ≥ (4) When miners choose to report, the feedback to the Central Government is equal to or less than the reward for reporting: 2 The dominant strategy profile of the Nash equilibrium is (E 1 , E 2 ), and the payoff for both players is the largest according to this new mechanism.The strengthened regulation is effective under the qualified reward pair r * .
According to this transferred game, the reward and punishment mechanism can contribute to an efficient cooperation.We did not conduct a separate analysis of the enterprise behavior; however, collusion of local governments is directly related to the rent-seeking of firms, and whether workers report is directly related to investment in safety equipment; therefore, corporate behavior is already implicit in the aforementioned models.

Examples: Test of Various Rewards and Punishments
In this section, we provide an example involving four players including the Central Government, local government, coal mining enterprise, and coal miners, which reflects the static behavioral differences according to three designed mechanisms.We assume that each player separately receives costs and benefits.The object of the Central Government is maximizing social welfare, including economic benefits from taxes and social benefits from reputation.The local government's benefits consist of taxes and possible bribery from collusion.We neglected the reputation of the local government.To simplify, normal profits of coal mining firms are equal to regulatory benefits, but the illegal income is double that amount if mining is conducted under unsafety conditions or with inadequate equipment.Finally, the wages of miners are half the amount of an enterprise's profits, and we ignored the other behaviors of miners (such as resignations).
The regulatory cost of the Central Government is the same as that of the local government in our games, and the cost of reporting for workers is half the amount of their wage.The cost structure of coal mining enterprises is complex (Table 1).

E2 N2
The Central

Table 1.
Cost and benefits of each player.

Player Costs Benefits
The Consider the rewards and punishments.Three mechanisms are presented in Table 2.The first involves no punishments and rewards to the local government and workers; there are eight payoff groups corresponding to each player's strategy profiles.Second, the Central Government offers rewards to miners, punishes captured local regulators (if they engaged in malpractice) and illegal firms (for unsafe production and rent-seeking) once receiving information from miners or accident reports.In the third part of this example, we reflect a higher reward and punishment mechanism, which is set the amount quantity of reward and punishment doubling the second one.
According to (1)-( 3) in previous section and Table 1, the qualified rewards and punishments satisfied ( 4) and ( 5) in the following, therefore, we defined the rewards and punishments as 0.5 and 1 for the lower extent, and 1 and 2 for the higher extent.
Payoff profiles are shown in Table 2. Regarding complete information, the Nash equilibriums in the first and second sections are ineffective, because the local government would like to be engaged to malpractice; although the payment pairs are in qualified intervals in two player games, the dominant strategy of a firm will be unsafe production because their expected income of safe production is lower than that of unsafe; in addition miners have no incentive to report malpractice.In the third part, reporting is preferable to remaining silent for miners, and enterprises choose safe production.Nevertheless, the local government does not work actually in the current regulatory system.The layers of vertical systems can be reduced, and regulatory costs can be saved when the information is symmetric.However, the problem is the asymmetry of information; the Central Government cannot directly obtain information from scattered enterprises nationwide.
Regarding incomplete information, the strategic probabilities of each player are presented in the following.We determined the probability of D 2 for a local government to be α, and 1 − α represents strategy I 2 ; similarly, β is the possibility of safe production of firms, 1 − β represented unsafe production in verse; γ is the probability of strategy E 2 , 1 − γ in verse.
( ) . The strategic information of the Central Government is common for other players in short terms; therefore we analyzed the Bayesian equilibrium of each behavior.
First, we assume that the Central Government chooses the strategy of the second part (lower level of punishments and rewards) in the first stage of the game.According to Bayesian rules, in the second stage, the local government must satisfy the sufficient condition that the expected utility of D 2 is larger than that of I 2 : Similarly, the sufficient conditions for enterprises and miners, respectively, are as follows:

I2
(2, 2, 0, 1.5) (3, 2, 0, 1) (0, 0, −1.5, −3.5) (1, 4, 0.5, −4) We determined the probability of a critical condition relying on ( 6)-( 8): . In the same manner, we determined probability of a critical condition when the Central Government chooses a strategy involving a higher level of punishments and rewards; the probability is: . Table 3 summaries the Bayesian Nash equilibrium.Local government exhibits a higher probability range of being captured by illegal enterprises operating a lower-level punishment mechanism, and there is no different utility for miners regarding the reporting of malpractice.Consequently, once enterprises increase the wage, the local government is likely to be captured and miners are likely to remain silent.Currently, coal enterprises can receive excess profits, and the risk of production for workers increases significantly.
Moreover, the willingness of miners to report increases by 50% when the Central Government doubles the rewards and the likelihood of illegal production decreases by 85% because of more severe punishment.At same time, the possibility of effective supervision range is increases by 27% for local governments.Although the Central Government's information remains incomplete, the effectiveness of regulation can be improved to reduce the likelihood of illegal production and rent-seeking behavior; therefore, the supplement mechanism reduces the failure possibility of vertical regulation as far as possible.

Conclusions and Suggestion
The study investigated reward and punishment compensation mechanism for improving the effectiveness of vertical regulation in China's coal industry.This multidimensional mechanism enables players to induce cooperation in games by allowing the central government to offer large rewards for encouraging the disclosure of information; concomitantly, the government can impose punitive measures in case of collusion between local officials and companies.The main results are summarized as follows.
First, under the current vertical system in China, the regulation is decentralized at the provincial level.This causes local governments to be susceptible to being captured by enterprises and reduce supervision.This constitutes a fundamental reason for the failure of local regulation.
Second, by increasing the punishment for collusion and rent-seeking, the possibility of local governments engaging in captured can be substantially reduced; consequently, enterprises are less likely to violate regulation.
Thirdly, the willingness of miners to disclose illicit activities is directly related to the value of the reward they are eligible for; larger rewards can encourage miners to disclose information and thereby safeguard their own security in production facilities, as well as reducing their risk of being fired.
Therefore, we provide the following three suggestions which are targeted and operability.First, power of supervision and administration should be separated in the vertical regulation system.That is, the performance evaluation system of local governments should be reformed; the focus should be shifted from economic growth to multiple standards such as social security, environmental friendliness indices, to avoid overexploitation, environmental destruction and frequent accidents in the coal industry.
Second, because the lower the cost of accidents is, the higher is the possibility of illegal production methods being implemented, that's why the compensation system in case of dead and injury of miners caused by acci-

Figure 1 .
Figure 1.The mortality rates per one million tons by different ownership coal firms from 1990 to 2011 in China (%) 2 .

Figure 2 .
Figure 2. The vertical regulation system of china's coal industry 3 .

2 Enterprises 4 Miner
Unsafety Product cost = 0.5, Safety Product cost = 0.5 − 1, Wage = −1, Rent-seeking costs = −2, Punishment of illegal production = −1, Punishment of collusion = −1 Normal profits = 2, Illegal income = Report cost = −0.5,Loss in accident = −5 Wage = 1, Reward = 0.5 workers and bear the economic cost).Local governments seek to reduce the supervision to induce more output of coal, because the Central Government evaluate the local's performance by GDP and employment indices, and in case of accidences, they are likely to attempt to conceal the truth to avoid punishment from the Central.Therefore, local regulatory departments who are captured by illegal enterprises are ineffective; they are likely to not fulfil the requirements of their regulatory mandate.Numerous accidents and anti-corruption cases in China have revealed the evidence of regulatory capture and exposed failures in local government regulation.