In 1958 Modigliani and Miller published one of the most significant papers in finance on the cost of capital. It presented the capital structure irrelevance theorem which states that the cost of capital is independent of capital structure. It implies that there is no optimal structure and consequently denies the existence of non-optimal structures. If there are no non-optimal structures then there is no such thing as a business firm having too much leverage or debt. We disagree with that conclusion on both theoretical and empirical grounds supported by the evidence provided by the Financial Crisis Inquiry Commission plus the basic logic behind Basel III. The model of this paper comes from Brigham and Houston’s Bigbee case using beta, the Hamada transformation and an interest rate function which is crucial, concepts not available to M&M in 1958. We show how the minimization of WACC (weighted average cost of capital) and maximization of stock price give identical solutions and their similarity to M&M Propositions I and II. It also shows the simple mechanism that causes non-optimality. M&M missed non optimal structures partly because their data base from 1948 and 1953 had only low and moderate debt/equity ratios. Non-optimal behavior appears at high (double digit) D/E ratios. We have examples of the consequences of excessive leverage not available to M&M, including RJR, Houdaille Industries, the casualties of the 2008 crisis and others. The RJR LBO is examined in detail because it is as close to a laboratory experiment as can be expected in economics. The analysis shows how extremely high leverage put RJR on the path to bankruptcy and how the Roberts-Gerstner plan restored profitability following the logic of Brigham’s Bigbee model.
Flashback to July 1990. RJR Nabisco’s debt is $22.90 billion and equity $0.80 billion, a debt/equity ratio of 28.6. This is an extremely high ratio.
Brigham’s Bigbee Electronics Company case ([
Modigliani and Miller actually had the modern WACC function in their 1958 paper. It is Equation (19) of footnote 27. The equation was formed as they modeled the “classic” theory of the cost of capital. They rejected Equation (19) because their empirical analysis did not fit the required curvature for optimality (and non-optimality). Unfortunately their pre LBO 1948 and 1953 data bases, which they described as skimpy and of rather limited scope (p. 281), did not contain double digit D/Es at which non-optimal effects tend to appear. Their 1953 sample of 42 oil companies had an average D/E of 0.41 with a maximum of 1.70, and their1947-8 sample of 43 electricutilities had an average D/E of 1.62 with a maximum of 3.76. In 1958 M&M did not have the advantage of observing the results of the LBO era nor the disasters of 2008.They also did not have the concepts of beta, the Hamada transformation [
Modern theory says that the firm should minimize the weighted average cost of capital (Min WACC) and maximize the common stock price (Max P) and that the two solutions should be identical. The Bigbee model in Brigham and Houston [
Other investigators have had a difficult time trying to find an optimal capital structure. This is because the optimal structure is not a well defined point but a rather broad shallow range. Accordingly, we have taken the reversed approach of trying to find non-optimal structures. It is easier to find non-optimal structures than to find an optimum.
Three approaches summarize the nature of the capital structure debate. According to classical theory the graph of WACC versus the debt ratio has a U shape. See Modigliani and Miller, (1958, p. 278). This led to an attempt to find an optimal capital structure at the bottom of the U. Then, in 1958, Modigliani and Miller (M&M) wrote their famous and controversial paper: The Cost of Capital, Corporation Finance, and the Theory of Investment.
Their key proposition was the average cost of capital to any firm is “completely independent” of its capital structure and is equal to the capitalization rate of a pure equity stream of its class. The words “completely independent” indicate that the WACC-debt ratio graph is perfectly flat and that there is no optimal structure. This is the capital structure irrelevance theorem. M&M’s theorem has another interesting conclusion: not only is there no optimum there can be no non-optimum either, because if a non-optimum exists there must be some better point. This means that there is no such thing as too much leverage. 96% debt and 4% equity is as good (or bad) as 4% debt and 96% equity. The Financial Crisis Inquiry Commission investigation [
The third version of the capital structure problem is the modern Min WACC-Max P model. Here we are not so concerned with an optimal structure because we believe the WACC function is relatively at low and moderate levels of debt and the optimum is actually a rather wide range. But, as the debt ratio approaches extreme levels around and above 90% the WACC curve increases rapidly and becomes non-optimal. See
Clearly debt and the associated interest is a problem as any household with large debt would confirm as it experiences rising interest expense with rising debt. Lawrence Fisher (1959) found a significant (t-statistic = 17.32) quantitative relation between the debt/equity ratio and the interest rate risk premium. Using a very simple example supposes debt doubles and as a result the interest rate also doubles. Then interest expense (IEX) will go up four times since IEX = debt D times the interest rate, id. Let us look at the interest expense effect in the Bigbee case. See
D | D/E | D(000) | rd | EBIT | 1EX | EBT | EAT | rp |
---|---|---|---|---|---|---|---|---|
0.00 | 0.00 | 0 | n/a | 40,000 | 0 | 40,000 | 24,000 | |
0.30 | 0.43 | 60 | 0.09 | 40,000 | 5400 | 34,600 | 20,760 | 0.03 |
0.40 | 0.67 | 80 | 0.10 | 40,000 | 8000 | 32,000 | 19,200 | 0.04 |
0.50 | 1.0 | 100 | 0.012 | 40,000 | 12,000 | 28,000 | 16,800 | 0.06 |
0.60 | 1.5 | 120 | 0.15 | 40,000 | 18,000 | 22,000 | 13,200 | 0.09 |
0.70 | 2.3 | 140 | 0.19 | 40,000 | 26,600 | 13,400 | 8040 | 0.13 |
0.80 | 4.0 | 160 | 0.24 | 40,000 | 38,400 | 1600 | 960 | 0.18 |
0.90 | 9.0 | 180 | 0.30 | 40,000 | 54,000 | −14,000 | −8400 | 0.24 |
0.95 | 19.0 | 190 | 0.3338 | 40,000 | 63,413 | −23,413 | −14,048 | 0.27 |
0.96 | 24.0 | 192 | 0.3408 | 40,000 | 65,434 | −23,434 | −15,260 | 0.28 |
D | TA(000) | D(000) | E(000) | Shares | EPS | res* | P |
---|---|---|---|---|---|---|---|
0.00 | 200 | 0 | 200 | 10,000 | 2.40 | 0.12 | 20 |
0.30 | 200 | 60k | 140 | 7000 | 2.97 | 0.135 | 21.90 |
0.40 | 200 | 80 | 120 | 6000 | 3.20 | 0.144 | 22.22 |
0.50 | 200 | 100 | 100 | 5000 | 3.36 | 0.156 | 21.54 |
0.60 | 200 | 120 | 80 | 4000 | 3.30 | 0.174 | 18.97 |
0.70 | 200 | 140 | 60 | 3000 | 2.68 | 0.204 | 13.14 |
0.80 | 200 | 160 | 40 | 2000 | 0.48 | 0.264 | 1.82 |
0.90 | 200 | 180 | 20 | 1000 | −8.40 | 0.444 | −18.92 |
0.95 | 200 | 190 | 10 | 500 | −28.10 | 0.804 | −34.95 |
0.96 | 200 | 192 | 8 | 400 | −38.15 | 0.924 | −41.29 |
*res = rRF + mrp [1 + (1 − T) d/1 − d] Bu = 0.06 + 0.06 (1 + 0.6d/1 − d).
1988, up 4.35 times. 23,998 = 0.11 × 5518 + 0.14 × 29,100 + 0.25 × (26,420 + 25,690 + 25,159). From the rd% column the interest rate on debt averaged 15.57% adjusting for the Feb. 9 date) compared to 11.12% in December 1988, up 40%. 15.57 = 0.11 × 11.12 + 0.14 × 13.71 + 0.25 × (14.0 + 17.3 + 18.4). IEX in 1988 was $549. Using the quick formula: IEX = $549 × 4.35 × 1.40 = $3343 which is close to the actual IEX of $3384. In 1989 RJR’s interest expense rose more than six times to $3384 million from $549 million, far exceeding EBIT and causing a $1.3 billion loss before taxes.
M&M do not have the interest expense effect in their model. The reason is given in the fourth paragraph of their paper. This attempt typically takes the form of superimposing on the results of the certainty analysis the notion of a risk “discount” to be subtracted from the expected yield (or a “risk premium” to be added to the market rate of interest). Investment decisions are then supposed to be based on a comparison of this “risk adjusted” or “certainty equivalent” yield with the market rate of interest. No satisfactory explanation has yet been provided, however, as to what determines the size of the risk discount and how it varies in response to changes in other variables. This last sentence gives the reason why M&M did not have an interest expense effect in their analysis. They could not find a function relating the risk premium (they call it the risk discount) to capital structure and other variables. An interest rate function is a crucial part of the modern WACC function. This may be a second reason M&M rejected their Equation (19). As a substitute for the missing interest rate function they used an “arbitrage” type of argument discussed below. Fisher interest rate function appeared after M&M hence it was not available for them to include in their model.
Ignoring preferred stock as do M&M, the WACC function is:
SALES | EBIT | IEX | EBT | D | E | D/E | |
---|---|---|---|---|---|---|---|
Pre LBO | |||||||
Dec. 1986 | 11,517 | 2009 | 531 | 1478 | 5591 | 5312 | 1.05 |
Dec. 1987 | 11,765 | 1915 | 454 | 1461 | 4279 | 6038 | 0.71 |
Dec. 1988 | 12,635 | 2368 | 549 | 1818 | 5518 | 5694 | 0.97 |
LBO | |||||||
1Q89 | 2926 | 402 | 561 | −159 | 29,100 | 2032 | 14.30 |
2Q89 | 3300 | 547 | 990 | −433 | 26,420 | 1778 | 14.80 |
3Q89 | 2999 | 452 | 983 | −531 | 25,690 | 1359 | 18.90 |
4Q89 | 3539 | 652 | 850 | −198 | 25,159 | 1237 | 20.30 |
YR89 | 12,764 | 2053 | 3384 | −1330 | 25,159 | 1237 | 20.30 |
1Q90 | 3204 | 602 | 830 | −228 | 22,937 | 1024 | 22.40 |
Bond | BP | rd% | rRF% | rp% | |
---|---|---|---|---|---|
Pre LBO | |||||
Dec.1986 | 73/8s01 | 90 1/8 | 8.57 | 7.54 | 1.03 |
Dec. 1987 | 8s07 | 84 7/8 | 9.73 | 9.23 | 0.52 |
Oct. 1988 | 8s07 | 82 4/7 | 10.06 | 9.04 | 1.06 |
LBO announcement and bidding process | |||||
Dec. 1988 | 73/8s01 | 73 1/4 | 11.12 | 8.24 | 1.88 |
LBO | |||||
1Q89 | Anders, p.225 | 13.71 | 9.20 | 4.51 | |
2Q89 | Anders, p.225 | 14.0 | 8.20 | 5.80 | |
3Q89 | na14.70s07 | 85 1/4 | 17.3 | 8.32 | 9.07 |
4Q89 | na14.70s08 | 80 5/8 | 18.4 | 8.17 | 10.25 |
1Q90 | na14.70s08 | 67 1/8 | 20.7 | 8.80 | 11.98 |
2Q90 | Bond prices vary with rescue rumors. |
Source: 10Ks, 10Qs, Annual Reports, Wall Street Journal bond tables. More details in Part 2.
where d is the debt ratio (debt/debt plus equity or D/(D + E)), rd is the interest rate on debt, (1 − d) is the equity ratio, and rcs is the return on equity. T is the tax rate assumed to be 0.40 in the Bigbee case, and (1 − T) converts the before tax interest rate on debt to an after tax measure. Sometimes the debt/equity ratio or D/E is used in formulas; D/E is equal to d/(1 − d). Proposition II is:
Solving this equation for WACC yields:
Hence Proposition II is the modern WACC function missing the (1 − T) factor, a minor omission. There is no conflict at this point. The models differ when it comes to specifying the behavior of rcs and particularly rd.
The equation for finding the stock price P developed below is:
where V is the value of the company, P the price of a share of common stock, Sho the unlevered number of shares outstanding, and EBIT earnings before interest and taxes or operating earnings. Proposition I is:
Later it will be shown that D + E = P Sho so the P function and Proposition I are the same except for the missing (1 − T) factor. Again there is no conflict at this point with M&M. The process of finding Min WACC and Max P in the Bigbee case is a classical calculus problem. The first task is to find the equations for rcs and rd to be substituted into the WACC function. Then take the derivatives which should be identical. Indeed, the corrected version of the P function is the inverse of the WACC function so that which minimizes WACC maximizes P.
Substituting for rcs: Step 1, the WACC function is:
where T = 0.40. The capital asset pricing model CAPM provides an equation for rcs:
where rRF is the risk free interest rate (0.06 in Bigbee), rM the stock market rate of return (0.10 in Bigbee), and B is beta. A shorter form is
where rMP is the market risk premium equal to rM − rRF (rMP = 0.04 in Bigbee).
Next, the Hamada transformation is used to relate beta to unlevered beta and capital structure:
With d = 0.40 rcs = 0.144. Substituting rcs into the WACC function yields:
At this point assume rd is a constant. Since rRF (0.06), rMP (0.04), Bu (1.50), and T (0.40) also are constants, WACC is a linear function of d and there is neither interior optimum nor non-optimum. M&M contend that the slope is zero. If the slope is positive the optimal d is zero (a clean balance sheet). If the slope is negative (due to the tax deductibility of debt then the optimal d is 0.99999―(it cannot be 1 because there must be at least one share of stock otherwise no one would own the company). Given that the rcs effect in WACC is linear, if the WACC function is curved and in the Bigbee data,
With respect to the tax advantage of debt Professor Michael Jensen in 1976 asked [
Substituting for rd: Step 2
There are four alternatives. First, as noted in the interest expense section above, a year after the Cost of Capital article Lawrence Fisher [
D | rd | (1 − T) | Totd | (1 − d)[rrt + mrp B | Bu(1 + (1 − T)D/E | TOTes | WACC |
---|---|---|---|---|---|---|---|
0.00 | n/a | 0.6 | 0 | 1.0[0.06 + 0.04(1.500) | 1.5(1 + 0.6 × 0) | 0.1200 | 0.1200 |
0.30 | 0.09 | 0.6 | 0.0162 | 0.7[0.06 + 0.04(1.886) | 1.5(1 + 0.6 × 3/7) | 0.0948 | 0.1110 |
0.40 | 0.10 | 0.6 | 0.0240 | 0.6[0.06 + 0.04(2.10) | 1.5(1 + 0.6 × 4/6) | 0.0864 | 0.1104 |
0.50 | 0.12 | 0.6 | 0.0360 | 0.5[0.06 + 0.04(2.40) | 1.5(1 + 0.6 × 1.0) | 0.0781 | 0.1140 |
0.60 | 0.15 | 0.6 | 0.0540 | 0.4[0.06 + 0.04(2.85) | 1.5(1 + 0.6 × 1.5) | 0.0696 | 0.1236 |
0.70 | 0.19 | 0.6 | 0.0798 | 0.3[0.06 + 0.04(3.60) | 1.5(1 + 0.6 × 7/3) | 0.0612 | 0.1410 |
0.80 | 0.25 | 0.6 | 0.1200 | 0.2[0.06 + 0.04(5.10) | 1.5(1 + 0.6 × 4) | 0.0528 | 0.1728 |
0.90 | 0.30 | 0.6 | 0.1620 | 0.1[0.06 + 0.04(9.60) | 1.5(1 + 0.6 × 9) | 0.0444 | 0.2064 |
0.95 | 0.33 | 0.6 | 0.1903 | 0.05[0.06 + 0.04(18.60) | 1.5(1 + 0.6 × 19) | 0.0402 | 0.2305 |
their fourth paragraph with this sentence, “No satisfactory explanation has yet been provided, however, as to what determines the size of the risk discount and how it varies in response to changes in other variables.” The Fisher regressions provide exactly what M&M said was missing. The key regression is:
where all variables are in common logarithms: Xo is the risk premium, X1 the coefficient of variation of after tax earnings (should have been EBIT, earnings before interest and taxes), X2 the time of solvency, X3 the D/E ratio (Fisher did the E/D ratio), and X4 size. The numbers in parentheses are standard errors. The t-statistic of D/E is 17.32. Altman’s (1968) Z-Score also found D/E to be extremely significant but it is not in a directly usable format. Using the Hamada transformation to adjust earnings variability to EBIT from earnings after taxes yields a function and converting back from logarithms yields:
The second solution is to not have an rd function. In 1958 the Fisher and Altman studies had not been done. Accordingly, M&M could not find any quantitative relation of rd to capital structure so they did not include an interest rate function in their model. Suppose rd is a linear function of
Inserting Equation (13) into the WACC function yields:
Now look at M&M’s footnote 27 Equation (19) with our notation for the debt ratio d and the equity ratio (1 − d) and capital letters for Greek letters:
They have the same form and the d2/(1 − d) terms which cause curvature are the same, except again the (1 − T) factor is missing in Equation (19). This is the form of equation used in
The Brigham-Houston Bigbee case rd assumption. For instructional simplicity the rd function was presented in
Substituting rRF = 0.06, rMP = 0.04, Bu = 1.50, T = 0.40 yields:
This equation reproduces WACC values for Bigbee exactly for d = 0.00, 0.30, 0.40, 0.50, and 0.60 (WACCs are 0.1200, 0.1110, 0.1104, 0.1140, and 0.1236).
The optimal WACC cited from Brigham’s discrete table is 0.1104 from a d of 0.40.
Solving for the first order condition of Equation (4) yields:
The positive root of this quadratic is d = 0.36942542 giving a minimum WACC of 0.110221.
A feature of the Bigbee case is that the WACC function is relatively at around the minimum. The RJR Nabisco WACC function is even fatter in the optimal range. This fatness around the minimum may account partly for the difficulty in overturning the irrelevance theorem. In pre LBO days most rms had conservative debt ratios way below d = 0.90 where non-optimality begins to appear. Hence most observations come from the at range so that the irrelevance theory appears to be valid. The next task is to find out if the Max P function gives the same result. It does not meaning that the Bigbee model has to be fixed.
The Brigham analysis that determines the stock price P begins with Line 1 of
The new stock price P is: P = EPS/rcs, where EPS is earnings per share (also, dividends per share equal earnings per share for simplicity). EPS = (EBIT − rdD) (1T)/Sh, where EBIT = $40,000 which is fixed for all situations), rd is 0.10 (from
As Bigbee issues debt and uses the proceeds to buy back shares, the debt D is dPoSho and shares outstanding Sh equal (1 − d) Sho. Substituting yields the P function:
With EBIT = 40,000, Po = 20, Sho = 10,000,
The derivative is a mess so Function Grapher Online by walterzorn.com was used to nd the maximum P at $22.2241 and d = 0.39358. This is incorrect because the correct optimal d found by minimizing WACC is 0.36943.
The reason that the optimal d for Max P is different than the d that minimizes WACC is the assumption that as the company levers itself the stock price P remains at $20/share regardless of the amount of debt undertaken. But if investors are informed they may not want to sell their shares back to the company if restructuring generates a higher price. The efficient markets solution is to assume that the shares are sold back to the company at the equilibrium price P instead of Po = $20. Replacing Po with P in Equation (20) yields:
Now P is on both sides of the equation. To solve for P cross multiply, divide by Sho, add Pdrd (1 − T) to both sides and then factor out P:
The term in brackets,
Since EBIT, Sho, and T are constants, P is a reciprocal function of WACC. What minimizes WACC maximizes P which is what is supposed to happen. Since EBIT = 40,000, Sho = 10,000, T = 0.40, and Min WACC = 0.110221, the Maximum price is 21.7744. The reason why this value is lower than the Brigham answer of $22.2222 is because Bigbee is repurchasing shares at a price of $21.77 rather than the bargain price of $20.00.
Now Equation (23), can be converted into M&M’s Proposition. Multiply both sides of Equation (23) by Sho. Debt D = dPSho and equity E = (1 − d) PSho so that D + E = PSho. Hence,
which is Proposition I (including the tax adjustment factor)?
M&M’s Proposition II is the WACC function in a different algebraic form. Proposition I is the stock price function as shown by Equations ((23) and (24)). M&M had the modern curved WACC function with their Equation 19 but did not believe it a problem with their admittedly “skimpy database of rather limited scope” was that it did not have any high D/E observations th attend to show non-optimal behavior. They did not have the benefit of seeing the LBO era in 1958 nor the excesses of 2008.
The most important theoretical reason was that M&M did not have an interest rate function relating the interest rate on corporate debt through the default risk premium to capital structure. In 1958 M&M did not have the 1959 Fisher interest model or the Altman model [
In 1958 Modigliani and Miller published one of the most significant papers in finance on the cost of capital. It presented the capital structure irrelevance theorem which states that the cost of capital is independent of capital structure. It implies that there is no optimal structure and consequently denies the existence of non-optimal structures. If there are no non-optimal structures then there is no such thing as a business firm having too much leverage or debt. We disagree with that conclusion on both theoretical and empirical grounds supported by the evidence provided by the Financial Crisis Inquiry Commission plus the basic logic behind Basel III. The model of this paper comes from Brigham and Houston’s Bigbee case using beta, the Hamada transformation, and an interest rate function which is crucial, concepts not available to M&M in 1958.to show how the minimization of WACC (weighted average cost of capital) and maximization of stock price give identical solutions and their similarity to M&M Propositions I and II. It also shows the simple mechanism that causes non-optimality. M&M missed non optimal structures partly because their data base from 1948 and 1953 had only low and moderate debt/equity ratios and the lacked the interest function provided later by Fisher. Non-optimal behavior appears at high (double digit) D/E ratios. We have examples of the consequences of excessive leverage not available to M&M, including RJR, Houdaille industries, the casualties of the 2008 crisis and others.
WilliamCarlson,ConwayLackman, (2016) Can Business Firms Have Too Much Leverage? M&M, RJR 1990, and the Crisis of 2008. Modern Economy,07,194-203. doi: 10.4236/me.2016.72021