Conviction Function ? A New Decision Paradigm for Personal inancial Risk Management in the Face of Large Exogenous Shocks

This paper contributes to the limited-information literature on savings in a stochastic environment. In particular, it contributes techniques and concepts to the question of state verification (or filtering), by including learning about aggregate income shocks, based on signals. As a seminal contribution to the extant literature, a “conviction function” is introduced, which takes into account histories of past prediction errors in determining how rational agents internalize such information in taking personal investment decisions. For purpose of a more transparent illustration, a numerical rendition of the posited model is provided for five consecutive time periods. We also perform a series of Monte Carlo simulations to demonstrate how the posited approach could potentially outperform traditional forward-looking models in the presence of sudden large extraneous shocks reminiscent of the recent Global Financial Crisis.

global financial crisis (GFC), were rudely disappointed.In this article we examine the nature of an optimal retirement savings plan and the resilience of such a plan to large external shocks.
A retirement savings plan can in essence be reduced to a dynamic portfolio insurance strategy that actively allocates funds to a risky asset (or pool of risky assets) when the market is expected to move in a positive direction, and diverting funds to a low-risk asset when market returns are expected to decrease.However, a major shortcoming of such a plan is the "forward-looking" nature of these decisions.A purely prediction-dependent plan is fraught with non-negligible costs of prediction error.While auto-corrections for very small errors in prediction can be built into prediction-dependent models, the impact of large systemic shocks can jeopardize their predictive accuracy and render them unusable.Most significantly, if such shocks occur towards the end of the planning horizon (i.e. as an individual approaches retirement age) then the recovery time is often not adequate to allow the models to reach the target savings goal.
The motivation of this paper is two-fold.Firstly, we postulate a better retirement planning approach that incorporates both forward-looking as well as backward-looking techniques in dynamically determining the optimal path.We combine the standard dynamic programming approach that applies the Bellman principle with a Bayesian learning approach.This allows our posited approach to endogenously account for past errors in prediction.We present a numerical rendition of our posited dynamic optimization strategy for a five-period horizon via a rather large 32 × 32 event-action matrix to extricate the best possible outcome for each error in prediction made in the past periods.This illustrative method quickly becomes computationally intractable for large values of n, but n = 5 helps to adequately illustrate our case.A rational Bayesian learner should not be affected by his/her past performances as he/she has incorporated all possible match/mismatch between his/her signals and states; and updated his/her beliefs accordingly.However, in the real world the assumption of rationality may not hold as past performance may be considered one of the crucial factors of analyzing one's investment decision, in particular, managing the risk portfolio.We therefore examine the case of using past performance as a tracking error and consider the implication for the Bayesian learner's decision making process.We find that in the case when the past performances revealed a great number of mistakes, this leads the Bayesian learner into an unnecessarily conservative investment decision making process.However, the reverse is not true i.e. if the past performances have revealed that he/she did not make any large mistakes, he/she follows the Bayesian decision making process.In other words, past performances affect an individual Bayesian learner's decision making process only when he/she made too many mistakes in the past.Indeed, our theoretical prediction is supported by the recent empirical evidence by reference [1] where they examined how the past performance affects mutual fund tracking error.Second- A recent, real-life example of a massive, systemic shock is the of course the GFC.
Many individuals who had previously thought that their savings were well on track to fund their (for some imminent) retirement, were subjected to great disappointment and distress.
An improved understanding of the process by which such shocks affect the accumulation of retirement funds would clearly be useful.A naïve solution to the dynamic retirement planning problem for an individual would be to redirect all savings to proxy risk-free assets.However because of the very low rates of return and the threat of inflation this approach would not be favorable (or even practical).A better solution ought to involve a dynamic funds allocation strategy that takes into account the individual planners prediction of the state of financial markets as well as the effect of past errors in prediction.This is what we have posited and illustrated in this paper.

Brief Review of Extant Literature
Reference [2] recognized that finding a way around the first principle problem is a key issue that lies at the heart of a quintessential social security problem.Effectively the retirement planning problem can be viewed as a special form of the social security problem where targeting the saving goal at retirement age and subsequently determining how to achieve that target is a crucial factor in addressing this problem.The extant literature has mostly addressed this problem via modern portfolio theory [3] [4] or more recently via optimal stopping rule with state constraints [5].The problem is typically formulated and solved with the help of optimal control theory in either continuous time space or discrete time space [6].In the most cases, the continuous time space is much easier to solve than that of discrete time space.However, the latter technique is preferable in terms of a more transparent representation of the solution method; especially with regards to its actual application in funds management [7] [8].In a seminal work of reference [9], the classical dynamic programming approach is embellished by incorporating a suite of key behavioral and situational factors (e.g. an individual agents belief, age, nature of employment etc).However, these factors are incorporated via pre-set parameters and the only stochastic element in the model is the probabilistic equation of motion (i.e. the standard Markov process).
Most of the later models were minor extensions of the reference [9]'s model [8].
As summarized by reference [10]  The standard model of dynamic programming assumes that the knowledge of parameter is known, thus the learning is complete.Indeed, it is quite possible to argue that an individual agent does not know the true state of the market, but may receive a signal of the future direction (i.e.form a prediction about the future state of the market).Such a signal plays an important role in helping the individual agent decide on an action.Given that an individual agent is rational, it may be assumed that he/she engages in Bayesian learning.In this respect, reference [11] has demonstrated how Bayesian learning can be incorporated in dynamic programming to make an optimal decision.Later, the reference [12] has incorporated bayesian learning to show how such learning affects asset prices empirically.In this paper, we are extending reference [11]'s paper by introducing the conviction function.A "conviction function", which is a function of past prediction errors, builds persistence, making histories of past shocks to affect the value function that agents use in order to make savings choices.Thus it is not just the signal (i.e. the prediction) but the conviction function in conjunction with the signal that ultimately determines the action.The conviction function is something that is not built into classical dynamic programming, thus we regard this as the most significant methodological contribution of our paper.
Note that, in a discrete-event case, the backward induction technique starts with the last period and works its way backwards period by period to the initial period to establish an optimal time path.However, in so doing, it ignores the forward looking induction.A key decision-making task precedes every state variable taking on a particular value before the true nature of a state is revealed; an individual agent has to decide whether or not to take a certain course of action given how he/she understands the signals, and also given his/her conviction function as determined by past prediction errors.This idea is similar to reference [13], who introduced the concept of forward looking induction in the context of sub-game perfect Nash equilibrium.However, unlike reference [13] we apply the forward looking induction in a recursive way through Bayesian learning.

Utility, Strategy and Payoff
Consider an individual agent who lives for 1 n + periods, where 2 n ≥ ; and dies at period 2 n + .Thus the terminal period is . This t c is in- terpreted as a choice of his consumption at period t so that he can achieve the targeted level of asset t a at period t.We assume that 2 0 n a + = (since agent dies at 2 N + period).Each individual agent starts with the initial asset level a at periods 0 t = i.e. our initial condition.Like conventional literature of dynamic programming, we assume that an individual's utility function is time additive separable (hereafter, TAS) and it is concave with respect to decision variable t c .
Without loss of generality, we assume that an agents's utility function will be as follows: ( ) , where β is a discount factor.All individual agents are homogenous in their preferences.
The state variable t a evolves with the following equation of motion: ( ) e. the next period targeted asset level depends on the previous period's asset level less the choice of consumption at period t, that is the net saving.Here, . In other words, in the case of risky asset, there is probability α to obtain r δ , if so then the probability to obtain 0 is ( ) However, the return of safe asset p a R a c + = i.e. the probability to obtain s δ is always 1. Clearly, α is very important for the risky investment and the nest two sections, we develop the conviction function in conjunction with Bayesian learning and define α in terms of conviction function.

Signals and Bayesian Learning
Consider the following scenario.An individual agent does not know how well the economy will perform at the beginning of each period t i + , however, he/she knows with certainty that there are only two possible outcomes; g S and b S (i.e.good and bad states respectively) with equal probability.This implies that an individual's prior belief is . Furthermore, an individual receives signals g and b at the beginning of each period.We assume that these signals are symmetric binary by nature.In others words, there is probability p that both the state and signal will be matched, otherwise there will be mismatched outcome with the probability 1 p − .

Conviction Function
Although Definition 1 explains how a decision will be made given the signal for the current period, it does not tell how a decision is made when an agent has perfect recall of the success or failure of past predictions.Indeed, past history may affect agent's belief and make him/her less confident about his/her current signal or the other way around.To make it clear, let assume that agent is at period 1 t + .He/she can observe whether his/her decision at period t was right or wrong by observing the state variable at current period (i.e. in our model In so doing, he/she will follow the following steps: At period t, ex-ante, an agent has received signal g with , he/she might think that a match between his/her signal and the state of economy's performance (following the Definition 1).
However, this is true only if ( ) 0 t E λ = i.e. he/she did not make any mistake in last period.If an agent made a mistake in last period, the current signal is not enough to build his/her confidence level to choose the same action that he/she took in last period.In this case, an agent has to revise his/her confidence level.
Thus if he/she receives g signal, the strength of the signal g at period . Agent will be confidence enough that his/her signal is matched with the economy's performance if the following is true: This implies that an agent will be confident enough that his/her signal is matched with the current performance of an economy if his/her signal It should be noted that when the agent made a mistake at period t.
Hereafter, c t p refers to an agent's confidence function about his/her signal at period t i + .
Given condition (2) and the process describe in the above three steps, the same procedure can be generalized for 1 n + periods, where 2 n ≥ .Suppose at period n an agent receives signal g with probability p and revises his/her confidence level based on whether his/her signal is matched with current performance of the economy.From the history of past periods, the agent learns that his/her signals were correct for k periods, and thus n k − periods his/her signals were incorrect.Therefore, at period n, condition (2) will be as follows: ) We are now able to define α in terms of c t p .Indeed, for any period 0 t > , α is defined as follows: 1 c t p α = − .Note that, if one has made a lot of mistakes in the past, one will have higher value of c t p (see Proposition 1(B)) in that case, one will face lower value of α, i.e., the probability will be very less that one will reach in the next state with the higher return r δ .If one does not make any mis

Numerical Rendition
We present a numerical rendition of the dynamic optimization problem as posited and developed in the previous sections.For purpose of computational tractability, we have limited an individual agents planning horizon to five discrete time periods (six nodes) with a time-line going from t 0 to t 5 .
At each node there are two possible, mutually exclusive events either the signal matches the true state of the market (i.e.there is no prediction error) or the signal does not match the true state of the market (i.e.there is a prediction error).Also, at each node the individual agent chooses a certain course of action and for the sake of simplicity, we limit the pertinent action space to a set of two mutually exclusive choices either the individual agent opts for a risky pool of assets or he/she opts for a riskless pool of assets.The starting amount in the retirement fund (i.e.y 0 ) is set as 1.0.Given the agent opts for the risky portfolio, the payoff from a match is 0.10 h ∆ = while the payoff from a mismatch is 0.00 h ∆ = .If on the other hand if the agent opts for the riskless portfolio then the payoff is 0.05 in both cases (i.e.irrespective of whether it's a match or a mismatch) as by definition this is a risk free outcome.The riskless portfolio if chosen in each of the five periods, would ensure a certain ending amount of 1.25 at the end of the fifth period irrespective of whether or not the signal matched the true state (i.e.whether or not the predictions were correct) in any of the past periods.Therefore this serves as a baseline value and realistically, all targeted For each period therefore, there may be a match or a mismatch of signal and true state for one, some or all previous periods and a corresponding outcome depending on whether a risky or a riskless portfolio was chosen by the agent in each of those previous periods.This effectively results in an event-action matrix of dimension 2 2 n n × , which as one may construe, could become inordinately large even for a moderately large value of n and quickly approach the limits of computability.For a fairly small n = 5, we have constructed a 32 32 × matrix representing all the possible time paths from y 0 to y 5 that the individual agent could follow.
The results obtained from the 32 × 32 event-action matrix are summarized in  It must be noted that this numerical rendition does not incorporate a massive exogenous shock (reminiscent of the GFC) but is rather a straightforward numerical demonstration of how the dynamic time paths would pan out for all possible combinations of event and action and shows how learning might be organically embedded in the temporal evolution process.By incorporating the conviction level (a function of past prediction errors) into the decision process along with the signal generated in each period, an individual agent is better able to identify the optimal path into the future given his/her present location in any intermediate node.For example, if an agent is at the beginning of the fifth (i.e. terminal) period and has only one prediction error till that point, then he can reach at best 1.45 (if he/she selects the risky portfolio and there's a match of signal with true state in the last period) or, at worst, he/she can reach 1.40 (if he/she selects the risky portfolio and there's a mismatch of signal with true state in the last period) so will always select risky if target * 1.40 y ≥ .On the other hand, if the agent has made two prediction errors already, then he/she can reach 1.40 at Theoretical Economics Letters best (if he/she selects the risky portfolio and there's a match of signal with true state in the last period) so will need stronger conviction to opt for risky in the last period.
In the next section we firstly present the results of Monte Carlo simulations for two different cases a naïve case (without learning) and an alternative one (with learning).For each of these cases, we have two alternative scenarios one with a massive exogenous shock (reminiscent of the GFC) and the without a shock.We then discuss the results in detail.

Monte Carlo Simulations
To allow for a more realistic presentation of the implications of the theoretical model the simulations is set up in such a manner that it is representative of an individual who saves for retirement over a 40 year working career.At the end of each month the individual observes the performance of the financial market over the last month and formulates a decision as to how the accumulated assets should be invested for the following month.
As in the theoretical formulation of the model, financial markets have only two states; high returns or low returns.In the simulation the return when the market is "high" is set at 1.0% per month, and when the market is "low" there is a negative return of 0.85% per month.The state of the market in any one period is modeled as a binomial random variable with a 50% probability of a high or low outcome.This effectively gives a slightly positive return to the overall market in the longer term.There is also a "safe portfolio" and this is set to give a return of 0.1% per month.We consider two cases; one in which the individual "remembers" his past mistakes and the other in which there is no memory of the past.Without memory of past mistakes the individual has no reason not to be optimistic and will thus invest in the risky portfolio in each period.However if past mistakes are remembered then the theoretical model is used to determine if the individual invests in the risky portfolio or the safe portfolio.Without loss of generality an initial investment of $1000 at the beginning of the period is made, and the total reinvested each period.The results of 2000 iterations of the simulation are shown is Figure 1. 3 The fact that there is little difference in the situations with and without memory is as expected.The recall of past mistakes does not lead to an advantage in asset selection since it does not provide any advantage in predicting the future course of the market.
In order to test for the effect of an external shock to the system the simulation is adapted so that a period of "low" returns is forced into the stream of market  The value of retirement savings falls for both sets of individuals because of the crisis, but the individuals with learning do better.This is not because they are better at predicting the outcome of the market, but because when they experience negative outcomes they adjust by holding the safe portfolio until they have experience enough signals that are correct for them to once again invest in the risky portfolio.

Conclusion
The significance of our research is two-fold.Firstly, we have proposed and formally derived an extension to the classical dynamic programming model via incorporation of a Bayesian learning component.Apart from theoretically deriving this extension, we have applied it in the practical context of a retirement savings planning problem whereby an individual agent starts with a given amount in his/her retirement fund and then tries to optimally manage it over a finite time horizon so as to attain a certain minimum target value.The model takes into consideration the risk propensity of the individual agent and yields a trivial solution as per intuition if the target minimum value is less than or equal to the maximum value attainable via risk-less investing.The numerical rendition illustrates that for every past prediction error, the highest possible target end value of the retirement fund would have to be revised downward with an associated drop in the probability of achieving the target for each additional error made.
Clearly then, if the individual agent was "learning" from the past errors and thereby becoming more and more "stringent" in terms of the minimum signal strength required to induce a selection of the risky portfolio, the riskless portfolio would be opted for more and more as the number of past errors increased.
This would effectively help to "lock in" the fund value to a certain target end value with lower and lower likelihood of any further downward adjustment.Our

Appendix 1.Example: Finite Horizon Decision Making Process and Conviction Function
In order to obtain a closed form solution we assume that ( ) . 5 Therefore, we can formulate the finite horizon dynamic problem as follows In order to solve the above problem, we assume that an individual agent is making his/her decision at period n.At period n, he/she knows for sure that how many mistakes he/she made in the past 1 n − periods.Furthermore, we assume that, one has made ( ) 1 n k − − mistakes in the past.In other words, he/she made correct decisions in k periods.In these k periods, he/she may have made both risky and safe investments.Suppose, he/she makes successful risky investment for x periods; and he/she makes safe investment for k x − periods.Thus his initial asset accumulation at period 1 n − is as follows: In period n, the value function (following from Bellman's optimal principal) one faces is as follows: ( ) ( ) Similarly, in period 1 n − , the value function will be as follows: ( ) ( ) Both Equations ( 9) and (10) are key to find out the Bellman's Optimal principle.The solutions of Equations ( 9) and (10) are discussed in the next section.

Solution
One can easily solve Equations ( 9) and (10) with the help of Equations ( 5) to (8).The optimal solutions are provided as follows:

M
. Cohen et al.DOI: 10.4236/tel.2018.85065919 Theoretical Economics Letters r δ and s δ are the returns from the risky and safe investments respectively such that 0 s r δ δ < < .In our set-up, we assume that an individual can choose either r t d or s t d i.e. to invest his saving either in the risky asset or in the safe asset. 1 In the case of risky asset investment, we assume that the return of risky asset, we have mentioned in the equation of motion) follows the Markovian probability also consider the convex combination of r t d and s t d , however, this will not change the qualitative result of our model.

Equation ( 3 )
leads to following proposition: Proposition 1 If k n = i.e. agent does not make any mistakes in any of the previous periods, proof follows from Equation (3) and the argument lies in those three steps.Proposition 1 is explains how an agent learns from his/her past decisions and build his/her conviction function c t p .The important part is that by introducing c t p , we analyze the role of forward looking induction in the dynamic program- ming which might lead to dynamic paths as oppose to dynamic path suggested by Bellman equation.
1(A)) in that case 1 2 α = (in other words, the value of α is exactly equal to one's Bayesian signal i.e. * 1 2 t p = ).Although the main contribution of our paper is Proposition 1, we also provide an example in Appendix illustrating the role of the conviction function in a finite horizon decision set-up.

4 prediction
to a 48.5% drop in the likelihood of achieving the target of 1.45), for 2 prediction errors * 0a 66% drop in the likelihood of achieving the target of 1.40); for 3 prediction errors * 073% drop in the likelihood of achieving the target of 1.35) and for to a 76% drop in the likelihood of achieving the target of 1.30).

Figure 1 .
Figure 1.Value of retirement savings, with no external shock.

4 Note:
All numbers are hypothetical and intended for illustration only no primary or secondary data sources have accessed for the purpose of this research.

Figure 2 .
Figure 2. Value of retirement savings, with no external shock.

Table 1 .
States and symmetric binary signals., an individual agent will form a posterior belief that there is a match between his/her signal and the state of economy's performance.In this case, he/she will follow his signal i.e. if he/she receives g, he/she will choose sigsignal b, the state of economy's performance is either g * t p p< , an individual agent will form a posterior belief that there is a mis-match between his/her signal and the state of economy's performance.In this case, he/she will always play safe i.e. he/she will choose s t d .Definition 1 accords with our intuition.Suppose, an agent receives signal b.Given Theoretical Economics LettersSuppose, an agent receives the same signal g at period 1 t + and, like Step 1, he/she calculates the same posterior belief.Thus ex-ante if an agent has received g P S g = and ( ) | 1 bP S g = .Thus, the magnitude of error at period t, denoted as( ) t E λ, is as follows:( )1t E p λ = − .Note that, if the agent does not make a mistake then ( ) 0 E λ = .
M. Cohen et al.amounts should be above this baseline.Given our starting numerical values as stated above, the four potential targets are 1.50 (best of the lot; with no past errors), 1.45 (achievable with at most one past prediction error), 1.40 (achievable with at most two past prediction errors), 1.35 (achievable with at most three past prediction errors) and 1.30 (achievable with at most four past errors).

Table 2 . 2 Table 2 .
Numerical output obtained from the posited theoretical model for n = 5.The percentage figures (in bold italicized font) represent the percentage drops in likelihood of achieving the best outcome for the particular time path for the highest number of past errors for which that best outcome is achievable.So, for example, for a targeted best possible outcome of 1.45 (i.e. the maximum amount an individual agent can expect to have accumulated at the end of the given planning horizon), the probability of achieving the target drops by 48.5% for a single past prediction error.For two past prediction errors, this probability drops by 100% i.e. the target no longer remains achievable (and would have to be revised downward to the next highest number of 1.40).Again; for a targeted best possible outcome of 1.40, the probability of achieving the target drops by 66% for two past prediction errors.For three past prediction errors, this probability drops by 100% i.e. the target no longer remains achievable (and would have to be revised downward to the next highest number of 1.35); and so on.It is interesting to note that these percentages can be seen to closely approximate the ratio 2Note: All numbers are hypothetical and intended for illustration only no primary or secondary data sources have accessed for the purpose of this research.M. Cohen et al.DOI: 10.4236/tel.2018.85065927 Theoretical Economics Letters