Conditional Value-at-Risk for Random Immediate Reward Variables in Markov Decision Processes
Masayuki Kageyama, Takayuki Fujii, Koji Kanefuji, Hiroe Tsubaki
.
DOI: 10.4236/ajcm.2011.13021   PDF    HTML     4,555 Downloads   9,065 Views   Citations

Abstract

We consider risk minimization problems for Markov decision processes. From a standpoint of making the risk of random reward variable at each time as small as possible, a risk measure is introduced using conditional value-at-risk for random immediate reward variables in Markov decision processes, under whose risk measure criteria the risk-optimal policies are characterized by the optimality equations for the discounted or average case. As an application, the inventory models are considered.

Share and Cite:

M. Kageyama, T. Fujii, K. Kanefuji and H. Tsubaki, "Conditional Value-at-Risk for Random Immediate Reward Variables in Markov Decision Processes," American Journal of Computational Mathematics, Vol. 1 No. 3, 2011, pp. 183-188. doi: 10.4236/ajcm.2011.13021.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] H. M. Markowitz, “Portfolio Selection: Efficient Diversifica-tion of Investment,” Wiley, New York, 1958.
[2] R. T. Rockafellar and S. Uryasev, “Optimization of Conditional Value-at-Risk,” Journal of Risk, Vol. 2, No. 3, 2000, pp. 21-42.
[3] R. T. Rockafellar and S. Uryasev, “Conditional Value-at- Risk for General Loss Distributions,” Journal of Banking & Finance, Vol. 26, No. 7, 2002, pp. 1443-1471. doi:10.1016/S0378-4266(02)00271-6
[4] P. Artzner, F. Del-baen, J. M. Eber and D. Heath, “Coherent Measure of Risk,” Mathematical Finance, Vol. 9, 1999, pp. 203-227. doi:10.1111/1467-9965.00068
[5] A. Inoue, “On the Worst Conditional Expectation,” Journal on Applied Mathematics, Vol. 286, No. 1, 2003, pp. 237-247.
[6] S. Kusuoka, “On Law Invariant Coherent Risk Measures,” Advances in Mathe-matical Economics, Vol. 3, Springer, Tokyo, 2001, pp. 83-95.
[7] H. F?llmer and I. Penner, “Convex Measures of Risk and Trading Constraints,” Finance and Stochastics, Vol. 6, No. 4, 2002, pp. 429-447. doi:10.1007/s007800200072
[8] H. F?llmer and I. Penner, “Convex Risk Measure and the Dynamics of Their Penalty Functions,” Statistics & Decision, Vol. 24, 2006, pp. 61-96.
[9] J. Goto and Y. Takano, “Newsvendor Solutions via Conditional Value-at-Risk Minimization,” Euro-pean Journal Operational Research, Vol. 179, No. 1, 2007, pp. 80-96. doi:10.1016/j.ejor.2006.03.022
[10] A. Takeda, “Generaliza-tion Performance of -Support Vector Classifier Based on Conditional Value-at-Risk Minimization,” Neurocomputing, Vol. 72, 2009, pp. 2351-2358.
[11] B. King and J. A. Filar, “Time Consistent Dynamic Risk Measures,” Mathematical Methods in Operations Research 2005, Special Issue in Honor of Arice Hordijk 2005, pp. 1-19.
[12] Y. Ohtsubo and K. Toyonaga, “Optimal Policy for Minimizing Risk Models in Markov Decision Processes,” Journal of Mathematical Analysis and Applications, Vol. 271, No. 1, 2002, pp. 66-81. doi:10.1016/S0022-247X(02)00097-5
[13] Y. Ohtsubo, “Op-timal Threshold Probability in Discounted Markov Decision Processes with a Target Set,” Applied Mathematics and Com-putation, Vol. 149, No. 2, 2004, pp. 519-532. doi:10.1016/S0096-3003(03)00158-9
[14] D. J. White, “Minimising a Threshold Probability in Discounted Markov Decision Processes,” Journal of Mathematical Analysis and Applications, Vol. 173, No. 2, 1993, pp. 634-646. doi:10.1006/jmaa.1993.1093
[15] C. Wu and Y. Lin, “Minimizing Risk Models in Markov Decision Processes with Policies Depending on Target Values,” Journal of Mathematical Analysis and Applications, Vol. 231, No. 1, 1999, pp. 47-67. doi:10.1006/jmaa.1998.6203
[16] A. P. Mundt, “Dynamic risk management with Markov decision processes,” Universit?ts-verlag Karlsruhe, Karl- sruhe, 2007.
[17] H. L. Royden, “Real Analysis, Second Edition,” The Macmillan Company, New York, 1968.
[18] O. Hernández-Lerma and J. B. Lasserre, “Discrete-Time Markov Control Processes, Basic Optimality Criteria,” Springer-Verlag, New York, 1995.
[19] O. Hernández-Lerma, “Adaptive Markov Control Processes,” Springer-Verlag, New York, 1989.
[20] M. Kurano, “Markov Decision Processes with a Borel Measurable Cost Function: The Average Case,” Mathematics of Operations Research, Vol. 11, No. 2, 1986, pp. 309-320.
[21] D. L. Iglehant, “Optimality of (s, S) Policies in the Infinite Horizon Dynamic Inventory Problem,” Management science, Vol. 9, No. 2, 1963, pp. 259-267. doi:10.1287/mnsc.9.2.259
[22] S. M. Ross, “Applied Probabil-ity Models with Optimization Applications,” Holden-Day, 1970.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.