[1]
|
Robbins, H. (1952) Some Aspects of the Sequential Design of Experiments. Bulletin of the American Mathematical Society, 58, 527-535. http://dx.doi.org/10.1090/S0002-9904-1952-09620-8
|
[2]
|
Freund, Y. and Schapire, R.E. (1997) A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55, 119-139. http://dx.doi.org/10.1006/jcss.1997.1504
|
[3]
|
Auer, P., Cesa-Bianchi, N., Freund, Y. and Schapire, R.E. (2002) The Non-Stochastic Multi-Armed Bandit Problem. SIAM Journal on Computing, 32, 48-77. http://dx.doi.org/10.1137/S0097539701398375
|
[4]
|
Allenberg-Neeman, C. and Neeman, B. (2004) Full Information Game with Gains and Losses. Algorithmic Learning Theory: 15th International Conference, 3244, 264-278.
|
[5]
|
Dani, V., Hayes, T.P. and Kakade, S.M. (2008) The Price of Bandit Information for Online Optimization. In: Platt, J.C., Koller, D., Singer, Y. and Roweis, S., Eds., Advances in Neural Information Processing Systems, MIT Press, Cambridge, 345-352.
|
[6]
|
Bartlett, P., Dani, V., Hayes, T., Kakade, S., Rakhlin, A. and Tewari, A. (2008) High-Probability Regret Bounds for Bandit Online Linear Optimization. Proceedings of 22nd Annual Conference on Learning Theory (COLT), Helsinki.
|
[7]
|
Cesa-Bianchi, N. and Lugosi, G. (2012) Combinatorial Bandits. Journal of Computer and System Sciences, 78, 1404-1422. http://dx.doi.org/10.1016/j.jcss.2012.01.001
|
[8]
|
Uchiya, T., Nakamura, A. and Kudo, M. (2010) Algorithms for Adversarial Bandit Problems with Multiple Plays. In: Hutter, M., Stephan, F., Vovk, V. and Zeugmann, T., Eds., Algorithmic Learning Theory, Lecture Notes in Artificial Intelligence No. 6331, Springer, 375-389.
|
[9]
|
Auer, P., Cesa-Bianchi, N., Freund, Y. and Schapire, R.E. (1995) Gambling in a Rigged Casino: The Adversarial Multi-Armed Bandit Problem. Proceedings of 36th Annual Symposium on Foundations of Computer Science, Milwaukee, 322-331.
|
[10]
|
Hochbaum, D.S. (1995) Approximation Algorithms for NP-Hard Problems. PWS Publishing Company, Boston.
|
[11]
|
He, D., Chen, W., Wang, L. and Liu, T.-Y. (2013) Online Learning for Auction Mechanism in Bandit Setting. Decision Support Systems, 56, 379-386. http://dx.doi.org/10.1016/j.dss.2013.07.004
|
[12]
|
Park, C. and Lee, J. (2012) Intelligent Traffic Control Based on Multi-Armed Bandit and Wireless Scheduling Techniques. International Conference on Advances in Vehicular System, Technologies and Applications, Venice, 23-27.
|
[13]
|
Bertsekas, D.P. (1998) Network Optimization. Athena Scientific, Belmont.
|
[14]
|
Blum, A. and Burch, C. (2000) On-Line Learning and the Metrical Task System Problem. Machine Learning, 39, 35-88. http://dx.doi.org/10.1023/A:1007621832648
|
[15]
|
Cole, S.J. and Lim, C. (2008) Algorithms for Network Interdiction and Fortification Games. Springer Optimization and Its Applications, 17, 609-644. http://dx.doi.org/10.1007/978-0-387-77247-9_24
|
[16]
|
Vanëk, O., Jakob, M. and Pëchoucek, M. (2011) Using Agents to Improve International Maritime Transport Security. IEEE Intelligent Systems, 26, 90-95. http://dx.doi.org/10.1109/MIS.2011.23
|