University of LiègeULgFaculty of EngineeringFacSALibrary News   
Michaël Castronovo - Publications ORBI
Castronovo, M. (2017). Offline Policy-search in Bayesian Reinforcement Learning. Unpublished doctoral thesis, Université de Liège, ​Liège, ​​Belgique.
This thesis presents research contributions in the study field of Bayesian Reinforcement Learning — a subfield of Reinforcement Learning where, even though the dynamics of the system are un- known, the ...
Castronovo, M., François-Lavet, V., Fonteneau, R., Ernst, D., & Couëtoux, A. (2017). Approximate Bayes Optimal Policy Search using Neural Networks. Proceedings of the 9th International Conference on Agents and Artificial Intelligence (ICAART 2017).
Peer reviewed
Bayesian Reinforcement Learning (BRL) agents aim to maximise the expected collected rewards obtained when interacting with an unknown Markov Decision Process (MDP) while using some prior knowledge. State-of ...
Castronovo, M., Ernst, D., Couëtoux, A., & Fonteneau, R. (2016, June). Benchmarking for Bayesian Reinforcement Learning. PLoS ONE.
Peer reviewed (verified by ORBi)
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the col- lected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many ...
Castronovo, M., Ernst, D., & Fonteneau, R. (2014). Bayes Adaptive Reinforcement Learning versus Off-line Prior-based Policy Search: an Empirical Comparison. Proceedings of the 23rd annual machine learning conference of Belgium and the Netherlands (BENELEARN 2014).
Peer reviewed
This paper addresses the problem of decision making in unknown finite Markov decision processes (MDPs). The uncertainty about the MDPs is modeled using a prior distribution over a set of candidate MDPs. The ...
Castronovo, M., Ernst, D., & Fonteneau, R. (2014). Apprentissage par renforcement bayésien versus recherche directe de politique hors-ligne en utilisant une distribution a priori: comparaison empirique. Proceedings des 9èmes Journée Francophones de Planification, Décision et Apprentissage.
Peer reviewed
Cet article aborde le problème de prise de décision séquentielle dans des processus de déci- sion de Markov (MDPs) finis et inconnus. L’absence de connaissance sur le MDP est modélisée sous la forme d’une ...
Castronovo, M. (2012). Learning for exploration/exploitation in reinforcement learning. Unpublished master thesis, Université de Liège, ​Liège, ​​Belgique.
We consider the problem of learning high-performance Exploration/Exploitation (E/E) strategies for finite Markov Decision Processes (MDPs) when the MDP to be controlled is supposed to be drawn from a known ...
Castronovo, M., Maes, F., Fonteneau, R., & Ernst, D. (2012). Learning exploration/exploitation strategies for single trajectory reinforcement learning. Proceedings of the 10th European Workshop on Reinforcement Learning (EWRL 2012) (pp. 1-9).
Peer reviewed
We consider the problem of learning high-performance Exploration/Exploitation (E/E) strategies for finite Markov Decision Processes (MDPs) when the MDP to be controlled is supposed to be drawn from a known ...