My name is Damien Ernst. I work as Associate Professor at the
University of Liège (Belgium). I am the holder of the EDF-Luminus chair on Smart Grids. I am
affiliated with the
Systems and Modeling Research Unit . I do research in control
theory with a particular emphasis on power system control problems and
reinforcement learning. [My-CV]
Contact information
Address:
Damien Ernst
University of Liège
Institut Montefiore, B28
B-4000 Liège
BELGIUM
Tel: +32 4 366 9518
Email: dernst@ulg.ac.be
Research interest
I started my research career in
power systems. My work in this field is related to
transient stability,
intelligent power system controllers,
distributed control,
security analysis, electricity markets and smart grids.
Though my interest in power systems has always been strong, my
research has gradually expanded to other fields and, in
particular, to reinforcement learning
(RL), which is a sub-area of machine learning concerned
with how an agent ought to take actions in an environment so as to
maximize some notion of long-term reward. Reinforcement learning is
also closely related to optimal control theory. One of my most
important contributions to RL is the introduction of the fitted Q iteration algorithm (FQI in short) . It
reformulates the problem of learning a high performance policy from a set
of trajectories as a sequence of standard supervised learning
problems. We have shown that by using ensemble of regression trees
(e.g., "Tree Bagging", "Extra-Trees") in the inner loop of the
fitted Q iteration process,
second to none performances
can be obtained (see, e.g., the research paper "Tree-based batch mode
reinforcement learning"). My most recent presentation on FQI
can be downloaded here
. Another presentation that explores different strategies for
learning high performance policies from sets of trajectories using
supervised learning can be downloaded here .
Being touched by the suffering of millions of people living with
HIV/AIDS and, at the same time,
being convinced that the tools I am working on/with to analyze and control
complex systems could help in designing cures for these people, I
also recently decided to focus some of my research on this
disease.
Below are the links to my publications. Those are also available from
ORBI , the institutional repository of the
University of Liège.
Submitted or to appear
"Contextual multi-armed bandits for web server defense". T. Jung, S. Martin, D. Ernst and G. Leduc. To be published in the proceedings of the 2012 International Joint Conference on Neural Networks.
"SPRT for SPIT: Using the sequential probability ration test for spam in VoIP Prevention". T. Jung, D. Ernst, S. Martin, M. Nassar and
G. Leduc. To be published in the proceedings of 6th International Conference on Autonomous Infrastructure, Management and Security.
"Contextual multi-armed bandits for the prevention of spam in VoIP networks". T. Jung, S. Martin, D. Ernst and G. Leduc. Submitted.
"Outbound SPIT filter with optimal performance guarantees". T. Jung, M. Sylvain, N. Mohamed, D. Ernst and G. Leduc. Submitted.
"Scenario trees and policy selection for multistage stochastic programming using machine learning". B. Defourny, D. Ernst and
L. Wehenkel. Submitted. [ arXiv:1112.4463]
"Min max generalization for deterministic batch mode reinforcement
learning: relaxation schemes". R. Fonteneau, D. Ernst, B. Boigelot and Q. Louveaux. Submitted.
"Mathematical modeling of HIV dynamics after antiretroviral
therapy initiation". M.J. Mhawej, C.H. Moog, F. Biafore,
D.A. Ouattara, C. Brunet-François, V. Ferre, D. Ernst, R. Fonteneau,
G.-B. Stan, F. Raffi, X. Xia. Submitted.
"The global grid". S. Chatzivasileiadis, D. Ernst and G. Andersson. Submitted.
"Cooperative control of a multi-terminal high-voltage DC network". J. Dai, Y. Phulpin, A. Sarlette and D. Ernst. Submitted.
"Direct policy search with parameterized
look-ahead trees and Gaussian process optimization". T. Jung, D. Ernst and F. Maes. Submitted.
"Batch mode reinforcement learning based on the synthesis of
artificial trajectories". R. Fonteneau, S.A. Murphy, L. Wehenkel and
D. Ernst. Submitted.
"Optimal discovery with probabilistic expert advice". S. Bubeck, D. Ernst and A. Garivier. Submitted. [ arXiv:1110.5447]
"Policy search in a space of simple closed-form formulas: towards
interpretability of reinforcement learning". F. Maes, R. Fonteneau, L. Wehenkel and D. Ernst. Submitted.
"A learning procedure for sampling semantically different valid expressions". D. Lupien St-Pierre, F. Maes, D. Ernst and Q. Louveaux. Submitted.
"Généralisation min max pour l’apprentissage
par renforcement batch et déterministe : schémas de relaxation". R. Fonteneau, D. Ernst, B. Boigelot and Q. Louveaux. Submitted.
"Learning exploration/exploitation strategies for single
trajectory reinforcement learning". M. Castronovo, F. Maes, R. Fonteneau and D. Ernst. Submitted.
"Imitative learning for real-time strategy games". Q. Gemine, F. Safadi, D. Ernst and R. Fonteneau. Submitted.
"Comparison of different selection strategies in Monte-Carlo tree search for the game of Tron". P. Perick, D. Lupien St-Pierre, F. Maes and D. Ernst. Submitted.
" Learning to play K-armed bandit problems ". F. Maes, L. Wehenkel
and D. Ernst. In Proceedings of the 4th International
Conference on Agents and Artificial Intelligence (ICAART 2012),
Vilamoura, Algarve, Portugal, 6-8 February 2012. (8 pages).
This paper together with the two papers "Optimized look-ahead tree search policies" and "Automatic discovery of ranking formulas for playing with multi-armed bandits" are part of a body of work that focuses on the automatic learning of good strategies for exploration-(exploitation) problems in RL. I gave at the INRIA-Lille in November 2011 a talk entitled "Learning for exploration-exploitation in reinforcement learning. The dusk of the small formulas' reign" that presents this body of work. Download the presentation .
" Ancillary services and operation
of multi-terminal HVDC systems ". Y. Phulpin and D. Ernst. In
Proceedings of the 10th International Workshop on Large-Scale
Integration of Wind Power into Power Systems as well as on
Transmission Networks for Offshore Wind Power Farms, Aarhus, Denmark,
October 25-26, 2011. (6 pages).
" Optimized look-ahead tree search policies ". F. Maes, L. Wehenkel and
D. Ernst. In Proceedings of the 9th European Workshop on Reinforcement Learning (EWRL 2011), Athens, Greece, September 9-11, 2011.
"Apprentissage actif par
modification de la politique de décision courante ". R.
Fonteneau, S.A. Murphy, L. Wehenkel and D. Ernst. In Proceedings
of the Sixièmes Journées Francophones de Planification, Décision et
Apprentissage pour la conduite de systèmes (JFPDA 2011), Rouen,
France, June 23-24, 2011, 14 pages.
" Approximate reinforcement learning: an overview ". L. Busoniu,
D. Ernst, R. Babuska and B. De Schutter. In Proceedings of
the 2011 IEEE International Symposium on Adaptive Dynamic Programming
and Reinforcement Learning (ADPRL-11). Paris, France, April 11-15,
2011.
" Towards min max generalization in
reinforcement learning ". R. Fonteneau, S.A. Murphy, L. Wehenkel
and D. Ernst. In Agents and Artificial Intelligence: International
Conference, ICAART 2010, Valencia, Spain, January 2010, Revised
Selected Papers. Series: Communications in Computer and Information
Science (CCIS), Volume 129, pp. 61-77. Editors: J. Filipe, A. Fred,
and B. Sharp. Springer, Heidelberg (2011).
" Optimal sample selection for
batch-mode reinforcement learning ". E. Rachelson, F.
Schnitzler, L. Wehenkel and D. Ernst. In Proceedings of the 3rd
International Conference on Agents and Artificial Intelligence (ICAART
2011), Rome, Italy, 28-30 January 2011. (10 pages). This paper presents the OSS(N) algorithm
where N is a parameter. I suggest to choose as default value N=117, in
honor of
my favourite movie character .
" Generating informative
trajectories by using bounds on the return of control policies
". R. Fonteneau, S.A. Murphy, L. Wehenkel and D. Ernst. In Proceedings
of the Workshop on Active Learning and Experimental Design 2010 (in
conjunction with AISTATS 2010), 2-page highlight paper, Chia Laguna,
Sardinia, Italy, May 16 2010.
This paper together with the three papers "Model-free Monte
Carlo-like policy evaluation", "Inferring bounds on the performance of
a control policy from a sample of trajectories" and "A cautious
approach to generalization in reinforcement learning" are part of a body
of work around the rebuilding of
trajectories for batch-mode RL. I gave at the 2010 NIPS workshop on "Learning and
Planning in Batch Time Series Data" a talk entitled "Beyond
function approximators for batch mode reinforcement learning:
rebuilding trajectories" that presents this body of work. Download the presentation .
" Model-free Monte Carlo-like policy evaluation ". R. Fonteneau, S.A. Murphy, L. Wehenkel and D. Ernst. In Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), JMLR W&CP Volume 9, pages 217-224, Chia Laguna, Sardinia, Italy, 13-15 May 2010.
This paper was also published in the proceedings of CAp 2010 (see below). I presented this paper at the conference "Journées MAS 2010" held in Bordeaux, France. Download the presentation .
" Model-free Monte Carlo-like policy evaluation ". R. Fonteneau, S.A. Murphy, L. Wehenkel and D. Ernst. In Proceedings of Conférence francophone sur l'Apprentissage Automatique (CAp) 2010, Clermont-Ferrand, France, 17-19 May 2010. (16 pages).
" A cautious approach to generalization in reinforcement learning ". R. Fonteneau, S.A. Murphy, L. Wehenkel and D. Ernst. In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence (ICAART 2010), Valencia, Spain, 22-24 January 2010. (10 pages).
I gave
in January 2009 at the TU Delft (The Netherlands) a presentation entitled
"Lower bounds in reinforcement learning: the intelligent agent dream
is getting closer". The presentation is based on this paper
and on some of my previous work on "fitted Q iteration". Download the presentation
.
" Policy search with cross-entropy
optimization of basis functions ". L. Busoniu, D. Ernst, R.
Babuska and B. De Schutter. In Proceedings of the 2009 IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09), pages 153-160. Nashville, United States, March 30 - April 2, 2009.
This paper is an extended
version of my IJTS 2007 paper.
2008
" Risk-aware decision making
and dynamic programming ". B. Defourny, D. Ernst and L.
Wehenkel. Presented at the NIPS-08 Workshop on
Model Uncertainty and Risk in Reinforcement Learning, Whistler,
Canada, December 2008. (8 pages).
" Continuous-state reinforcement learning
with fuzzy approximation ". L. Busoniu, D. Ernst, B. De Schutter
and R. Babuska. In Adaptive Agents and Multi-Agent Systems III, Adaptation and Multi-Agent Learning. Lecture Notes in Artificial Intelligence, Volume 4865,
K. Tuyls, A. Nowé, Z. Guessoum and D. Kudenko (Editors),
pages 27-43. Springer, 2008.
This paper
is a revised version of the Alamas 2007 paper.
" Consistency of fuzzy model-based
reinforcement learning ". L. Busoniu, D. Ernst, R. Babuska and
B. De Schutter. In Proceedings of the 2008 IEEE International
Conference on Fuzzy Systems (FUZZ-IEEE-08), pages 518-524. Hong Kong,
China, 1-6 June 2008.
The paper "Reinforcement learning versus model predictive control: a
comparison a power system problem" (see above) is an extended and more
mature version of this work.
" Continuous-state reinforcement
learning with fuzzy approximation ". L. Busoniu, D. Ernst, R.
Babuska and B. De Schutter. In Proceedings of 7th European Symposium on Adaptive Learning Agents and Multi-Agent Systems (ALAMAS-07), pages 21--35,
Maastricht, The Netherlands, 2-3 April 2007.
The paper "Continuous-state reinforcement learning
with fuzzy approximation" published in LNCS volume 4865 (see above) is an
extended and more mature version of this work.
" Model predictive
control and reinforcement learning as two complementary frameworks
". D. Ernst, M. Glavic, F. Capitanescu and L. Wehenkel. In
Proceedings of the 13th IFAC Workshop on Control Applications of
Optimisation, Cachan, France, 2006. (6 pages). This paper has later been selected for publication in
the International Journal of Tomography and Statistics (IJTS). Note
that paper "Reinforcement learning versus model predictive control: a
comparison a power system problem" (see above) is an extended and more
mature version of this work.
" On multi-area security
assessment of large interconnected power systems ". L. Wehenkel,
M. Glavic and D. Ernst. In Proceedings of the second Carnegie Mellon
Conference in Electric Power Systems: Monitoring, Sensing, Software
and its Valuation for the Changing electric Power Industry, January
2006, Pittsburgh, USA. (6 pages).
" Preventive and
emergency control of power systems ". L. Wehenkel, D.
Ruiz-Vega, D. Ernst, and M. Pavella. In "Real Time Stability in
Power Systems - Techniques for Early Detection of the Risk of
Blackout", Savu C. Savulescu (Ed.), Chapter 8, pages
199-232. Springer. Power Electronics and Power Systems
Series. Printed in the United States of America (2005).
The fitted Q iteration (FQI) algorithm was
first described in the paper "Iteratively extending time horizon
reinforcement learning" (see below) but this paper is the first one to
name it fitted Q iteration (or FQI in short). You can
download here
a presentation that gives a brief overview of my work on fitted Q iteration and here a presentation that discusses several strategies for using supervised learning in the context of batch-mode reinforcement learning.
" Iteratively extending time horizon
reinforcement learning ". D. Ernst, P. Geurts and L. Wehenkel. In
Machine Learning: ECML 2003, 14th European Conference on Machine
Learning, Lecture Notes in Articial Intelligence, Volume 2837, Springer,
2003, pages 96-107. Cavtat-Dubrovnik, Croatia, September 22-26, 2003
" Closed-loop transient stability emergency
control ". D. Ernst and M. Pavella. In Proceedings of the IEEE PES Winter Meeting 2000; Panel
Session: "On-Line Transient Stability Assessment and Control", January 23-27
2000, Singapore, pages 58-62.
" Transient stability-constrained optimal power flow ". A.L. Bettiol, D.
Ruiz-Vega, D. Ernst, L. Wehenkel, and M. Pavella. In Proceedings of the IEEE
Power Tech'99, August 29-September 2 1999, Budapest, Hungary. (6 pages).
1998
" Transient stability-constrained
generation rescheduling ". D. Ruiz-Vega, A.L. Bettiol, D. Ernst,
L. Wehenkel, and M. Pavella. In Proceedings of 1998 IREP Symposium -
Bulk Power System Dynamics and Control - IV, Santorini, Greece, pages
105-115, August 1998.
" An approach to real-time transient stability assessment and control ".
M. Pavella, L. Wehenkel, A. Bettiol, and D. Ernst. IEEE PES Summer Meeting
1997; Panel Session: "Techniques for Stability Limit Search", July 1997,
Berlin, Germany. Publication IEEE : TP-138-0. (10 pages).
Teaching material
" Artificial autonomous
intelligent agent ". I tried to explain to some undergraduate
students how to build autonomous reinforcement learning agents. I am
not sure I succeeded since it seems to me that the only things they
have remembered from my class are the cinematographic references given
in my teaching material (see page 6 and page 38).