Séminaire des Doctorant·e·s
mercredi 13 mars 2024 à 17:30 - Salle 109
Orlane Rossini (IMAG)
Stochastic dynamic control: an approach based on semi-Markov models
Human diseases such as cancer involve long-term follow-up. A patient alternates between phases of remission with relapses. A biomarker is monitored throughout the follow-up. Its dynamic is modelled by a controlled piecewise deterministic Markov process (PDMP). The PDMP evolves in continuous time and space, the process is observed through noise and some of its parameters are unknown, making the control problem especially difficult. To our knowledge, there is no method to control such a PDMP, i.e. to maximize the patient's life while minimizing the treatment cost and side effects. We consider discrete dates only for the decisions, thus turning the controlled PDMP into a partially observable Markov decision process (POMDP). We present reinforcement learning methods for solving this type of problem. Reinforcement learning is a general technique that allows an agent to learn the best way to behave, such as maximising patient's life, from repeated interactions in the environment. Model-based Bayesian RL aims to reduce model interaction and handle the problem of exploration-exploitation.In this framework, prior information about the problem is represented in parametric form, and Bayesian inference is used to incorporate any new information about the model.