2 videos

20 videos

10 videos

3 videos

## 10ème Séminaire Itzykson : Valeurs zêta multiples et fonctions modulaires de graphes en théorie des cordes

00:00:00 / 00:00:00

## Some recent progress for continuous-time reinforcement learning and regret analysis

De Xin Guo

Recently, reinforcement learning (RL) has attracted substantial research interests. Much of the attention and success, however, has been for the discrete time setting. Continuous-time RL, despite its natural analytical connection to stochastic controls, has been largely unexplored and with limited progress. In particular, characterizing sample efficiency for continuous-time RL algorithms with convergence rate remains a challenging and open problem. In this talk, we will discuss some recent advances in the convergence rate analysis for the episodic linear-convex RL problem, and report a regret bound of the order $O(\sqrt{N \ln N})$ for the greedy least-squares algorithm, with $N$ the number of episodes. The approach is probabilistic, involving establishing the stability of the associated forward-backward stochastic differential equation, studying the Lipschitz stability of feedback controls, and exploring the concentration properties of sub-Weibull random variables. In the special case of the linear-quadratic RL problem, the analysis reduces to the regularity and robustness of the associated Riccati equation and the sub-exponential properties of continuous-time least-squares estimators, which leads to a logarithmic regret.

### Données de citation

• DOI 10.24350/CIRM.V.19959403
• Citer cette vidéo Guo Xin (13/09/2022). Some recent progress for continuous-time reinforcement learning and regret analysis. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19959403
• URL https://dx.doi.org/10.24350/CIRM.V.19959403

### Dernières questions liées sur MathOverflow

Pour poser une question, votre compte Carmin.tv doit être connecté à mathoverflow

## Inscrivez-vous

• Mettez des vidéos en favori
• Ajoutez des vidéos à regarder plus tard &
conservez votre historique de consultation
• Commentez avec la communauté
scientifique
• Recevez des notifications de mise à jour
de vos sujets favoris
Donner son avis