12 videos

14 videos

1 videos

28 videos

# Collection Advances in Stochastic Control and Optimal Stopping with Applications in Economics and Finance / Avancées en contrôle stochastique et arrêt optimal avec applications à l'économie et à la finance

Organisateur(s) Buckdahn, Rainer ; Ferrari, Giorgio ; Grigorova, Miryana ; Quenez, Marie-Claire ; Riedel, Frank
Date(s) 12/09/2022 - 16/09/2022
URL associée https://conferences.cirm-math.fr/2600.html
00:00:00 / 00:00:00
3 4

## Some recent progress for continuous-time reinforcement learning and regret analysis

De Xin Guo

Recently, reinforcement learning (RL) has attracted substantial research interests. Much of the attention and success, however, has been for the discrete time setting. Continuous-time RL, despite its natural analytical connection to stochastic controls, has been largely unexplored and with limited progress. In particular, characterizing sample efficiency for continuous-time RL algorithms with convergence rate remains a challenging and open problem. In this talk, we will discuss some recent advances in the convergence rate analysis for the episodic linear-convex RL problem, and report a regret bound of the order $O(\sqrt{N \ln N})$ for the greedy least-squares algorithm, with $N$ the number of episodes. The approach is probabilistic, involving establishing the stability of the associated forward-backward stochastic differential equation, studying the Lipschitz stability of feedback controls, and exploring the concentration properties of sub-Weibull random variables. In the special case of the linear-quadratic RL problem, the analysis reduces to the regularity and robustness of the associated Riccati equation and the sub-exponential properties of continuous-time least-squares estimators, which leads to a logarithmic regret.

### Données de citation

• DOI 10.24350/CIRM.V.19959403
• Citer cette vidéo Guo Xin (13/09/2022). Some recent progress for continuous-time reinforcement learning and regret analysis. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19959403
• URL https://dx.doi.org/10.24350/CIRM.V.19959403

### Dernières questions liées sur MathOverflow

Pour poser une question, votre compte Carmin.tv doit être connecté à mathoverflow

### Poser une question sur MathOverflow

• 43:43
publiée le 27 septembre 2022

## Optimal reinsurance via BSDEs in a partially observable contagion model

De Claudia Ceci

01:00:25
publiée le 27 septembre 2022

## Entropy, energy, and optimal couplings on Wiener space

De Hans Föllmer

52:54
publiée le 27 septembre 2022

## Some recent progress for continuous-time reinforcement learning and regret analysis

De Xin Guo

38:52
publiée le 27 septembre 2022

## Dependent stopping times

De Philip Protter

## Inscrivez-vous

• Mettez des vidéos en favori
• Ajoutez des vidéos à regarder plus tard &
conservez votre historique de consultation
• Commentez avec la communauté
scientifique
• Recevez des notifications de mise à jour
de vos sujets favoris
Donner son avis