Advances in Stochastic Control and Optimal Stopping with Applications in Economics and Finance / Avancées en contrôle stochastique et arrêt optimal avec applications à l'économie et à la finance

Collection Advances in Stochastic Control and Optimal Stopping with Applications in Economics and Finance / Avancées en contrôle stochastique et arrêt optimal avec applications à l'économie et à la finance

Organisateur(s) Buckdahn, Rainer ; Ferrari, Giorgio ; Grigorova, Miryana ; Quenez, Marie-Claire ; Riedel, Frank

Date(s) 12/09/2022 - 16/09/2022

URL associée https://conferences.cirm-math.fr/2600.html

00:00:00 / 00:00:00

3 5

Some recent progress for continuous-time reinforcement learning and regret analysis

Recently, reinforcement learning (RL) has attracted substantial research interests. Much of the attention and success, however, has been for the discrete time setting. Continuous-time RL, despite its natural analytical connection to stochastic controls, has been largely unexplored and with limited progress. In particular, characterizing sample efficiency for continuous-time RL algorithms with convergence rate remains a challenging and open problem. In this talk, we will discuss some recent advances in the convergence rate analysis for the episodic linear-convex RL problem, and report a regret bound of the order $O(\sqrt{N \ln N})$ for the greedy least-squares algorithm, with $N$ the number of episodes. The approach is probabilistic, involving establishing the stability of the associated forward-backward stochastic differential equation, studying the Lipschitz stability of feedback controls, and exploring the concentration properties of sub-Weibull random variables. In the special case of the linear-quadratic RL problem, the analysis reduces to the regularity and robustness of the associated Riccati equation and the sub-exponential properties of continuous-time least-squares estimators, which leads to a logarithmic regret.

Informations sur la vidéo

Date de captation 13/09/2022
Date de publication 27/09/2022
Institut CIRM
Licence CC BY NC ND
Langue Anglais
Audience Chercheurs, Doctorants
Réalisateur(s) Guillaume Hennenfent
Format MP4

Données de citation

DOI 10.24350/CIRM.V.19959403
Citer cette vidéo Guo, Xin (13/09/2022). Some recent progress for continuous-time reinforcement learning and regret analysis. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19959403
URL https://dx.doi.org/10.24350/CIRM.V.19959403

Domaine(s)

Optimisation et contrôle

Dernières questions liées sur MathOverflow

Pour poser une question, votre compte Carmin.tv doit être connecté à mathoverflow

Poser une question sur MathOverflow

Toutes les vidéos de la collection

43:43

publiée le 27 septembre 2022

Optimal reinsurance via BSDEs in a partially observable contagion model

De Claudia Ceci

01:00:25

publiée le 27 septembre 2022

Entropy, energy, and optimal couplings on Wiener space

De Hans Föllmer

52:54

publiée le 27 septembre 2022

Some recent progress for continuous-time reinforcement learning and regret analysis

De Xin Guo

38:52

publiée le 27 septembre 2022

Dependent stopping times

De Philip Protter

50:10

publiée le 27 septembre 2022

European and american options in a non-linear incomplete market with default

De Marie-Claire Quenez

Copyright Carmin.tv 2025

Donner son avis