Learning and Optimization in Luminy - LOL2022 / Apprentissage et Optimisation à Luminy - LOL2022

Collection Learning and Optimization in Luminy - LOL2022 / Apprentissage et Optimisation à Luminy - LOL2022

Organizer(s) Boyer, Claire ; d'Aspremont, Alexandre ; Dieuleveut, Aymeric ; Moreau, Thomas ; Villar, Soledad

Date(s) 03/10/2022 - 07/10/2022

linked URL https://conferences.cirm-math.fr/2551.html

00:00:00 / 00:00:00

4 5

Non-convex SGD and Lojasiewicz-type conditions for deep learning

By Kevin Scaman

First-order non-convex optimization is at the heart of neural networks training. Recent analyses showed that the Polyak-Lojasiewicz condition is particularly well-suited to analyze the convergence of the training error for these architectures. In this short presentation, I will propose extensions of this condition that allows for more flexibility and application scenarios, and show how stochastic gradient descent converges under these conditions. Then, I will show how to use these conditions to prove the convergence of the test error for simple deep learning architectures in an online setting.

Information about the video

Date of recording 04/10/2022
Date of publication 10/11/2022
Institution CIRM
Licence CC BY NC ND
Language English
Audience Researchers, Graduate Students
Director(s) Guillaume Hennenfent
Format MP4

Citation data

DOI 10.24350/CIRM.V.19965303
Cite this video Scaman, Kevin (04/10/2022). Non-convex SGD and Lojasiewicz-type conditions for deep learning. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19965303
URL https://dx.doi.org/10.24350/CIRM.V.19965303

Domain(s)

Machine Learning

Bibliography

SCAMAN, Kevin, MALHERBE, Cédric, et DOS SANTOS, Ludovic. Convergence Rates of Non-Convex Stochastic Gradient Descent Under a Generic Lojasiewicz Condition and Local Smoothness. In : International Conference on Machine Learning. PMLR, 2022. p. 19310-19327. - https://proceedings.mlr.press/v162/scaman22a.html
ROBIN, David, SCAMAN, Kevin, LELARGE, Marc. Convergence beyond the over-parameterized regime using Rayleigh quotients. Poster. NeurIPS, 2022. -

MSC codes

68T05 Learning and adaptive systems

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow

All the collection videos

48:34

published on November 10, 2022

Private frequency estimation via projective geometry

By Jelani Nelson

35:00

published on November 10, 2022

Generalisation of some overparametrised models

By Stéphane Chrétien

35:47

published on November 10, 2022

Curiosities and counterexamples in smooth convex optimization

By Edouard Pauwels

47:22

published on November 10, 2022

Non-convex SGD and Lojasiewicz-type conditions for deep learning

By Kevin Scaman

42:41

published on November 10, 2022

Stochastic normalizing flows and the power of patches in inverse problems

By Gabriele Steidl

Copyright Carmin.tv 2025

Give feedback