Non-convex SGD and Lojasiewicz-type conditions for deep learning | Vidéo | Carmin.tv

00:00:00 / 00:00:00

Non-convex SGD and Lojasiewicz-type conditions for deep learning

By Kevin Scaman

Appears in collection : Learning and Optimization in Luminy - LOL2022 / Apprentissage et Optimisation à Luminy - LOL2022

First-order non-convex optimization is at the heart of neural networks training. Recent analyses showed that the Polyak-Lojasiewicz condition is particularly well-suited to analyze the convergence of the training error for these architectures. In this short presentation, I will propose extensions of this condition that allows for more flexibility and application scenarios, and show how stochastic gradient descent converges under these conditions. Then, I will show how to use these conditions to prove the convergence of the test error for simple deep learning architectures in an online setting.

Information about the video

Date of recording 04/10/2022
Date of publication 10/11/2022
Institution CIRM
Licence CC BY NC ND
Language English
Audience Researchers, Graduate Students
Director(s) Guillaume Hennenfent
Format MP4

Citation data

DOI 10.24350/CIRM.V.19965303
Cite this video Scaman, Kevin (04/10/2022). Non-convex SGD and Lojasiewicz-type conditions for deep learning. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19965303
URL https://dx.doi.org/10.24350/CIRM.V.19965303

Domain(s)

Machine Learning

Bibliography

SCAMAN, Kevin, MALHERBE, Cédric, et DOS SANTOS, Ludovic. Convergence Rates of Non-Convex Stochastic Gradient Descent Under a Generic Lojasiewicz Condition and Local Smoothness. In : International Conference on Machine Learning. PMLR, 2022. p. 19310-19327. - https://proceedings.mlr.press/v162/scaman22a.html
ROBIN, David, SCAMAN, Kevin, LELARGE, Marc. Convergence beyond the over-parameterized regime using Rayleigh quotients. Poster. NeurIPS, 2022. -

MSC codes

68T05 Learning and adaptive systems

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow

Copyright Carmin.tv 2024

Give feedback