00:00:00 / 00:00:00

Non-convex SGD and Lojasiewicz-type conditions for deep learning

By Kevin Scaman

Appears in collection : Learning and Optimization in Luminy - LOL2022 / Apprentissage et Optimisation à Luminy - LOL2022

First-order non-convex optimization is at the heart of neural networks training. Recent analyses showed that the Polyak-Lojasiewicz condition is particularly well-suited to analyze the convergence of the training error for these architectures. In this short presentation, I will propose extensions of this condition that allows for more flexibility and application scenarios, and show how stochastic gradient descent converges under these conditions. Then, I will show how to use these conditions to prove the convergence of the test error for simple deep learning architectures in an online setting.

Information about the video

Citation data

  • DOI 10.24350/CIRM.V.19965303
  • Cite this video Scaman Kevin (10/4/22). Non-convex SGD and Lojasiewicz-type conditions for deep learning. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19965303
  • URL https://dx.doi.org/10.24350/CIRM.V.19965303

Domain(s)

Bibliography

  • SCAMAN, Kevin, MALHERBE, Cédric, et DOS SANTOS, Ludovic. Convergence Rates of Non-Convex Stochastic Gradient Descent Under a Generic Lojasiewicz Condition and Local Smoothness. In : International Conference on Machine Learning. PMLR, 2022. p. 19310-19327. - https://proceedings.mlr.press/v162/scaman22a.html
  • ROBIN, David, SCAMAN, Kevin, LELARGE, Marc. Convergence beyond the over-parameterized regime using Rayleigh quotients. Poster. NeurIPS, 2022. -

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow




Register

  • Bookmark videos
  • Add videos to see later &
    keep your browsing history
  • Comment with the scientific
    community
  • Get notification updates
    for your favorite subjects
Give feedback