2022 - T3 - WS1 - Non-linear and high dimensional inference

Collection 2022 - T3 - WS1 - Non-linear and high dimensional inference

Organisateur(s) Aamari, Eddie ; Aaron, Catherine ; Chazal, Frédéric ; Fischer, Aurélie ; Hoffmann, Marc ; Le Brigant, Alice ; Levrard, Clément ; Michel, Bertrand

Date(s) 03/10/2022 - 07/10/2022

URL associée https://indico.math.cnrs.fr/event/7545/

9 21

Convergence of Sharpness-Aware Minimization

De Peter Bartlett

We consider Sharpness-Aware Minimization (SAM), a gradient-based optimization method for deep networks that has exhibited performance improvements on image and language prediction problems. We show that when SAM is applied with a convex quadratic objective, for most random initializations it converges to a cycle that oscillates between either side of the minimum in the direction with the largest curvature, and we provide bounds on the rate of convergence. In the non-quadratic case, we show that such oscillations effectively perform gradient descent, with a smaller step-size, on the spectral norm of the Hessian. In such cases, SAM's update may be regarded as a third derivative---the derivative of the Hessian in the leading eigenvector direction---that encourages drift toward wider minima.

Informations sur la vidéo

Date de captation 06/10/2022
Date de publication 05/07/2025
Institut IHP
Licence CC BY-NC-ND
Langue Anglais
Audience Chercheurs, Doctorants
Réalisateur(s) Alexandre Duplessis, Marco Perez
Format MP4
Lieu IHP - Hermite Amphitheater

Données de citation

DOI 10.57987/IHP.2022.T3.WS1.009
Citer cette vidéo Bartlett, Peter (06/10/2022). Convergence of Sharpness-Aware Minimization. IHP. Audiovisual resource. DOI: 10.57987/IHP.2022.T3.WS1.009
URL https://dx.doi.org/10.57987/IHP.2022.T3.WS1.009

Domaine(s)

Apprentissage

Dernières questions liées sur MathOverflow

Pour poser une question, votre compte Carmin.tv doit être connecté à mathoverflow

Poser une question sur MathOverflow

Toutes les vidéos de la collection

Scaling ResNets in the Large-depth Regime

De Adeline Fermanian

Overcoming the curse of dimensionality with deep neural networks

De Sophie Langer

Learning on and near Low-Dimensional Subsets of the Wasserstein Manifold

De Alex Cloninger

Extrinsic and Intrinsic Operator Estimations for Manifold Learning

De John Harlim

A graph coupling view of dimension reduction

De Franck Picard

Topologically penalized regression on manifolds

De Wolfgang Polonik

Bayesian nonparametric estimation of a density living near an unknown manifold

De Judith Rousseau

Linear methods for non-linear inverse problems

De Botond Szabo

Convergence of Sharpness-Aware Minimization

De Peter Bartlett

Dimensionality reduction in reinforcement learning by randomisation

De Denis Belomestny

Stein effect for estimating many vector means: a "blessing of dimensionality" phenomenon

De Gilles Blanchard

On the use of overfitting for estimator selection

De Claire Lacour

Optimal Permutation estimation in crowdsourcing problems

De Nicolas Verzelen

What does LIME really see in images?

De Damien Garreau

A statistical analysis of an image classification problem

De Johannes Schmidt-Hieber

Learning a partial correlation graph using only a few covariance queries

De Vasiliki Velona

publiée le 5 avril 2024

Neural networks, wide and deep, singular kernels and Bayes optimality

De Mikhail Belkin

publiée le 5 avril 2024

Understanding the geometry of high-dimensional data through the reach

De Clément Bérenfeld

publiée le 5 avril 2024

Manifold Learning, Explanations and Eigenflows - Part 1

De Marina Meila

publiée le 5 avril 2024

Manifold Learning, Explanations and Eigenflows - Part 2

De Cécile Mailler

publiée le 5 avril 2024

On high-dimensional Lévy-driven Ornstein-Uhlenbeck processes

De Claudia Strauch

Copyright Carmin.tv 2025

Donner son avis