Monte Carlo guided Diffusion for Bayesian linear inverse problems
De Sylvain Le Corff
Linear and nonlinear schemes for forward model reduction and inverse problems - Lecture 1
De Olga Mula Hernandez
Apparaît dans la collection : 2022 - T3 - WS1 - Non-Linear and High Dimensional Inference
We consider Sharpness-Aware Minimization (SAM), a gradient-based optimization method for deep networks that has exhibited performance improvements on image and language prediction problems. We show that when SAM is applied with a convex quadratic objective, for most random initializations it converges to a cycle that oscillates between either side of the minimum in the direction with the largest curvature, and we provide bounds on the rate of convergence. In the non-quadratic case, we show that such oscillations effectively perform gradient descent, with a smaller step-size, on the spectral norm of the Hessian. In such cases, SAM's update may be regarded as a third derivative---the derivative of the Hessian in the leading eigenvector direction---that encourages drift toward wider minima.