2022 - T3 - WS1 - Non-linear and high dimensional inference

Collection 2022 - T3 - WS1 - Non-linear and high dimensional inference

Organizer(s) Aamari, Eddie ; Aaron, Catherine ; Chazal, Frédéric ; Fischer, Aurélie ; Hoffmann, Marc ; Le Brigant, Alice ; Levrard, Clément ; Michel, Bertrand

Date(s) 03/10/2022 - 07/10/2022

linked URL https://indico.math.cnrs.fr/event/7545/

1 21

Scaling ResNets in the Large-depth Regime

By Adeline Fermanian

Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these architectures relies on a training procedure that needs to be carefully crafted to avoid vanishing or exploding gradients, particularly as the depth $L$ increases. No consensus has been reached on how to mitigate this issue, although a widely discussed strategy consists in scaling the output of each layer by a factor $\alpha_L$. We show in a probabilistic setting that with standard i.i.d. initializations, the only non-trivial dynamics is for $\alpha_L = 1/ \sqrt{L}$ (other choices lead either to explosion or to identity mapping). This scaling factor corresponds in the continuous-time limit to a neural stochastic differential equation, contrarily to a widespread interpretation that deep ResNets are discretizations of neural ordinary differential equations. By contrast, in the latter regime, stability is obtained with specific correlated initializations and $\alpha_L=1/L$. Our analysis suggests a strong interplay between scaling and regularity of the weights as a function of the layer index. Finally, in a series of experiments, we exhibit a continuous range of regimes driven by these two parameters, which jointly impact performance before and after training.

Information about the video

Date of recording 03/10/2022
Date of publication 15/09/2025
Institution IHP
Licence CC BY-NC-ND
Language English
Audience Researchers, Graduate Students, Students
Director(s) Alexandre Duplessis, Marco Perez
Format MP4
Venue IHP - Hermite Amphitheater

Citation data

DOI 10.57987/IHP.2022.T3.WS1.001
Cite this video Fermanian, Adeline (03/10/2022). Scaling ResNets in the Large-depth Regime. IHP. Audiovisual resource. DOI: 10.57987/IHP.2022.T3.WS1.001
URL https://dx.doi.org/10.57987/IHP.2022.T3.WS1.001

Domain(s)

Statistics Theory

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow

All the collection videos

Scaling ResNets in the Large-depth Regime

By Adeline Fermanian

Overcoming the curse of dimensionality with deep neural networks

By Sophie Langer

Learning on and near Low-Dimensional Subsets of the Wasserstein Manifold

By Alex Cloninger

Extrinsic and Intrinsic Operator Estimations for Manifold Learning

By John Harlim

A graph coupling view of dimension reduction

By Franck Picard

Topologically penalized regression on manifolds

By Wolfgang Polonik

Bayesian nonparametric estimation of a density living near an unknown manifold

By Judith Rousseau

Linear methods for non-linear inverse problems

By Botond Szabo

Convergence of Sharpness-Aware Minimization

By Peter Bartlett

Dimensionality reduction in reinforcement learning by randomisation

By Denis Belomestny

Stein effect for estimating many vector means: a "blessing of dimensionality" phenomenon

By Gilles Blanchard

On the use of overfitting for estimator selection

By Claire Lacour

Optimal Permutation estimation in crowdsourcing problems

By Nicolas Verzelen

What does LIME really see in images?

By Damien Garreau

A statistical analysis of an image classification problem

By Johannes Schmidt-Hieber

Learning a partial correlation graph using only a few covariance queries

By Vasiliki Velona

published on April 5, 2024

Neural networks, wide and deep, singular kernels and Bayes optimality

By Mikhail Belkin

published on April 5, 2024

Understanding the geometry of high-dimensional data through the reach

By Clément Bérenfeld

published on April 5, 2024

Manifold Learning, Explanations and Eigenflows - Part 1

By Marina Meila

published on April 5, 2024

Manifold Learning, Explanations and Eigenflows - Part 2

By Cécile Mailler

published on April 5, 2024

On high-dimensional Lévy-driven Ornstein-Uhlenbeck processes

By Claudia Strauch

Copyright Carmin.tv 2025

Give feedback