2022 - T3 - WS1 - Non-Linear and High Dimensional Inference

Collection 2022 - T3 - WS1 - Non-Linear and High Dimensional Inference

Organizer(s) Aamari, Eddie ; Aaron, Catherine ; Chazal, Frédéric ; Fischer, Aurélie ; Hoffmann, Marc ; Le Brigant, Alice ; Levrard, Clément ; Michel, Bertrand
Date(s) 03/10/2022 - 07/10/2022
linked URL https://indico.math.cnrs.fr/event/7545/
1 21

Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these architectures relies on a training procedure that needs to be carefully crafted to avoid vanishing or exploding gradients, particularly as the depth $L$ increases. No consensus has been reached on how to mitigate this issue, although a widely discussed strategy consists in scaling the output of each layer by a factor $\alpha_L$. We show in a probabilistic setting that with standard i.i.d. initializations, the only non-trivial dynamics is for $\alpha_L = 1/ \sqrt{L}$ (other choices lead either to explosion or to identity mapping). This scaling factor corresponds in the continuous-time limit to a neural stochastic differential equation, contrarily to a widespread interpretation that deep ResNets are discretizations of neural ordinary differential equations. By contrast, in the latter regime, stability is obtained with specific correlated initializations and $\alpha_L=1/L$. Our analysis suggests a strong interplay between scaling and regularity of the weights as a function of the layer index. Finally, in a series of experiments, we exhibit a continuous range of regimes driven by these two parameters, which jointly impact performance before and after training.

Information about the video

Citation data

  • DOI 10.57987/IHP.2022.T3.WS1.001
  • Cite this video Fermanian, Adeline (03/10/2022). Scaling ResNets in the Large-depth Regime. IHP. Audiovisual resource. DOI: 10.57987/IHP.2022.T3.WS1.001
  • URL https://dx.doi.org/10.57987/IHP.2022.T3.WS1.001

Domain(s)

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow




Register

  • Bookmark videos
  • Add videos to see later &
    keep your browsing history
  • Comment with the scientific
    community
  • Get notification updates
    for your favorite subjects
Give feedback