00:00:00 / 00:00:00

Gradient descent for wide two-layer neural networks

By Francis Bach

Appears in collection : Optimization for Machine Learning / Optimisation pour l’apprentissage automatique

Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. Towards understanding this phenomenon, we analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations. We show that the limits of the gradient flow on exponentially tailed losses can be fully characterized as a max-margin classifier in a certain non-Hilbertian space of functions.

Information about the video

Citation data

  • DOI 10.24350/CIRM.V.19622703
  • Cite this video BACH, Francis (12/03/2020). Gradient descent for wide two-layer neural networks. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19622703
  • URL https://dx.doi.org/10.24350/CIRM.V.19622703

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow




Register

  • Bookmark videos
  • Add videos to see later &
    keep your browsing history
  • Comment with the scientific
    community
  • Get notification updates
    for your favorite subjects
Give feedback