00:00:00 / 00:00:00

The Expressive Power of Large Language Models

By Gabriel Peyré

Appears in collection : 3rd Edition of Mathematics for and by Large Language Models

Large language models process vast sequences of input tokens by alternating between classical multi-layer perceptron layers and self-attention mechanisms. While the approximation capabilities of perceptrons are relatively well understood, those of attention mechanisms remain less explored. In this talk, I will compare the proof techniques and approximation results associated with these two types of layers, emphasizing key open questions that connect large language models with approximation theory in infinite-dimensional spaces representing input token distributions.

Information about the video

  • Date of recording 28/05/2026
  • Date of publication 12/06/2026
  • Institution IHES
  • Language English
  • Audience Researchers
  • Format MP4

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow




Register

  • Bookmark videos
  • Add videos to see later &
    keep your browsing history
  • Comment with the scientific
    community
  • Get notification updates
    for your favorite subjects
Give feedback