Three Problems in the Mathematics of Deep Learning

By Andrew Dudzik

Appears in collection : Mathematics for and by Large Language Models

Neural networks, particularly LLMs, are notoriously poor at algorithmic tasks, such as sorting, shortest path, and even basic arithmetic. Across three papers, we explored the problem of "aligning" architectures to classical computer programs, and showed that this question relates to familiar mathematical concepts: polynomial functors, cohomology, and higher categories.

Information about the video

  • Date of recording 23/05/2024
  • Date of publication 25/05/2024
  • Institution IHES
  • Language English
  • Audience Researchers
  • Format MP4

