Analysis of Deep Neural Networks with Random Tensors

De Tomohiro Hayase

Apparaît dans la collection : 2024 - PC2 - Random tensors and related topics

Combining random matrices and multilayer perceptrons (MLPs) forms the foundation theory of deep neural networks (DNN). So, what role do random tensors play in deep learning? In this talk, we introduce how random tensors appear in the analysis of the MLP-Mixer. The MLP-Mixer is a type of DNN used in image processing and is a simplified model of the Vision Transformer (ViT). In these models, input images are divided into tokens, arranged sequentially, and input as second-order tensors. The MLP-Mixer processes both within-token and between-token operations using MLP blocks. Despite its simple structure that replaces the attention mechanism of ViT with MLPs, the MLP-Mixer achieves performance close to ViT's, highlighting the importance of data volume and tokenization. Specifically, this talk presents experimental results showing that high sparsity and large hidden layer dimensions positively impact performance. To this end, we intentionally disrupt the model's structure using tensor products and random permutation matrices, verifying that these beneficial properties are not dependent on the model's specific structure.

Informations sur la vidéo

Données de citation

  • DOI 10.57987/IHP.2024.PC2.010
  • Citer cette vidéo Hayase, Tomohiro (15/10/2024). Analysis of Deep Neural Networks with Random Tensors. IHP. Audiovisual resource. DOI: 10.57987/IHP.2024.PC2.010
  • URL https://dx.doi.org/10.57987/IHP.2024.PC2.010

Domaine(s)

Dernières questions liées sur MathOverflow

Pour poser une question, votre compte Carmin.tv doit être connecté à mathoverflow

Poser une question sur MathOverflow




Inscrivez-vous

  • Mettez des vidéos en favori
  • Ajoutez des vidéos à regarder plus tard &
    conservez votre historique de consultation
  • Commentez avec la communauté
    scientifique
  • Recevez des notifications de mise à jour
    de vos sujets favoris
Donner son avis