Mathematics as a Translation Task - the Importance of Training Distributions
Many problems of mathematics can be set as translation tasks: problems, represented as sentences in some language, are translated into their solutions, by language models trained from synthetic examples. In this setting, we can choose the distribution of problems and solutions we use to train the model. I present examples from three different experiments, which suggest that this can make a large difference in model performance, and provide intuition on the inner workings of transformer models.