Banach-Tarski Embeddings and Transformers

Joshua Maher
DOI: https://doi.org/10.48550/arXiv.2311.09387
2023-11-22
Abstract:We introduce a new construction of embeddings of arbitrary recursive data structures into high dimensional vectors. These embeddings provide an interpretable model for the latent state vectors of transformers. We demonstrate that these embeddings can be decoded to the original data structure when the embedding dimension is sufficiently large. This decoding algorithm has a natural implementation as a transformer. We also show that these embedding vectors can be manipulated directly to perform computations on the underlying data without decoding. As an example we present an algorithm that constructs the embedded parse tree of an embedded token sequence using only vector operations in embedding space.
Machine Learning,Computation and Language,Data Structures and Algorithms
What problem does this paper attempt to address?