Universal Regular Conditional Distributions via Probabilistic Transformers

Kratsios, Anastasis
DOI: https://doi.org/10.1007/s00365-023-09635-3
2023-03-28
Constructive Approximation
Abstract:We introduce a deep learning model that can universally approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first, it linearizes inputs from a given metric space to via a feature map, then a deep feedforward neural network processes these linearized features, and then the network's outputs are then transformed to the 1-Wasserstein space via a probabilistic extension of the attention mechanism of Bahdanau et al. (Neural machine translation by jointly learning to align and translate, 2014. arXiv:1409.0473). Our model, called the probabilistic transformer (PT) , can approximate any continuous function from to uniformly on compact sets, quantitatively. We identify two ways in which the PT avoids the curse of dimensionality when approximating -valued functions. The first strategy builds functions in which can be efficiently approximated by a PT, uniformly on any given compact subset of . In the second approach, given any function f in , we build compact subsets of whereon f can be efficiently approximated by a PT.
mathematics
What problem does this paper attempt to address?