Abstract:Squared tensor networks (TNs) and their generalization as parameterized computational graphs -- squared circuits -- have been recently used as expressive distribution estimators in high dimensions. However, the squaring operation introduces additional complexity when marginalizing variables or computing the partition function, which hinders their usage in machine learning applications. Canonical forms of popular TNs are parameterized via unitary matrices as to simplify the computation of particular marginals, but cannot be mapped to general circuits since these might not correspond to a known TN. Inspired by TN canonical forms, we show how to parameterize squared circuits to ensure they encode already normalized distributions. We then use this parameterization to devise an algorithm to compute any marginal of squared circuits that is more efficient than a previously known one. We conclude by formally showing the proposed parameterization comes with no expressiveness loss for many circuit classes.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the efficiency problem of squared circuits in calculating marginal probabilities and partition functions. Specifically, due to the introduction of the squaring operation, squared circuits have increased complexity when performing variable marginalization or calculating partition functions, which hinders their use in machine - learning applications. #### Main problems 1. **High computational complexity**: The squaring operation of squared circuits significantly increases the time complexity of calculating marginal probabilities and partition functions. Especially for large - scale data sets, this computational cost becomes unbearable. 2. **Lack of a canonical form**: Traditional tensor networks (TNs) can simplify the probability calculation of specific margins by adopting canonical forms, such as using unitary matrix parameterization. However, these methods cannot be directly applied to squared circuits because squared circuits may not be mapped to known tensor network structures. #### Solutions To solve these problems, the author proposes the following methods: 1. **Orthogonal parameterization**: By introducing orthonormal circuits, ensure that the squared circuits encode a already - normalized distribution. Specifically, the input layer encodes orthogonal functions, and the summation layer is parameterized by (semi -) unitary matrices. - **Definition 3 (Orthogonal circuit)**: - Each input layer encodes a set of orthogonal functions, that is, \(\int_{\text{dom}(X)} f_i(x) f_j^*(x) dx=\delta_{ij}\), where \(\delta_{ij}\) is the Kronecker delta. - Each summation layer is parameterized by a (semi -) unitary matrix \(W\in\mathbb{C}^{K_1\times K_2}\) that satisfies \(WW^\dagger = I_{K_1}\) or the rows of \(W\) are orthogonal. 2. **A more efficient marginal calculation algorithm**: Based on the properties of orthonormal circuits, a new algorithm is proposed to calculate any marginal probability with a lower time complexity than existing methods. - **Theorem 1**: For a structurally decomposed orthonormal circuit \(c\), the time complexity of calculating the marginal likelihood \(p(y)=\int_{\text{dom}(Z)} |c(y, z)|^2 dz\) is \(O(|\phi_Y|S + |\phi_{Y,Z}|S^2)\), where \(\phi_Y\) and \(\phi_{Y,Z}\) represent the sets of layers that depend only on \(Y\) and on both \(Y\) and \(Z\), respectively. 3. **Maintaining expressiveness**: It is proved that orthonormal circuits do not lose expressiveness, that is, a general circuit can be converted into an equivalent orthonormal circuit by a polynomial - time algorithm. - **Theorem 2**: For a tensored circuit \(c\), if each input layer encodes a set of orthogonal functions, then there exists a polynomial - time algorithm that returns an equivalent orthonormal circuit \(c'\) such that \(c'(X) = Z^{-1}_2 c(X)\), where \(Z=\int_{\text{dom}(X)} |c(x)|^2 dx\). Through these improvements, the paper provides an effective method to accelerate the marginal calculation of squared circuits and ensures that the expressiveness of the model is not affected. This makes squared circuits more practical in tasks that require fast marginal calculation, such as lossless compression and sampling.

On Faster Marginalization with Squared Circuits via Orthonormalization

Sum of Squares Circuits

On the Relationship Between Monotone and Squared Probabilistic Circuits

Subtractive Mixture Models via Squaring: Representation and Learning

What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)?

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

A joint optimization approach of parameterized quantum circuits with a tensor network

Encoding of Probability Distributions for Quantum Monte Carlo Using Tensor Networks

Convolutions Through the Lens of Tensor Networks

Polynomial Semantics of Tractable Probabilistic Circuits

Capacity and quantum geometry of parametrized quantum circuits

Circuit Model Reduction with Scaled Relative Graphs

Computing exact moments of local random quantum circuits via tensor networks

Symmetric Tensor Networks for Generative Modeling and Constrained Combinatorial Optimization

Approximation Theory of Tree Tensor Networks: Tensorized Univariate Functions -- Part II

Sparse Probabilistic Circuits via Pruning and Growing

Universal scaling laws in quantum-probabilistic machine learning by tensor network towards interpreting representation and generalization powers

Cons-training tensor networks

Quantum-Classical Computing via Tensor Networks

Convolutions and More as Einsum: A Tensor Network Perspective with Advances for Second-Order Methods

Probabilistic Inference in the Era of Tensor Networks and Differential Programming