Machine learning with tree tensor networks, CP rank constraints, and tensor dropout

Hao Chen,Thomas Barthel
2023-05-31
Abstract:Tensor networks approximate order-$N$ tensors with a reduced number of degrees of freedom that is only polynomial in $N$ and arranged as a network of partially contracted smaller tensors. As suggested in [<a class="link-https" data-arxiv-id="2205.15296" href="https://arxiv.org/abs/2205.15296">arXiv:2205.15296</a>] in the context of quantum many-body physics, computation costs can be further substantially reduced by imposing constraints on the canonical polyadic (CP) rank of the tensors in such networks. Here we demonstrate how tree tensor networks (TTN) with CP rank constraints and tensor dropout can be used in machine learning. The approach is found to outperform other tensor-network based methods in Fashion-MNIST image classification. A low-rank TTN classifier with branching ratio $b=4$ reaches test set accuracy 90.3\% with low computation costs. Consisting of mostly linear elements, tensor network classifiers avoid the vanishing gradient problem of deep neural networks. The CP rank constraints have additional advantages: The number of parameters can be decreased and tuned more freely to control overfitting, improve generalization properties, and reduce computation costs. They allow us to employ trees with large branching ratios which substantially improves the representation power.
Machine Learning,Strongly Correlated Electrons
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use Tree Tensor Networks (TTN) in combination with CP - rank constraints and tensor dropout techniques to improve the performance of machine - learning tasks, especially image - classification tasks. Specifically, the paper explores how to reduce computational costs and improve generalization performance through these techniques, and achieves a 90.3% test - set accuracy on the Fashion - MNIST dataset. In addition, the paper also discusses the advantages of low - rank TTN over full - rank TTN, such as a reduction in the number of parameters, alleviation of over - fitting, and reduction in computational costs. These improvements make low - rank TTN more efficient and effective when dealing with high - dimensional data.