Efficient shallow learning mechanism as an alternative to deep learning
Ofek Tevet,Ronit D. Gross,Shiri Hodassman,Tal Rogachevsky,Yarden Tzach,Yuval Meir,Ido Kanter
DOI: https://doi.org/10.1016/j.physa.2024.129513
IF: 3.778
2024-01-13
Physica A Statistical Mechanics and its Applications
Abstract:Deep learning architectures comprising tens or even hundreds of convolutional and fully-connected hidden layers differ greatly from the shallow architecture of the brain. Here, we demonstrate that by increasing the relative number of filters per layer of a generalized shallow architecture, the error rates decay as a power law to zero. Additionally, a quantitative method to measure the performance of a single filter, shows that each filter identifies small clusters of possible output labels, with additional noise selected as labels outside the clusters. This average noise per filter also decays for a given generalized architecture as a power law with an increasing number of filters per layer, forming the underlying mechanism of efficient shallow learning. The results are supported by the training of the generalized LeNet-3, VGG-5, and VGG-16 on CIFAR-100 and suggest an increase in the noise power law exponent for deeper architectures. The presented underlying shallow learning mechanism calls for its further quantitative examination using various databases and shallow architectures.
physics, multidisciplinary