Hybrid deep additive neural networks

Gyu Min Kim,Jeong Min Jeon
2024-11-14
Abstract:Traditional neural networks (multi-layer perceptrons) have become an important tool in data science due to their success across a wide range of tasks. However, their performance is sometimes unsatisfactory, and they often require a large number of parameters, primarily due to their reliance on the linear combination structure. Meanwhile, additive regression has been a popular alternative to linear regression in statistics. In this work, we introduce novel deep neural networks that incorporate the idea of additive regression. Our neural networks share architectural similarities with Kolmogorov-Arnold networks but are based on simpler yet flexible activation and basis functions. Additionally, we introduce several hybrid neural networks that combine this architecture with that of traditional neural networks. We derive their universal approximation properties and demonstrate their effectiveness through simulation studies and a real-data application. The numerical results indicate that our neural networks generally achieve better performance than traditional neural networks while using fewer parameters.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve the problems of poor performance and excessive parameters of traditional neural networks (multi - layer perceptrons) when dealing with complex non - linear relationships. Specifically, the author points out the following problems: 1. **Insufficient performance**: Traditional neural networks are sometimes unable to satisfactorily capture the complex non - linear relationships between predictor variables and response variables. 2. **Redundant parameters**: Traditional neural networks usually require a large number of parameters, which leads to high computational and memory requirements. These problems mainly stem from the insufficient non - linear ability of each node. To solve these problems, the author introduces a new deep neural network architecture, namely **Hybrid Deep Additive Neural Networks (HDANNs)**. This network combines the idea of additive regression and aims to improve computational efficiency and model performance by using simpler activation functions and basis functions. ### Specific improvement measures 1. **Introducing additive structure**: Drawing on the idea of additive regression, each node can depend on the sum of non - linear functions of input variables, so as to better capture complex non - linear relationships. 2. **Simplifying basis functions**: Using simple and flexible basis functions makes calculation and implementation easier. 3. **Hybrid architecture**: Several hybrid neural networks are proposed, combining the above - mentioned new networks with traditional neural networks, effectively avoiding over - fitting. 4. **Universal approximation property**: It is proved that these new networks have the universal approximation property for any continuous function, thus ensuring their theoretical validity. ### Experimental verification The author verifies the effectiveness of the proposed network through simulation studies and real - data applications. The experimental results show that, compared with traditional neural networks, the new network not only performs better in prediction error, but also uses a smaller number of parameters, thus significantly reducing the computational cost. In general, this paper aims to improve the performance and efficiency of neural networks by introducing additive structures and simplifying basis functions while maintaining good theoretical properties.