Deep Broad Learning - Big Models for Big Data

Nayyar A. Zaidi,Geoffrey I. Webb,Mark J. Carman,Francois Petitjean
DOI: https://doi.org/10.48550/arXiv.1509.01346
2015-09-04
Abstract:Deep learning has demonstrated the power of detailed modeling of complex high-order (multivariate) interactions in data. For some learning tasks there is power in learning models that are not only Deep but also Broad. By Broad, we mean models that incorporate evidence from large numbers of features. This is of especial value in applications where many different features and combinations of features all carry small amounts of information about the class. The most accurate models will integrate all that information. In this paper, we propose an algorithm for Deep Broad Learning called DBL. The proposed algorithm has a tunable parameter $n$, that specifies the depth of the model. It provides straightforward paths towards out-of-core learning for large data. We demonstrate that DBL learns models from large quantities of data with accuracy that is highly competitive with the state-of-the-art.
Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is: in the context of big data, how to construct a learning model that is both Deep and Broad to improve the accuracy of classification tasks. Specifically, the authors propose an algorithm called DBL (Deep Broad Learning), which aims to combine the advantages of deep learning and broad learning by utilizing information from a large number of features to create complex high-order interaction models, thereby achieving accuracy comparable to or higher than existing state-of-the-art methods when handling large-scale data. The paper points out that traditional machine learning algorithms are mostly developed on small-scale datasets, and therefore may encounter issues such as overfitting when dealing with big data. In contrast, big data can support very detailed and complex models that can encode higher-order multivariate distributions. However, while existing deep learning models can capture complex data structures, they may not be efficient enough when handling a large number of features. Therefore, the DBL algorithm not only considers deep interactions between features but also integrates weak information from a large number of features to achieve more accurate classification results.