Mapping the Learning Curves of Deep Learning Networks

Yanru Jiang,Rick Dale
DOI: https://doi.org/10.1101/2024.07.01.601491
2024-07-04
Abstract:There is an important challenge in systematically interpreting the internal representations of deep neural networks. This study introduces a multi-dimensional quantification and visualization approach which can capture two temporal dimensions of a model learning experience: the "information processing trajectory" and the "developmental trajectory." The former represents the influence of incoming signals on an agent's decision-making, while the latter conceptualizes the gradual improvement in an agent's performance throughout its lifespan. Tracking the learning curves of a DNN enables researchers to explicitly identify the model appropriateness of a given task, examine the properties of the underlying input signals, and assess the model's alignment (or lack thereof) with human learning experiences. To illustrate the method, we conducted 750 runs of simulations on two temporal tasks: gesture detection and natural language processing (NLP) classification, showcasing its applicability across a spectrum of deep learning tasks. Based on the quantitative analysis of the learning curves across two distinct datasets, we have identified three insights gained from mapping these curves: nonlinearity, pairwise comparisons, and domain distinctions. We reflect on the theoretical implications of this method for cognitive processing, language models and multimodal representation.
Scientific Communication and Education
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenge of systematically explaining the internal representations of deep neural networks. Specifically, the research introduced a multi - dimensional quantification and visualization method, which can capture two time dimensions in the model learning process: "information processing trajectory" and "development trajectory". The former represents the influence of input signals on agent decisions, and the latter describes the process in which the agent gradually improves its performance during its life cycle. By tracing the learning curves of deep neural networks (DNN), researchers can clearly identify the applicability of the model in specific tasks, examine the characteristics of the underlying input signals, and evaluate the consistency or differences between the model and human learning experiences. The paper demonstrated the applicability of this method in various deep - learning tasks by conducting 750 simulation runs on two time - based tasks - gesture detection and natural - language - processing classification. Based on the quantitative analysis of the learning curves of two different datasets, the research identified three main insights: non - linearity, pairwise comparison, and domain distinction. These findings not only help in understanding the model's learning process but also provide profound insights into the theoretical significance of cognitive processing, language models, and multi - modal representations.