Model-Size Reduction for Reservoir Computing by Concatenating Internal States Through Time

Yusuke Sakemi,Kai Morino,Timothée Leleu,Kazuyuki Aihara
DOI: https://doi.org/10.1038/s41598-020-78725-0
2020-06-11
Abstract:Reservoir computing (RC) is a machine learning algorithm that can learn complex time series from data very rapidly based on the use of high-dimensional dynamical systems, such as random networks of neurons, called "reservoirs." To implement RC in edge computing, it is highly important to reduce the amount of computational resources that RC requires. In this study, we propose methods that reduce the size of the reservoir by inputting the past or drifting states of the reservoir to the output layer at the current time step. These proposed methods are analyzed based on information processing capacity, which is a performance measure of RC proposed by Dambre et al. (2012). In addition, we evaluate the effectiveness of the proposed methods on time-series prediction tasks: the generalized Henon-map and NARMA. On these tasks, we found that the proposed methods were able to reduce the size of the reservoir up to one tenth without a substantial increase in regression error. Because the applications of the proposed methods are not limited to a specific network structure of the reservoir, the proposed methods could further improve the energy efficiency of RC-based systems, such as FPGAs and photonic systems.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to reduce the scale of the reservoir in Reservoir Computing (RC) without significantly degrading performance, thereby reducing the computational resources required for RC. Specifically, the author proposes a method of concatenating the internal states of the reservoir in time to reduce the size of the reservoir, making RC more suitable for application scenarios with limited computational resources such as edge computing. ### Background and Motivation Reservoir computing is a machine - learning algorithm that can quickly learn complex temporal information from data based on high - dimensional dynamical systems (such as random neural networks). However, traditional RC models usually require a large amount of computational resources, which is a great challenge in the edge - computing environment. Edge computing is characterized by limited computing power and battery capacity, so reducing the computational resource requirements of RC is crucial. ### Proposed Methods The author proposes three methods to reduce the size of the reservoir: 1. **Delay - state Concatenation**: - Increase the effective dimension by connecting past reservoir states to the current output layer. - Mathematically represented as: \[ \hat{x}(t)=\begin{pmatrix} x(t) \\ x(t - Q) \\ \vdots \\ x(t - PQ) \end{pmatrix} \] where \(P\) is the number of past states in the concatenation and \(Q\) is the delay unit. 2. **Drift - state Concatenation**: - Introduce new reservoir states (called drift states) and connect them to the current output layer. - The update formula for the drift state is: \[ x_{\text{drift}}(t';t)=\begin{cases} \tanh(W_{\text{drift}}x(t)), & \text{if }t' = 1,\\ \tanh(W_{\text{drift}}x_{\text{drift}}(t' - 1;t)), & \text{if }t'\geq2. \end{cases} \] 3. **Delay - state Concatenation with Transient States**: - Introduce transient states on the basis of delay - state concatenation, so that the reservoir state is updated twice between input and output. ### Experimental Results The author analyzed the effectiveness of these methods through Information Processing Capacity (IPC) and verified them on two time - series prediction tasks (generalized Hénon mapping and NARMA task). The experimental results show that these methods can reduce the size of the reservoir to one - tenth of the original without significantly increasing the regression error. ### Conclusion Through these methods, RC can significantly reduce the demand for computational resources while maintaining performance, so as to better adapt to resource - constrained environments such as edge computing. In addition, since these methods do not depend on a specific reservoir topology, they can be applied to various hardware implementations, such as FPGA and photonic reservoir systems.