aeon: a Python toolkit for learning from time series

Matthew Middlehurst,Ali Ismail-Fawaz,Antoine Guillaume,Christopher Holder,David Guijo Rubio,Guzal Bulatova,Leonidas Tsaprounis,Lukasz Mentel,Martin Walter,Patrick Schäfer,Anthony Bagnall
2024-06-20
Abstract:aeon is a unified Python 3 library for all machine learning tasks involving time series. The package contains modules for time series forecasting, classification, extrinsic regression and clustering, as well as a variety of utilities, transformations and distance measures designed for time series data. aeon also has a number of experimental modules for tasks such as anomaly detection, similarity search and segmentation. aeon follows the scikit-learn API as much as possible to help new users and enable easy integration of aeon estimators with useful tools such as model selection and pipelines. It provides a broad library of time series algorithms, including efficient implementations of the very latest advances in research. Using a system of optional dependencies, aeon integrates a wide variety of packages into a single interface while keeping the core framework with minimal dependencies. The package is distributed under the 3-Clause BSD license and is available at <a class="link-external link-https" href="https://github.com/" rel="external noopener nofollow">this https URL</a> aeon-toolkit/aeon. This version was submitted to the JMLR journal on 02 Nov 2023 for v0.5.0 of aeon. At the time of this preprint aeon has released v0.9.0, and has had substantial changes.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to provide a unified, easy - to - use and comprehensive Python toolkit for time - series machine learning (TSML) tasks. Specifically, the aeon toolkit aims to: 1. **Unified Interface**: Provide a unified Python 3 library for all machine - learning tasks involving time series, including tasks such as time - series prediction, classification, exogenous regression and clustering. 2. **Rich Algorithm Library**: Provide a wide range of time - series algorithm libraries, including efficient implementations of the latest research progress, ensuring that users can use the most advanced technologies. 3. **Modular Design**: Separate different tasks (such as prediction, classification, clustering, etc.) through modular design, and at the same time provide support modules (such as distance measurement and transformation) to enhance flexibility and extensibility. 4. **Compatibility and Integration**: Follow the API style of scikit - learn as much as possible so that new users can easily get started and can be seamlessly integrated with existing model selection and pipeline tools. 5. **Experimental Modules**: Introduce some experimental modules, such as anomaly detection, similarity search and segmentation, to explore more possibilities and promote the development of the field. 6. **Dependency Management**: Keep the core framework of the toolkit concise through core - dependency minimization and optional - dependency mechanisms, while integrating the functions of multiple external packages. Through these efforts, aeon hopes to become the most comprehensive time - series machine - learning toolkit, promoting reproducible research and community collaboration in different research fields.