TIDMAD: Time Series Dataset for Discovering Dark Matter with AI Denoising

J. T. Fry,Aobo Li,Lindley Winslow,Xinyi Hope Fu,Zhenghao Fu,Kaliroe M. W. Pappas
2024-06-06
Abstract:Dark matter makes up approximately 85% of total matter in our universe, yet it has never been directly observed in any laboratory on Earth. The origin of dark matter is one of the most important questions in contemporary physics, and a convincing detection of dark matter would be a Nobel-Prize-level breakthrough in fundamental science. The ABRACADABRA experiment was specifically designed to search for dark matter. Although it has not yet made a discovery, ABRACADABRA has produced several dark matter search results widely endorsed by the physics community. The experiment generates ultra-long time-series data at a rate of 10 million samples per second, where the dark matter signal would manifest itself as a sinusoidal oscillation mode within the ultra-long time series. In this paper, we present the TIDMAD -- a comprehensive data release from the ABRACADABRA experiment including three key components: an ultra-long time series dataset divided into training, validation, and science subsets; a carefully-designed denoising score for direct model benchmarking; and a complete analysis framework which produces a community-standard dark matter search result suitable for publication as a physics paper. This data release enables core AI algorithms to extract the signal and produce real physics results thereby advancing fundamental science. The data downloading and associated analysis scripts are available at <a class="link-external link-https" href="https://github.com/jessicafry/TIDMAD" rel="external noopener nofollow">this https URL</a>
Instrumentation and Methods for Astrophysics,Machine Learning,High Energy Physics - Experiment
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to detect dark matter signals from ultra - long - time - series data through AI noise - reduction technology, thereby promoting the research on dark matter. Specifically, the main objectives of the paper include: 1. **Improve the sensitivity of dark matter signal detection**: - Dark matter accounts for approximately 85% of the matter in the universe, but has not yet been directly observed in any laboratory on Earth. The ABRACADABRA experiment aims to search for dark matter, but in its generated ultra - long - time - series data (10 million samples per second), dark matter signals are usually drowned out by various noises. - In order to improve the detection sensitivity, researchers introduced machine - learning (ML) noise - reduction technology, hoping to more effectively extract potential dark matter signals. 2. **Provide a comprehensive data set and evaluation framework**: - The paper introduced TIDMAD (Time Series Dataset for Discovering Dark Matter with AI Denoising), which is an ultra - long - time - series data set generated by the ABRACADABRA experiment. This data set is divided into training, validation, and scientific subsets for training and evaluating noise - reduction algorithms. - TIDMAD also includes a carefully designed noise - reduction scoring system for directly comparing the effects of different models and finally generating dark matter search results that meet the standards of the physics community. 3. **Promote the improvement of dark matter limits**: - By applying AI noise - reduction algorithms, researchers can set stricter dark matter limits on data collected in a relatively short time (for example, 24 hours). Although the amount of data in TIDMAD is only 1% of that in ABRA Run 3, after noise - reduction processing, its limit level is close to or even exceeds the results of ABRA Run 3. - This shows that AI noise - reduction technology significantly improves the sensitivity of dark matter detection, providing new possibilities for future larger - scale experiments. In summary, by introducing AI noise - reduction technology, this paper aims to improve the sensitivity of dark matter signal detection, and provides a comprehensive data set and evaluation framework for dark matter research, thus promoting the progress in this field.