A review of unsupervised learning in astronomy

Sotiria Fotopoulou

DOI: https://doi.org/10.1016/j.ascom.2024.100851

2024-06-25

Abstract:This review summarizes popular unsupervised learning methods, and gives an overview of their past, current, and future uses in astronomy. Unsupervised learning aims to organise the information content of a dataset, in such a way that knowledge can be extracted. Traditionally this has been achieved through dimensionality reduction techniques that aid the ranking of a dataset, for example through principal component analysis or by using auto-encoders, or simpler visualisation of a high dimensional space, for example through the use of a self organising map. Other desirable properties of unsupervised learning include the identification of clusters, i.e. groups of similar objects, which has traditionally been achieved by the k-means algorithm and more recently through density-based clustering such as HDBSCAN. More recently, complex frameworks have emerged, that chain together dimensionality reduction and clustering methods. However, no dataset is fully unknown. Thus, nowadays a lot of research has been directed towards self-supervised and semi-supervised methods that stand to gain from both supervised and unsupervised learning.

Instrumentation and Methods for Astrophysics,Machine Learning

What problem does this paper attempt to address?

This paper primarily focuses on the application and review of unsupervised learning methods in astronomy. Its core aim is to summarize the various unsupervised learning techniques used in the field of astronomy over the past 30 years and to explore how these techniques help extract knowledge from astronomical data. Specifically, the paper first defines the goal of unsupervised learning, which is to organize information content in a dataset without explicit labels to extract knowledge. Next, it outlines several traditional unsupervised learning methods, such as Principal Component Analysis (PCA), Autoencoders (AE), and techniques for data visualization and dimensionality reduction, such as Self-Organizing Maps (SOM). Additionally, it discusses the importance of clustering algorithms, particularly the k-means algorithm and density-based clustering methods (such as HDBSCAN). As research progresses, the paper points out the emergence of complex frameworks in recent years that combine dimensionality reduction techniques and clustering methods, as well as the increasing study of self-supervised and semi-supervised methods, which can leverage the advantages of both supervised and unsupervised learning. The paper also details the history of machine learning applications in astronomy, from early digital astronomy to the multi-wavelength era, to the period of computational mainstreaming and the machine learning revolution, and finally to the current meta-algorithm stage. Throughout this process, the paper emphasizes how advancements in hardware, software, and data availability have influenced astronomical research. Finally, the paper offers some suggestions for future applications, based on the authors' observations of existing literature and personal experience. In summary, this paper attempts to systematically review and summarize the application of unsupervised learning methods in astronomy and explores how these methods help address the big data challenges in astronomy.

A review of unsupervised learning in astronomy

A review of unsupervised learning in astronomy

Machine Learning in Astronomy: a practical overview

Enabling unsupervised discovery in astronomical images through self-supervised representations

Surveying image segmentation approaches in astronomy

Surveying the reach and maturity of machine learning and artificial intelligence in astronomy

A brief review of contrastive learning applied to astrophysics

Big Universe, Big Data: Machine Learning and Image Analysis for Astronomy

Applications of AI in Astronomy

Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy

The Dawes Review 10: The impact of deep learning for the analysis of galaxy surveys

Semi-supervised classification and clustering analysis for variable stars

Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet Transmission Spectra

Scientific Data Mining in Astronomy

An Astronomical Image Content-based Recommendation System Using Combined Deep Learning Models in a Fully Unsupervised Mode

An Astronomers Guide to Machine Learning

Astronomaly at scale: Searching for anomalies amongst 4 million galaxies

A Review Based Study on Different Techniques of AI Used in Exoplanet Detection

How Do Observational Astronomers Learn to Inspect Imaging Data

Unsupervised and lightly supervised learning in particle physics