A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions

Zemin Liu,Yuan Li,Nan Chen,Qian Wang,Bryan Hooi,Bingsheng He
2023-08-29
Abstract:Graphs represent interconnected structures prevalent in a myriad of real-world scenarios. Effective graph analytics, such as graph learning methods, enables users to gain profound insights from graph data, underpinning various tasks including node classification and link prediction. However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess abundant data while others are scarce, thereby leading to biased learning outcomes. This necessitates the emerging field of imbalanced learning on graphs, which aims to correct these data distribution skews for more accurate and representative learning outcomes. In this survey, we embark on a comprehensive review of the literature on imbalanced learning on graphs. We begin by providing a definitive understanding of the concept and related terminologies, establishing a strong foundational understanding for readers. Following this, we propose two comprehensive taxonomies: (1) the problem taxonomy, which describes the forms of imbalance we consider, the associated tasks, and potential solutions; (2) the technique taxonomy, which details key strategies for addressing these imbalances, and aids readers in their method selection process. Finally, we suggest prospective future directions for both problems and techniques within the sphere of imbalanced learning on graphs, fostering further innovation in this critical area.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily focuses on the issue of Imbalanced Learning on Graphs (ILoGs). Specifically, the paper attempts to address the following aspects: 1. **Imbalance Phenomenon**: - In practical applications, graph data often exhibit an imbalance distribution problem, where some categories of sample data are abundant while others are scarce. This imbalance can lead to models performing well on high-resource groups but poorly on low-resource groups, resulting in uneven learning outcomes. 2. **Types of Imbalance in Graph Learning**: - The unique nature of graph data makes it not always feasible to directly apply traditional imbalanced learning methods. The paper discusses various types of imbalances in graph data, including node-level, edge-level, and graph-level category imbalances as well as structural imbalances. 3. **Review and Classification of Existing Research**: - The paper proposes two comprehensive classification systems: a problem classification system and a technical classification system. The problem classification system describes different forms of imbalances, their related tasks, and potential solutions; the technical classification system details key strategies for addressing these imbalances and helps readers choose appropriate methods. 4. **Future Research Directions**: - The paper also explores future research directions for imbalanced learning on graphs, aiming to promote further innovation in this important field. Through the above content, the paper aims to provide a systematic framework for imbalanced graph learning, helping researchers better understand and address the imbalance issues in graph data.