Explainable AI through a Democratic Lens: DhondtXAI for Proportional Feature Importance Using the D'Hondt Method

Turker Berk Donmez
2024-11-08
Abstract:In democratic societies, electoral systems play a crucial role in translating public preferences into political representation. Among these, the D'Hondt method is widely used to ensure proportional representation, balancing fair representation with governmental stability. Recently, there has been a growing interest in applying similar principles of proportional representation to enhance interpretability in machine learning, specifically in Explainable AI (XAI). This study investigates the integration of D'Hondt-based voting principles in the DhondtXAI method, which leverages resource allocation concepts to interpret feature importance within AI models. Through a comparison of SHAP (Shapley Additive Explanations) and DhondtXAI, we evaluate their effectiveness in feature attribution within CatBoost and XGBoost models for breast cancer and diabetes prediction, respectively. The DhondtXAI approach allows for alliance formation and thresholding to enhance interpretability, representing feature importance as seats in a parliamentary view. Statistical correlation analyses between SHAP values and DhondtXAI allocations support the consistency of interpretations, demonstrating DhondtXAI's potential as a complementary tool for understanding feature importance in AI models. The results highlight that integrating electoral principles, such as proportional representation and alliances, into AI explainability can improve user understanding, especially in high-stakes fields like healthcare.
Artificial Intelligence,Digital Libraries,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to apply the principle of proportional representation in the democratic election system to the explainability of machine - learning models (Explainable AI, XAI), especially by introducing the D’Hondt method to improve the interpretation of feature importance. Specifically, the paper attempts to solve the following key problems: 1. **Improving the explainability of AI models**: - The paper proposes a new method - DhondtXAI, which draws on the D’Hondt method in political elections to assign importance weights to features. This method can not only more fairly reflect the impact of each feature on the model prediction, but also provide an intuitive, visualization method similar to the distribution of parliamentary seats, helping users better understand the decision - making process of the model. 2. **Combining proportional representation and coalition mechanisms**: - Traditional feature importance analysis methods (such as SHAP) may overlook the collective contributions of some secondary features. DhondtXAI, by introducing the concept of "coalition", allows features to form combinations, thereby better capturing the complex interactions between features. This helps to reveal the subtle influences that may be ignored in traditional methods. 3. **Setting thresholds to filter unimportant features**: - DhondtXAI allows setting a minimum importance threshold, and features or coalitions below this threshold will be excluded from the final explanation. This mechanism can ensure that only features with a significant impact on the model prediction are highlighted, thereby improving the clarity and readability of the explanation. 4. **Verifying the effectiveness of the new method**: - The paper evaluates the performance of DhondtXAI in explaining feature importance by conducting comparative experiments with the existing SHAP method. The experiments use CatBoost and XGBoost models to predict breast cancer and diabetes data sets respectively, and verify the consistency and complementarity of the two methods through statistical correlation analysis. 5. **Application areas**: - The research pays special attention to applications in high - risk areas such as healthcare, emphasizing that in these areas, improving the transparency and explainability of AI models is crucial for enhancing users' trust and understanding. ### Formula summary - **Information gain formula**: \[ \Delta I_{A,n}=I(n)-I'(n) \] where \(I(n)\) is the impurity before the split of node \(n\), and \(I'(n)\) is the weighted impurity after the split. - **Feature importance calculation formula**: \[ \text{Importance}_{A,\text{tree}}=\sum_{n\in N_A}\Delta I_{A,n} \] \[ \text{Importance}_{A,\text{ensemble}}=\frac{1}{|T|}\sum_{t\in T}\text{Importance}_{A,t} \] - **Initial vote distribution formula**: \[ \text{initial\_vote}_i = \frac{\text{importance}_i}{\sum_{j\in F'}\text{importance}_j}\times V \] - **Threshold calculation formula**: \[ \text{threshold\_vote}=\frac{\text{threshold}}{100}\times V \] - **Vote redistribution formula**: \[ \text{redistributed\_vote}_j=\text{initial\_vote}_j+\left(\frac{\text{importance}_j}{\sum_{k\in\text{above\_threshold}}\text{importance}_k}\right)\times\sum_{