Improving Decision Sparsity

Yiyang Sun,Tong Wang,Cynthia Rudin
2024-10-27
Abstract:Sparsity is a central aspect of interpretability in machine learning. Typically, sparsity is measured in terms of the size of a model globally, such as the number of variables it uses. However, this notion of sparsity is not particularly relevant for decision-making; someone subjected to a decision does not care about variables that do not contribute to the decision. In this work, we dramatically expand a notion of decision sparsity called the Sparse Explanation Value(SEV) so that its explanations are more meaningful. SEV considers movement along a hypercube towards a reference point. By allowing flexibility in that reference and by considering how distances along the hypercube translate to distances in feature space, we can derive sparser and more meaningful explanations for various types of function classes. We present cluster-based SEV and its variant tree-based SEV, introduce a method that improves credibility of explanations, and propose algorithms that optimize decision sparsity in machine learning models.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to improve the decision sparsity in the decision - making process of machine - learning models, that is, to reduce the amount of information crucial to individual decisions. Specifically, the authors introduce and extend the concept of "Sparse Explanation Value" (SEV) to provide more concise and more meaningful explanations. Traditionally, sparsity is measured from a global perspective, such as the number of variables used in the model. However, this global sparsity is not particularly relevant to individuals affected by the model's decisions. Instead, individuals are more concerned with the local sparsity related to their own decisions. ### The main contributions of the paper include: 1. **Extension of the SEV concept**: - Proposed cluster - based SEV and tree - based SEV to improve the proximity and credibility of explanations. - By adjusting the position of the reference point, proposed flexible reference SEV to further optimize sparsity. 2. **Proposing new optimization algorithms**: - Proposed two methods to optimize machine - learning models to minimize their average SEV without sacrificing prediction performance. One is a gradient - based optimization method, and the other is a search - based method. 3. **Enhancing the credibility of explanations**: - By ensuring that the explanations are located in the high - density area of negative - class samples, the credibility of the explanations is improved. ### Specific problem descriptions: - **Sparsity problem**: Traditional global sparsity measurement methods (such as the number of variables used in the model) are not relevant enough for individual decision - makers. Individuals are more concerned with which factors directly affect their decision results. - **Credibility problem of explanations**: Many counterfactual explanations are effective but often unnatural because they may involve unrealistic feature changes (for example, changing the credit history of a 21 - year - old applicant from 3 years to 15 years). - **Optimizing the SEV of the model**: How to train a classifier to minimize the SEV of each query while maintaining prediction performance. ### Solutions: - **Clustering and tree structures**: Generate multiple reference points through clustering algorithms and use tree structures to find the closest reference point, thereby improving the proximity and sparsity of explanations. - **Flexible reference points**: Allow the reference point to move within a certain range to further reduce the SEV. - **Optimization algorithms**: Designed two optimization algorithms, one based on gradient optimization and the other based on search, to minimize the average SEV of the model. These improvements enable the SEV framework to provide sparser, closer, and more credible explanations for various types of function classes, thereby improving the interpretability and transparency of machine - learning models in practical applications.