Abstract:App reviews in mobile app stores contain useful information which is used to improve applications and promote software evolution. This information is processed by automatic tools which prioritize reviews. In order to carry out this prioritization, reviews are decomposed into features like category and sentiment. Then, a weighted function assigns a weight to each feature and a review ranking is calculated. Unfortunately, in order to extract category and sentiment from reviews, its is required at least a classifier trained in an annotated corpus. Therefore this task is computational demanding. Thus, in this work, we propose Shannon Entropy as a simple feature which can replace standard features. Our results show that a Shannon Entropy based ranking is better than a standard ranking according to the NDCG metric. This result is promising even if we require fairness by means of algorithmic bias. Finally, we highlight a computational limit which appears in the search of the best ranking.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is: when dealing with user feedback (especially application reviews), how to more effectively extract and utilize useful information to improve the application and enhance the process of software evolution. Specifically, the paper explores the following issues: 1. **Optimal Weight Combination**: - What is the optimal combination of feature weights so that, according to the NDCG (Normalized Discounted Cumulative Gain) metric, the generated review rankings are closest to the rankings manually labeled by experts? - Expressed by the formula: \[ R(c)=\sum_{i = 1}^{4}w_i\cdot f_i(c) \] where \(w_i\) represents the weight of the \(i\)-th feature, and \(f_i(c)\) represents the scoring factor of the \(i\)-th feature. 2. **Effectiveness of Shannon Entropy as a Feature**: - Can Shannon entropy replace traditional features (such as category and sentiment) for review ranking? The paper verifies the effectiveness of Shannon entropy as a feature through experiments and finds that it performs better than traditional features. - The specific formula is: \[ H(X)=-\sum_{i = 1}^{n}p(x_i)\log_2 p(x_i) \] where \(p(x_i)\) is the probability of the occurrence of the character \(x_i\). 3. **Impact of Calculation Precision on Performance**: - When increasing the precision of weight calculation (for example, from two decimal places to three decimal places), will the computational resources and time requirements exceed the practically feasible range? Research shows that using three - decimal - place precision will lead to a sharp increase in the number of combinations, resulting in a significant increase in calculation time and disk space. 4. **Algorithm Bias and Its Mitigation**: - How to detect and mitigate national biases generated by algorithms? The paper uses the AIF 360 tool to detect biases and applies the re - weighting algorithm for mitigation, but finds that this will reduce the NDCG value. ### Main Conclusions - **Advantages of Shannon Entropy as a Feature**: Shannon entropy can effectively replace traditional category and sentiment features, and is simple to calculate without the need for complex machine - learning models and annotated corpora. - **Computational Resource Limitations**: As the weight precision increases, the computational resource requirements grow exponentially, and it becomes infeasible when reaching three - decimal - place precision. - **Bias and Fairness**: Although Shannon entropy improves the ranking accuracy, there are still biases among different countries, and further research is needed on how to optimize the ranking while maintaining fairness. These research results provide new ideas for future user - feedback processing, especially in terms of feature selection and computational efficiency.

Shannon Entropy is better Feature than Category and Sentiment in User Feedback Processing

Mining software insights: uncovering the frequently occurring issues in low-rating software applications

Novel feature selection approaches for improving the performance of sentiment classification

On the automatic classification of app reviews

Generalizing Machine Learning Evaluation through the Integration of Shannon Entropy and Rough Set Theory

Personalized Review Ranking for Improving Shopper's Decision Making: A Term Frequency based Approach

A Novel Dual of Shannon Information and Weighting Scheme

The Good, The Bad & The Ugly Features: A Meta-analysis on User Review About Food Journaling Apps

Improving Review Representations with User Attention and Product Attention for Sentiment Classification

On the Emotion of Users in App Reviews

Using Entropy for Group Sampling in Pairwise Ranking from Implicit Feedback

Feature-Level Rating System Using Customer Reviews and Review Votes

Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

A Novel Product Ranking Approach Considering Sentiment Intensity Distribution of Online Reviews

Integrated shannon entropy and COPRAS optimal model-based recommendation framework

User Bias Removal in Review Score Prediction

User Personalization based Product Ranking using Sentimental Reviews

Perceiving University Student's Opinions from Google App Reviews

A HYBRID DEEP LEARNING APPROACH FOR SENTIMENT ANALYSIS IN PRODUCT REVIEWS

CDNB: CAVIAR-Dragonfly Optimization with Naive Bayes for the Sentiment and Affect Analysis in Social Media

Neighbour adjusted dispersive flies optimization based deep hybrid sentiment analysis framework