Enhanced Utility-Driven Data Anonymization: Leveraging AI and Machine Learning for Sensitive Data Privacy

Saidaiah Yechuri
DOI: https://doi.org/10.60087/jaigs.v1i1.229
2024-01-22
Abstract:As the transition to electronic data formats continues, ensuring privacy while maintaining the utility of sensitive data, such as medical records, remains a critical challenge. This paper introduces an enhanced utility-driven data anonymization method that leverages AI and machine learning techniques to optimize data utility during the anonymization process. Specifically, we propose integrating AI-driven feature selection to dynamically assign importance scores to attributes, improving upon traditional generalization and suppression techniques. Additionally, machine learning models are utilized to predict the impact of anonymization on data utility, enabling a more precise balance between privacy protection and research value. Our approach not only ensures compliance with k-anonymity but also integrates differential privacy mechanisms using AI to minimize information loss. Experimental results demonstrate that our method scales efficiently with large datasets, while ML-based evaluation consistently outperforms traditional methods in preserving critical data patterns essential for research and analytics. This fusion of AI and ML into the anonymization process promises a new frontier in privacy-preserving data sharing, particularly in domains like healthcare and public policy, where data utility is paramount.
What problem does this paper attempt to address?