Exploring Artificial Intelligence Architecture in Data Cleaning Based on Bayesian Networks

Suzhen Zhang,Yuechun Wang,Qing Lv
DOI: https://doi.org/10.1155/2022/6731781
2022-09-14
Advances in Multimedia
Abstract:In order to further improve the technical level of data cleaning and data mining and better avoid the defects of uncertain knowledge expression in traditional Bayesian networks, a Bayesian network algorithm based on combined data cleaning and mining technology is proposed, and a manual functional data cleaning architecture based on Hadoop is constructed. The results show that the traditional neighbor sorting algorithm with window size of 5 takes the least time to process the same amount of data. The nearest neighbor sorting algorithm with window size 7 is always the longest. The time consumption of the nonfixed window nearest neighbor sorting algorithm is similar to that of the traditional nearest neighbor sorting algorithm with a window size of 5. However, with the increase of data volume, the consumption time increases rapidly until it approaches the consumption time of the traditional sorting nearest neighbor algorithm with window size of 7. Therefore, the algorithm can improve the precision of data cleaning at the expense of cleaning speed, which proves that the artificial intelligence architecture based on combined data significantly improves the efficiency of the algorithm and can effectively analyze and process large data sets.
What problem does this paper attempt to address?