Anomalies resolution and semantification of tabular data

Sumit Sharma,Sarika Jain
DOI: https://doi.org/10.1007/s11227-024-06147-0
IF: 3.3
2024-05-12
The Journal of Supercomputing
Abstract:The fast growth of the web generates a significant amount of heterogeneous information such as images, text, audio, and video through various applications. These applications use different layouts to represent significant information. The layouts of table information are overloaded with anomalies that have given rise to intensive research into the semantification of web content and organizing tabular data for knowledge sharing and acquisition. Moreover, there are many anomalies present in tabular layouts that lead to the lack of semantic representation in tabular form and new challenges in data modeling. In this paper, we have discussed the various anomalies present in the tabular data that pertain to ontology learning and population tasks and provide the semantification of tabular data. To complete this task, (1) we provide the list of anomalies that pertain to semantification and provide the resolution to anomalies along with the semantification of tabular data, and (2) we have established the algorithm to interpret the table structure into a formal representation to analyze anomalies and provide the resolution. Furthermore, the proposed approach has been compared with existing approaches using ontology elements, the ability to resolve the anomalies, and the time complexity of the ontology population.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?