SEREIA: document store exploration through keywords
Ariel Afonso,Paulo Martins,Altigran da Silva
DOI: https://doi.org/10.1007/s10115-024-02151-1
IF: 2.7
2024-06-12
Knowledge and Information Systems
Abstract:The adoption of document stores , such as MongoDB or CouchDB, has increased drastically in recent years. Part of this popularity can certainly be explained by their flexibility in loading, storing, and retrieving semi-structured data on massive scales. However, adopting such systems presents challenges when exploring the data they store, since document structure may not follow a single pattern and thus present complex hierarchical and nested structures that vary. Additionally, an analyst who wants to retrieve data may experience difficulties since she must learn the specificities of the document store's native query language. In this work, we propose SEREIA , a system that facilitates data exploration in document stores through keyword search. The user inputs a non-structured keyword-based query, and the system generates a structured query for the document store that fulfils her information needs. We evaluated SEREIA using five datasets previously used in the literature, and the results achieved indicate that SEREIA is suitable for helping users in data exploration tasks by removing the burden of understanding the data organization of the stored documents and by automatically generating queries to explore data of interest.
computer science, information systems, artificial intelligence