WikiDoMiner: Wikipedia Domain-specific Miner

Saad Ezzini,Sallam Abualhaija,Mehrdad Sabetzadeh
DOI: https://doi.org/10.48550/arXiv.2206.10218
2022-06-21
Abstract:We introduce WikiDoMiner, a tool for automatically generating domain-specific corpora by crawling Wikipedia. WikiDoMiner helps requirements engineers create an external knowledge resource that is specific to the underlying domain of a given requirements specification (RS). Being able to build such a resource is important since domain-specific datasets are scarce. WikiDoMiner generates a corpus by first extracting a set of domain-specific keywords from a given RS, and then querying Wikipedia for these keywords. The output of WikiDoMiner is a set of Wikipedia articles relevant to the domain of the input RS. Mining Wikipedia for domain-specific knowledge can be beneficial for multiple requirements engineering tasks, e.g., ambiguity handling, requirements classification, and question answering. WikiDoMiner is publicly available on Zenodo under an open-source license (DOI: <a class="link-https link-external" data-doi="10.5281/zenodo.6671357" href="https://doi.org/10.5281/zenodo.6671357" rel="external noopener nofollow">https://doi.org/10.5281/zenodo.6671357</a>).
Software Engineering
What problem does this paper attempt to address?