Mining Requirements Knowledge from Collections of Domain Documents

Xiaoli Lian,Mona Rahimi,Jane Cleland-Huang,Li Zhang,Remo Ferrai,Michael Smith
DOI: https://doi.org/10.1109/RE.2016.50
2016-01-01
Abstract:When organizations enter domains that are entirely new to them, they need to invest significant time and effort to acquire domain knowledge. This typically involves searching through a broad set of domain documents, retrieving relevant ones, and analyzing the textual content in order to discover and specify pertinent requirements. Depending on the nature of the domain and the availability of documentation, this task can be extremely time-consuming and may require non-trivial human effort. Furthermore, the task must often be performed repeatedly throughout early phases of the project. In this paper we first explore the effort needed to manually build a high-level domain model capturing the functional components. We then present MaRK (Mining Requirements Knowledge), which identifies and retrieves the documents containing descriptions of functional components in the domain model. Domain analysts can use this information to to specify requirements. We introduce and evaluate an algorithm which ranks domain documents according to their relevance to a component and then highlights sections of text which are likely to contain requirements-related information. We describe our process within the context of the Positive Train Control (PTC) domain with a repository of of 523 documents, representing 852MB of data. We empirically evaluate the MaRK relevance algorithm and its ability to retrieve relevant requirements knowledge for requirements related to PTC's On-Board Unit.
What problem does this paper attempt to address?