Using Wikidata lexemes and items to generate text from abstract representations

Mahir Morshed
DOI: https://doi.org/10.3233/sw-243564
2024-06-15
Semantic Web
Abstract:Ninai/Udiron, a living function-based natural language generation system, uses knowledge in Wikidata lexemes and items to transform abstract representations of factual statements into human-readable text. The combined system first produces syntax trees based on those abstract representations (Ninai) and then yields sentences from those syntax trees (Udiron). The system relies on information about individual lexical units and links to the concepts those units represent, as well as rules encoded in various types of functions to which users may contribute, to make decisions about words, phrases, and other morphemes to use and how to arrange them. Various system design choices work toward using the information in Wikidata lexemes and items efficiently and effectively, making different components individually contributable and extensible, and making the overall resultant outputs from the system expectable and analyzable. These targets accompany the intentions for Ninai/Udiron to ultimately power the Wikipedia project as well as be hosted on the Wikifunctions project.
computer science, information systems, artificial intelligence, theory & methods
What problem does this paper attempt to address?