Learning to Answer Programming Questions with Software Documentation Through Social Context Embedding.

Jing Li,Aixin Sun,Zhenchang Xing
DOI: https://doi.org/10.1016/j.ins.2018.03.014
IF: 8.1
2018-01-01
Information Sciences
Abstract:Official software documentation provides a comprehensive overview of software usages, but not on specific programming tasks or use cases. Often there is a mismatch between the documentation and a question on a specific programming task because of different wordings. We observe from Stack Overflow that the best answers to programmers’ questions often contain links to formal documentation. In this paper, we propose a novel deep-learning-to-answer framework, named QDLinker, for answering programming questions with software documentation. QDLinker learns from the large volume of discussions in community-based question answering site to bridge the semantic gap between programmers’ questions and software documentation. Specifically, QDLinker learns question-documentation semantic representation from these question answering discussions with a four-layer neural network, and incorporates semantic and content features into a learning-to-rank schema. Our approach does not require manual feature engineering or external resources to infer the degree of relevance between a question and documentation. Through extensive experiments, results show that QDLinker effectively answers programming questions with direct links to software documentation. QDLinker significantly outperforms the baselines based on traditional retrieval models and Web search services dedicated for software documentation retrieval. The user study shows that QDLinker effectively bridges the semantic gap between the intent of a programming question and the content of software documentation.
What problem does this paper attempt to address?