APOLLO: an Optimized Training Approach for Long-form Numerical Reasoning.

Jiashuo Sun,Hang Zhang,Chen Lin,Xiangdong Su,Yeyun Gong,Jian Guo
DOI: https://doi.org/10.48550/arxiv.2212.07249
2022-01-01
Abstract:Long-form numerical reasoning in financial analysis aims to generate areasoning program to calculate the correct answer for a given question.Previous work followed a retriever-generator framework, where the retrieverselects key facts from a long-form document, and the generator generates areasoning program based on retrieved facts. However, they treated all factsequally without considering the different contributions of facts with andwithout numbers. Meanwhile, the program consistency were ignored undersupervised training, resulting in lower training accuracy and diversity. Tosolve these problems, we proposed APOLLO to improve the long-form numericalreasoning framework. For the retriever, we adopt a number-aware negativesampling strategy to enable the retriever to be more discriminative on keynumerical facts. For the generator, we design consistency-based reinforcementlearning and target program augmentation strategy based on the consistency ofprogram execution results. Experimental results on the FinQA and ConvFinQAleaderboard verify the effectiveness of our proposed method, achieving the newstate-of-the-art.
What problem does this paper attempt to address?