MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction

Tongliang Li,Zixiang Wang,Linzheng Chai,Jian Yang,Jiaqi Bai,Yuwei Yin,Jiaheng Liu,Hongcheng Guo,Liqun Yang,Hebboul Zine El-abidine,Zhoujun Li
DOI: https://doi.org/10.1016/j.eswa.2024.124760
IF: 8.5
2024-01-01
Expert Systems with Applications
Abstract:Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages. Previous work uses a shared cross-lingual pre-trained model to handle the different languages but underuses the potential of the language-specific representation. In this paper, we propose an effective multistage tuning framework called Mr4CrossOIE, designed for enhancing cross-lingual open information extraction by injecting language-specific knowledge into the shared model. Specifically, the cross-lingual pre-trained model is first tuned in a shared semantic space (e.g., embedding matrix) in the fixed encoder and then other components are optimized in the second stage. After enough training, we freeze the pre-trained model and tune the multiple extra low-rank language-specific modules using mixture of LoRAs for model-based cross- lingual transfer. In addition, we leverage two-stage prompting to encourage the large language model (LLM) to annotate the multi-lingual raw data for data-based cross-lingual transfer. The model is trained with multilingual objectives on our proposed dataset OpenIE4++ by combining the model-based and data-based transfer techniques. Experimental results on various benchmarks emphasize the importance of aggregating multiple plug-in-and-play language-specific modules and demonstrate the effectiveness of Mr4CrossOIE in cross-lingual OIE.2
What problem does this paper attempt to address?