Abstract:Citing properly in order to support concepts, claims and arguments is one of the main requirements of writing any scientific text. However, manual analysis of the input text to identify potential citation contexts is time-consuming and needs a great deal of experience. To solve this issue, in today's citation recommendation systems, a citation worthiness identification process, also known as citation context detection, is developed as the first step in order to reduce the workload and extract an appropriate set of citation contexts. The performance of the subsequent steps is strongly dependent on the results of the citation worthiness identification task. Regarding its high importance, the current study focuses on the task of citation worthiness identification and proposes a syntax-based learning approach in order to produce a more accurate continuous representation of words. By considering both words and dependency context features, the proposed model effectively captures the functional characteristics of words and alleviates the long-distance dependency and polysemy problems. Moreover, various forms of feature representation regarding the word- and syntactic-based embeddings are presented in order to inject the best combination of them into a convolutional neural network-based classifier. The extensive experimental results on the standard dataset ACL-ARC demonstrate that the proposed model significantly improves the baselines (over 0.23documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$$0.23$$end{document} increase in F1documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$${F}_{1}$$end{document} score) and outperforms the state-of-the-art methods (with relative improvements of over 14%documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$$14\%$$end{document} in F1documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$${F}_{1}$$end{document} score). These characteristics make the proposed model a suitable choice to be embedded in citation recommendation systems.

Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics

A Comparative Analysis of Faithfulness Metrics and Humans in Citation Evaluation

ALiiCE: Evaluating Positional Fine-grained Citation Generation

Learning Fine-Grained Grounded Citations for Attributed Large Language Models

Learning to Generate Answers with Citations via Factual Consistency Models

Training Language Models to Generate Text with Citations via Fine-grained Rewards

On the Capacity of Citation Generation by Large Language Models

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?

On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation

The Probabilities Also Matter: A More Faithful Metric for Faithfulness of Free-Text Explanations in Large Language Models

When Large Language Models Meet Citation: A Survey

WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations

Evaluation of Faithfulness Using the Longest Supported Subsequence

Verifiable Generation with Subsentence-Level Fine-Grained Citations

FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models

FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"

Citation Worthiness Identification for Fine-Grained Citation Recommendation Systems

Chain-of-Thought Improves Text Generation with Citations in Large Language Models

Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models

Assessing citation integrity in biomedical publications: corpus annotation and NLP models