Abstract:Paraphrasing, conveying the same meaning in different ways, is an intrinsic part of natural languages. The research field of Automatic Paraphrasing encompasses the tasks of collecting, identifying, and generating paraphrases in an automatic or a computeraided manner. In addition, researchers have investigated the contribution of automatic paraphrasing techniques to many natural language applications, such as question answering (QA), information extraction (IE), multi-document summarization (MDS), and machine translation (MT). For example, in Machine Translation, paraphrases have been used for rewriting and simplifying input sentences, enlarging translation phrase tables, expanding human references for automatic evaluation, and so forth. This special section of ACM TIST is intended to cover state-of-the-art research in automatic paraphrasing. Especially, we highlight the applications of paraphrasing techniques in real-world systems, such as MT systems and search engines. Seven articles are included in the special section. One of them is about paraphrase extraction from monolingual corpora, while the other six discuss the applications of paraphrases, including paraphrasing for machine translation, sentence compression, word meaning computing, and plagiarism detection. There are three articles that focus on applying paraphrasing techniques for MT. These articles cover the three main research directions mentioned, namely, source sentence rewriting, phrase table enlargement, and human reference expansion. In “Using Targeted Paraphrasing and Monolingual Crowdsourcing to Improve Translation” by Philip Resnik, Olivia Buzek, Yakov Kronrod, Chang Hu, Alexander J. Quinn, and Benjamin B. Bederson, the authors propose enhancing the translation quality of an SMT system based on crowdsourcing. A remarkable advantage of the proposed method is that it involves only monolingual workers to identify target-side translation errors and supply source-side paraphrase, rather than relying on workers with bilingual expertise. The proposed solution has the potential of providing a more cost-effective approach to translation in scenarios where machine translation would be considered acceptable to use if only it were generally of high enough quality. It also has the potential to vastly reduce the burden of human effort for cases in which bilingual translators postedit machine translation output. In the article “Distributional Phrasal Paraphrase Generation for Statistical Machine Translation” by Yuval Marton, the author focuses on extracting paraphrases to improve the coverage of the translation model. The proposed method extracts paraphrases from large-scale monolingual corpora based on distributional similarity. The extracted paraphrases are then used to augment a translation phrase table with pairs not covered by the initial table. The novelty of the proposed method lies in it being languageindependent, and hence it does not rely on bitexts for generating paraphrases or new phrase pairs. In “Generating Targeted Paraphrases for Improved Translation” by Nitin Madnani and Bonnie Dorr, the authors adopt an approach that uses automatic paraphrase generation to tune parameters for an SMT system. Specifically, given a single reference translation, they build a paraphrase generation system that can produce several different semantically equivalent variants that can then be used as additional reference translations. Experimental results on several language pairs have demonstrated that the proposed approach can improve translation quality. Furthermore, this article presents

Beyond Pivot For Extracting Chinese Paraphrases

Revisiting Pivot-Based Paraphrase Generation - Language Is Not the Only Optional Pivot.

Extracting Paraphrase Patterns from Bilingual Parallel Corpora

Paraphrase Extraction from Interactive Q&A Communities

Automatic Acquisition of Context-Specific Lexical Paraphrases

Exploring Key Concept Paraphrasing Based on Pivot Language Translation for Question Retrieval

ECNU: Leveraging Word Embeddings to Boost Performance for Paraphrase in Twitter.

Neural paraphrasing by automatically crawled and aligned sentence pairs

PKU Paraphrase Bank: A Sentence-Level Paraphrase Corpus for Chinese.

Leveraging multiple MT engines for paraphrase generation

Introduction to Special Section on Paraphrasing

Paraphrasing with search engine query logs

Enriching SMT Training Data Via Paraphrasing.

Question Paraphrases for QA from Encarta Logs

Extracting paraphrases from a parallel corpus

Extract, Transform and Filling: A Pipeline Model for Question Paraphrasing Based on Template.

Learning Question Paraphrases For Qa From Encarta Logs

A Survey on Paraphrasing Technology

Exploring Diverse Expressions for Paraphrase Generation

Phrasal Paraphrase Acquisition Based on Bilingual Corpus

Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora.