Abstract:Implied semantics is a complex language act that can appear everywhere on the Cyberspace. The prevalence of implied spam texts, such as implied pornography, sarcasm, and abuse hidden within the novel, tweet, microblog, or review, can be extremely harmful to the physical and mental health of teenagers. The non-literal interpretation of the implied text is hard to be understood by machine models due to its high context-sensitivity and heavy usage of figurative language. In this study, inspired by human reading comprehension, we propose a novel, simple, and effective deep neural framework, called Skim and Intensive Reading Model (SIRM), for figuring out implied textual meaning. The proposed SIRM consists of three main components, namely the skim reading component, intensive reading component, and adversarial training component. N-gram features are quickly extracted from the skim reading component, which is a combination of several convolutional neural networks, as skim (entire) information. An intensive reading component enables a hierarchical investigation for both sentence-level and paragraph-level representation, which encapsulates the current (local) embedding and the contextual information (context) with a dense connection. More specifically, the contextual information includes the near-neighbor information and the skim information mentioned above. Finally, besides the common training loss function, we employ an adversarial loss function as a penalty over the skim reading component to eliminate noisy information (noise) arisen from special figurative words in the training data. To verify the effectiveness, robustness, and efficiency of the proposed architecture, we conduct extensive comparative experiments on an industrial novel dataset involving implied pornography and three sarcasm benchmarks. Experimental results indicate that (1) the proposed model, which benefits from context and local modeling and consideration of figurative language (noise), outperforms existing state-of-the-art solutions, with comparable parameter scale and running speed; (2) the SIRM yields superior robustness in terms of parameter size sensitivity; (3) compared with ablation and addition variants of the SIRM, the final framework is efficient enough.

Making the Implicit Explicit: Implicit Content as a First Class Citizen in NLP

Visualizing and Understanding Neural Models in NLP

ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Language

Toward Cultural Interpretability: A Linguistic Anthropological Framework for Describing and Evaluating Large Language Models (LLMs)

Show, Don't Tell: Uncovering Implicit Character Portrayal using LLMs

Deceptively simple: An outsider's perspective on natural language processing

Think Beyond the Word: Understanding the Implied Textual Meaning by Digesting Context, Local, and Noise

Implicit Dimension Identification in User-Generated Text with LSTM Networks

Implicit Personalization in Language Models: A Systematic Study

Implicit meta-learning may lead language models to trust more reliable sources

From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency

Explicating the Implicit: Argument Detection Beyond Sentence Boundaries

Linguistic Properties Matter for Implicit Discourse Relation Recognition: Combining Semantic Interaction, Topic Continuity and Attribution

Great Service! Fine-grained Parsing of Implicit Arguments

Can Your Model Tell a Negation from an Implicature? Unravelling Challenges With Intent Encoders

Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models using Minimal Pairs

ExpressivityArena: Can LLMs Express Information Implicitly?

The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models

Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions

Latent Concept-based Explanation of NLP Models

Modelling Compositionality and Structure Dependence in Natural Language