IAD: A Benchmark Dataset and a New Method for Illegal Advertising Classification.

Zebo Liu,Kehan Li,Xu Tan,Jiming Chen
DOI: https://doi.org/10.3233/faia200331
2020-01-01
Abstract:While online advertising becomes ubiquitous and the pillar of the economy in Internet industry, there are increasing illegal ads which contain misleading or deceptive content and hinder the healthy development of online advertising. How to detect illegal advertising and classify it according to the provisions it violates, is critical for legal supervision. However, due to the difficulty of dataset acquisition and the lack of expert knowledge in advertising, benchmark datasets and methods for illegal advertising classification are scarce. In this paper, we collect and release a large-scale dataset for illegal advertising classification (called IAD, short for illegal ads), which contains the content of illegal ads and the corresponding violated provisions. IAD dataset has been released. Based on the IAD dataset, we further propose a novel method called IAD-Net to classify the violated provisions of the illegal ads. IAD-Net mainly adopts an interactive attention-based parallel LSTM network, where the parallel structure integrates the provision into classification process, equivalent to using prior information to supervise the classification. Besides, IAD-Net introduces an auxiliary embedding layer to enhance the semantics of lexicons in short ads, and an interactive attention mechanism to capture the relationship between lexicons in ads and its legality. We conduct comprehensive study on the IAD dataset and benchmark several previous methods as well as the proposed IAD-Net for illegal advertising classification. Experimental results demonstrate that IAD-Net achieves good accuracy and outperforms all the previous methods on IAD dataset. We believe the proposed IAD dataset and IAD-Net will help accelerate the research in the area of illegal advertising classification.
What problem does this paper attempt to address?