Finding Camouflaged Needle in a Haystack?
Guoxiu He,Yangyang Kang,Zhe Gao,Zhuoren Jiang,Changlong Sun,Xiaozhong Liu,Wei Lu,Qiong Zhang,Luo Si
DOI: https://doi.org/10.1145/3331184.3331197
2019-01-01
Abstract:It is an important and urgent research problem for decentralized eCommerce services, e.g., eBay, eBid, and Taobao, to detect illegal products, e.g., unclassified pornographic products. However, it is a challenging task as some sellers may utilize and change camouflaged text to deceive the current detection algorithms. In this study, we propose a novel task to dynamically locate the pornographic products from very large product collections. Unlike prior product classification efforts focusing on textual information, the proposed model, BerryPIcking TRee MoDel (BIRD), utilizes both product textual content and buyers' seeking behavior information as berrypicking trees. In particular, the BIRD encodes both semantic information with respect to all branches sequence and the overall latent buyer intent during the whole seeking process. An extensive set of experiments have been conducted to demonstrate the advantage of the proposed model against alternative solutions. To facilitate further research of this practical and important problem, the codes and buyers' seeking behavior data have been made publicly available1.