Beating the Artificial Chaos: Fighting OSN Spam Using Its Own Templates
Tiantian Zhu,Hongyu Gao,Yi Yang,Kai Bu,Yan Chen,Doug Downey,Kathy Lee,Alok N. Choudhary
DOI: https://doi.org/10.1109/tnet.2016.2557849
2016-01-01
IEEE/ACM Transactions on Networking
Abstract:Online social networks (OSNs) are extremely popular among Internet users. However, spam originating from friends and acquaintances not only reduces the joy of Internet surfing but also causes damage to less security-savvy users. Prior countermeasures combat OSN spam from different angles. Due to the diversity of spam, there is hardly any existing method that can independently detect the majority or most of OSN spam. In this paper, we empirically analyze the textual pattern of a large collection of OSN spam. An inspiring finding is that the majority (e.g., 76.4% in 2015) of the collected spam is generated with underlying templates. Based on the analysis, we propose tangram, an OSN spam filtering system that performs online inspection on the stream of user-generated messages. Tangram extracts the templates of spam detected by existing methods and then matching messages against the templates toward the accurate and the fast spam detection. It automatically divides the OSN spam into segments and uses the segments to construct templates to filter future spam. Experimental results on Twitter and Facebook data sets show that tangram is highly accurate and can rapidly generate templates to throttle newly emerged campaigns. Furthermore, we analyze the behavior of detected OSN spammers. We find a series of spammer properties-such as spamming accounts are created in bursts and a single active organization orchestrates more spam than all other spammers combined-that promise more comprehensive spam countermeasures.