Study On User'S Multi-Email Addresses Extraction And Association For Large Scale Network Monitoring

Qin Tao,Zhao Dan,Niu Xiaoqiang,Li Yunzhao,Wei Qiang
2013-01-01
Abstract:As email integrates the advantages of telephone communications and postal mail, it has become an indispensable form of modern communication. Most of current network applications require an email to create account, making people's social characteristics and network identity attributes connected. It is important to monitor the user's behavior with different identities for security monitoring. We focus on extracting the user's email identity extraction and association in this paper. Firstly, email delivery mechanism for different email client is introduced with the communications protocols and related standards. Then, SMTP session based method and TCP stream assembly based method are proposed to analyze the traffic generated by email clients and web client respectively. All user characteristics from most email clients and web mail sites which has about 60 percent share in market, are extracted successfully. At last, the association between user's multi-email accounts and identity is established on some basic analysis of the active profile of different accounts. The analysis results with the actual traffic trace collect from the CERNET verify the correctness of the proposed method, the identification accuracy of multi-addresses is above 80%, which is important for user's behavior monitoring.
What problem does this paper attempt to address?