Do Not Pull My Data for Resale

Guosai Wang,Shiyang Xiang,Yitao Duan,Ling Huang,Wei Xu
DOI: https://doi.org/10.1145/3209978.3210158
2018-01-01
Abstract:Data providers have a profound contribution to many fields such as finance, economy, and academia by serving people with both web-based and API-based query service of specialized data. Among the data users, there are data resellers who abuse the query APIs to retrieve and resell the data to make a profit, which harms the data provider's interests and causes copyright infringement. In this work, we define the "anti-data-reselling" problem and propose a new systematic method that combines feature engineering and machine learning models to provide a solution. We apply our method to a real query log of over 9,000 users with limited labels provided by a large financial data provider and get reasonable results, insightful observations, and real deployments.
What problem does this paper attempt to address?