Imitate TheWorld: A Search Engine Simulation Platform

Yongqing Gao,Guangda Huzhang,Weijie Shen,Yawen Liu,Wen-Ji Zhou,Qing Da,Yang Yu
DOI: https://doi.org/10.48550/arXiv.2107.07693
2021-08-10
Abstract:Recent E-commerce applications benefit from the growth of deep learning techniques. However, we notice that many works attempt to maximize business objectives by closely matching offline labels which follow the supervised learning paradigm. This results in models obtain high offline performance in terms of Area Under Curve (AUC) and Normalized Discounted Cumulative Gain (NDCG), but cannot consistently increase the revenue metrics such as purchases amount of users. Towards the issues, we build a simulated search engine AESim that can properly give feedback by a well-trained discriminator for generated pages, as a dynamic dataset. Different from previous simulation platforms which lose connection with the real world, ours depends on the real data in AliExpress Search: we use adversarial learning to generate virtual users and use Generative Adversarial Imitation Learning (GAIL) to capture behavior patterns of users. Our experiments also show AESim can better reflect the online performance of ranking models than classic ranking metrics, implying AESim can play a surrogate of AliExpress Search and evaluate models without going online.
Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is that Learning-to-Rank (LTR) models in e-commerce perform well in offline evaluations but fail to continuously improve business metrics (such as user purchases) in actual online environments. Specifically, many existing works optimize offline labels through supervised learning paradigms, which perform excellently on offline evaluation metrics (such as AUC and NDCG) but do not effectively enhance online business performance. To tackle this challenge, the authors constructed a simulated search engine platform called AESim. This platform can provide feedback on generated pages through a well-trained discriminator, thus serving as a dynamic dataset. Unlike previous simulation platforms, AESim relies on real AliExpress search data, uses adversarial learning to generate virtual users, and employs Generative Adversarial Imitation Learning (GAIL) to capture user behavior patterns. Experimental results show that AESim can better reflect the online performance of ranking models, suggesting that it can serve as a substitute for AliExpress search to evaluate model performance without conducting online tests.