Mining E-Commercial Data: A Text-Rich Heterogeneous Network Embedding Approach
Weizheng Chen,Chi Liu,Jun Yin,Hongfei Yan,Yan Zhang
DOI: https://doi.org/10.1109/IJCNN.2017.7966017
2017-01-01
Abstract:It is a great challenge to model and mine the e-commercial data, which is made up of multiple types of objects, such as products, users, comments and tags. To model the complicated interactive relationships in the the e-commercial data, we propose to transform the complex e-commercial data into a text-rich heterogeneous e-commercial network. Then three neural network based embedding algorithms named WTL (Weighted Text Learning), IBL (Identity Based Learning) and IBTSL (Identity Based Two Steps Learning) are proposed to consider both the network structure information and heterogeneous nodes attributes identity information to learn the embeddings. The key idea of our models is to map all objects in the e-commercial network to a same low-dimensional vector space, which is useful to produce meaningful features for many applications such as product classification, comment classification, product attributes forecasting, recommendation, and so on. Our algorithms are compared with other existing advanced methods on a real large-scale e-commercial dataset. Several applications are set to evaluate the effectivity of the learned embeddings. The experimental results show that the embeddings generated by our algorithms have superior performance in each application.