Abstract:Detecting and monitoring competitors is fundamental to a company to stay ahead in the global market. Existing studies mainly focus on mining competitive relationships within a single data source, while competing information is usually distributed in multiple networks. How to discover the underlying patterns and utilize the heterogeneous knowledge to avoid biased aspects in this issue is a challenging problem. In this article, we study the problem of mining competitive relationships by learning across heterogeneous networks. We use Twitter and patent records as our data sources and statistically study the patterns behind the competitive relationships. We find that the two networks exhibit different but complementary patterns of competitions. Overall, we find that similar entities tend to be competitors, with a probability of 4 times higher than chance. On the other hand, in social network, we also find a 10 minutes phenomenon: when two entities are mentioned by the same user within 10 minutes, the likelihood of them being competitors is 25 times higher than chance. Based on the discovered patterns, we propose a novel Topical Factor Graph Model. Generally, our model defines a latent topic layer to bridge the Twitter network and patent network. It then employs a semi-supervised learning algorithm to classify the relationships between entities (e.g., companies or products). We test the proposed model on two real data sets and the experimental results validate the effectiveness of our model, with an average of +46% improvement over alternative methods. Besides, we further demonstrate the competitive relationships inferred by our proposed model can be applied in the job-hopping prediction problem by achieving an average of +10.7% improvement.

Learning Heterogeneous Coupling Relationships Between Non-IID Terms

Community Mining From Multi-Relational Networks

Heterogeneous Graph Representation for Text Mining

Learning a Probabilistic Semantic Model from Heterogeneous Social Networks for Relationship Identification

A similarity reinforcement algorithm for heterogeneous web pages

Coupling Learning of Complex Interactions

Text Classification With Heterogeneous Information Network Kernels

Information Filtering on Coupled Social Networks.

Mining Hidden Community in Heterogeneous Social Networks

Scalable Community Discovery on Textual Data with Relations

Mutual Clustering on Comparative Texts via Heterogeneous Information Networks

Research on Text Similarity Measurement Hybrid Algorithm with Term Semantic Information and TF-IDF Method

A Novel Multiview Topic Model to Compute Correlation of Heterogeneous Data

Mining E-Commercial Data: A Text-Rich Heterogeneous Network Embedding Approach

User Identity Linkage across Social Networks with the Enhancement of Knowledge Graph and Time Decay Function

A Concept Similarity Based Text Classification Algorithm

Full-text Based Context-Rich Heterogeneous Network Mining Approach for Citation Recommendation

Distant Meta-Path Similarities for Text-Based Heterogeneous Information Networks

Learning to Infer Competitive Relationships in Heterogeneous Networks.

Mining Competitive Relationships by Learning Across Heterogeneous Networks

Several alternative term weighting methods for text representation and classification