Representation Learning for Frequent Subgraph Mining
Rex Ying,Tianyu Fu,Andrew Wang,Jiaxuan You,Yu Wang,Jure Leskovec
2024-02-22
Abstract:Identifying frequent subgraphs, also called network motifs, is crucial in
analyzing and predicting properties of real-world networks. However, finding
large commonly-occurring motifs remains a challenging problem not only due to
its NP-hard subroutine of subgraph counting, but also the exponential growth of
the number of possible subgraphs patterns. Here we present Subgraph Pattern
Miner (SPMiner), a novel neural approach for approximately finding frequent
subgraphs in a large target graph. SPMiner combines graph neural networks,
order embedding space, and an efficient search strategy to identify network
subgraph patterns that appear most frequently in the target graph. SPMiner
first decomposes the target graph into many overlapping subgraphs and then
encodes each subgraph into an order embedding space. SPMiner then uses a
monotonic walk in the order embedding space to identify frequent motifs.
Compared to existing approaches and possible neural alternatives, SPMiner is
more accurate, faster, and more scalable. For 5- and 6-node motifs, we show
that SPMiner can almost perfectly identify the most frequent motifs while being
100x faster than exact enumeration methods. In addition, SPMiner can also
reliably identify frequent 10-node motifs, which is well beyond the size limit
of exact enumeration approaches. And last, we show that SPMiner can find large
up to 20 node motifs with 10-100x higher frequency than those found by current
approximate methods.
Machine Learning,Social and Information Networks