Single Large-Scale Graph Frequent Subgraph Algorithm Based on Spark

Laihao JIANG,Zhixiang ZHU,Zichen ZHAO
DOI: https://doi.org/10.3969/j.issn.1672-9722.2019.10.006
2019-01-01
Abstract:With the rapid development of the Internet,the campus card has been widely popularized,and the data on the serv?er is also increasing rapidly. The single computer algorithm can not support frequent subgraph mining and growth pattern mining. The data mining of a large number of single graph frequent subgraphs can not be realized on a single machine. The Hadoop distribut?ed framework is not suitable for iterative algorithm. Therefore,In this paper,a distributed algorithm named FSMBUS for mining fre?quent subgraph in a single large-scale graph under Spark frame work is proposed. It constructs the parallel computing candidate sub?graphs by suboptimal CAM Tree,which returns all the frequent subgraphs for given user-defined minimum support. This experi?ments show that the single chart of new algorithms than the efficiency of FSMBUS is an order of magnitude slower,FSMBUS algo?rithm can support lower support threshold and larger map data mining,2~4 times faster than the efficiency of the Hadoop version of the transplant,analysis of our campus card can help college management and leadership of colleges and universities to put forward a reference basis.
What problem does this paper attempt to address?