Finding Representative and Diverse Vertices within Graphs
Minlie Huang,Shouyuan Chen
2009-01-01
Abstract:ABSTRACT Inthis paper, we present a novel method that optimizes the selection of representative and ,diverse vertices ,from ,graphs. Representative means,that selected vertices are salient and highly relevant to the ,topic of interest. Diverse means ,that selected vertices must contain comprehensive information from different aspects and highly similar contents must be eliminated. Given a budget-limited weighted graph G(V,E), where each vertex v in V isassociated with a salience score Rank(v) and a budget Bud(v), and each edge (u,v)is associated with a weight w(u,v), the goal is tofind a subset ,of representative and ,diverse vertices which ,can cover as much ,information contained ,by the ,original graph as possible, and at the same time, the total budget of selected vertices is within a pre-specified limit. The central idea is implemented by optimizing an objective ,function which ,takes into account both salience and diversity, with a (1-1/e)-approximation algorithm. Weapply,the method ,to solve ,several problems: (1) multi- document summarization, which extracts representative sentences from many ,documents ,to form ,a summary; ,(2) mining ,the most famous,experts who ,have published papers on KDD since 2000; and (3) mining ,the most representative comedy ,actors from the IMDb website. By comparing with other popular methods, our