Improved Approximation Algorithms for Bipartite Correlation Clustering
Nir Ailon,Noa Avigdor-Elgrabli,Edo Liberty,Anke van Zuylen
DOI: https://doi.org/10.1137/110848712
2012-01-01
SIAM Journal on Computing
Abstract:In this work we study the problem of bipartite correlation clustering (BCC), a natural bipartite counterpart of the well-studied correlation clustering (CC) problem [N. Bansal, A. Blum, and S. Chawla, Machine Learning, 56 (2004), pp. 89--113], also referred to as graph editing [R. Shamir, R. Sharan, and D. Tsur, Discrete Appl. Math., 144 (2004), pp. 173--182]. Given a bipartite graph, the objective of BCC is to generate a set of vertex disjoint bicliques (clusters) that minimizes the symmetric difference to the original graph. The best-known approximation algorithm for BCC due to Amit [N. Amit, The Bicluster Graph Editing Problem, Master's Thesis, Tel Aviv University, Tel Aviv, Israel, 2004] guarantees an $11$-approximation ratio. In this paper we present two algorithms. The first is a linear program based $4$-approximation algorithm. Like the previous approximation algorithm, it requires solving a large convex problem, which becomes prohibitive even for modestly sized tasks. The second algorithm, and our main contribution, is a simple randomized combinatorial algorithm. It also achieves an expected $4$-approximation factor, and it is trivial to implement and highly scalable. The analysis extends a method developed by Ailon, Charikar, and Newman in 2008, where a randomized pivoting algorithm was analyzed for obtaining a $3$-approximation algorithm for CC. For analyzing our algorithm for BCC, considerably more sophisticated arguments are required in order to take advantage of the bipartite structure. Whether it is possible to achieve (or beat) the $4$-approximation factor using a scalable and deterministic algorithm remains an open problem.
computer science, theory & methods,mathematics, applied