A quantitative structure comparison with persistent similarity

Kelin Xia
DOI: https://doi.org/10.48550/arXiv.1707.03572
2017-07-12
Quantitative Methods
Abstract:Biomolecular structure comparison not only reveals evolutionary relationships, but also sheds light on biological functional properties. However, traditional definitions of structure or sequence similarity always involve superposition or alignment and are computationally inefficient. In this paper, I propose a new method called persistent similarity, which is based on a newly-invented method in algebraic topology, known as persistent homology. Different from all previous topological methods, persistent homology is able to embed a geometric measurement into topological invariants, thus provides a bridge between geometry and topology. Further, with the proposed persistent Betti function (PBF), topological information derived from the persistent homology analysis can be uniquely represented by a series of continuous one-dimensional (1D) functions. In this way, any complicated biomolecular structure can be reduced to several simple 1D PBFs for comparison. Persistent similarity is then defined as the quotient of sizes of intersect areas and union areas between two correspondingly PBFs. If structures have no significant topological properties, a pseudo-barcode is introduced to insure a better comparison. Moreover, a multiscale biomolecular representation is introduced through the multiscale rigidity function. It naturally induces a multiscale persistent similarity. The multiscale persistent similarity enables an objective-oriented comparison. State differently, it facilitates the comparison of structures in any particular scale of interest. Finally, the proposed method is validated by four different cases. It is found that the persistent similarity can be used to describe the intrinsic similarities and differences between the structures very well.
What problem does this paper attempt to address?