Sparse Canonical Correlation Analysis via truncated ℓ<inf>1</inf>-norm with application to brain imaging genetics

Lei Du,Tuo Zhang,Kefei Liu,Xiaohui Yao,Jingwen Yan,Shannon L. Risacher,Lei Guo,Andrew J. Saykin,Li Shen
DOI: https://doi.org/10.1109/BIBM.2016.7822605
2016-01-01
Abstract:Discovering bi-multivariate associations between genetic markers and neuroimaging quantitative traits is a major task in brain imaging genetics. Sparse Canonical Correlation Analysis (SCCA) is a popular technique in this area for its powerful capability in identifying bi-multivariate relationships coupled with feature selection. The existing SCCA methods impose either the ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -norm or its variants. The ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> -norm is more desirable, which however remains unexplored since the ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> -norm minimization is NP-hard. In this paper, we impose the truncated ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -norm to improve the performance of the ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -norm based SCCA methods. Besides, we propose two efficient optimization algorithms and prove their convergence. The experimental results, compared with two benchmark methods, show that our method identifies better and meaningful canonical loading patterns in both simulated and real imaging genetic analyse.
What problem does this paper attempt to address?