TrustGWAS: A full-process workflow for encrypted GWAS using multi-key homomorphic encryption and pseudorandom number perturbation
Meng Yang,Chuwen Zhang,Xiaoji Wang,Xingmin Liu,Shisen Li,Jianye Huang,Zhimin Feng,Xiaohui Sun,Fang Chen,Shuang Yang,Ming Ni,Lin Li,Yanan Cao,Feng Mu
DOI: https://doi.org/10.1016/j.cels.2022.08.001
IF: 11.091
2022-08-31
Cell Systems
Abstract:Summary The statistical power of genome-wide association studies (GWASs) is affected by the effective sample size. However, the privacy and security concerns associated with individual-level genotype data pose great challenges for cross-institutional cooperation. The full-process cryptographic solutions are in demand but have not been covered, especially the essential principal-component analysis (PCA). Here, we present TrustGWAS, a complete solution for secure, large-scale GWAS, recapitulating gold standard results against PLINK without compromising privacy and supporting basic PLINK steps including quality control, linkage disequilibrium pruning, PCA, chi-square test, Cochran-Armitage trend test, covariate-supported logistic regression and linear regression, and their sequential combinations. TrustGWAS leverages pseudorandom number perturbations for PCA and multiparty scheme of multi-key homomorphic encryption for all other modules. TrustGWAS can evaluate 100,000 individuals with 1 million variants and complete QC-LD-PCA-regression workflow within 50 h. We further successfully discover gene loci associated with fasting blood glucose, consistent with the findings of the ChinaMAP project.
cell biology,biochemistry & molecular biology