Balancing Confidentiality and Sharing of Genomic and Phenotypic Data in a Clinical Research System

Daniel Veltri,Zhiwen Li,Sandhya Xirasagar,Andrew J. Oler,Xi Cheng,Ke Huang,Jason Barnett,Lingwen Zhang,Octavio Juarez-Espinosa,Yongjie Fan,Morgan Similuk,Magdalena Walkiewicz,Celine Hong,Joyce Johnson,Joshua D. Milner
DOI: https://doi.org/10.1145/3233547.3233641
2018-01-01
Abstract:We recently developed the Genomics Research Integration System (GRIS) to help NIAID investigators at the NIH leverage both phenotypic and genotypic patient data to identify causal variants for rare diseases. The project is a bioinformatics compliment to an initiative to sequence exomes for all NIAID patients visiting the NIH Clinical Center. The system is designed to serve as a valuable resource for clinical genomic data annotated with standardized phenotypic terms using the Human Phenotype Ontology \citeKohler2013. GRIS uses PhenoTips® \citeGirdea2013 to capture clinical records and family pedigrees which are linked to genomic records stored in a genetic analysis tool,seqr, developed at the Broad Institute (\urlseqr.broadinstitute.org ) to enable causal variant identification. We have customized both programs in novel ways to meet NIH encryption requirements, to link patient records across programs in a controlled manner, and to provide "tiers" of access so that individual research groups can customize users' ability to edit their patient records and view personally identifiable information (PII). A challenge faced by shared clinical data repositories is to facilitate maximal collective research value of data through open sharing, while respecting the needs of researchers to adjust access to patient data in accordance with research goals and subject to clinical sharing guidelines. We devised a technical approach to meet the needs of sharing policies, formulated collectively by researchers and clinicians, to promote wider acceptance and usage of the system. Accordingly, we implemented a patient identifier mapping system in conjunction with automated notifications to enable transparent sharing. Our approach may prove helpful to other hospital or clinical support systems seeking to respect the confidentiality of patient PII and early findings of individual researchers, while recognizing that data repositories are most primed for discovery (and can significantly increase return on investment) if they are open and accessible to a larger research community.
What problem does this paper attempt to address?