ABSTRACT
The human genome can reveal sensitive information and is potentially re-identifiable, which raises privacy and security concerns about sharing such data on wide scales. In this work, we propose a preventive approach for privacy-preserving sharing of genomic data in decentralized networks for Genome-wide association studies (GWASs), which have been widely used in discovering the association between genotypes and phenotypes. The key components of this work are: a decentralized secure network, with a privacy- preserving sharing protocol, and a gene fragmentation framework that is trainable in an end-to-end manner. Our experiments on real datasets show the effectiveness of our privacy-preserving approaches as well as significant improvements in efficiency when compared with recent, related algorithms.
- Mikhail J Atallah, Florian Kerschbaum, andWenliang Du. 2003. Secure and private sequence comparisons. In Proceedings of the 2003 ACM workshop on Privacy in the electronic society. ACM, 39--44. Google ScholarDigital Library
- Pierre Baldi, Roberta Baronio, Emiliano De Cristofaro, Paolo Gasti, and Gene Tsudik. 2011. Countering gattaca: efficient and secure testing of fully-sequenced human genomes. In Proceedings of the 18th ACM conference on Computer and communications security. ACM, 691--702. Google ScholarDigital Library
- Dan Bogdanov, Liina Kamm, Swen Laur, and Ville Sokk. 2016. Rmind: a tool for cryptographically secure statistical analysis. IEEE Transactions on Dependable and Secure Computing (2016).Google Scholar
- Dan Boneh and Hovav Shacham. 2002. Fast variants of RSA. CryptoBytes 5, 1 (2002), 1--9.Google Scholar
- Fons Bruekers, Stefan Katzenbeisser, Klaus Kursawe, and Pim Tuyls. 2008. Privacy- Preserving Matching of DNA Profiles. IACR Cryptology ePrint Archive 2008 (2008), 203.Google Scholar
- Yangyi Chen, Bo Peng, XiaoFeng Wang, and Haixu Tang. 2012. Large-Scale Privacy-Preserving Mapping of Human Genomic Sequences on Hybrid Clouds.. In NDSS.Google Scholar
- 1000 Genomes Project Consortium et al. 2015. A global reference for human genetic variation. Nature 526, 7571 (2015), 68.Google Scholar
- Wellcome Trust Case Control Consortium et al. 2007. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 7145 (2007), 661.Google Scholar
- Caitlin Curtis and James Hereward. 2018. DNA facial prediction could make protecting your privacy more difficult. (2018). Retrieved May 06, 2018 from https://theconversation.com/ dna-facial-prediction-could-make-protecting-your-privacy-more-difficult-94740Google Scholar
- Ivan Damgård, Yuval Ishai, and Mikkel Krøigaard. 2010. Perfectly secure multiparty computation and the computational overhead of cryptography. In Annual international conference on the theory and applications of cryptographic techniques. Springer, 445--465. Google ScholarDigital Library
- Emiliano De Cristofaro, Sky Faber, Paolo Gasti, and Gene Tsudik. 2012. Genodroid: are privacy-preserving genomic tests ready for prime time?. In Proceedings of the 2012 ACM workshop on Privacy in the electronic society. ACM, 97--108. Google ScholarDigital Library
- Cynthia Dwork. 2011. Differential privacy. In Encyclopedia of Cryptography and Security. Springer, 338--340.Google ScholarDigital Library
- Craig Gentry and Dan Boneh. 2009. A fully homomorphic encryption scheme. Vol. 20. Stanford University Stanford.Google Scholar
- Kristian Gjøsteen. 2006. A new security proof for Damgård's ElGamal. In Cryptographers' Track at the RSA Conference. Springer, 150--158. Google ScholarDigital Library
- Ian J Goodfellow. 2014. On distinguishability criteria for estimating generative models. arXiv preprint arXiv:1412.6515 (2014).Google Scholar
- Melissa Gymrek, Amy L McGuire, David Golan, Eran Halperin, and Yaniv Erlich. 2013. Identifying personal genomes by surname inference. Science 339, 6117 (2013), 321--324.Google Scholar
- Yacov Y Haimes, Warren A Hall, and Herbert T Freedman. 2011. Multiobjective optimization in water resources systems: the surrogate worth trade-off method. Vol. 3. Elsevier.Google Scholar
- Petr Holub, Morris Swertz, Robert Reihs, David van Enckevort, Heimo Müller, and Jan-Eric Litton. 2016. BBMRI-ERIC Directory: 515 biobanks with over 60 million biological samples. Biopreservation and biobanking 14, 6 (2016), 559--562.Google Scholar
- Grace Hui Yang and Sicong Zhang. 2018. Differential Privacy for Information Retrieval. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 777--778. Google ScholarDigital Library
- Karthik A Jagadeesh, David J Wu, Johannes A Birgmeier, Dan Boneh, and Gill Bejerano. 2017. Deriving genomic diagnoses without revealing patient genomes. Science 357, 6352 (2017), 692--695.Google Scholar
- Somesh Jha, Louis Kruger, and Vitaly Shmatikov. 2008. Towards practical privacy for genomic computation. In Security and Privacy, 2008. SP 2008. IEEE Symposium on. IEEE, 216--230. Google ScholarDigital Library
- Aaron Johnson and Vitaly Shmatikov. 2013. Privacy-preserving data exploration in genome-wide association studies. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1079-- 1087. Google ScholarDigital Library
- Liina Kamm, Dan Bogdanov, Sven Laur, and Jaak Vilo. 2013. A new way to protect privacy in large-scale genome-wide association studies. Bioinformatics 29, 7 (2013), 886--893. Google ScholarDigital Library
- Christoph Lippert, Riccardo Sabatini, M Cyrus Maher, Eun Yong Kang, Seunghak Lee, Okan Arikan, Alena Harley, Axel Bernal, Peter Garst, Victor Lavrenko, et al. 2017. Identification of individuals by trait prediction using whole-genome sequencing data. Proceedings of the National Academy of Sciences 114, 38 (2017), 10166--10171.Google ScholarCross Ref
- Satoshi Nakamoto. 2008. Bitcoin: A peer-to-peer electronic cash system. (2008).Google Scholar
- Antonio Regalado. 2018. 2017 was the year consumer DNA testing blew up. (2018). Retrieved May 06, 2018 from https://www.technologyreview.com/s/610233/2017-was-the-year-consumer-dna-testing-blew-up/Google Scholar
- Md Nazmus Sadat, Md Momin Al Aziz, Noman Mohammed, Feng Chen, Shuang Wang, and Xiaoqian Jiang. 2017. SAFETY: Secure gwAs in Federated Environment Through a hYbrid solution with Intel SGX and Homomorphic Encryption. arXiv preprint arXiv:1703.02577 (2017).Google Scholar
- Sean Simmons, Cenk Sahinalp, and Bonnie Berger. 2016. Enabling privacypreserving GWASs in heterogeneous human populations. Cell systems 3, 1 (2016), 54--61.Google Scholar
- Latanya Sweeney, Akua Abu, and Julia Winn. 2013. Identifying participants in the personal genome project by name. (2013).Google Scholar
- Don Tapscott and Alex Tapscott. 2016. Blockchain revolution: how the technology behind bitcoin is changing money, business, and the world. Penguin. Google ScholarDigital Library
- Caroline Uhlerop, Aleksandra Slavkovic, and Stephen E Fienberg. 2013. Privacypreserving data sharing for genome-wide association studies. The Journal of privacy and confidentiality 5, 1 (2013), 137.Google Scholar
- Marten Van Dijk, Craig Gentry, Shai Halevi, and Vinod Vaikuntanathan. 2010. Fully homomorphic encryption over the integers. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 24--43. Google ScholarDigital Library
- Peter M Visscher, Naomi R Wray, Qian Zhang, Pamela Sklar, Mark I McCarthy, Matthew A Brown, and Jian Yang. 2017. 10 years of GWAS discovery: biology, function, and translation. The American Journal of Human Genetics 101, 1 (2017), 5--22.Google ScholarCross Ref
- ShuangWang, Yuchen Zhang,Wenrui Dai, Kristin Lauter, Miran Kim, Yuzhe Tang, Hongkai Xiong, and Xiaoqian Jiang. 2015. HEALER: Homomorphic computation of ExAct Logistic rEgRession for secure rare disease variants analysis in GWAS. Bioinformatics 32, 2 (2015), 211--218.Google Scholar
- Andrew C Yao. 1982. Protocols for secure computations. In Foundations of Computer Science, 1982. SFCS'08. 23rd Annual Symposium on. IEEE, 160--164. Google ScholarDigital Library
- Fei Yu, Stephen E Fienberg, Aleksandra B Slavkovic, and Caroline Uhler. 2014. Scalable privacy-preserving data sharing methodology for genome-wide association studies. Journal of biomedical informatics 50 (2014), 133--141.Google ScholarCross Ref
Index Terms
- Enabling Privacy-Preserving Sharing of Genomic Data for GWASs in Decentralized Networks
Recommendations
Preserving Genomic Privacy via Selective Sharing
WPES'20: Proceedings of the 19th Workshop on Privacy in the Electronic SocietyAlthough genomic data has significant impact and widespread usage in medical research, it puts individuals' privacy in danger, even if they anonymously or partially share their genomic data. To address this problem, we present a framework that is ...
How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems
The increasing integration of patient-specific genomic data into clinical practice and research raises serious privacy concerns. Various systems have been proposed that protect privacy by removing or encrypting explicitly identifying information, such ...
Privacy-preserving data sharing in cloud computing
Storing and sharing databases in the cloud of computers raise serious concern of individual privacy. We consider two kinds of privacy risk: presence leakage, by which the attackers can explicitly identify individuals in (or not in) the database, and ...
Comments