tutorial

Tutorial: Rapidly Identifying Disease-associated Rare Variants using Annotation and Machine Learning at Whole-genome Scale Online

Authors:
Alex V. Kotlar

Emory University, Atlanta, GA, USA

Emory University, Atlanta, GA, USA
View Profile

,
Thomas S. Wingo

Emory University, Atlanta, GA, USA

Emory University, Atlanta, GA, USA
View Profile

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health InformaticsAugust 2018Pages 558https://doi.org/10.1145/3233547.3233666

Published:15 August 2018Publication History

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Pages 558

ABSTRACT

Accurately identifying disease-associated alleles from large sequencing experiments remains challenging. During this tutorial, participants will learn how to use a new variant annotation and filtering web app called Bystro (https://bystro.io/) to analyze sequencing experiments. Bystro is the first online, cloud-based application that makes variant annotation and filtering accessible to all researchers for even the largest, terabyte-sized whole-genome experiments containing thousands of samples. Using its general-purpose, natural-language filtering engine, attendees will be shown how to perform quality control measures and identify alleles of interest. They will then be guided in exporting those variants, and using them in both a regression context by performing rare-variant association tests in R, as well as classification context by training new machine learning models in Python's scikit-learn library.

References

I. Ionita-Laza, S. Lee, V. Makarov, J. D. Buxbaum, and X. Lin . 2013. Sequence kernel association tests for the combined effect of rare and common variants. Am J Hum Genet Vol. 92, 6 (2013), 841--53.Google ScholarCross Ref
M. Kircher, D. M. Witten, P. Jain, B. J. O'Roak, G. M. Cooper, and J. Shendure . 2014. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet Vol. 46, 3 (2014), 310--5.Google ScholarCross Ref
Alex V. Kotlar, Cristina E. Trevino, Michael E. Zwick, David J. Cutler, and Thomas S. Wingo . 2017. Bystro: Rapid online variant annotation and natural-language filtering at whole-genome scale. bioRxiv (2017).Google Scholar

Index Terms

Tutorial: Rapidly Identifying Disease-associated Rare Variants using Annotation and Machine Learning at Whole-genome Scale Online
1. Applied computing
  1. Life and medical sciences

Recommendations

Classifying promoters by interpreting the hidden information of DNA sequences for disease prediction in clinical laboratories using Gaussian decision boundary estimation

A promoter is a brief stretch of DNA (100–1,000 bp) where RNA polymerase starts to transcribe a gene. A DNA (Deoxyribonucleic Acid) base pair is a fundamental unit of DNA structure and represents the pairing of two complementary nucleotide bases within ...
Read More
Prediction of small non-coding RNA in bacterial genomes using support vector machines

Small non-coding RNA genes have been shown to play important regulatory roles in a variety of cellular processes, but prediction of non-coding RNA genes is a great challenge, using either an experimental or a computational approach, due to the ...
Read More
Machine learning-based approaches identify a key physicochemical property for accurately predicting polyadenlylation signals in genomic sequences
ICIC'13: Proceedings of the 9th international conference on Intelligent Computing Theories and Technology

Accurately predicting poly(A) signals (PASs) is one of important topics in bioinformatics for high-quality genome annotation and transcription regulation mechanism investigation. In this study, we identified a powerful physicochemical property of DNA ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
August 2018
727 pages
ISBN:9781450357944
DOI:10.1145/3233547
General Chairs:
Amarda Shehu
George Mason University, USA
,
Cathy Wu
University of Delaware, USA
,
Program Chairs:
Christina Boucher
University of Florida, USA
,
Jing Li
Case Western Reserve University, USA
,
Hongfang Liu
Mayo Clinic, USA
,
Mihai Pop
University of Maryland, USA
Copyright © 2018 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 August 2018
Check for updates
Author Tags
bioinformatics
machine learning
rare-variant association tests
variant classification
Qualifiers
- tutorial
Conference

Acceptance Rates
BCB '18 Paper Acceptance Rate46of148submissions,31%Overall Acceptance Rate254of885submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 62
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Tutorial: Rapidly Identifying Disease-associated Rare Variants using Annotation and Machine Learning at Whole-genome Scale Online

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Classifying promoters by interpreting the hidden information of DNA sequences for disease prediction in clinical laboratories using Gaussian decision boundary estimation

Prediction of small non-coding RNA in bacterial genomes using support vector machines

Machine learning-based approaches identify a key physicochemical property for accurately predicting polyadenlylation signals in genomic sequences

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Tutorial: Rapidly Identifying Disease-associated Rare Variants using Annotation and Machine Learning at Whole-genome Scale Online

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Classifying promoters by interpreting the hidden information of DNA sequences for disease prediction in clinical laboratories using Gaussian decision boundary estimation

Prediction of small non-coding RNA in bacterial genomes using support vector machines

Machine learning-based approaches identify a key physicochemical property for accurately predicting polyadenlylation signals in genomic sequences

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media