Free factories | Guide books

Free factories: from the quantum coreworld to the personal genome project

January 2009

Author:
Alexander Wait Zaranek
Harvard University
,
Adviser:
George M. Church
Harvard University

Publisher:

Harvard University
Cambridge, MA
United States

ISBN:978-1-109-06619-7

Order Number:AAI3351021

Pages:

132

Purchase on ProQuest

Bibliometrics

Abstract

This dissertation develops technical and governance infrastructure for a "free factory" by building on parallels with free and open source software and related communities. By viewing varied technologies and people as comprising free factories—or a federation of co-operating and competing factories with certain common ideals and infrastructure—I argue many scientific questions become easier to answer.

In the first chapter, I briefly summarize the dissertation. I then describe the hardware, staff and other resources required to implement the computational aspects of a free factory with reasonable economies of scale. In the next chapter, I use the infrastructure to search for DNA and RNA editing events in more than 600 million genomic traces from ten organisms at NCBI. I find numerous examples of traces that support the existence of these phenomena and set the stage for a more comprehensive investigation. The subsequent chapter uses the same tools to analyze four individual human genomes for variants of clinical interest. This work demonstrates such analyses need not lead to costly or harmful medical workup. In the last chapter, I describe the initial data release of the Personal Genome Project. The release is derived from two gigabases of targeted sequence data from ten individuals. I investigate the quality of the data by comparison with Affymetrix 500K SNPs and discuss one variant of clinical interest. This data release—linking scientists, physicians and members of the general public—demonstrates the utility of free factories for advancing the state-of-the-art in personalized, genomic medicine.

In Appendix A, I indicate how the Quantum Coreworld—earlier work on a digital evolution system consistent with the rules of quantum information processing—could efficiently use free factories. Such projects could allow free factories to fully utilize idle resources. Finally, in Appendix B, a novel, open-source primary data analysis pipeline is used to reprocess 100 gigabytes of image data derived from the exome of a Personal Genome Project participant. This approach demonstrates a 14% increase in placeable reads, on the PGP sample, over the vendor's pipeline.

Contributors

George M Church
Harvard Medical School
- Publication Years2000 - 2016
- Publication counts15
- Citation count397
- Available for Download5
- Downloads (cumulative)1,972
- Downloads (12 months)42
- Downloads (6 weeks)4
- Average Downloads per Article394
- Average Citation per Article26
View Full Profile
Alexander Wait Zaranek
Harvard University
- Publication Years2008 - 2009
- Publication counts3
- Citation count2
- Available for Download0
- Downloads (cumulative)0
- Downloads (12 months)0
- Downloads (6 weeks)0
- Average Downloads per Article0
- Average Citation per Article1
View Full Profile

Recommendations

Alignment-free detection of local similarity among viral and bacterial genomes

Motivation: Bacterial and viral genomes are often affected by horizontal gene transfer observable as abrupt switching in local homology. In addition to the resulting mosaic genome structure, they frequently contain regions not found in close ...
Read More
An alignment-free method to identify candidate orthologous enhancers in multiple Drosophila genomes

Motivation: Evolutionarily conserved non-coding genomic sequences represent a potentially rich source for the discovery of gene regulatory region such as transcriptional enhancers. However, detecting orthologous enhancers using alignment-based ...
Read More
Alignment-free estimation of nucleotide diversity

Motivation: Sequencing capacity is currently growing more rapidly than CPU speed, leading to an analysis bottleneck in many genome projects. Alignment-free sequence analysis methods tend to be more efficient than their alignment-based counterparts. ...
Read More

Comments

Browse Theses

Sections

Alignment-free detection of local similarity among viral and bacterial genomes

An alignment-free method to identify candidate orthologous enhancers in multiple Drosophila genomes

Alignment-free estimation of nucleotide diversity

Sections

Save to Binder

Recommendations

Alignment-free detection of local similarity among viral and bacterial genomes

An alignment-free method to identify candidate orthologous enhancers in multiple Drosophila genomes

Alignment-free estimation of nucleotide diversity