ABSTRACT
Bioinfomatics is a bright new field, and is defined as a science of developing computer databases and algorithms for the purpose of speeding up and enhancing biological research. The mapping of the human genome was officially completed in June of 2000, and DNA sequence has been completed in April 2003. The key issue of next era in Bioinfomatics is to interpret data for our purpose. Biological research uses a huge volume of data, the DNA sequence, the nature of the process's pattern searching, complex calculations and sorted data subsets. The pattern searching has been studied in data mining community to extract characteristics from databases, which may be text strings and/or primary sequence information. The text strings consists of entry, organism, or gene names, and authors. Primary sequence information includes restriction enzyme cut sites, regulatory patterns ("signal") and known functional or structural pattern's (motifs). The pattern searching covers simple one-dimensional pattern text searches, regular pattern expressions, to neural nets and genetic algorithms. A visualization technique offers visual data interpretation to the Bioinfomatics. Cluster, grid computing and/or parallel computation are introduced to share database called data banks and to speed up the searching over databases. The large distributed databases need methods of inquiry and access over the networks. The VRML, cellML, AnatML, and FiledML have been designed to access cluster GRID computing. The web-based and GRID computing offers tools for easy data access and integration to facilitate biological researches in the next era. The Computer Science has provided background technologies to Bioinfomatics in a variety of fields. We study Bioinfomatics researches and products and show growing adaptation of Computer Science in Bioinfomatics.
- Human Genome Project, http://www.ornl.gov/sci/techresources/Human_Genome/home.shtmlGoogle Scholar
- Mike Cornell, Norman W. Paton, Shengli Wu, Carole A. Goble, Crispin J. Miller, Paul Kirby, Karen Eilbeck, Andy Brass, Andrew Hayes, Stephen G. Oliver, GIMS - A Data Warehouse for Storage and Analysis of Genome Sequence and Functional Data, 2nd IEEE International Symposium on Bioinfomatics and Bioengineering (BIBE'01) March 04 - 06, 2001 Bethesda, Maryland Google ScholarDigital Library
- Michael B. Eisen, Paul T. Spellman, Patrick O. Brown, and David Botstein. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. Vol. 95, pp. 14863--14868, December 1998.Google ScholarCross Ref
- Eugene W. Myers and James L. Weber, Is whole Human Genome Sequencing Feasible?, Department of Computer Science, University of Arizona, Tucson, AZ 85721--0077; Center for Medical Genetics, Marshfield Medical Research Foundation, Marshfield, WI 54449Google Scholar
- Edward N Baker, Vickery L Arcus, J Shaun Lott., Protein structure prediction and analysis as a tool for functional genomics, Center of Molecular Biodiscovery and School of Biological Sciences, University of Auckland, Auckland, New Zealand.Google Scholar
- Mathew Palakal, Matthew Stephens, Snehasis Mukhopadhyay, Rajeev Raje, Multi-level Text Mining Method to Extract Biological Relationships. Proceeding of the IEEE Computer Society Bioinformatics Conference, 2002. Google ScholarDigital Library
- United Devices' Cancer Research Project, http://www.grid.orgGoogle Scholar
- Physiome Sciences, Inc. Computer-based biological simulation technology, http://www.bioinformaticssolutions.com/products/ph.phpGoogle Scholar
- Bioinformatics solutions Inc. Pattern Hunter, http://www.bioinformaticssolutions.com/Google Scholar
- Zoe Lacrcix, Omar Boucelma, Mehd Essid, The biological integration System, proceedings of the fifth ACM international workshop on Web information and data management, New Orleans, Louisiana, USA, 2003. Google ScholarDigital Library
- Steffen Schulze-Kremer, Genetic Algorithms and the Protein Folding Problem, Virtual School of Natural Science, Biocomputing Division, Berlin, GermanyGoogle Scholar
- Steven Salzberg, Arthur L. Delcher, Kenneth H. Fasman, John Henderson, A Decision Tree System for Finding Genes in DNA, Journal of Computational Biology 1997Google Scholar
Index Terms
- Growing adaptation of computer science in Bioinfomatics
Recommendations
A Bioinfomatics Grid Alignment Toolkit
Even though many useful tools for sequence alignment are available, such as BLAST and PSI-BLAST by NCBI and FASTA by the University of Virginia, a key issue regarding sequence databases is their size, growing at an exponential rate. Grid and parallel ...
A comparative study of codon adaptation in ssDNA and dsDNA phages
BCB '12: Proceedings of the ACM Conference on Bioinformatics, Computational Biology and BiomedicineSelection and mutation are the main forces that shape the codon usage patterns of viruses. Bacteriophages, just like all other viruses depend on their host's translational machinery for replication. The rate of spontaneous C→T mutations is about 100-...
Science: gene expression analysis
Handbook of data mining and knowledge discoveryEvery cell contains all the information necessary to grow, divide, and respond correctly to its environment. The DNA sequence that holds this information is already known for many organisms, and a canonical draft DNA sequence was known for humans by the ...
Comments