ABSTRACT
The correct interpretation of tandem mass spectra is a difficult problem, even when it is limited to scoring peptides against a database. De novo sequencing is considerably harder, but critical when sequence databases are incomplete or not available. In this paper we build upon earlier work due to Dancik et al., and Chen et al. to provide a dynamic programming algorithm for interpreting de novo spectra. Our method can handle most of the commonly occurring ions, including a; b; y, and their neutral losses. Additionally, we shift the emphasis away from sequencing to assigning ion types to peaks. In particular, we introduce the notion of core interpretations, which allow us to give confidence values to individual peak assignments, even in the absence of a strong interpretation. Finally, we introduce a systematic approach to evaluating de novo algorithms as a function of spectral quality. We show that our algorithm, in particular the core-interpretation, is robust in the presence of measurement error, and low fragmentation probability.
- V. Bafna and N. Edwards. SCOPE: a probabilistic model for scoring tandem mass spectra against a peptide database. Bioinformatics, 17 Suppl 1:S13--21, June 2001. Appeared in Intl. Conference on Intelligent Systems for Molecular Biology.Google Scholar
- C. Bartels. Fast algorithm for peptide sequencing by mass spectrometry. Biomedical and Environmental Mass Spectrometry, 19:363--368, 1990.Google ScholarCross Ref
- T. Chen, M. Y. Kao, M. Tepel, J. Rush, and G.M. Church. A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. Journal of Computational Biology, 8(6):571--83, 2001.Google ScholarCross Ref
- V. Dancik, T. Addona, K. Clauser, J. Vath, and P.A. Pevzner. De novo peptide sequencing via tandem mass spectrometry. Journal of Computational Biology, 6:327--342, 1999.Google ScholarCross Ref
- J. Fernandez de Cossio, J. Gonzales, and V. Besada. Protein identification using mass spectrometric information. Comput. Appl. Biosci., 11:427--434, 1995.Google Scholar
- J. Eng, A. McCormack, and J. Yates. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of American Society of Mass Spectrometry, 5:976--989, 1994.Google ScholarCross Ref
- D. Fenyo, J. Qin, and B.T. Chait. Protein identification using mass spectrometric information. Electrophoresis, 19(6):998--1005, 1998.Google ScholarCross Ref
- R.J. Johnson and K. Biemann. Computer program (seqpep) to aid in the interpretation of high-energy collision tandem mass spectra of peptides. Biomedical and Environmental Mass Spectrometry, 18:945--957, 1989.Google ScholarCross Ref
- D.J. Lipman and W.R. Pearson. Rapid and sensitive protein similarity searches. Science, 227:1435--1441, 1985.Google ScholarCross Ref
- M. Mann and M. Wilm. Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Analytical Chemistry, 66:4390--4399, 1994.Google ScholarCross Ref
- P. A. Pevzner. Computational Molecular Biology: An Algorithmic Approach. MIT Press, 2000.Google ScholarCross Ref
- P.A. Pevzner, V. Dancik, and C.L. Tang. Mutation-tolerant protein identification by mass-spectrometry. In R. Shamir, S. Miyano, S. Istrail, P.A. Pevzner, and M.S. Waterman, editors, International Conference on Computational Molecular Biology (RECOMB), pages 231--236. ACM Press, 2000. Google ScholarDigital Library
- J.A. Taylor and R.S. Johnson. Sequence database searches via de novo peptide sequencing by mass spectrometry. Rapid Communications in Mass Spectrometry, 11:1067--1075, 1997.Google ScholarCross Ref
Index Terms
- On de novo interpretation of tandem mass spectra for peptide identification
Recommendations
Improving phosphopeptide identification in shotgun proteomics by supervised filtering of peptide-spectrum matches
BCB'13: Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical InformaticsOne of the important objectives in mass spectrometry-based proteomics is the identification of post-translationally modified sites in cellular and extracellular proteomes. Proteomics techniques have been particularly effective in studying protein ...
Improving the Results of De novo Peptide Identification via Tandem Mass Spectrometry Using a Genetic Programming-Based Scoring Function for Re-ranking Peptide-Spectrum Matches
PRICAI 2019: Trends in Artificial IntelligenceAbstractDe novo peptide sequencing algorithms have been widely used in proteomics to analyse tandem mass spectra (MS/MS) and assign them to peptides, but quality-control methods to evaluate the confidence of de novo peptide sequencing are lagging behind. ...
A neural network approach to the identification of b-/y-ions in MS/MS spectra
BIBM '12: Proceedings of the 2012 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)The effectiveness of de novo peptide sequencing algorithms depends on the quality of MS/MS spectra. Since most of the peaks in a spectrum are uninterpretable ‘noise’ peaks it is necessary to carefully pre-filter the spectra to identify the ‘signal’ ...
Comments