skip to main content
10.1145/1458449.1458466acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Knowledge-based gene symbol disambiguation

Published:30 October 2008Publication History

ABSTRACT

Since there is no standard naming convention for genes and gene products, gene symbol disambiguation (GSD) has become a big challenge when mining biomedical literature. Several GSD methods have been proposed based on MEDLINE references to genes. However, nowadays gene databases, e.g. Entrez Gene, provide plenty of information about genes, and many biomedical ontologies, e.g. UMLS Metathesaurus and Semantic Network, have been developed. These knowledge sources could be used for disambiguation, in this paper we propose a method which relies on information about gene candidates from gene databases, contexts of gene symbols and biomedical ontologies. We implement our method, and evaluate the performance of the implementation using BioCreAtIvE II data sets.

References

  1. Chen L., Liu H., Friedman C. (2005) Gene name ambiguity of eukaryotic nomenclatures. Bioinformatics 21:248--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Gale, W., K. Church, and D. Yarowsky. (1992) One Sense Per Discourse. Proceedings of the 4th DARPA Speech and Natural Language Workshop 233--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Hatzivassiloglou V., Duboue PA., Rzhetsky A. (2001) Disambiguating proteins, genes, and RNA in text: a machine learning approach. Bioinformatics 17:S97--106.Google ScholarGoogle ScholarCross RefCross Ref
  4. Jensen J.L., Saric J., Bork P. (2006) Literature mining for the biologist:from information retrieval to biological discovery. Nature Reviews Genetics 7:119--129.Google ScholarGoogle ScholarCross RefCross Ref
  5. Krallinger M., Valencia A. (2005) Text-Mining and Information-Retrieval Services for Molecular Biology. Genome Biology 6:224.Google ScholarGoogle ScholarCross RefCross Ref
  6. Lambrix P., Tan H., Jakoniene V., Strömbäck L. (2007) Biological Ontologies.chapter 4 in Baker, Cheung (eds), Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences 85--99.Google ScholarGoogle Scholar
  7. Leser U., Hakenberg J. (2005) What makes a gene name? Named entity recognition in the biomedical literature. Briefings in Bioinformatics 6(4):357--369.Google ScholarGoogle ScholarCross RefCross Ref
  8. Podowski R.M., Cleary J.G., Goncharoff N.T. (2004) AZuRE, a scalable system for automated term disambiguation of gene and protein names. Proceedings IEEE Comput. Syst. Bioinform. Conf. 415--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Schijvenaars BJA, et al. (2005)Thesaurus-based disambiguation of gene symbols. BMC Bioinformatics 6:149.Google ScholarGoogle ScholarCross RefCross Ref
  10. Tamames J., Valencia A. (2006)The success (or not) of HUGO nomenclature. Genome Biol.7:402.Google ScholarGoogle ScholarCross RefCross Ref
  11. Xu H., et al. (2007) Gene symbol disambiguation using knowledge-based profiles. Bioinformatics 23(8):1015--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Knowledge-based gene symbol disambiguation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        DTMBIO '08: Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
        October 2008
        92 pages
        ISBN:9781605582511
        DOI:10.1145/1458449

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 October 2008

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        Overall Acceptance Rate19of36submissions,53%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader