skip to main content
10.5555/1567619.1567641dlproceedingsArticle/Chapter ViewAbstractPublication PagesbionlpConference Proceedingsconference-collections
research-article
Free Access

Using dependency parsing and probabilistic inference to extract relationships between genes, proteins and malignancies implicit among multiple biomedical research abstracts

Published:08 June 2006Publication History

ABSTRACT

We describe BioLiterate, a prototype software system which infers relationships involving relationships between genes, proteins and malignancies from research abstracts, and has initially been tested in the domain of the molecular genetics of oncology. The architecture uses a natural language processing module to extract entities, dependencies and simple semantic relationships from texts, and then feeds these features into a probabilistic reasoning module which combines the semantic relationships extracted by the NLP module to form new semantic relationships. One application of this system is the discovery of relationships that are not contained in any individual abstract but are implicit in the combined knowledge contained in two or more abstracts.

References

  1. Chan-Goo Kang and Jong C. Park. 2005. Generation of Coherent Gene Summary with Concept-Linking Sentences. Proceedings of the International Symposium on Languages in Biology and Medicine (LBM), pages 41--45, Daejeon, Korea, November, 2005.Google ScholarGoogle Scholar
  2. Claire Nédellec. 2005. Learning Language in Logic - Genic Interaction Extraction Challenge. Proceedings of The 22nd International Conference on Machine Learning, Bonn, Germany.Google ScholarGoogle Scholar
  3. Cliff Goddard. 2002. The On-going Development of the NSM Research Program. Ch 5 (pp. 301--321) of Meaning and Universal Grammar - Theory and Empirical Findings. Volume II. Amsterdam: John Benjamins.Google ScholarGoogle Scholar
  4. Davulcu, H et Al. 2005. IntEx?: A Syntactic Role Driven Protein-Protein Interaction Extractor for Bio-Medical Text. Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics. Detroit.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Donaldson, Ian, Joel Martin, Berry de Bruijn, Cheryl Wolting et al. 2003. PreBIND and Textomy - mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics, 4:11,Google ScholarGoogle Scholar
  6. Friedman C, Kra P, Yu H, Krauthammer M, Rzhetsky. 2001. A. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics Jun;17 Suppl 1:S74--82.Google ScholarGoogle Scholar
  7. Goertzel, Ben and Cassio Pennachin. 2005. Artificial General Intelligence. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Goertzel, Ben, Matt Ikle', Izabela Goertzel and Ari Heljakka. 2006. Probabilistic Logic Networks. In preparation.Google ScholarGoogle Scholar
  9. Götz, T and Suhre, O. 2004. Design and implementation of the UIMA Common Analysis System. IBM Systems Journal. V 43, number 3. pages 476--489. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Guha, R. V.,&Lenat, D. B. 1994. Enabling agents to work together. Communications of the ACM, 37(7), 127--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Guha, R. V. and Lenat, D. B. 1990. Cyc: A Midterm Report. AI Magazine 11(3):32--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hakenberg,. et al. 2205. LLL'05 Challenge: Genic Interaction Extraction -- Identification of Language Patterns Based on Alignment and Finite State Automata. Proceedings of The 22nd International Conference on Machine Learning, Bonn, Germany. 2005.Google ScholarGoogle Scholar
  13. Hoffmann, R., Valencia, A. 2005. Implementing the iHOP concept for navigation of biomedical literature. Bioinformatics 21(suppl. 2), ii252--ii258 (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ian Niles and Adam Pease. 2001. Towards a Standard Upper Ontology. In Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS--2001), Ogunquit, Maine, October 2001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jensen, L. J., Saric, J and Bork, P. 2006. Literature Mining for the biologist: from information retrieval to biological discovery. Nature Reviews. Vol 7. pages 119--129. Natura Publishing Group. 2006.Google ScholarGoogle Scholar
  16. Jing Ding. 2003. Extracting biomedical interactions with from medline using a link grammar parser. Proceedings of 15th IEEE international Conference on Tools With Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kim, Jim-Dong et al. 2004. Introduction to the Bio-NLP Entity Task at JNLPBA 2004. In Proceedings of JNLPBA 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lenat, D., Prakash, M.,&Shepard, M. 1986. CYC: Using common sense knowledge to overcome brittleness and knowledge acquisition bottlenecks. AI Magazine, 6(4), 65--85 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lerman, K, McDonal, R., Jin, Y. and Pancoast, E. University of Pennsylvania BioTagger. 2006. http://www.seas.upenn.edu/~ryantm/software/BioTagger/Google ScholarGoogle Scholar
  20. Looks, Moshe, Ben Goertzel and Cassio Pennachin. 2004. Novamente: An Integrative Approach to Artificial General Intelligence. AAAI Symposium on Achieving Human-Level Intelligence Through Integrated Systems and Research, Washington DC, October 2004Google ScholarGoogle Scholar
  21. Mandel, Mark. 2006. Mining the Bibliome. February, 2006 http://bioie.ldc.upenn.eduGoogle ScholarGoogle Scholar
  22. Mark A. Greenwood, Mark Stevenson, Yikun Guo, Henk Harkema, and Angus Roberts. 2005. Automatically Acquiring a Linguistically Motivated Genic Interaction Extraction System. In Proceedings of the 4th Learning Language in Logic Workshop (LLL05), Bonn, Germany.Google ScholarGoogle Scholar
  23. McDonald, F. Pereira, S. Kulick, S. Winters, Y. Jin and P. White. 2005. Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE. R. 43rd Annual Meeting of the Association for Computational Linguistics, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Müller, H. M., Kenny, E. E. and Sternberg, P. W. 2004. Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature. PLoS Biol 2(11): e309Google ScholarGoogle ScholarCross RefCross Ref
  25. Pyysalo, S. et al. 2004. Analisys of link Grammar on Biomedical Dependency Corpus Targeted at Protein-Protein Interactions. In Proceedings of JNLPBA 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Riedel, et al. 2005. Genic Interaction Extraction with Semantic and Syntactic Chains. Proceedings of The 22nd International Conference on Machine Learning, Bonn, Germany.Google ScholarGoogle Scholar
  27. Ryan McDonald and Fernando Pereira. 2005. Identifying gene and protein mentions in text using conditional random fields. BMC Bioinformatics 2005, 6(Suppl 1):S6Google ScholarGoogle Scholar
  28. Rzhetsky A, Iossifov I, Koike T, Krauthammer M, Kra P, Morris M, Yu H, Duboue PA, Weng W, Wilbur WJ, Hatzivassiloglou V, Friedman C. 2004. Gene-Ways: a system for extracting, analyzing, visualizing, and integrating molecular pathway data. Journal of Biomedical Informatics 37(1):43--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sleator, Daniel and Dave Temperley. 1993. Parsing English with a Link Grammar. Third International Workshop on Parsing Technologies, Tilburg, The Netherlands.Google ScholarGoogle Scholar
  30. Smalheiser, N. L and Swanson D. R. 1996. Linking estrogen to Alzheimer's disease: an informatics approach. Neurology 47(3):809--10.Google ScholarGoogle ScholarCross RefCross Ref
  31. Smalheiser, N. L and Swanson, D. R. 1998. Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Comput Methods Programs Biomed. 57(3):149--53.Google ScholarGoogle ScholarCross RefCross Ref
  32. Syed Ahmed et al. 2005. IntEx: A Syntactic Role Driven Protein-Protein Interaction Extractor for Bio-Medical Text. Proc. of BioLink '2005, Detroit, Michigan, June 24, 2005Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Szolovits, Peter. 2003. Adding a medical lexicon to an English parser. Proceedings of 2003 AMIA Annual Symposium. Bethesda. MD.Google ScholarGoogle Scholar
  34. Tanabe, L. U. Scherf, L. H. Smith, J. K. Lee, L. Hunter and J. N. Weinstein. 1999. MedMiner: an Internet Text-Mining Tool for Biomedical Information, with Application to Gene Expression Profiling. BioTechniques 27:1210--1217.Google ScholarGoogle ScholarCross RefCross Ref
  35. Wierzbicka, Anna. 1996. Semantics, Primes and Universals. Oxford University Press.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    BioNLP '06: Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
    June 2006
    156 pages

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    • Published: 8 June 2006

    Qualifiers

    • research-article

    Acceptance Rates

    BioNLP '06 Paper Acceptance Rate11of29submissions,38%Overall Acceptance Rate33of92submissions,36%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader