ABSTRACT
We describe BioLiterate, a prototype software system which infers relationships involving relationships between genes, proteins and malignancies from research abstracts, and has initially been tested in the domain of the molecular genetics of oncology. The architecture uses a natural language processing module to extract entities, dependencies and simple semantic relationships from texts, and then feeds these features into a probabilistic reasoning module which combines the semantic relationships extracted by the NLP module to form new semantic relationships. One application of this system is the discovery of relationships that are not contained in any individual abstract but are implicit in the combined knowledge contained in two or more abstracts.
- Chan-Goo Kang and Jong C. Park. 2005. Generation of Coherent Gene Summary with Concept-Linking Sentences. Proceedings of the International Symposium on Languages in Biology and Medicine (LBM), pages 41--45, Daejeon, Korea, November, 2005.Google Scholar
- Claire Nédellec. 2005. Learning Language in Logic - Genic Interaction Extraction Challenge. Proceedings of The 22nd International Conference on Machine Learning, Bonn, Germany.Google Scholar
- Cliff Goddard. 2002. The On-going Development of the NSM Research Program. Ch 5 (pp. 301--321) of Meaning and Universal Grammar - Theory and Empirical Findings. Volume II. Amsterdam: John Benjamins.Google Scholar
- Davulcu, H et Al. 2005. IntEx?: A Syntactic Role Driven Protein-Protein Interaction Extractor for Bio-Medical Text. Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics. Detroit.Google ScholarDigital Library
- Donaldson, Ian, Joel Martin, Berry de Bruijn, Cheryl Wolting et al. 2003. PreBIND and Textomy - mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics, 4:11,Google Scholar
- Friedman C, Kra P, Yu H, Krauthammer M, Rzhetsky. 2001. A. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics Jun;17 Suppl 1:S74--82.Google Scholar
- Goertzel, Ben and Cassio Pennachin. 2005. Artificial General Intelligence. Springer-Verlag. Google ScholarDigital Library
- Goertzel, Ben, Matt Ikle', Izabela Goertzel and Ari Heljakka. 2006. Probabilistic Logic Networks. In preparation.Google Scholar
- Götz, T and Suhre, O. 2004. Design and implementation of the UIMA Common Analysis System. IBM Systems Journal. V 43, number 3. pages 476--489. Google ScholarDigital Library
- Guha, R. V.,&Lenat, D. B. 1994. Enabling agents to work together. Communications of the ACM, 37(7), 127--142. Google ScholarDigital Library
- Guha, R. V. and Lenat, D. B. 1990. Cyc: A Midterm Report. AI Magazine 11(3):32--59. Google ScholarDigital Library
- Hakenberg,. et al. 2205. LLL'05 Challenge: Genic Interaction Extraction -- Identification of Language Patterns Based on Alignment and Finite State Automata. Proceedings of The 22nd International Conference on Machine Learning, Bonn, Germany. 2005.Google Scholar
- Hoffmann, R., Valencia, A. 2005. Implementing the iHOP concept for navigation of biomedical literature. Bioinformatics 21(suppl. 2), ii252--ii258 (2005). Google ScholarDigital Library
- Ian Niles and Adam Pease. 2001. Towards a Standard Upper Ontology. In Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS--2001), Ogunquit, Maine, October 2001 Google ScholarDigital Library
- Jensen, L. J., Saric, J and Bork, P. 2006. Literature Mining for the biologist: from information retrieval to biological discovery. Nature Reviews. Vol 7. pages 119--129. Natura Publishing Group. 2006.Google Scholar
- Jing Ding. 2003. Extracting biomedical interactions with from medline using a link grammar parser. Proceedings of 15th IEEE international Conference on Tools With Artificial Intelligence. Google ScholarDigital Library
- Kim, Jim-Dong et al. 2004. Introduction to the Bio-NLP Entity Task at JNLPBA 2004. In Proceedings of JNLPBA 2004. Google ScholarDigital Library
- Lenat, D., Prakash, M.,&Shepard, M. 1986. CYC: Using common sense knowledge to overcome brittleness and knowledge acquisition bottlenecks. AI Magazine, 6(4), 65--85 Google ScholarDigital Library
- Lerman, K, McDonal, R., Jin, Y. and Pancoast, E. University of Pennsylvania BioTagger. 2006. http://www.seas.upenn.edu/~ryantm/software/BioTagger/Google Scholar
- Looks, Moshe, Ben Goertzel and Cassio Pennachin. 2004. Novamente: An Integrative Approach to Artificial General Intelligence. AAAI Symposium on Achieving Human-Level Intelligence Through Integrated Systems and Research, Washington DC, October 2004Google Scholar
- Mandel, Mark. 2006. Mining the Bibliome. February, 2006 http://bioie.ldc.upenn.eduGoogle Scholar
- Mark A. Greenwood, Mark Stevenson, Yikun Guo, Henk Harkema, and Angus Roberts. 2005. Automatically Acquiring a Linguistically Motivated Genic Interaction Extraction System. In Proceedings of the 4th Learning Language in Logic Workshop (LLL05), Bonn, Germany.Google Scholar
- McDonald, F. Pereira, S. Kulick, S. Winters, Y. Jin and P. White. 2005. Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE. R. 43rd Annual Meeting of the Association for Computational Linguistics, 2005. Google ScholarDigital Library
- Müller, H. M., Kenny, E. E. and Sternberg, P. W. 2004. Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature. PLoS Biol 2(11): e309Google ScholarCross Ref
- Pyysalo, S. et al. 2004. Analisys of link Grammar on Biomedical Dependency Corpus Targeted at Protein-Protein Interactions. In Proceedings of JNLPBA 2004. Google ScholarDigital Library
- Riedel, et al. 2005. Genic Interaction Extraction with Semantic and Syntactic Chains. Proceedings of The 22nd International Conference on Machine Learning, Bonn, Germany.Google Scholar
- Ryan McDonald and Fernando Pereira. 2005. Identifying gene and protein mentions in text using conditional random fields. BMC Bioinformatics 2005, 6(Suppl 1):S6Google Scholar
- Rzhetsky A, Iossifov I, Koike T, Krauthammer M, Kra P, Morris M, Yu H, Duboue PA, Weng W, Wilbur WJ, Hatzivassiloglou V, Friedman C. 2004. Gene-Ways: a system for extracting, analyzing, visualizing, and integrating molecular pathway data. Journal of Biomedical Informatics 37(1):43--53. Google ScholarDigital Library
- Sleator, Daniel and Dave Temperley. 1993. Parsing English with a Link Grammar. Third International Workshop on Parsing Technologies, Tilburg, The Netherlands.Google Scholar
- Smalheiser, N. L and Swanson D. R. 1996. Linking estrogen to Alzheimer's disease: an informatics approach. Neurology 47(3):809--10.Google ScholarCross Ref
- Smalheiser, N. L and Swanson, D. R. 1998. Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Comput Methods Programs Biomed. 57(3):149--53.Google ScholarCross Ref
- Syed Ahmed et al. 2005. IntEx: A Syntactic Role Driven Protein-Protein Interaction Extractor for Bio-Medical Text. Proc. of BioLink '2005, Detroit, Michigan, June 24, 2005Google ScholarDigital Library
- Szolovits, Peter. 2003. Adding a medical lexicon to an English parser. Proceedings of 2003 AMIA Annual Symposium. Bethesda. MD.Google Scholar
- Tanabe, L. U. Scherf, L. H. Smith, J. K. Lee, L. Hunter and J. N. Weinstein. 1999. MedMiner: an Internet Text-Mining Tool for Biomedical Information, with Application to Gene Expression Profiling. BioTechniques 27:1210--1217.Google ScholarCross Ref
- Wierzbicka, Anna. 1996. Semantics, Primes and Universals. Oxford University Press.Google Scholar
Recommendations
Using dependency parsing and probabilistic inference to extract relationships between genes, proteins and malignancies implicit among multiple biomedical research abstracts
LNLBioNLP '06: Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and BiologyWe describe BioLiterate, a prototype software system which infers relationships involving relationships between genes, proteins and malignancies from research abstracts, and has initially been tested in the domain of the molecular genetics of oncology. ...
Identifying regulatory relationships among genomic loci, biological pathways, and disease
Objective: Elucidating genetic factors of complex diseases is one of the most important challenges in biomedical research. Recently, a genetical genomics approach of mapping genotype to transcripts has been used in complex disease analysis. This ...
Semantic rules for extracting proteins functions information from biomedical abstracts
BIBM '15: Proceedings of the 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)We present a classifier system called SRPFP that predicts the functions of un-annotated proteins. SRPFP aims at enhancing the state of the art of biological text mining. It analyzes biomedical texts in order to discover protein function information that ...
Comments