skip to main content
Text association mining with cross-sentence inference, structure-based document model and multi-relational text mining
Publisher:
  • University of Colorado at Denver
  • Denver, CO
  • United States
ISBN:978-1-109-18848-6
Order Number:AAI3360857
Pages:
131
Bibliometrics
Skip Abstract Section
Abstract

With an exponential growth of published documents, text mining becomes a vital tool for an automated extraction of information and discovery of hidden information/knowledge. We begin this dissertation with an overview of text mining covering key definitions, pre-processing, feature selection, text representation and types of text mining. Then, we describe a fundamental text mining approach that we used for the development of a chromosome-21 database. Next, we present our three novel text mining techniques: (i) text association mining with cross-sentence inference, (ii) structure-based document model, and (iii) multi-relational text mining. Our techniques emphasize novel hypothesis generation, document representation and multi-relational discovery, respectively. In the text association mining with cross-sentence inference, statistical co-occurrences of terms and syntactic sentence structure analysis are initially used to find associations among key terms in documents. Subsequently, potential novel hypotheses are derived from the discovered associations. In a different way, the structure-based document model introduces two novel document representations for text documents that take into account not only term frequencies and patterns of term occurrences, but also the document's structural information. Based on the experimental results, our structure-based document models are superior to existing non-structure-based ones. Finally, the multi-relational text mining enhances a literature-based discovery method with multi-relational data mining and Inductive Logic Programming. It is aimed to discover relational knowledge in forms of frequent relational patterns and relational association rules from disjoint sets of literatures. These relational patterns and rules are complementary to the indirect connections found by existing literature-based discovery, and can be used for exploratory research.

Contributors
  • University of Colorado Denver
  • Virginia Commonwealth University
  • University of Colorado Denver

Recommendations