ABSTRACT
This paper presents a transductive approach to learn ranking functions for extractive multi-document summarization. At the first stage, the proposed approach identifies topic themes within a document collection, which help to identify two sets of relevant and irrelevant sentences to a question. It then iteratively trains a ranking function over these two sets of sentences by optimizing a ranking loss and fitting a prior model built on keywords. The output of the function is used to find further relevant and irrelevant sentences. This process is repeated until a desired stopping criterion is met.
- M.-R. Amini and N. Usunier,A contextual query expansion approach by term clustering for robust text summarization.In Proceedings of DUC,2007.Google Scholar
- I. Mani and E. Bloedorn,Summarizing similarities and differences among related documents. Information Retrieval,1(1-2):35--67,1999. Google ScholarDigital Library
- R. Reichart and A. Rappoport,Self-Training for Enhancement and Domain Adaptation of Statistical Parsers Trained on Small Datasets.In Proceedings of ACL,pages 616--623,2007.Google Scholar
- Robert E. Schapire and Marie Rochery and Mazin Rahim and Narendra Gupta,Incorporating Prior Knowledge into Boosting.In Proceedings of ICML,pages 538--545,2002. Google ScholarDigital Library
Index Terms
- Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization
Recommendations
Transductive learning over automatically detected themes for multi-document summarization
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalWe propose a new method for query-biased multi-document summarization, based on sentence extraction. The summary of multiple documents is created in two steps. Sentences are first clustered; where each cluster corresponds to one of the main themes ...
Multi-document Hyperedge-based Ranking for Text Summarization
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementIn a multi-document settings, graph-based extractive summarization approaches build a similarity graph out of sentences in each cluster of documents then use graph centrality approaches to measure the importance of sentences. The similarity is computed ...
Latent dirichlet allocation based multi-document summarization
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text dataExtraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being ...
Comments