skip to main content
article

An analysis of a high-performance japanese question answering system

Published:01 September 2005Publication History
Skip Abstract Section

Abstract

Twenty-five Japanese Question Answering systems participated in NTCIR QAC2 subtask 1. Of these, our system SAIQA-QAC2 performed the best: MRR = 0.607. SAIQA-QAC2 is an improvement on our previous system SAIQA-Ii that achieved MRR = 0.46 for QAC1. We mainly improved the answer-type determination module and the retrieval module. In general, a fine-grained answer taxonomy improves QA performance but it is difficult to build an accurate answer extraction module for the fine-grained taxonomy because Machine Learning methods require a huge training corpus and hand-crafted rules are hard to maintain. Therefore, we built a fine-grained system by using a coarse-grained named entity recognizer and a Japanese lexicon “Nihongo Goi-taikei.” Our experiments show that named entity/numerical expression recognition and word sense-based answer extraction mainly contributed to the performance. In addition, we developed a new proximity-based document retrieval module that performs better than BM25. We also compared its performance with MultiText, a conventional proximity-based retrieval method developed for QA.

References

  1. Akiba, T., Itou, K., and Fujii, A. 2004. Question answering using “common sense” and utility maximization principle. In Working Notes of NTCIR-4. 297--303.Google ScholarGoogle Scholar
  2. Clarke, C. L. A. and Terra, E. L. 2003. Passage retrieval vs. document retrieval for factoid question answering. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 427--428. Google ScholarGoogle Scholar
  3. Clarke, C. L. A., Cormack, G. V., and Lynam, T. R. 2001. Exploiting redundancy in question answering. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 358--365. Google ScholarGoogle Scholar
  4. Harabagiu, S., Moldovan, D., Pasca, M., Mihalcea, R., Surdeanu, M., Bunescu, R., and Girju, R. 2000. FALCON: Boosting knowledge for answer engines. In Proceedings of Ninth Text REtrieval Conference. 479--488.Google ScholarGoogle Scholar
  5. Hayashi, Y., Kikui, G., and Tomita, J. 2000. Searching text-rich XML documents with relevance ranking. In Proceedings of SIGIR 2000 Workshop on XML and Information Retrieval.Google ScholarGoogle Scholar
  6. Hirao, T., Sasaki, Y., and Isozaki, H. 2001. An extrinsic evaluation for question-biased text summarization on QA tasks. In Proceedings of the Workshop on Automatic Summarization, The Second Meeting of the North American Chapter of the Association for Computational Linguistics. 61--68.Google ScholarGoogle Scholar
  7. Ichimura, Y., Saito, Y., Sakai, T., Kokubu, T., and Koyama, M. 2004. A study of the relations among question answering, Japanese named entity extraction, and named entity taxonomy (in Japanese). In IPSJ SIG Technical Report NL-161. 17--24.Google ScholarGoogle Scholar
  8. Ikehara, S., Miyazaki, M., Shirai, S., Yokoo, A., Nakaiwa, H., Ogura, K., Ooyama, Y., and Hayashi, Y. 1997. Goi-Taikei---A Japanese Lexicon (in Japanese). Iwanami Shoten.Google ScholarGoogle Scholar
  9. Isozaki, H. and Kazawa, H. 2002. Efficient support vector classifiers for named entity recognition. In Proceedings of the 19th International Conference on Computational Linguistics. 390--396. Google ScholarGoogle Scholar
  10. Jones, K. S., Walker, S., and Robertson, S. E. 2000. A probabilistic model of information retrieval: development and comparative experiments. Information Processing and Management 36, 779--840. Google ScholarGoogle Scholar
  11. Mori, T. 2004. Japanese Q/A systems using A* search and its improvement. In Working Notes of NTCIR-4. 345--352.Google ScholarGoogle Scholar
  12. Murata, M., Utiyama, M., and Isahara, H. 2004. Japanese question-answering system using decreased adding with multiple answers. In Working Notes of NTCIR-4. 353--360.Google ScholarGoogle Scholar
  13. Nomoto, M., Fukushige, Y., Sato, M., and Suzuki, H. 2004. NTCIR-4 QAC experiments at Matsushita. In Working Notes of NTCIR-4. 373--380.Google ScholarGoogle Scholar
  14. Ravichandran, D. and Hovy, E. 2002. Learning surface text patterns for a question answering system. In Proceedings of the 40th Annual Meeting of the Assocication for Computational Linguistics. 41--47. Google ScholarGoogle Scholar
  15. Sakai, T., Saito, Y., and Ichimura, Y. 2004. Toshiba ASKMi at NTCIR-4 QAC2. In Working Notes of NTCIR-4. 387--394.Google ScholarGoogle Scholar
  16. Sasaki, Y. 2003. Question answering as abduction: A feasibility study at NTCIR QAC1. IEICE Transaction on Information and Systems E86-D, 9, 1669--1676.Google ScholarGoogle Scholar
  17. Sasaki, Y., Isozaki, H., Hirao, T., Kokuryou, K., and Maeda, E. 2002. NTT's QA systems for NTCIR QAC-1. In Working Notes of the Third NTCIR Workshop Meeting, Part IV: Question Answering Challenge (QAC1). 63--70.Google ScholarGoogle Scholar
  18. Sekine, S. and Eriguchi, Y. 2000. Japanese named entity extraction evaluation---analysis of results. In Proceedings of the 18th International Conference on Computational Linguistics. 1106--1110. Google ScholarGoogle Scholar
  19. Soricut, R. and Brill, E. 2003. Automatic question answering: Beyond the factoid. In Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference. 149--156. Google ScholarGoogle Scholar
  20. Suzuki, J., Sasaki, Y., and Maeda, E. 2002. SVM answer selection for open-domain question answering. In Proceedings of the 19th International Conference on Computational Linguistics. 974--980. Google ScholarGoogle Scholar

Index Terms

  1. An analysis of a high-performance japanese question answering system

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian Language Information Processing
        ACM Transactions on Asian Language Information Processing  Volume 4, Issue 3
        September 2005
        138 pages
        ISSN:1530-0226
        EISSN:1558-3430
        DOI:10.1145/1111667
        Issue’s Table of Contents

        Copyright © 2005 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 September 2005
        Published in talip Volume 4, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader