skip to main content
10.1145/1164820.1164829acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiixConference Proceedingsconference-collections
Article

Towards genre classification for IR in the workplace

Published:18 October 2006Publication History

ABSTRACT

Use of document genre in information retrieval systems has the potential to improve the task-appropriateness of results. However, genre classification remains a challenging problem. We describe a case study of genre classification in a software engineering workplace domain, which includes the development of a genre taxonomy and experiments in automatic genre classification using supervised machine learning. We present results based on evaluation using real-life enterprise data from this work domain.

References

  1. Hawking, D.: Challenges in Enterprise Search, presented at the Australasian Database Conference, Dunedin, New Zealand (2004) Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Fagin, R. Kumar, K. S. McCurley, J. Novak, D. Sivakumar, Tomlin, J. A., Williamson D. P.: Searching the Workplace Web. presented at WWW '03 International World Web Conference (2003) Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Broder, A., Ciccolo, A. C.: Towards the Next Generation of Enterprise Search Technology. IBM Systems Journal 43 (2004) 451--454 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jarvelin, K., Ingwersen, P.: Information Seeking Research Needs Extensions towards Tasks and Technology. Information Research 10 (2004)Google ScholarGoogle Scholar
  5. Anderson, C. J., Glassman, M, McAfee R. B., Pinelli, T.: An Investigation of Factors Afffecting how Engineers and Scientists Seek Information. Journal of Engineering and Technology Management 18 (2001) 131--155Google ScholarGoogle ScholarCross RefCross Ref
  6. Hertzum, M.: The Importance of Trust in Software Engineers' Assessment and Choice of Information Sources. Information and Organization 12 (2002) 1--18Google ScholarGoogle Scholar
  7. Fidel, R., Green, M.: The Many Faces of Accessibility: Engineers' Perception of Information Sources. Information Processing & Management 40 (2004) 563--581 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Freund, L., Toms, E. G., Clarke, C. L. A.: Modeling Task-Genre Relationships for IR in the Workplace. Annual International ACM SIGIR Conference, Salvador, Brazil (2005) Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Freund, L., Toms, E. G., Waterhouse, J.: Modeling the Information Behaviour of Software Engineers using a Work -Task Framework. 68th Annual Meeting of the American Society for Information Science and Technology, Charlotte, NC (2005).Google ScholarGoogle ScholarCross RefCross Ref
  10. Vakkari, P.: Task-Based Information Searching. Annual Review of Information Science and Technology 37 (2003) 413--463Google ScholarGoogle ScholarCross RefCross Ref
  11. Bystrom, K., Hansen, P.: Conceptual Framework for Tasks in Information Studies. Journal of the American Society for Information Science and Technology 56 (2005) 1050--1061 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bystrom, K.: Information and Information Sources in Tasks of Varying Complexity. Journal of the American Society for Information Science 53 (2002) 581--591 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Orlikowski, W. J., Yates, J.: Genre Repertoire: the Structuring of Communicative Practices in Organizations. Administrative Science Quarterly 39 (1994) 541--574Google ScholarGoogle ScholarCross RefCross Ref
  14. Toms, E. G.: Recognizing Digital Genre. Bulletin of the American Society for information Science and Technology 27 (2001) 20--22Google ScholarGoogle Scholar
  15. Yoshioka, T., Herman, G., Yates, J., Orlikowski, W. J.: Genre Taxonomy: a Knowledge Repository of Communicative Actions. ACM Transactions on Information Systems 19 (2001) 431--456 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Karlgren, J.: Non-Topical Factors in Information Access. Webnet '99, Honolulu, 1999.Google ScholarGoogle Scholar
  17. Roussinov, D. G., Crowston, K., Nilan, M., Kwasnik, B., Cai, J., Liu, X.: Genre Based Navigation on the Web. presented at Hawai'i International Conference on Systems Sciences, Maui, Hawai'i (2001) Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Muresan, G., Smith, C. L., Cole, M. Liu, L. Belkin, N. J.: Detecting Document Genre for Personalization in Information Retrieval. Proceedings of the Hawaii International Conference on System Sciences, Kauai, Hawai'I (2006) Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Bretan, I., Dewe, J., Hallberg, A., Wolkert, N., Karlgren, J.: Web-Specific Genre Visualization. presented at WebNet '98, Orlando Florida (1998)Google ScholarGoogle Scholar
  20. Glover, E. J., Lawrence, S., Gordon, M. D., Birmingham, W. P., Giles, C. L.: Web Search -- Your Way. Communications of the ACM 44 (2001) 97--102 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Dewdney, N., VanEss-Dykema, C., MacMillan, R.: The Form is the Substance: Classification of Genres in Text. Proceedings of ACL Workshop on Human Language Technology and Knowledge Management (2001) Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Meyer zu Eissen, S., Stein, B.: Genre Classification of Web Pages. Proceedings of the 27th German Conference on Artificial Intelligence, Ulm, Germany (2004)Google ScholarGoogle Scholar
  23. Lee, Y.-B., Myaeng, S. H.: Text Genre Classification with Genre Revealing and Subject-Revealing Features. Proceedings of the 25th Annual International ACM SIGIR Conference (2002) Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Stamatatos, E., Fakotakis, N., Kokkinakis, G.: Text Genre Detection using Common Word Frequencies. Proceedings of the 18th International Conference on Computational Linguistics (2000) Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Finn, A., Kushmerick, N.: Learning to Classify Documents according to Genre. presented at IJCAI Workshop on Computational Approaches to Style Analysis and Synthesis (2003)Google ScholarGoogle Scholar
  26. Shepherd, M., Watters, C., Kennedy, A.: Cybergenre: Automatic Identification of Home Pages on the Web. Journal of Web Engineering 3 (2004) 236--251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Joachims, T.: Learning to Classify Text using Support Vector Machines: Methods, Theory and Algorithms, Kluwer, Amsterdam (2002) Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Joachims, T.: Text Categorization with Support Vector Machines: Learning with Many Relevant Features. Proceedings of the European Conference on Machine Learning (1998) Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards genre classification for IR in the workplace

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      IIiX: Proceedings of the 1st international conference on Information interaction in context
      October 2006
      187 pages
      ISBN:1595934820
      DOI:10.1145/1164820
      • Program Chair:
      • Ian Ruthven

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 October 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate21of45submissions,47%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader