ABSTRACT
Use of document genre in information retrieval systems has the potential to improve the task-appropriateness of results. However, genre classification remains a challenging problem. We describe a case study of genre classification in a software engineering workplace domain, which includes the development of a genre taxonomy and experiments in automatic genre classification using supervised machine learning. We present results based on evaluation using real-life enterprise data from this work domain.
- Hawking, D.: Challenges in Enterprise Search, presented at the Australasian Database Conference, Dunedin, New Zealand (2004) Google ScholarDigital Library
- Fagin, R. Kumar, K. S. McCurley, J. Novak, D. Sivakumar, Tomlin, J. A., Williamson D. P.: Searching the Workplace Web. presented at WWW '03 International World Web Conference (2003) Google ScholarDigital Library
- Broder, A., Ciccolo, A. C.: Towards the Next Generation of Enterprise Search Technology. IBM Systems Journal 43 (2004) 451--454 Google ScholarDigital Library
- Jarvelin, K., Ingwersen, P.: Information Seeking Research Needs Extensions towards Tasks and Technology. Information Research 10 (2004)Google Scholar
- Anderson, C. J., Glassman, M, McAfee R. B., Pinelli, T.: An Investigation of Factors Afffecting how Engineers and Scientists Seek Information. Journal of Engineering and Technology Management 18 (2001) 131--155Google ScholarCross Ref
- Hertzum, M.: The Importance of Trust in Software Engineers' Assessment and Choice of Information Sources. Information and Organization 12 (2002) 1--18Google Scholar
- Fidel, R., Green, M.: The Many Faces of Accessibility: Engineers' Perception of Information Sources. Information Processing & Management 40 (2004) 563--581 Google ScholarDigital Library
- Freund, L., Toms, E. G., Clarke, C. L. A.: Modeling Task-Genre Relationships for IR in the Workplace. Annual International ACM SIGIR Conference, Salvador, Brazil (2005) Google ScholarDigital Library
- Freund, L., Toms, E. G., Waterhouse, J.: Modeling the Information Behaviour of Software Engineers using a Work -Task Framework. 68th Annual Meeting of the American Society for Information Science and Technology, Charlotte, NC (2005).Google ScholarCross Ref
- Vakkari, P.: Task-Based Information Searching. Annual Review of Information Science and Technology 37 (2003) 413--463Google ScholarCross Ref
- Bystrom, K., Hansen, P.: Conceptual Framework for Tasks in Information Studies. Journal of the American Society for Information Science and Technology 56 (2005) 1050--1061 Google ScholarDigital Library
- Bystrom, K.: Information and Information Sources in Tasks of Varying Complexity. Journal of the American Society for Information Science 53 (2002) 581--591 Google ScholarDigital Library
- Orlikowski, W. J., Yates, J.: Genre Repertoire: the Structuring of Communicative Practices in Organizations. Administrative Science Quarterly 39 (1994) 541--574Google ScholarCross Ref
- Toms, E. G.: Recognizing Digital Genre. Bulletin of the American Society for information Science and Technology 27 (2001) 20--22Google Scholar
- Yoshioka, T., Herman, G., Yates, J., Orlikowski, W. J.: Genre Taxonomy: a Knowledge Repository of Communicative Actions. ACM Transactions on Information Systems 19 (2001) 431--456 Google ScholarDigital Library
- Karlgren, J.: Non-Topical Factors in Information Access. Webnet '99, Honolulu, 1999.Google Scholar
- Roussinov, D. G., Crowston, K., Nilan, M., Kwasnik, B., Cai, J., Liu, X.: Genre Based Navigation on the Web. presented at Hawai'i International Conference on Systems Sciences, Maui, Hawai'i (2001) Google ScholarDigital Library
- Muresan, G., Smith, C. L., Cole, M. Liu, L. Belkin, N. J.: Detecting Document Genre for Personalization in Information Retrieval. Proceedings of the Hawaii International Conference on System Sciences, Kauai, Hawai'I (2006) Google ScholarDigital Library
- Bretan, I., Dewe, J., Hallberg, A., Wolkert, N., Karlgren, J.: Web-Specific Genre Visualization. presented at WebNet '98, Orlando Florida (1998)Google Scholar
- Glover, E. J., Lawrence, S., Gordon, M. D., Birmingham, W. P., Giles, C. L.: Web Search -- Your Way. Communications of the ACM 44 (2001) 97--102 Google ScholarDigital Library
- Dewdney, N., VanEss-Dykema, C., MacMillan, R.: The Form is the Substance: Classification of Genres in Text. Proceedings of ACL Workshop on Human Language Technology and Knowledge Management (2001) Google ScholarDigital Library
- Meyer zu Eissen, S., Stein, B.: Genre Classification of Web Pages. Proceedings of the 27th German Conference on Artificial Intelligence, Ulm, Germany (2004)Google Scholar
- Lee, Y.-B., Myaeng, S. H.: Text Genre Classification with Genre Revealing and Subject-Revealing Features. Proceedings of the 25th Annual International ACM SIGIR Conference (2002) Google ScholarDigital Library
- Stamatatos, E., Fakotakis, N., Kokkinakis, G.: Text Genre Detection using Common Word Frequencies. Proceedings of the 18th International Conference on Computational Linguistics (2000) Google ScholarDigital Library
- Finn, A., Kushmerick, N.: Learning to Classify Documents according to Genre. presented at IJCAI Workshop on Computational Approaches to Style Analysis and Synthesis (2003)Google Scholar
- Shepherd, M., Watters, C., Kennedy, A.: Cybergenre: Automatic Identification of Home Pages on the Web. Journal of Web Engineering 3 (2004) 236--251 Google ScholarDigital Library
- Joachims, T.: Learning to Classify Text using Support Vector Machines: Methods, Theory and Algorithms, Kluwer, Amsterdam (2002) Google ScholarDigital Library
- Joachims, T.: Text Categorization with Support Vector Machines: Learning with Many Relevant Features. Proceedings of the European Conference on Machine Learning (1998) Google ScholarDigital Library
Index Terms
- Towards genre classification for IR in the workplace
Recommendations
Modeling task-genre relationships for IR in the workplace
SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrievalContext influences the search process, but to date research has not definitively identified which aspects of context are the most influential for information retrieval, and thus are worthy of integration in today's retrieval systems. In this research, ...
Enhancing multi-label music genre classification through ensemble techniques
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalIn the field of Music Information Retrieval (MIR), multi-label genre classification is the problem of assigning one or more genre labels to a music piece. In this work, we propose a set of ensemble techniques, which are specific to the task of multi-...
Music Genre Classification and Feature Comparison using ML
ICMLT '22: Proceedings of the 2022 7th International Conference on Machine Learning TechnologiesAn essential feature of the music is the genre, which can be considered a high-level description of an individual piece of music. In this sense, genre as a music feature is similar to typical descriptive features from the ML perspective. Although a ...
Comments