skip to main content
10.1145/1963192.1963249acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
poster

Comparative study of clustering techniques for short text documents

Published:28 March 2011Publication History

ABSTRACT

We compare various document clustering techniques including K-means, SVD-based method and a graph-based approach and their performance on short text data collected from Twitter. We define a measure for evaluating the cluster error with these techniques. Observations show that graph-based approach using affinity propagation performs best in clustering short text data with minimal cluster error.

References

  1. Somnath Banerjee, Krishnan Ramanathan, and Ajay Gupta, Clustering short texts using wikipedia, SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (New York, NY, USA), ACM, 2007, pp. 787--788. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard Harshman, Indexing by latent semantic analysis, Journal of the American Society for Information Science 41 (1990), 391--407.Google ScholarGoogle ScholarCross RefCross Ref
  3. Brendan J. Frey and Delbert Dueck, Clustering by passing messages between data points, Science 315 (2007), 972--976.Google ScholarGoogle ScholarCross RefCross Ref
  4. Jeon hyung Kang, Kristina Lerman, and Plangprasopchok Anon, Analyzing microblogs with affinity propagation, Proceedings of KDD workshop on Social Media Analytic, July 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Brendan O'Connor, Michel Krieger, and David Ahn, Tweetmotif: Exploratory search and topic summarization for twitter, ICWSM, 2010.Google ScholarGoogle Scholar
  6. Nordianah Ab Samat, Masrah Azrifah Azmi Murad, Muhamad Taufik Abdullah, and Rodziah Atan, Malay documents clustering algorithm based on singular value decomposition.Google ScholarGoogle Scholar
  7. M. Steinbach, G. Karypis, and V. Kumar, A comparison of document clustering techniques, Technical Report 00-034, University of Minnesota, 2000.Google ScholarGoogle Scholar

Index Terms

  1. Comparative study of clustering techniques for short text documents

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        WWW '11: Proceedings of the 20th international conference companion on World wide web
        March 2011
        552 pages
        ISBN:9781450306379
        DOI:10.1145/1963192

        Copyright © 2011 Authors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 March 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • poster

        Acceptance Rates

        Overall Acceptance Rate1,899of8,196submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader