skip to main content
article
Free Access

Classification in Networked Data: A Toolkit and a Univariate Case Study

Published:01 May 2007Publication History
Skip Abstract Section

Abstract

This paper is about classifying entities that are interlinked with entities for which the class is known. After surveying prior work, we present NetKit, a modular toolkit for classification in networked data, and a case-study of its application to networked data used in prior machine learning research. NetKit is based on a node-centric framework in which classifiers comprise a local classifier, a relational classifier, and a collective inference procedure. Various existing node-centric relational learning algorithms can be instantiated with appropriate choices for these components, and new combinations of components realize new algorithms. The case study focuses on univariate network classification, for which the only information used is the structure of class linkage in the network (i.e., only links and some class labels). To our knowledge, no work previously has evaluated systematically the power of class-linkage alone for classification in machine learning benchmark data sets. The results demonstrate that very simple network-classification models perform quite well---well enough that they should be used regularly as baseline classifiers for studies of learning with networked data. The simplest method (which performs remarkably well) highlights the close correspondence between several existing methods introduced for different purposes---that is, Gaussian-field classifiers, Hopfield networks, and relational-neighbor classifiers. The case study also shows that there are two sets of techniques that are preferable in different situations, namely when few versus many labels are known initially. We also demonstrate that link selection plays an important role similar to traditional feature selection.

References

  1. J. C. Almack. The influence of intelligence on the selection of associates. School and Society, 16: 529-530, 1922.Google ScholarGoogle Scholar
  2. A. Bernstein, S. Clearwater, and F. Provost. The relational vector-space model and industry classification. In Proceedings of the Learning Statistical Models from Relational Data Workshop at the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI), 2003.Google ScholarGoogle Scholar
  3. J. Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society, 36(2):192-236, 1974.Google ScholarGoogle Scholar
  4. J. Besag. Statistical analysis of non-lattice data. The Statistician, 24(3):179-195, 1975.Google ScholarGoogle ScholarCross RefCross Ref
  5. J. Besag. On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, 48(3): 259-302, 1986.Google ScholarGoogle Scholar
  6. P. M. Blau. Inequality and Heterogeneity: A Primitive Theory of Social Structure. New York: Free Press, 1977.Google ScholarGoogle Scholar
  7. A. Blum and S. Chawla. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML), pages 19-26, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Blum, J. Lafferty, R. Reddy, and M. R. Rwebangira. Semi-supervised learning using randomized mincuts. In Proceedings of the Twentyfirst International Conference on Machine Learning (ICML), pages 97-104, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Bott. Observation of play activities in a nursery school. Genetic Psychology Monographs, 4: 44-88, 1928.Google ScholarGoogle Scholar
  10. Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 23(11):1222-1239, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pages 307-319, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. Cortes, D. Pregibon, and C. T. Volinsky. Communities of interest. In Proceedings of the Fourth International Conference on Advances in Intelligent Data Analysis (IDA), pages 105-114, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter. Probabilistic networks and expert systems. Springer, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Craven, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and C. Y. Quek. Learning to extract symbolic knowledge from the World Wide Web. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI), pages 509-516, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. De Raedt, H. Blockeel, L. Dehaspe, and W. Van Laer. Three companions for data mining in first order logic. In S. Dzeroski and N. Lavrac, editors, Relational Data Mining, pages 105-139. Berlin; New York: Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 269-274, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. L. Dobrushin. The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability and its Applications, 13(2):197-224, 1968.Google ScholarGoogle ScholarCross RefCross Ref
  18. P. Domingos and M. Richardson. Mining the network value of customers. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 57-66, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Dzeroski and N. Lavrac. Relational Data Mining. Berlin; New York: Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Fawcett and F. Provost. Adaptive fraud detection. Data Mining and Knowledge Discovery, 3: 291-316, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Fawcett and F. Provost. Activity monitoring: Noticing interesting changes in behavior. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 53-62, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. A. Flach and N. Lachiche. Naive Bayesian classification of structured data. Machine Learning, 57:233-269, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Friedman, L. Getoor, D. Koller, and A. Pfeffer. Learning probabilistic relational models. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI), pages 1300-1309, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Galstyan and P. Cohen. Is guilt by association a bad thing? In Proceedings of the First International Conference on Intelligence Analysis (IA), 2005.Google ScholarGoogle Scholar
  25. S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 6:721-741, 1984.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W. R. Gilks, S. Richardson, and D. J. Spiegelhalter. Markov Chain Monte Carlo in Practice. Chapman & Hall/CRC, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  27. D. Greig, B. Porteous, and A. Seheult. Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society, 51(2):271-279, 1989.Google ScholarGoogle Scholar
  28. D. D. Heckathorn and J. Jeffri. Jazz networks: Using respondent-driven sampling to study stratification in two jazz musician communities. Unpublished paper presented at American Sociological Association Annual Meeting, August 2003.Google ScholarGoogle Scholar
  29. D. Heckerman, D. M. Chickering, C. Meek, R. Rounthwaite, and C. Kadie. Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research (JMLR), 1:49-75, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Hill, F. Provost, and C. Volinsky. Network-based marketing: Identifying likely adopters via consumer networks. Statistical Science, 22(2):256-276, 2006a.Google ScholarGoogle ScholarCross RefCross Ref
  31. S. Hill, D. K. Agarwal, R. Bell, and C. Volinsky. Building an effective representation for dynamic networks. Journal of Computational & Graphical Statistics, 15(3):584-608, 2006b.Google ScholarGoogle ScholarCross RefCross Ref
  32. G. E. Hinton and T. J. Sejnowski. Learning and relearning in Boltzmann machines. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, 1: Foundations:282-317, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the United States of America, 79(8): 2554-2558, 1982.Google ScholarGoogle Scholar
  34. Z. Huang, H. Chen, and D. Zeng. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1): 116-142, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. A. Hummel and S. W. Zucker. On the foundations of relaxation labeling processes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 5(3):267-287, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. E. Ising. Beitrag zur Theorie des Ferromagnetismus. Zeitschrift f. Physik, 31:253-258, 1925. {German}.Google ScholarGoogle ScholarCross RefCross Ref
  37. D. Jensen and J. Neville. Data mining in social networks. In National Academy of Sciences Symposium on Dynamic Social Network Modeling and Analysis, 2002a.Google ScholarGoogle Scholar
  38. D. Jensen and J. Neville. Linkage and autocorrelation cause feature selection bias in relational learning. In Proceedings of the Nineteenth International Conference on Machine Learning (ICML), pages 259-266, 2002b. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. Jensen, J. Neville, and B. Gallagher. Why collective inference improves relational classification. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 593-598, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. T. Joachims. Transductive learning via spectral graph partitioning. In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 290-297, 2003.Google ScholarGoogle Scholar
  41. J. Kleinberg and É. Tardos. Approximation algorithms for classification problems with pairwise relations: Metric labeling and Markov random fields. In Proceedings of the Fortieth Annual Symposium on Foundations of Computer Science (FOCS), pages 14-23, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. R. Kohavi and G. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1-2): 273-324, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. Koller and A. Pfeffer. Probabilistic frame-based systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI), pages 580-587, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. S. Kramer, N. Lavrac, and P. Flach. Propositionalization approaches to relational data mining. In S. Dzeroski and N. Lavrac, editors, Relational Data Mining, pages 262-291. Berlin; New York: Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML), pages 282-289, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. P. Langley. Crafting papers on machine learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pages 1207-1212, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. P. Lazarsfeld and R. K. Merton. Friendship as a social process: A substantive and methodological analysis. In M. Berger, T. Abel, and C. H. Page, editors, Freedom and Control in Modern Society, pages 18-66. Van Nostrand, 1954.Google ScholarGoogle Scholar
  48. C. P. Loomis. Political and occupational cleavages in a Hanoverian village. Sociometry, 9:316- 3333, 1946.Google ScholarGoogle ScholarCross RefCross Ref
  49. Q. Lu and L. Getoor. Link-based classification. In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 496-503, 2003.Google ScholarGoogle Scholar
  50. S. A. Macskassy and F. Provost. A simple relational classifier. In Proceedings of the Multi-Relational Data Mining Workshop (MRDM) at the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  51. S. A. Macskassy and F. Provost. Suspicion scoring based on guilt-by-association, collective inference, and focused data access. In Proceedings of the First International Conference on Intelligence Analysis (IA), 2005.Google ScholarGoogle Scholar
  52. A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. Information Retrieval, 3(2):127-163, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27:415-444, 2001.Google ScholarGoogle Scholar
  54. K. Murphy, Y. Weiss, and M. I. Jordan. Loopy belief-propagation for approximate inference: An empirical study. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 467-475, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. J. Neville and D. Jensen. Iterative classification in relational data. In Proceedings of the Workshop on Learning Statistical Models from Relational Data at the Seventeenth National Conference on Artificial Intelligence (AAAI), pages 13-20, 2000.Google ScholarGoogle Scholar
  56. J. Neville and D. Jensen. Collective classification with relational dependency networks. In Proceedings of the Multi-Relational Data Mining Workshop (MRDM) at the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003.Google ScholarGoogle Scholar
  57. J. Neville and D. Jensen. Dependency networks for relational data. In Proceedings of the Fourth IEEE International Conference in Data Mining (ICDM), pages 170-177, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. J. Neville and D. Jensen. Leveraging relational autocorrelation with latent group models. In Proceedings of the Fifth IEEE International Conference in Data Mining (ICDM), pages 322-329, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. J. Neville and D. Jensen. Relational dependency networks. Journal of Machine Learning Research (JMLR), 8(Mar):653-692, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. J. Neville, D. Jensen, L. Friedland, and M. Hay. Learning relational probability trees. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 625-630, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. J. Neville, Ö. Simsek, D. Jensen, J. Komoroske, K. Palmer, and H. Goldberg. Using relational knowledge discovery to prevent securities fraud. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 449-458, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. M. E. J. Newman. Mixing patterns in networks. Physical Review E, 67, 2003. 026126.Google ScholarGoogle Scholar
  63. J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. C. Perlich. Citation-based document classification. In Workshop on Information Technology and Systems (WITS), 2003.Google ScholarGoogle Scholar
  65. C. Perlich and F. Provost. Aggregation-based feature invention and relational concept classes. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 167-176, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. C. Perlich and F. Provost. Distribution-based aggregation for relational learning with identifier attributes. Machine Learning, 62(1/2):65-105, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. C. Perlich, F. Provost, and J. Simonoff. Tree induction vs. logistic regression: A learning-curve analysis. Journal of Machine Learning Research (JMLR), 4:211-255, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. A. Popescul and L. H. Ungar. Statistical relational learning for link prediction. In Proceedings of the Learning Statistical Models from Relational Data Workshop at the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI), 2003.Google ScholarGoogle Scholar
  69. R. B. Potts. Some generalized order-disorder transformations. Cambridge Philosophic Society, 48: 106-109, 1952.Google ScholarGoogle ScholarCross RefCross Ref
  70. H. M. Richardson. Community of values as a factor in friendships of college and adult women. Journal of Social Psychology, 11:303-312, 1940.Google ScholarGoogle ScholarCross RefCross Ref
  71. M. Richardson and P. Domingos. Markov logic networks. Machine Learning, 62(1/2):107-136, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, chapter 14, pages 313-323. Prentice-Hall, 1971.Google ScholarGoogle Scholar
  73. A. Rosenfeld, R. Hummel, and S. Zucker. Scene labeling by relaxation operations. IEEE Transactions on Systems, Man and Cybernetics, 6:420-433, 1976.Google ScholarGoogle ScholarCross RefCross Ref
  74. L. J. Savage. The Foundations of Statistics. John Wiley and Sons, 1954.Google ScholarGoogle Scholar
  75. E. Segal, H. Wang, and D. Koller. Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics, 19:I264-I272, Jul 2003a.Google ScholarGoogle ScholarCross RefCross Ref
  76. E. Segal, R. Yelensky, and D. Koller. Genome-wide discovery of transcriptional modules from DNA sequence and gene expression. Bioinformatics, 19:I273-I282, Jul 2003b.Google ScholarGoogle ScholarCross RefCross Ref
  77. B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 485-492, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. B. Taskar, E. Segal, and D. Koller. Probabilistic classification and clustering in relational data. In Proceedings of the Seventeenthth International Joint Conference on Artificial Intelligence (IJCAI) , pages 870-878, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. B. Taskar, V. Chatalbashev, and D. Koller. Learning associative Markov networks. In Proceedings of the Twentyfirst International Conference on Machine Learning (ICML), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. K. Tumulty. Inside Bush's Secret Spy Net {electronic version}. Time, 167(21), 2006. Retrieved May 25, 2006, from http://www.time.com/time/archive/preview/0,10987,1194021, 00.html.Google ScholarGoogle Scholar
  81. V. N. Vapnik. The support vector method of function estimation. In J. Suykens and J. Vandewalle, editors, Nonlinear Modeling: Advanced Black-Box Techniques, pages 55-86. Kluwer, Boston, 1998a.Google ScholarGoogle Scholar
  82. V. N. Vapnik. Statistical Learning Theory. John Wiley, NY, 1998b.Google ScholarGoogle Scholar
  83. M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference. Technical Report 649, University of California, Berkeley, 2003.Google ScholarGoogle Scholar
  84. G. Winkler. Image Analysis, Random Fields and Markov Chain Monte Carlo Methods. Springer-Verlag, 2nd edition, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools with Java Implementations . Morgan Kaufmann, San Francisco, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the Twentieth International Conference on Machine Learning (ICML), pages 912-919, 2003.Google ScholarGoogle Scholar

Index Terms

  1. Classification in Networked Data: A Toolkit and a Univariate Case Study

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image The Journal of Machine Learning Research
              The Journal of Machine Learning Research  Volume 8, Issue
              5/1/2007
              1150 pages
              ISSN:1532-4435
              EISSN:1533-7928
              Issue’s Table of Contents

              Publisher

              JMLR.org

              Publication History

              • Published: 1 May 2007
              Published in jmlr Volume 8, Issue

              Qualifiers

              • article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader