skip to main content
10.1145/2488388.2488503acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

The self-feeding process: a unifying model for communication dynamics in the web

Published:13 May 2013Publication History

ABSTRACT

How often do individuals perform a given communication activity in the Web, such as posting comments on blogs or news? Could we have a generative model to create communication events with realistic inter-event time distributions (IEDs)? Which properties should we strive to match? Current literature has seemingly contradictory results for IED: some studies claim good fits with power laws; others with non-homogeneous Poisson processes. Given these two approaches, we ask: which is the correct one? Can we reconcile them all? We show here that, surprisingly, both approaches are correct, being corner cases of the proposed Self-Feeding Process (SFP). We show that the SFP (a) exhibits a unifying power, which generates power law tails (including the so-called "top-concavity" that real data exhibits), as well as short-term Poisson behavior; (b) avoids the "i.i.d. fallacy", which none of the prevailing models have studied before; and (c) is extremely parsimonious, requiring usually only one, and in general, at most two parameters. Experiments conducted on eight large, diverse real datasets (e.g., Youtube and blog comments, e-mails, SMSs, etc) reveal that the SFP mimics their properties very well.

References

  1. L. Akoglu, P. O. S. Vaz de Melo, and C. Faloutsos. Quantifying reciprocity in large weighted communication networks. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2012, Kuala Lumpur, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Barabási. The origin of bursts and heavy tails in human dynamics. Nature, 435:207--211, May 2005.Google ScholarGoogle ScholarCross RefCross Ref
  3. S. Bennett. Log-logistic regression models for survival data. Journal of the Royal Statistical Society. Series C (Applied Statistics), 32(2):165--171, 1983.Google ScholarGoogle Scholar
  4. G. E. P. Box, G. M. Jenkins, and G. C. Reinsel. Time Series Analysis, Forecasting, and Control. Prentice-Hall, Englewood Cliffs, New Jersey, third edition, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Cao, W. S. Cleveland, D. Lin, and D. X. Sun. Internet traffic tends to poisson and independent as the load increases. Technical report, Bell Labs Technical Report, 2001.Google ScholarGoogle Scholar
  6. F. Chierichetti, R. Kumar, P. Raghavan, and T. Sarlos. Are web users really markovian? In Proceedings of the 21st international conference on World Wide Web, WWW '12, pages 609--618, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data. SIAM Review, 51(4):661+, Feb 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. S. Cleveland. Robust Locally Weighted Regression and Smoothing Scatterplots. Journal of the American Statistical Association, 74(368):829--836, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  9. D. Cox and V. Isham. Point Processes. Monographs on Applied Probability and Statistics. Taylor & Francis, 1980.Google ScholarGoogle Scholar
  10. D. R. Cox. Some Statistical Methods Connected with Series of Events. Journal of the Royal Statistical Society. Series B (Methodological), 17(2):129--164, 1955.Google ScholarGoogle ScholarCross RefCross Ref
  11. M. De Choudhury, H. Sundaram, A. John, and D. D. Seligmann. Social synchrony: Predicting mimicry of user actions in online social media. In Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 04, pages 151--158, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J.-P. Eckmann, E. Moses, and D. Sergi. Entropy of dialogues creates coherent structures in e-mail traffic. Proceedings of the National Academy of Sciences of the United States of America, 101(40):14333--14337, October 2004.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In SIGCOMM '99: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, pages 251--262, New York, NY, USA, 1999. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. R. Fisk. The graduation of income distributions. Econometrica, 29(2):171--185, 1961.Google ScholarGoogle ScholarCross RefCross Ref
  15. S. Garriss, M. Kaminsky, M. J. Freedman, B. Karp, D. Mazières, and H. Yu. Re: Reliable email. In Proceedings of the Third USENIX/ACM Symposium on Networked System Design and Implementation (NSDI'06), pages 297--310, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. S. Gokhale and K. S. Trivedi. Log-logistic software reliability growth model. In HASE '98: The 3rd IEEE International Symposium on High-Assurance Systems Engineering, pages 34--41, Washington, DC, USA, 1998. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. A. Haight. Handbook of the Poisson distribution {by} Frank A. Haight. Wiley New York,, 1967.Google ScholarGoogle Scholar
  18. C. A. Hidalgo. Scaling in the inter-event time of random and seasonal systems. PHYSICA A, 369:877, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Jamali, G. Haffari, and M. Ester. Modeling the temporal dynamics of social rating networks using bidirectional effects of social relations and rating patterns. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 527--536, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Jiang and C. Dovrolis. Why is the internet traffic bursty in short time scales? In Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'05), pages 241--252, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Karagiannis, M. Molle, M. Faloutsos, and A. Broido. A nonstationary Poisson view of Internet traffic. In INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, volume 3, pages 1558--1569 vol.3, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  22. M. Karsai, K. Kaski, A.-L. Barabási, and J. Kertész. Universal features of correlated bursty behaviour. Scientific Reports, 2, May 2012.Google ScholarGoogle ScholarCross RefCross Ref
  23. J. Kleinberg. Bursty and hierarchical structure in streams. In Proceedings of the eighth ACM SIGKDD, KDD '02, pages 91--101, New York, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Klimt and Y. Yang. Introducing the enron corpus. In CEAS'04: The First Conference on Email and Anti-Spam, 2004.Google ScholarGoogle Scholar
  25. A. Kuczura. The interrupted poisson process as an overflow process. The Bell System Technical Journal, 52:437--448, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  26. J. F. Lawless and J. F. Lawless. Statistical Models and Methods for Lifetime Data (Wiley Series in Probability & Mathematical Statistics). John Wiley & Sons, January 1982.Google ScholarGoogle Scholar
  27. K. Lerman and T. Hogg. Using a model of social dynamics to predict popularity of news. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 621--630, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. O. Lorenz. Methods of measuring the concentration of wealth. Publications of the American Statistical Association, 9:209--219, 1905.Google ScholarGoogle ScholarCross RefCross Ref
  29. T. Mahmood. Survival of newly founded businesses: A log-logistic model approach. JournalSmall Business Economics, 14(3):223--237, 2000.Google ScholarGoogle Scholar
  30. R. D. Malmgren, J. M. Hofman, L. A. Amaral, and D. J. Watts. Characterizing individual communication patterns. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09, pages 607--616, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. D. Malmgren, D. B. Stouffer, A. S. L. O. Campanharo, and L. A. N. Amaral. On universality in human correspondence activity. SCIENCE, 325:1696, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  32. R. D. Malmgren, D. B. Stouffer, A. E. Motter, and L. A. N. Amaral. A poissonian explanation for heavy tails in e-mail communication. Proceedings of the National Academy of Sciences, 105(47):18153--18158, November 2008.Google ScholarGoogle ScholarCross RefCross Ref
  33. C. S. M.I. Ahmad and A. Werritty. Log-logistic flood frequency analysis. Journal of Hydrology, 98:205--224, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  34. J. G. Oliveira and A.-L. Barabasi. Human dynamics: Darwin and Einstein correspondence patterns. Nature, 437(7063):1251, Oct. 2005.Google ScholarGoogle ScholarCross RefCross Ref
  35. M. Owczarczuk. Long memory in patterns of mobile phone usage. Physica A: Statistical Mechanics and its Applications, Oct. 2011.Google ScholarGoogle Scholar
  36. K. Radinsky, K. Svore, S. Dumais, J. Teevan, A. Bocharov, and E. Horvitz. Modeling and predicting behavioral dynamics on the web. In Proceedings of the 21st international conference on World Wide Web, WWW '12, pages 599--608, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. E. Shmueli, A. Kagian, Y. Koren, and R. Lempel. Care to comment?: recommendations for commenting on news stories. In Proceedings of the 21st international conference on World Wide Web, WWW '12, pages 429--438, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. P. O. S. Vaz de Melo, L. Akoglu, C. Faloutsos, and A. A. F. Loureiro. Surprising patterns for the call duration distribution of mobile phone users. In The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), pages 354--369, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. P. O. S. Vaz de Melo, C. Faloutsos, and A. A. Loureiro. Human dynamics in large communication networks. In SIAM Conference on Data Mining (SDM), pages 968--879. SIAM / Omnipress, 2011.Google ScholarGoogle Scholar
  40. A. Vazquez, J. G. Oliveira, Z. Dezso, K.-I. Goh, I. Kondor, and A.-L. Barabasi. Modeling bursts and heavy tails in human dynamics. Phys Rev E Stat Nonlin Soft Matter Phys, 73:036127, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  41. H. Wold and U. universitet. Statistiska institutionen. On Stationary Point Processes and Markov Chains. Selected publications - University of Uppsala, Department of Statistics. Swedish and Danish Actuarial Societies, 1948.Google ScholarGoogle Scholar

Index Terms

  1. The self-feeding process: a unifying model for communication dynamics in the web

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Other conferences
                    WWW '13: Proceedings of the 22nd international conference on World Wide Web
                    May 2013
                    1628 pages
                    ISBN:9781450320351
                    DOI:10.1145/2488388

                    Copyright © 2013 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 13 May 2013

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • research-article

                    Acceptance Rates

                    WWW '13 Paper Acceptance Rate125of831submissions,15%Overall Acceptance Rate1,899of8,196submissions,23%

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader