skip to main content
research-article

Efficient Multiview Maintenance under Insertion in Huge Social Networks

Published:01 March 2014Publication History
Skip Abstract Section

Abstract

Applications to monitor various aspects of social networks are becoming increasingly popular. For instance, marketers want to look for semantic patterns relating to the content of tweets and Facebook posts relating to their products. Law enforcement agencies want to track behaviors involving potential criminals on the Internet by looking for certain patterns of behavior. Music companies want to track patterns of spread of illegal music. These applications allow multiple users to specify patterns of interest and monitor them in real time as new data gets added to the Web or to a social network. In this article we develop the concept of social network view servers in which all of these types of applications can be simultaneously monitored. The patterns of interest are expressed as views over an underlying graph or social network database. We show that a given set of views can be compiled in multiple possible ways to take advantage of common substructures and define the concept of an optimal merge. Though finding an optimal merge is shown to be NP-hard, we develop the AddView to find very good merges quickly. We develop a very fast MultiView algorithm that scalably and efficiently maintains multiple subgraph views when insertions are made to the social network database. We show that our algorithm is correct, study its complexity, and experimentally demonstrate that our algorithm can scalably handle updates to hundreds of views on 6 real-world social network databases with up to 540M edges.

References

  1. Daniel J. Abadi, Adam Marcus, Samuel Madden, and Katherine J. Hollenbach. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB’07). C. Koch, J. Gehrke, M. N. Garofalakis, D. Srivastava, K. Aberer, A. Deshpande, D. Florescu, C. Y. Chan, V. Ganti, C.-C. Kanne, W. Klas, and E. J. Neuhold Eds., ACM Press, New York, 411--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Matthias Brocheler, Andrea Pugliese, and Venkatramanan S. Subrahmanian. 2009. DOGMA: A disk-oriented graph matching algorithm for rdf databases. In Proceedings of the International Semantic Web Conference (ISWC’09). A. Bernstein, D. R. Karger, T. Heath, L. Feigenbaum, D. Maynard, E. Motta, and K. Thirunarayan Eds., Lecture Notes in Computer Science, vol. 5823, Springer, 97--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Matthias Brocheler, Andrea Pugliese, and Venkatramanan S. Subrahmanian. 2010. COSI: Cloud oriented subgraph identification in massive social networks. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM’10). N. Memon and R. Alhajj Eds., IEEE Computer Society, 248--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Matthias Brocheler, Andrea Pugliese, and Venkatramanan S. Subrahmanian. 2011. A budget-based algorithm for efficient subgraph matching on huge networks. In Proceedings of the 27th IEEE International Conference on Data Engineering Workshops (ICDEWorkshops’11). S. Abiteboul, K. Bohm, C. Koch, and K.-L. Tan Eds., IEEE, 94--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. 2003. Sesame: An architecture for storing and querying rdf data and schema information. In Spinning the Semantic Web, D. Fensel, J. A. Hendler, H. Lieberman, and W. Wahlster Eds., MIT Press, 197--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Lei Chen and Changliang Wang. 2010. Continuous subgraph pattern search over certain and uncertain graph streams. IEEE Trans. Knowl. Data Eng. 22, 8, 1093--1109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. James Cheng, Yiping Ke, and Wilfred Ng. 2009. Efficient query processing on graph databases. ACM Trans. Database Syst. 34, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jiefeng Cheng, Jeffrey Xu Yu, Bolin Ding, Philip S. Yu, and Haixun Wang. 2008. Fast graph pattern matching. In Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE’08). G. Alonso, J. A. Blakeley, and A. L. P. Chen Eds., IEEE, 913--922. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Johannes Gehrke, Paul Ginsparg, and Jon M. Kleinberg. 2003. Overview of the 2003 KDD Cup. SIGKDD Explorations 5, 2, 149--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Rosalba Giugno and Dennis Shasha. 2002. GraphGrep: A fast and universal method for querying graphs. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 112--115.Google ScholarGoogle ScholarCross RefCross Ref
  12. Roy Goldman, Jason McHugh, and Jennifer Widom. 1999. From semistructured data to xml: Migrating the lore data model and query language. In Proceedings of the ACM SIGMOD Workshop on the Web and Databases (WebDB’99). 25--30.Google ScholarGoogle Scholar
  13. Ashish Gupta, Inderpal Singh Mumick, and Venkatramanan S. Subrahmanian. 1993. Maintaining views incrementally. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’93). P. Buneman and S. Jajodia Eds., ACM Press, New York, 157--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Edward Hung, Yu Deng, and Venkatramanan S. Subrahmanian. 2005. RDF aggregate queries and views. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05). K. Aberer, M. J. Franklin, and S. Nishio Eds., IEEE Computer Society, 717--728. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yiping Ke, James Cheng, and Jeffrey Xu Yu. 2010. Querying large graph databases. In Proceedings of the 15th International Conference on Database Systems for Advanced Applications (DASFAA’10). H. Kitagawa, Y. Ishikawa, Q. Li, and C. Watanabe Eds., Lecture Notes in Computer Science, vol. 5982, Springer, 487--488. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Atanas Kiryakov, Damyan Ognyanov, and Dimitar Manov. 2005. OWLIM - A pragmatic semantic repository for owl. In Proceedings of the International Conference on Web Information Systems Engineering (WISE’05). M. Dean, Y. Guo, W. Jun, R. Kaschek, S. Krishnaswamy, Z. Pan, and Q. Z. Sheng Eds., Lecture Notes in Computer Science, vol. 3807, Springer, 182--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Bryan Klimt and Yiming Yang. 2004. The Enron Corpus: A new dataset for email classification research. In Proceedings of the 15th European Conference on Machine Learning (ECML’04). Jean-François Boulicaut, Floriana Esposito, Fosca Giannotti, and Dino Pedreschi Eds., Lecture Notes in Computer Science, vol. 3201, Springer, 217--226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Geetha Manjunath, R. Badrinath, Craig Sayers, and K. S. Venugopal. 2008. Temporal views over rdf data. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). 1131--1132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Alan Mislove, Massimiliano Marcon, P. Krishna Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the Internet Measurement Conference. Constantine Dovrolis and Matthew Roughan Eds., ACM, 29--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Thomas Neumann and Gerhard Weikum. 2009. Scalable join processing on very large rdf graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’09). U. Cetintemel, S. B. Zdonik, D. Kossmann, and N. Tatbul Eds., ACM Press, New York, 627--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Thomas Neumann and Gerhard Weikum. 2010. The rdf-3x engine for scalable management of rdf data. VLDB J. 19, 1, 91--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Makoto Onizuka, Fong Yee Chan, Ryusuke Michigami, and Takashi Honishi. 2005. Incremental maintenance for materialized xpath/xslt views. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). A. Ellis and T. Hagino Eds., ACM Press, New York, 671--681. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mark Roantree, Colm Noonan, and John Murphy. 2007. Specifying and optimising xml views. In Proceedings of the 24th British National Conference on Databases (BNCOD’07). R. Cooper and J. B. Kennedy Eds., Lecture Notes in Computer Science, vol. 4587, Springer, 138--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobe. 2000. Efficient and extensible algorithms for multi query optimization. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’00). W. Chen, J. F. Naughton, and P. A. Bernstein Eds., ACM Press, New York, 249--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sherif Sakr. 2009. GraphREL: A decomposition-based and selectivity-aware relational framework for processing sub-graph queries. In Proceedings of the 14th International Conference on Database Systems for Advanced Applications (DASFAA’09). X. Zhou, H. Yokota, K. Deng, and Q. Liu Eds., Lecture Notes in Computer Science, vol. 5463, Springer, 123--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Sherif Sakr and Ghazi Al-Naymat. 2009. Relational processing of rdf queries: A survey. SIGMOD Rec. 38, 4, 23--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mauro San Martin and Claudio Gutierrez. 2009. Representing, querying and transforming social networks with rdf/sparql. In Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications (ESWC’09). L. Aroyo, P. Traverso, F. Ciravegna, P. Cimiano, T. Heath, E. Hyvonen, R. Mizoguchi, E. Oren, M. Sabou, and E. P. B. Simperl Eds., Lecture Notes in Computer Science, vol. 5554, Springer, 293--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Timos K. Sellis. 1988. Multiple-query optimization. ACM Trans. Database Syst. 13, 1, 23--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Timos K. Sellis and Subrata Ghosh. 1990. On the multiple-query optimization problem. IEEE Trans. Knowl. Data Engin. 2, 2, 262--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sesame2. 2012. http://www.openrdf.org.Google ScholarGoogle Scholar
  31. Michael Sintek and Malte K. 2006. RDFBroker: A signature-based high-performance rdf store. In Proceedings of the Extended Semantic Web Conference (ESWC). Y. Sure and J. Domingue Eds., Lecture Notes in Computer Science, vol. 4011, Springer, 363--377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Slawek Staworko, Iovka Boneva, and Benoit Groz. 2010. The view update problem for XML. In EDBT/ICDT Workshops (ACM International Conference Proceeding Series). F. Daniel, L. M. L. Delcambre, F. Fotouhi, I. Garrigos, G. Guerrini, J.-N. Mazon, M. Mesiti, S. Muller-Feuerstein, J. Trujillo, T. M. Truta, B. Volz, E. Waller, L. Xiong, and E. Zimanyi Eds., ACM Press, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Markus Stocker, Andy Seaborne, Abraham Bernstein, Christoph Kiefer, and Dave Reynolds. 2008. SPARQL basic graph pattern optimization using selectivity estimation. In Proceedings of the 37th International Conference on World Wide Web (WWW’08). ACM Press, New York. 595--604. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Julian R. Ullmann. 1976. An algorithm for subgraph isomorphism. J. ACM 23, 1, 31--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Raphael Volz, Steffen Staab, and Boris Motik. 2003. Incremental maintenance of materialized ontologies. In Proceedings of the OTM Confederated International Conference on the Move to Meaningful Internet Systems (CoopIS/DOA/ODBASE). R. Meersman, Z. Tari, and D. C. Schmidt Eds., Lecture Notes in Computer Science, vol. 2888, Springer, 707--724.Google ScholarGoogle ScholarCross RefCross Ref
  36. Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, and Dave Reynolds. 2003. Efficient rdf storage and retrieval in jena2. In Proceedings of the 1st International Workshop on Semantic Web and Databases (SWDB’03). I. F. Cruz, V. Kashyap, S. Decker, and R. Eckstein Eds., 131--150.Google ScholarGoogle Scholar
  37. Shijie Zhang, Shirong Li, and Jiong Yang. 2009. GADDI: Distance index based subgraph matching in biological networks. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’09). M. L. Kersten, B. Novikov, J. Teubner, V. Polutin, and S. Manegold Eds., vol. 360, ACM Press, New York, 192--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Shijie Zhang, Shirong Li, and Jiong Yang. 2010. SUMMA: Subgraph matching in massive graphs. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM’10). J. Huang, N. Koudas, G. J. F. Jones, X. Wu, K. Collins-Thompson, and A. An Eds., ACM Press, New York, 1285--1288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ke Zhu, Ying Zhang, Xuemin Lin, Gaoping Zhu, and Wei Wang. 2010. NOVA: A novel and efficient framework for finding subgraph isomorphism mappings in large graphs. In Proceedings of the 15th International Conference on Database Systems for Advanced Applications (DASFAA’10). H. Kitagawa, Y. Ishikawa, Q. Li, and C. Watanabe Eds., Lecture Notes in Computer Science, vol. 5981, Springer, 140--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yue Zhuge and Hector Garcia-Molina. 1998. Graph structured views and their incremental maintenance. In Proceedings of the 14th International Conference on Data Engineering (ICDE’98). S. D. Urban and E. Bertino Eds., IEEE Computer Society, 116--125. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficient Multiview Maintenance under Insertion in Huge Social Networks

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM Transactions on the Web
                  ACM Transactions on the Web  Volume 8, Issue 2
                  March 2014
                  226 pages
                  ISSN:1559-1131
                  EISSN:1559-114X
                  DOI:10.1145/2600093
                  Issue’s Table of Contents

                  Copyright © 2014 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 1 March 2014
                  • Accepted: 1 October 2013
                  • Revised: 1 July 2013
                  • Received: 1 January 2013
                  Published in tweb Volume 8, Issue 2

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article
                  • Research
                  • Refereed

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader