Abstract
Applications to monitor various aspects of social networks are becoming increasingly popular. For instance, marketers want to look for semantic patterns relating to the content of tweets and Facebook posts relating to their products. Law enforcement agencies want to track behaviors involving potential criminals on the Internet by looking for certain patterns of behavior. Music companies want to track patterns of spread of illegal music. These applications allow multiple users to specify patterns of interest and monitor them in real time as new data gets added to the Web or to a social network. In this article we develop the concept of social network view servers in which all of these types of applications can be simultaneously monitored. The patterns of interest are expressed as views over an underlying graph or social network database. We show that a given set of views can be compiled in multiple possible ways to take advantage of common substructures and define the concept of an optimal merge. Though finding an optimal merge is shown to be NP-hard, we develop the AddView to find very good merges quickly. We develop a very fast MultiView algorithm that scalably and efficiently maintains multiple subgraph views when insertions are made to the social network database. We show that our algorithm is correct, study its complexity, and experimentally demonstrate that our algorithm can scalably handle updates to hundreds of views on 6 real-world social network databases with up to 540M edges.
- Daniel J. Abadi, Adam Marcus, Samuel Madden, and Katherine J. Hollenbach. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB’07). C. Koch, J. Gehrke, M. N. Garofalakis, D. Srivastava, K. Aberer, A. Deshpande, D. Florescu, C. Y. Chan, V. Ganti, C.-C. Kanne, W. Klas, and E. J. Neuhold Eds., ACM Press, New York, 411--422. Google ScholarDigital Library
- Matthias Brocheler, Andrea Pugliese, and Venkatramanan S. Subrahmanian. 2009. DOGMA: A disk-oriented graph matching algorithm for rdf databases. In Proceedings of the International Semantic Web Conference (ISWC’09). A. Bernstein, D. R. Karger, T. Heath, L. Feigenbaum, D. Maynard, E. Motta, and K. Thirunarayan Eds., Lecture Notes in Computer Science, vol. 5823, Springer, 97--113. Google ScholarDigital Library
- Matthias Brocheler, Andrea Pugliese, and Venkatramanan S. Subrahmanian. 2010. COSI: Cloud oriented subgraph identification in massive social networks. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM’10). N. Memon and R. Alhajj Eds., IEEE Computer Society, 248--255. Google ScholarDigital Library
- Matthias Brocheler, Andrea Pugliese, and Venkatramanan S. Subrahmanian. 2011. A budget-based algorithm for efficient subgraph matching on huge networks. In Proceedings of the 27th IEEE International Conference on Data Engineering Workshops (ICDEWorkshops’11). S. Abiteboul, K. Bohm, C. Koch, and K.-L. Tan Eds., IEEE, 94--99. Google ScholarDigital Library
- Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. 2003. Sesame: An architecture for storing and querying rdf data and schema information. In Spinning the Semantic Web, D. Fensel, J. A. Hendler, H. Lieberman, and W. Wahlster Eds., MIT Press, 197--222. Google ScholarDigital Library
- Lei Chen and Changliang Wang. 2010. Continuous subgraph pattern search over certain and uncertain graph streams. IEEE Trans. Knowl. Data Eng. 22, 8, 1093--1109. Google ScholarDigital Library
- James Cheng, Yiping Ke, and Wilfred Ng. 2009. Efficient query processing on graph databases. ACM Trans. Database Syst. 34, 1. Google ScholarDigital Library
- Jiefeng Cheng, Jeffrey Xu Yu, Bolin Ding, Philip S. Yu, and Haixun Wang. 2008. Fast graph pattern matching. In Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE’08). G. Alonso, J. A. Blakeley, and A. L. P. Chen Eds., IEEE, 913--922. Google ScholarDigital Library
- Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York. Google ScholarDigital Library
- Johannes Gehrke, Paul Ginsparg, and Jon M. Kleinberg. 2003. Overview of the 2003 KDD Cup. SIGKDD Explorations 5, 2, 149--151. Google ScholarDigital Library
- Rosalba Giugno and Dennis Shasha. 2002. GraphGrep: A fast and universal method for querying graphs. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 112--115.Google ScholarCross Ref
- Roy Goldman, Jason McHugh, and Jennifer Widom. 1999. From semistructured data to xml: Migrating the lore data model and query language. In Proceedings of the ACM SIGMOD Workshop on the Web and Databases (WebDB’99). 25--30.Google Scholar
- Ashish Gupta, Inderpal Singh Mumick, and Venkatramanan S. Subrahmanian. 1993. Maintaining views incrementally. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’93). P. Buneman and S. Jajodia Eds., ACM Press, New York, 157--166. Google ScholarDigital Library
- Edward Hung, Yu Deng, and Venkatramanan S. Subrahmanian. 2005. RDF aggregate queries and views. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05). K. Aberer, M. J. Franklin, and S. Nishio Eds., IEEE Computer Society, 717--728. Google ScholarDigital Library
- Yiping Ke, James Cheng, and Jeffrey Xu Yu. 2010. Querying large graph databases. In Proceedings of the 15th International Conference on Database Systems for Advanced Applications (DASFAA’10). H. Kitagawa, Y. Ishikawa, Q. Li, and C. Watanabe Eds., Lecture Notes in Computer Science, vol. 5982, Springer, 487--488. Google ScholarDigital Library
- Atanas Kiryakov, Damyan Ognyanov, and Dimitar Manov. 2005. OWLIM - A pragmatic semantic repository for owl. In Proceedings of the International Conference on Web Information Systems Engineering (WISE’05). M. Dean, Y. Guo, W. Jun, R. Kaschek, S. Krishnaswamy, Z. Pan, and Q. Z. Sheng Eds., Lecture Notes in Computer Science, vol. 3807, Springer, 182--192. Google ScholarDigital Library
- Bryan Klimt and Yiming Yang. 2004. The Enron Corpus: A new dataset for email classification research. In Proceedings of the 15th European Conference on Machine Learning (ECML’04). Jean-François Boulicaut, Floriana Esposito, Fosca Giannotti, and Dino Pedreschi Eds., Lecture Notes in Computer Science, vol. 3201, Springer, 217--226.Google ScholarDigital Library
- Geetha Manjunath, R. Badrinath, Craig Sayers, and K. S. Venugopal. 2008. Temporal views over rdf data. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). 1131--1132. Google ScholarDigital Library
- Alan Mislove, Massimiliano Marcon, P. Krishna Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the Internet Measurement Conference. Constantine Dovrolis and Matthew Roughan Eds., ACM, 29--42. Google ScholarDigital Library
- Thomas Neumann and Gerhard Weikum. 2009. Scalable join processing on very large rdf graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’09). U. Cetintemel, S. B. Zdonik, D. Kossmann, and N. Tatbul Eds., ACM Press, New York, 627--640. Google ScholarDigital Library
- Thomas Neumann and Gerhard Weikum. 2010. The rdf-3x engine for scalable management of rdf data. VLDB J. 19, 1, 91--113. Google ScholarDigital Library
- Makoto Onizuka, Fong Yee Chan, Ryusuke Michigami, and Takashi Honishi. 2005. Incremental maintenance for materialized xpath/xslt views. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). A. Ellis and T. Hagino Eds., ACM Press, New York, 671--681. Google ScholarDigital Library
- Mark Roantree, Colm Noonan, and John Murphy. 2007. Specifying and optimising xml views. In Proceedings of the 24th British National Conference on Databases (BNCOD’07). R. Cooper and J. B. Kennedy Eds., Lecture Notes in Computer Science, vol. 4587, Springer, 138--146. Google ScholarDigital Library
- Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobe. 2000. Efficient and extensible algorithms for multi query optimization. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’00). W. Chen, J. F. Naughton, and P. A. Bernstein Eds., ACM Press, New York, 249--260. Google ScholarDigital Library
- Sherif Sakr. 2009. GraphREL: A decomposition-based and selectivity-aware relational framework for processing sub-graph queries. In Proceedings of the 14th International Conference on Database Systems for Advanced Applications (DASFAA’09). X. Zhou, H. Yokota, K. Deng, and Q. Liu Eds., Lecture Notes in Computer Science, vol. 5463, Springer, 123--137. Google ScholarDigital Library
- Sherif Sakr and Ghazi Al-Naymat. 2009. Relational processing of rdf queries: A survey. SIGMOD Rec. 38, 4, 23--28. Google ScholarDigital Library
- Mauro San Martin and Claudio Gutierrez. 2009. Representing, querying and transforming social networks with rdf/sparql. In Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications (ESWC’09). L. Aroyo, P. Traverso, F. Ciravegna, P. Cimiano, T. Heath, E. Hyvonen, R. Mizoguchi, E. Oren, M. Sabou, and E. P. B. Simperl Eds., Lecture Notes in Computer Science, vol. 5554, Springer, 293--307. Google ScholarDigital Library
- Timos K. Sellis. 1988. Multiple-query optimization. ACM Trans. Database Syst. 13, 1, 23--52. Google ScholarDigital Library
- Timos K. Sellis and Subrata Ghosh. 1990. On the multiple-query optimization problem. IEEE Trans. Knowl. Data Engin. 2, 2, 262--266. Google ScholarDigital Library
- Sesame2. 2012. http://www.openrdf.org.Google Scholar
- Michael Sintek and Malte K. 2006. RDFBroker: A signature-based high-performance rdf store. In Proceedings of the Extended Semantic Web Conference (ESWC). Y. Sure and J. Domingue Eds., Lecture Notes in Computer Science, vol. 4011, Springer, 363--377. Google ScholarDigital Library
- Slawek Staworko, Iovka Boneva, and Benoit Groz. 2010. The view update problem for XML. In EDBT/ICDT Workshops (ACM International Conference Proceeding Series). F. Daniel, L. M. L. Delcambre, F. Fotouhi, I. Garrigos, G. Guerrini, J.-N. Mazon, M. Mesiti, S. Muller-Feuerstein, J. Trujillo, T. M. Truta, B. Volz, E. Waller, L. Xiong, and E. Zimanyi Eds., ACM Press, New York. Google ScholarDigital Library
- Markus Stocker, Andy Seaborne, Abraham Bernstein, Christoph Kiefer, and Dave Reynolds. 2008. SPARQL basic graph pattern optimization using selectivity estimation. In Proceedings of the 37th International Conference on World Wide Web (WWW’08). ACM Press, New York. 595--604. Google ScholarDigital Library
- Julian R. Ullmann. 1976. An algorithm for subgraph isomorphism. J. ACM 23, 1, 31--42. Google ScholarDigital Library
- Raphael Volz, Steffen Staab, and Boris Motik. 2003. Incremental maintenance of materialized ontologies. In Proceedings of the OTM Confederated International Conference on the Move to Meaningful Internet Systems (CoopIS/DOA/ODBASE). R. Meersman, Z. Tari, and D. C. Schmidt Eds., Lecture Notes in Computer Science, vol. 2888, Springer, 707--724.Google ScholarCross Ref
- Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, and Dave Reynolds. 2003. Efficient rdf storage and retrieval in jena2. In Proceedings of the 1st International Workshop on Semantic Web and Databases (SWDB’03). I. F. Cruz, V. Kashyap, S. Decker, and R. Eckstein Eds., 131--150.Google Scholar
- Shijie Zhang, Shirong Li, and Jiong Yang. 2009. GADDI: Distance index based subgraph matching in biological networks. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’09). M. L. Kersten, B. Novikov, J. Teubner, V. Polutin, and S. Manegold Eds., vol. 360, ACM Press, New York, 192--203. Google ScholarDigital Library
- Shijie Zhang, Shirong Li, and Jiong Yang. 2010. SUMMA: Subgraph matching in massive graphs. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM’10). J. Huang, N. Koudas, G. J. F. Jones, X. Wu, K. Collins-Thompson, and A. An Eds., ACM Press, New York, 1285--1288. Google ScholarDigital Library
- Ke Zhu, Ying Zhang, Xuemin Lin, Gaoping Zhu, and Wei Wang. 2010. NOVA: A novel and efficient framework for finding subgraph isomorphism mappings in large graphs. In Proceedings of the 15th International Conference on Database Systems for Advanced Applications (DASFAA’10). H. Kitagawa, Y. Ishikawa, Q. Li, and C. Watanabe Eds., Lecture Notes in Computer Science, vol. 5981, Springer, 140--154. Google ScholarDigital Library
- Yue Zhuge and Hector Garcia-Molina. 1998. Graph structured views and their incremental maintenance. In Proceedings of the 14th International Conference on Data Engineering (ICDE’98). S. D. Urban and E. Bertino Eds., IEEE Computer Society, 116--125. Google ScholarDigital Library
Index Terms
- Efficient Multiview Maintenance under Insertion in Huge Social Networks
Recommendations
Friendship Maintenance and Prediction in Multiple Social Networks
HT '16: Proceedings of the 27th ACM Conference on Hypertext and Social MediaDue to the proliferation of online social networks (OSNs), users find themselves participating in multiple OSNs. These users leave their activity traces as they maintain friendships and interact with other users in these OSNs. In this work, we analyze ...
Towards combating rumors in social networks: Models and metrics
Dynamic Networks and Knowledge DiscoveryRumor is a potentially harmful social phenomenon that has been observed in all human societies in all times. Social networking sites provide a platform for the rapid interchange of information and hence, for the rapid dissemination of unsubstantiated ...
Efficient multi-view maintenance in the social semantic web
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide WebThe Social Semantic Web (SSW) refers to the mix of RDF data in web content, and social network data associated with those who posted that content. Applications to monitor the SSW are becoming increasingly popular. For instance, marketers want to look ...
Comments