ABSTRACT
We introduce the public-private model of graphs. In this model, we have a public graph and each node in the public graph has an associated private graph. The motivation for studying this model stems from social networks, where the nodes are the users, the public graph is visible to everyone, and the private graph at each node is visible only to the user at the node. From each node's viewpoint, the graph is just a union of its private graph and the public graph.
We consider the problem of efficiently computing various properties of the graphs from each node's point of view, with minimal amount of recomputation on the public graph. To illustrate the richness of our model, we explore two powerful computational paradigms for studying large graphs, namely, sketching and sampling, and focus on some key problems in social networks and show efficient algorithms in the public-private graph model. In the sketching model, we show how to efficiently approximate the neighborhood function, which in turn can be used to approximate various notions of centrality. In the sampling model, we focus on all-pair shortest path distances, node similarities, and correlation clustering.
Supplemental Material
- K. J. Ahn, S. Guha, and A. McGregor. Analyzing graph structure via linear measurements. In SODA, pages 459--467, 2012. Google ScholarDigital Library
- K. J. Ahn, S. Guha, and A. McGregor. Graph sketches: sparsification, spanners, and subgraphs. In PODS, pages 5--14, 2012. Google ScholarDigital Library
- N. Ailon, M. Charikar, and A. Newman. Aggregating inconsistent information: Ranking and clustering. JACM, 55(5), 2008. Google ScholarDigital Library
- R. Andersen, F. Chung, and K. Lang. Local partitioning for directed graphs using pagerank. Internet Mathematics, 5(1--2):3--22, 2008.Google Scholar
- N. Bansal, A. Blum, and S. Chawla. Correlation clustering. Machine Learning, 56(1--3):89--113, 2004. Google ScholarDigital Library
- T. Y. Berger-Wolf and J. Saia. A framework for analysis of dynamic social networks. In KDD, pages 523--528, 2006. Google ScholarDigital Library
- P. Boldi, M. Rosa, and S. Vigna. HyperANF: approximating the neighbourhood function of very large graphs on a budget. In WWW, pages 625--634, 2011. Google ScholarDigital Library
- P. Boldi and S. Vigna. In-core computation of geometric centralities with hyperball: A hundred billion nodes and beyond. In ICDM Workshops, pages 621--628, 2013. Google ScholarDigital Library
- M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In APPROX, pages 84--95, 2000. Google ScholarDigital Library
- E. Cohen. Size-estimation framework with applications to transitive closure and reachability. JCSS, 55:441--453, 1997. Google ScholarDigital Library
- E. Cohen. All-distances sketches, revisited: HIP estimators for massive graphs analysis. In PODS, pages 88--99, 2014. Google ScholarDigital Library
- E. Cohen and H. Kaplan. Summarizing data using bottom-k sketches. In PODC, pages 225--234, 2007. Google ScholarDigital Library
- A. Dasgupta, R. Kumar, and T. Sarlós. On estimating the average degree. In WWW, pages 795--806, 2014. Google ScholarDigital Library
- C. Demetrescu, D. Eppstein, Z. Galil, and G. F. Italiano. Dynamic graph algorithms. In M. J. Atallah and M. Blanton, editors, Algorithms and Theory of Computation Handbook, pages 9--9. Chapman & Hall/CRC, 2010. Google ScholarDigital Library
- C. Demetrescu and G. F. Italiano. A new approach to dynamic all pairs shortest paths. JACM, 51(6):968--992, 2004. Google ScholarDigital Library
- R. Dey, Z. Jelveh, and K. W. Ross. Facebook users have become much more private: A large-scale study. In PerCom Workshops, 2012.Google ScholarCross Ref
- S. J. Hardiman and L. Katzir. Estimating clustering coefficients and size of social networks via random walk. In WWW, pages 539--550, 2013. Google ScholarDigital Library
- T. H. Haveliwala. Topic-sensitive pagerank. In WWW, pages 517--526, 2002. Google ScholarDigital Library
- J. Holm, K. de Lichtenberg, and M. Thorup. Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. JACM, 48(4):723--760, 2001. Google ScholarDigital Library
- G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279. ACM, 2003. Google ScholarDigital Library
- L. Katz. A new status index derived from sociometric analysis. Psychometrika, 18(1):39--43, 1953.Google ScholarCross Ref
- L. Katzir, E. Liberty, and O. Somekh. Estimating sizes of social networks via biased sampling. In WWW, pages 597--606, 2011. Google ScholarDigital Library
- V. King. Fully dynamic connectivity. In Encyclopedia of Algorithms. Springer, 2008.Google ScholarCross Ref
- V. King. Fully dynamic transitive closure. In Encyclopedia of Algorithms. Springer, 2008.Google ScholarCross Ref
- J. Leskovec, D. P. Huttenlocher, and J. M. Kleinberg. Predicting positive and negative links in online social networks. In WWW, pages 641--650, 2010. Google ScholarDigital Library
- J. Leskovec, D. P. Huttenlocher, and J. M. Kleinberg. Signed networks in social media. In CHI, pages 1361--1370, 2010. Google ScholarDigital Library
- J. Leskovec, J. M. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. TKDD, 1(1), 2007. Google ScholarDigital Library
- A. McGregor. Graph stream algorithms: a survey. SIGMOD Record, 43(1):9--20, 2014. Google ScholarDigital Library
- C. R. Palmer, P. B. Gibbons, and C. Faloutsos. ANF: a fast and scalable tool for data mining in massive graphs. In KDD, pages 81--90, 2002. Google ScholarDigital Library
- R. Panigrahy, M. Najork, and Y. Xie. How user behavior is related to social affinity. In WSDM, pages 713--722, 2012. Google ScholarDigital Library
- M. Ripeanu, A. Iamnitchi, and I. T. Foster. Mapping the Gnutella network. IEEE Internet Computing, 6(1):50--57, 2002. Google ScholarDigital Library
- A. D. Sarma, S. Gollapudi, M. Najork, and R. Panigrahy. A sketch-based distance oracle for web-scale graphs. In WSDM, pages 401--410, 2010. Google ScholarDigital Library
- C. Tantipathananandh, T. Y. Berger-Wolf, and D. Kempe. A framework for community identification in dynamic social networks. In KDD, pages 717--726, 2007. Google ScholarDigital Library
- J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In ICDM, pages 745--754, 2012. Google ScholarDigital Library
Index Terms
- Efficient Algorithms for Public-Private Social Networks
Recommendations
Optimal Distributed Submodular Optimization via Sketching
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningWe present distributed algorithms for several classes of submodular optimization problems such as k-cover, set cover, facility location, and probabilistic coverage. The new algorithms enjoy almost optimal space complexity, optimal approximation ...
Composable core-sets for diversity and coverage maximization
PODS '14: Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsIn this paper we consider efficient construction of "composable core-sets" for basic diversity and coverage maximization problems. A core-set for a point-set in a metric space is a subset of the point-set with the property that an approximate solution ...
Budget Management Strategies in Repeated Auctions
WWW '17: Proceedings of the 26th International Conference on World Wide WebIn online advertising, advertisers purchase ad placements by participating in a long sequence of repeated auctions. One of the most important features advertising platforms often provide, and advertisers often use, is budget management, which allows ...
Comments