ABSTRACT
Measuring distance or some other form of proximity between objects is a standard data mining tool. Connection subgraphs were recently proposed as a way to demonstrate proximity between nodes in networks. We propose a new way of measuring and extracting proximity in networks called "cycle free effective conductance"(CFEC). Our proximity measure can handle more than two endpoints, directed edges, is statistically well-behaved, and produces an effectiveness score for the computed subgraphs. We provide an efficien talgorithm. Also, we report experimental results and show examples for three large network data sets: a telecommunications calling graph, the IMDB actors graph, and an academic co-authorship network.
- A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509--512, October 1999.Google ScholarCross Ref
- I. Bhattacharya and L. Getoor. Relational clustering for multi-type entity resolution. In Proc. 11th ACM SIGKDD Workshop on Multi Relational Data Mining, pages 3--11, 2005. Google ScholarDigital Library
- B. Bollobas. Modern Graph Theory. Springer-Verlag, 1998.Google ScholarCross Ref
- U. Brandes and D. Fleischer. Centrality measures based on current flow. In Proc. 22nd Symp. Theoretical Aspects of Computer Science (STACS '05), pages 533--544. LNCS 3404, Springer-Verlag, 2005. Google ScholarDigital Library
- J. D. Cohen. Drawing graphs to convey proximity: an incremental arrangement method. ACM Transactions on Computer-Human Interaction, 4(3):197--229, 1997. Google ScholarDigital Library
- T. H. Cormen, C. L. Leiserson, and R. L. Rivest. Introduction to Algorithms. McGraw-Hill and the MIT Press, 1990. Google ScholarDigital Library
- DBLP computer science bibliography. dblp.uni-trier.de.Google Scholar
- P. G. Doyle and J. L. Snell. Random Walks and Electrical Networks. The Mathematical Association of America, 1984. http://arxiv.org/abs/math.PR/0001057.Google ScholarCross Ref
- C. Faloutsos, K. S. McCurley, and A. Tomkins. Fast discovery of connection subgraphs. In Proc. 10th ACM SIGKDD conference, pages 118--127, 2004. Google ScholarDigital Library
- G. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 150--160, Boston, MA, August 20-23 2000. Google ScholarDigital Library
- M. R. Garey and D. S. Johnson. Computers and Intractability, A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, New York, 1979. Google ScholarDigital Library
- D. Gibson, J. M. Kleinberg, and P. Raghavan. Inferring Web Communities from Link Topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia, pages 225--234, Pittsburgh, Pennsylvania, June 1998. Google ScholarDigital Library
- E. Hadjiconstantinou and N. Christofides. An efficient implementation of an algorithm for finding k shortest simple paths. Networks, 34:88--101, 1999.Google Scholar
- Internet Movie Database. www.imdb.com.Google Scholar
- M. M. John Hershberger and S. Suri. Finding the k shortest simple paths: A new algorithm and its implementation. In Proc. 5th Workshop on Algorithm Engineering and Experimentation (ALENEX '05), pages 26--36. SIAM, 2003.Google Scholar
- N. Katoh, T. Ibaraki, and H. Mine. An efficient algorithm for k shortest simple paths. Networks, 12:411--427, 1982.Google Scholar
- K. Lang. Finding good nearly balanced cuts in power law graphs. Technical report, Yahoo Research Labs, 2004.Google Scholar
- D. Liben-Nowell and J. M. Kleinberg. The link prediction problem for social networks. In CIKM, pages 556--559. ACM, 2003. Google ScholarDigital Library
- A. Popescul and L. H. Ungar. Statistical relational learning for link prediction. In Proc. Workshop on Learning Statistical Models from Relational Data (IJCAI '03), 2003.Google Scholar
- W. E. Winkler. The state of record linkage and current research problems. Technical report, Statistical Research Division, U.S. Bureau of the Census, 1999.Google Scholar
Index Terms
- Measuring and extracting proximity in networks
Recommendations
Measuring and extracting proximity graphs in networks
Measuring distance or some other form of proximity between objects is a standard data mining tool. Connection subgraphs were recently proposed as a way to demonstrate proximity between nodes in networks. We propose a new way of measuring and extracting ...
Proximity and average eccentricity of a graph
Let G=(V,E) be a connected graph on n vertices. The proximity@p(G) of G is the minimum average distance from a vertex of G to all others. The eccentricitye(v) of a vertex v in G is the largest distance from v to another vertex, and the average ...
Nordhaus-Gaddum relations for proximity and remoteness in graphs
The transmission of a vertex in a connected graph is the sum of all distances from that vertex to the others. It is said to be normalized if divided by n-1, where n denotes the order of the graph. The proximity of a graph is the minimum normalized ...
Comments