skip to main content
10.1145/3219819.3220040acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

SpotLight: Detecting Anomalies in Streaming Graphs

Published:19 July 2018Publication History

ABSTRACT

How do we spot interesting events from e-mail or transportation logs? How can we detect port scan or denial of service attacks from IP-IP communication data? In general, given a sequence of weighted, directed or bipartite graphs, each summarizing a snapshot of activity in a time window, how can we spot anomalous graphs containing the sudden appearance or disappearance of large dense subgraphs (e.g., near bicliques) in near real-time using sublinear memory? To this end, we propose a randomized sketching-based approach called SpotLight, which guarantees that an anomalous graph is mapped 'far' away from 'normal' instances in the sketch space with high probability for appropriate choice of parameters. Extensive experiments on real-world datasets show that SpotLight (a) improves accuracy by at least 8.4% compared to prior approaches, (b) is fast and can process millions of edges within a few minutes, (c) scales linearly with the number of edges and sketching dimensions and (d) leads to interesting discoveries in practice.

Skip Supplemental Material Section

Supplemental Material

eswaran_detecing_anomalies.mp4

mp4

391.1 MB

References

  1. 2018. NYC Taxi &Limousine Corporation - Trip Record Data. http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml.Google ScholarGoogle Scholar
  2. Charu C. Aggarwal, Yuchen Zhao, and Philip S. Yu . 2011. Outlier detection in graph streams. In ICDE. IEEE, 399--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Leman Akoglu, Mary McGlohon, and Christos Faloutsos . 2010. oddball: Spotting Anomalies in Weighted Graphs. In PAKDD, Vol. Vol. 6119. Springer, 410--421. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Leman Akoglu, Hanghang Tong, and Danai Koutra . 2015. Graph based anomaly detection and description: a survey. Data Min. Knowl. Discov. Vol. 29, 3 (2015), 626--688. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos . 2013. CopyCatch: stopping group attacks by spotting lockstep behavior in social networks WWW. ACM, 119--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Reinhard Diestel . 2012. Graph Theory, 4th Edition. Graduate texts in mathematics, Vol. Vol. 173. Springer.Google ScholarGoogle Scholar
  7. Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, Disha Makhija, and Mohit Kumar . 2017. ZooBP: Belief Propagation for Heterogeneous Networks. PVLDB Vol. 10, 5 (2017), 625--636. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Timothy La Fond, Jennifer Neville, and Brian Gallagher . 2014. Anomaly Detection in Dynamic Networks of Varying Size. CoRR Vol. abs/1411.3749 (2014).Google ScholarGoogle Scholar
  9. Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers . 2016. Robust Random Cut Forest Based Anomaly Detection on Streams ICML, Vol. Vol. 48. JMLR.org, 2712--2721. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Keith Henderson, Tina Eliassi-Rad, Christos Faloutsos, Leman Akoglu, Lei Li, Koji Maruhashi, B. Aditya Prakash, and Hanghang Tong . 2010. Metric forensics: a multi-level approach for mining volatile graphs KDD. ACM, 163--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos . 2016. FRAUDAR: Bounding Graph Fraud in the Face of Camouflage KDD. ACM, 895--904. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tsuyoshi Idé and Hisashi Kashima . 2004. Eigenspace-based anomaly detection in computer systems KDD. ACM, 440--449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos . 2015. A General Suspiciousness Metric for Dense Blocks in Multimodal Data ICDM. IEEE, 781--786. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Danai Koutra, Neil Shah, Joshua T. Vogelstein, Brian Gallagher, and Christos Faloutsos . 2016. DeltaCon: Principled Massive-Graph Similarity Function with Attribution. TKDD Vol. 10, 3, 28:1--28:43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Richard Lippmann, Robert K. Cunningham, David J. Fried, Isaac Graf, Kris R. Kendall, Seth E. Webster, and Marc A. Zissman . 1999. Results of the DARPA 1998 Offline Intrusion Detection Evaluation Recent Advances in Intrusion Detection.Google ScholarGoogle Scholar
  16. Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou . 2008. Isolation Forest. In ICDM. IEEE, 413--422.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Emaad A. Manzoor, Sadegh M. Milajerdi, and Leman Akoglu . 2016. Fast Memory-efficient Anomaly Detection in Streaming Heterogeneous Graphs KDD. ACM, 1035--1044. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Andrew McGregor . 2014. Graph stream algorithms: a survey. SIGMOD Record Vol. 43, 1 (2014), 9--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Andrew McGregor, David Tench, Sofya Vorotnikova, and Hoa T. Vu . 2015. Densest Subgraph in Dynamic Graph Streams. In MFCS, Vol. Vol. 9235. Springer, 472--482.Google ScholarGoogle Scholar
  20. Tomás Pevný . 2016. Loda: Lightweight on-line detector of anomalies. Machine Learning Vol. 102, 2 (2016), 275--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Aditya Prakash, Ashwin Sridharan, Mukund Seshadri, Sridhar Machiraju, and Christos Faloutsos . 2010. EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs PAKDD, Vol. Vol. 6119. Springer, 435--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Stephen Ranshous, Steve Harenberg, Kshitij Sharma, and Nagiza F Samatova . 2016. A Scalable Approach for Outlier Detection in Edge Streams Using Sketch-based Approximations. In SDM. SIAM, 189--197.Google ScholarGoogle Scholar
  23. Stephen Ranshous, Shitian Shen, Danai Koutra, Steve Harenberg, Christos Faloutsos, and Nagiza F Samatova . 2015. Anomaly detection in dynamic networks: a survey. Wiley Interdisciplinary Reviews: Computational Statistics Vol. 7, 3 (2015), 223--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jitesh Shetty and Jafar Adibi . 2004. The Enron email dataset database schema and brief statistical report. Information sciences institute technical report, University of Southern California Vol. 4, 1 (2004), 120--128.Google ScholarGoogle Scholar
  25. Kijung Shin, Bryan Hooi, and Christos Faloutsos . 2016. M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees ECML/PKDD, Vol. Vol. 9851. Springer, 264--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Kumar Sricharan and Kamalika Das . 2014. Localizing anomalous changes in time-evolving graphs SIGMOD. ACM, 1347--1358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, and Philip S. Yu . 2007. GraphScope: parameter-free mining of large time-evolving graphs KDD. ACM, 687--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jimeng Sun, Dacheng Tao, and Christos Faloutsos . 2006. Beyond streams and graphs: dynamic tensor analysis KDD. ACM, 374--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ke Wu, Kun Zhang, Wei Fan, Andrea Edwards, and Philip S. Yu . 2014. RS-Forest: A Rapid Density Estimator for Streaming Anomaly Detection ICDM. IEEE, 600--609. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yiming Yang and Xin Liu . 1999. A Re-Examination of Text Categorization Methods. In SIGIR. ACM, 42--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Weiren Yu, Charu C Aggarwal, Shuai Ma, and Haixun Wang . 2013. On anomalous hotspot discovery in graph streams. In ICDM. IEEE, 1271--1276.Google ScholarGoogle Scholar

Index Terms

  1. SpotLight: Detecting Anomalies in Streaming Graphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
        July 2018
        2925 pages
        ISBN:9781450355520
        DOI:10.1145/3219819

        Copyright © 2018 Owner/Author

        This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 July 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        KDD '18 Paper Acceptance Rate107of983submissions,11%Overall Acceptance Rate1,133of8,635submissions,13%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader