research-article

Public Access

NetWalk: A Flexible Deep Embedding Approach for Anomaly Detection in Dynamic Networks

Authors:
Wenchao Yu

University of California, Los Angeles, Los Angeles, CA, USA

University of California, Los Angeles, Los Angeles, CA, USA
View Profile

,
Wei Cheng

NEC Laboratories America, Inc., Princeton, NJ, USA

NEC Laboratories America, Inc., Princeton, NJ, USA
View Profile

,
Charu C. Aggarwal

IBM Research AI, Yorktown Height, NY, USA

IBM Research AI, Yorktown Height, NY, USA
View Profile

,
Kai Zhang

Temple University, Philadelphia, PA, USA

Temple University, Philadelphia, PA, USA
View Profile

,
Haifeng Chen

NEC Laboratories America, Inc., Princeton, NJ, USA

NEC Laboratories America, Inc., Princeton, NJ, USA
View Profile

,
Wei Wang

University of California, Los Angeles, Los Angeles, CA, USA

University of California, Los Angeles, Los Angeles, CA, USA
View Profile

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2018Pages 2672–2681https://doi.org/10.1145/3219819.3220024

Published:19 July 2018Publication History

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 2672–2681

ABSTRACT

Massive and dynamic networks arise in many practical applications such as social media, security and public health. Given an evolutionary network, it is crucial to detect structural anomalies, such as vertices and edges whose "behaviors'' deviate from underlying majority of the network, in a real-time fashion. Recently, network embedding has proven a powerful tool in learning the low-dimensional representations of vertices in networks that can capture and preserve the network structure. However, most existing network embedding approaches are designed for static networks, and thus may not be perfectly suited for a dynamic environment in which the network representation has to be constantly updated. In this paper, we propose a novel approach, NetWalk, for anomaly detection in dynamic networks by learning network representations which can be updated dynamically as the network evolves. We first encode the vertices of the dynamic network to vector representations by clique embedding, which jointly minimizes the pairwise distance of vertex representations of each walk derived from the dynamic networks, and the deep autoencoder reconstruction error serving as a global regularization. The vector representations can be computed with constant space requirements using reservoir sampling. On the basis of the learned low-dimensional vertex representations, a clustering-based technique is employed to incrementally and dynamically detect network anomalies. Compared with existing approaches, NetWalk has several advantages: 1) the network embedding can be updated dynamically, 2) streaming network nodes and edges can be encoded efficiently with constant memory space usage, 3). flexible to be applied on different types of networks, and 4) network anomalies can be detected in real-time. Extensive experiments on four real datasets demonstrate the effectiveness of NetWalk.

Supplemental Material

yu_netwalk_approach.mp4

mp4

306.8 MB

Download

References

Charu C Aggarwal . 2013. Outlier Analysis. Springer. Google Scholar
Charu C Aggarwal, Yuchen Zhao, and S Yu Philip . 2010. On Clustering Graph Streams.. In SDM. SIAM, 478--489.Google Scholar
Charu C Aggarwal, Yuchen Zhao, and S Yu Philip . 2011. Outlier detection in graph streams. In ICDE. IEEE, 399--409. Google ScholarDigital Library
Nir Ailon, Ragesh Jaiswal, and Claire Monteleoni . 2009. Streaming k-means approximation. In NIPS. 10--18. Google ScholarDigital Library
Leman Akoglu and Christos Faloutsos . 2013. Anomaly, event, and fraud detection in large network datasets WSDM. ACM, 773--774. Google ScholarDigital Library
Leman Akoglu, Mary McGlohon, and Christos Faloutsos . 2010. Oddball: Spotting anomalies in weighted graphs. In PAKDD. Springer, 410--421. Google ScholarDigital Library
Leman Akoglu, Hanghang Tong, and Danai Koutra . 2015. Graph based anomaly detection and description: a survey. Data Mining and Knowledge Discovery Vol. 29, 3 (2015), 626--688. Google ScholarDigital Library
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin . 2003. A neural probabilistic language model. JMLR Vol. 3, Feb (2003), 1137--1155. Google ScholarDigital Library
Shiyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C Aggarwal, and Thomas S Huang . 2015. Heterogeneous network embedding via deep architectures SIGKDD. ACM, 119--128. Google ScholarDigital Library
Wei Cheng, Kai Zhang, Haifeng Chen, Guofei Jiang, and Wei Wang . 2016. Ranking Causal Anomalies via Temporal and Dynamical Analysis on Vanishing Correlations SIGKDD. Google ScholarDigital Library
Graham Cormode and S Muthukrishnan . 2005. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms Vol. 55, 1 (2005), 58--75. Google ScholarDigital Library
Aaron Defazio, Francis R. Bach, and Simon Lacoste-Julien . 2014. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives. In NIPS. 1646--1654. Google ScholarDigital Library
William Eberle and Lawrence Holder . 2007. Anomaly detection in data represented as graphs. Intelligent Data Analysis Vol. 11, 6 (2007), 663--689. Google ScholarDigital Library
Jing Gao, Feng Liang, Wei Fan, Chi Wang, Yizhou Sun, and Jiawei Han . 2010. On community outliers and their efficient detection in information networks SIGKDD. ACM, 813--822. Google ScholarDigital Library
Aditya Grover and Jure Leskovec . 2016. node2vec: Scalable Feature Learning for Networks. (2016).Google Scholar
Manish Gupta, Jing Gao, Charu C Aggarwal, and Jiawei Han . 2014 a. Outlier Detection for Temporal Data: A Survey. TKDE Vol. 9, 26 (2014), 2250--2267.Google Scholar
Manish Gupta, Jing Gao, Yizhou Sun, and Jiawei Han . 2012 a. Community trend outlier detection using soft temporal pattern mining ECML/PKDD. Springer, 692--708.Google Scholar
Manish Gupta, Jing Gao, Yizhou Sun, and Jiawei Han . 2012 b. Integrating community matching and outlier detection for mining evolutionary community outliers. In SIGKDD. ACM, 859--867. Google ScholarDigital Library
Manish Gupta, Arun Mallya, Subhro Roy, Jason HD Cho, and Jiawei Han . 2014 b. Local Learning for Mining Outlier Subgraphs from Network Datasets SDM. 73--81.Google Scholar
Piotr Indyk and Rajeev Motwani . 1998. Approximate nearest neighbors: towards removing the curse of dimensionality ACM Symposium on Theory of Computing. ACM, 604--613. Google ScholarDigital Library
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos . 2007. Graph evolution: Densification and shrinking diameters. TKDD Vol. 1, 1 (2007), 2. Google ScholarDigital Library
Omer Levy and Yoav Goldberg . 2014. Neural word embedding as implicit matrix factorization NIPS. 2177--2185. Google ScholarDigital Library
Emaad A Manzoor, Sadegh Momeni, Venkat N Venkatakrishnan, and Leman Akoglu . 2016. Fast Memory-efficient Anomaly Detection in Streaming Heterogeneous Graphs KDD. Google ScholarDigital Library
Ryan McConville, Weiru Liu, and Paul Miller . 2015. Vertex clustering of augmented graph streams. SDM (2015).Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean . 2013 a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013 b. Distributed representations of words and phrases and their compositionality NIPS. 3111--3119. Google ScholarDigital Library
Andrew Ng . 2011. Sparse autoencoder. CS294A Lecture notes Vol. 72, 2011 (2011), 1--19.Google Scholar
Tore Opsahl and Pietro Panzarasa . 2009. Clustering in weighted networks. Social networks (2009), 155--163.Google Scholar
Bryan Perozzi and Leman Akoglu . 2016. Scalable anomaly ranking of attributed neighborhoods SDM. SIAM, 207--215.Google Scholar
Bryan Perozzi, Leman Akoglu, Patricia Iglesias Sánchez, and Emmanuel Müller . 2014 a. Focused Clustering and Outlier Detection in Large Attributed Graphs SIGKDD. 1346--1355. Google ScholarDigital Library
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena . 2014 b. Deepwalk: Online learning of social representations SIGKDD. ACM, 701--710. Google ScholarDigital Library
Stephen Ranshous, Steve Harenberg, Kshitij Sharma, Nagiza F Samatova, et almbox. . 2016. A Scalable Approach for Outlier Detection in Edge Streams Using Sketch-based Approximations. In SDM.Google Scholar
Stephen Ranshous, Shitian Shen, Danai Koutra, Steve Harenberg, Christos Faloutsos, and Nagiza F Samatova . 2015. Anomaly detection in dynamic networks: a survey. Wiley Interdisciplinary Reviews: Computational Statistics Vol. 7, 3 (2015), 223--247. Google ScholarDigital Library
David E Rumelhart, Geoffrey E Hinton, Ronald J Williams, et almbox. . 1988. Learning representations by back-propagating errors. Cognitive modeling Vol. 5, 3 (1988), 1.Google Scholar
Jimeng Sun, Huiming Qu, Deepayan Chakrabarti, and Christos Faloutsos . 2005. Neighborhood formation and anomaly detection in bipartite graphs ICDM. IEEE, 8--17.Google Scholar
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei . 2015. Line: Large-scale information network embedding. In WWW. ACM, 1067--1077. Google ScholarDigital Library
Fei Tian, Bin Gao, Qing Cui, Enhong Chen, and Tie-Yan Liu . 2014. Learning Deep Representations for Graph Clustering. AAAI. 1293--1299. Google ScholarDigital Library
Ulrike Von Luxburg . 2007. A tutorial on spectral clustering. Statistics and computing Vol. 17, 4 (2007), 395--416. Google ScholarDigital Library
Daixin Wang, Peng Cui, and Wenwu Zhu . 2016. Structural deep network embedding. In SIGKDD. ACM, 1225--1234. Google ScholarDigital Library
Wenchao Yu, Charu C Aggarwal, and Wei Wang . 2017. Temporally factorized network modeling for evolutionary network analysis WSDM. ACM, 455--464. Google ScholarDigital Library
Wenchao Yu, Guangxiang Zeng, Ping Luo, Fuzhen Zhuang, Qing He, and Zhongzhi Shi . 2013. Embedding with autoencoder regularization. In ECML/PKDD. Springer, 208--223. Google ScholarDigital Library
Wayne W Zachary . 1977. An information flow model for conflict and fission in small groups. Journal of Anthropological Research (1977), 452--473.Google Scholar
Yuchen Zhao and Philip Yu . 2013. On graph stream clustering with side information. In SDM. SIAM, 139--150.Google Scholar
Lekui Zhou, Yang Yang, Xiang Ren, Fei Wu, and Yueting Zhuang . 2018. Dynamic Network Embedding by Modeling Triadic Closure Process. (2018).Google Scholar

Index Terms

NetWalk: A Flexible Deep Embedding Approach for Anomaly Detection in Dynamic Networks
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Anomaly detection
        Dimensionality reduction and manifold learning
2. Theory of computation
  1. Design and analysis of algorithms
    1. Graph algorithms analysis
      1. Dynamic graph algorithms

Recommendations

Dynamic network embedding survey
Abstract
Since many real world networks are evolving over time, such as social networks and user-item networks, there are increasing research efforts on dynamic network embedding in recent years. They learn node representations from a sequence ...
Read More
D2NE: Deep Dynamic Network Embedding
Advanced Data Mining and Applications
Abstract
In this paper, we propose a new approach named D2NE, short for Deep Dynamic Network Embedding, to learn the vertex representations for dynamic networks. The algorithm utilizes the graph attention mechanism to refresh embeddings efficiently, in ...
Read More
Learning and Updating Node Embedding on Dynamic Heterogeneous Information Network
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

Heterogeneous information networks consist of multiple types of edges and nodes, which have a strong ability to represent the rich semantics underpinning network structures. Recently, the dynamics of networks has been studied in many tasks such as ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anomaly detection
clique embedding
deep autoencoder
dynamic network embedding
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '18 Paper Acceptance Rate107of983submissions,11%Overall Acceptance Rate1,133of8,635submissions,13%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 202
  Total Citations
  View Citations
- 8,788
  Total Downloads
- Downloads (Last 12 months)1,291
- Downloads (Last 6 weeks)181
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

NetWalk: A Flexible Deep Embedding Approach for Anomaly Detection in Dynamic Networks

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Dynamic network embedding survey

D2NE: Deep Dynamic Network Embedding

Learning and Updating Node Embedding on Dynamic Heterogeneous Information Network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

NetWalk: A Flexible Deep Embedding Approach for Anomaly Detection in Dynamic Networks

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Dynamic network embedding survey

D2NE: Deep Dynamic Network Embedding

Learning and Updating Node Embedding on Dynamic Heterogeneous Information Network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media