ABSTRACT
The recent boom of weblogs and social media has attached increasing importance to the identification of suspicious users with unusual behavior, such as spammers or fraudulent reviewers. A typical spamming strategy is to employ multiple dummy accounts to collectively promote a target, be it a URL or a product. Consequently, these suspicious accounts exhibit certain coherent anomalous behavior identifiable as a collection. In this paper, we propose the concept of Coherent Anomaly Collection (CAC) to capture this kind of collections, and put forward an efficient algorithm to simultaneously find the top-K disjoint CACs together with their anomalous behavior patterns. Compared with existing approaches, our new algorithm can find disjoint anomaly collections with coherent extreme behavior without having to specify either their number or sizes. Results on real Twitter data show that our approach discovers meaningful and informative hashtag spammer groups of various sizes which are hard to detect by clustering-based methods.
- D. Chakrabarti, S. Papadimitriou, D. S. Modha, and C. Faloutsos. Fully automatic cross-associations. In SIGKDD Conf., 2004. Google ScholarDigital Library
- V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Comput. Surv., 41(3), 2009. Google ScholarDigital Library
- H. Dai, F. Zhu, E.-P. Lim, and H. H. Pang. Detecting extreme rank anomalous collections. In SDM Conf., 2012.Google ScholarCross Ref
- K. Das, J. Schneider, and D. B. Neill. Anomaly pattern detection in categorical datasets. In SIGKDD Conf., 2008. Google ScholarDigital Library
- L. Duan, L. Xu, Y. Liu, and J. Lee. Cluster-based outlier detection. Annals of Operations Research, 168(1), 2009.Google Scholar
- M. Kendall. Rank correlation methods. Griffin, 1948.Google Scholar
- F. T. Liu, K. M. Ting, and Z.-H. Zhou. On detecting clustered anomalies using sciforest. In ECML/PKDD Conf., 2010. Google ScholarDigital Library
- A. Mukherjee, B. Liu, and N. Glance. Spotting fake reviewer groups in consumer reviews. In WWW Conf., 2012. Google ScholarDigital Library
Index Terms
- Mining coherent anomaly collections on web data
Recommendations
Detecting Anomalies in Alert Firing within Clinical Decision Support Systems using Anomaly/Outlier Detection Techniques
BCB '16: Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health InformaticsClinical Decision Support (CDS) systems play an integral role in the improvement of health care quality and safety. Alert malfunctions within CDS are a common problem and these greatly limit its usability. Anomaly detection is a novel approach to ...
Detecting anomaly collections using extreme feature ranks
Detecting anomaly collections is an important task with many applications, including spam and fraud detection. In an anomaly collection, entities often operate in collusion and hold different agendas to normal entities. As a result, they usually ...
Anomaly detection scheme using data mining in mobile environment
ICCSA'03: Proceedings of the 2003 international conference on Computational science and its applications: PartIIFor detecting the intrusion effectively, many researches have developed data mining framework for constructing intrusion detection modules. Traditional anomaly detection techniques focus on detecting anomalies in new data after training on normal data. ...
Comments