skip to main content
10.1145/3205455.3205612acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Benchmarking evolutionary computation approaches to insider threat detection

Published:02 July 2018Publication History

ABSTRACT

Insider threat detection represents a challenging problem to companies and organizations where malicious actions are performed by authorized users. This is a highly skewed data problem, where the huge class imbalance makes the adaptation of learning algorithms to the real world context very difficult. In this work, applications of genetic programming (GP) and stream active learning are evaluated for insider threat detection. Linear GP with lexicase/multi-objective selection is employed to address the problem under a stationary data assumption. Moreover, streaming GP is employed to address the problem under a non-stationary data assumption. Experiments conducted on a publicly available corporate data set show the capability of the approaches in dealing with extreme class imbalance, stream learning and adaptation to the real world context.

References

  1. M. Barreno, B. Nelson, A. D. Joseph, and J. D. Tygar. 2010. The security of machine learning. Machine Learning 81, 2 (2010), 121--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. 2010. MOA: Massive Online Analysis. Journal of Machine Learning Research 11 (2010), 1601--1604. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. F. Brameier and W. Banzhaf. 2007. Linear Genetic Programming. Springer US. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Demsar. 2006. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7 (2006), 1--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. Eberle, L. Holder, and D. Cook. 2009. Identifying Threats Using Graph4)ased Anomaly Detection. In Machine Learning in Cyber Trust. Springer, 73--108.Google ScholarGoogle Scholar
  6. F. Eibe, M. A. Hall, and I. H. Witten. 2017. The WEKA Workbench. In Data mining: practical machine learning tools and techniques (4 ed.). Morgan Kaufmann.Google ScholarGoogle Scholar
  7. J. Gama. 2012. A survey on learning from data streams: current and future trends. Progress in AI 1, 1 (2012), 45--55.Google ScholarGoogle Scholar
  8. J. Glasser and B. Lindauer. 2013. Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data. In IEEE Symposium on Security and Privacy Workshops. 98--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Haddadi and A. N. Zincir-Heywood. 2015. A Closer Look at the HTTP and P2P Based Botnets from a Detector's Perspective. In Foundations and Practice of Security - 8th International Symposium (FPS 2015). Clermont-Ferrand, France, 212--228.Google ScholarGoogle Scholar
  10. T. Helmuth, L. Spector, and J. Matheson. 2015. Solving Uncompromising Problems With Lexicase Selection. IEEE Transactions on Evolutionary Computation 19, 5 (2015), 630--643.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. I. Heywood. 2015. Evolutionary model building under streaming data for classification tasks: opportunities and challenges. Genetic Programming and Evolvable Machines 16, 3 (2015), 283--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Hulten, L. Spencer, and P. M. Domingos. 2001. Mining time-changing data streams. In ACM SIGKDD International Conference on Knowledge discovery and data mining. 97--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Khanchi, M. I. Heywood, and A. N. Zincir-Heywood. 2016. On the Impact of Class Imbalance in GP Streaming Classification with Label Budgets. In European Genetic Programming Conference. 35--50.Google ScholarGoogle Scholar
  14. S. Khanchi, M. I. Heywood, and A. N. Zincir-Heywood. 2017. Properties of a GP active learning framework for streaming data with class imbalance. In ACM Genetic and Evolutionary Computation Conference. 945--952. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Krawiec and M. I. Heywood. 2017. Solving Complex Problems with Coevolutionary Algorithms. In ACM Genetic and Evolutionary Computation Conference (Companion). 782--806. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Lichodzijewski and M. I. Heywood. 2008. Managing team-based problem solving with symbiotic bid-based genetic programming. In ACM Genetic and Evolutionary Computation Conference. 363--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Parveen, J. Evans, B. M. Thuraisingham, K. W. Hamlen, and L. Khan. 2011. Insider Threat Detection Using Stream Mining and Graph Mining. In IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing. 1102--1110.Google ScholarGoogle Scholar
  18. P. Parveen and B. M. Thuraisingham. 2012. Unsupervised incremental sequence learning for insider threat detection. In IEEE International Conference on Intelligence and Security Informatics. 141--143.Google ScholarGoogle Scholar
  19. T. Rashid, I. Agrafiotis, and J. R. C. Nurse. 2016. A New Take on Detecting Insider Threats: Exploring the Use of Hidden Markov Models. In ACM CCS International Workshop on Managing Insider Security Threats. 47--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Ren, Y. Lian, and X. Zou. 2014. Incremental Naïve Bayesian Learning Algorithm based on Classification Contribution Degree. Journal of Computers 9, 8 (2014), 1967--1974.Google ScholarGoogle ScholarCross RefCross Ref
  21. T. E. Senator, H. G. Goldberg, A. Memory, W. T. Young, B. Rees, R. Pierce, D. Huang, M. Reardon, D. A. Bader, E. Chow, I. A. Essa, J. Jones, V. Bettadapura, D. H. Chau, O. Green, O. Kaya, A. Zakrzewska, E. Briscoe, R. L. Mappus IV, R. McColl, L. Weiss, T. G. Dietterich, A. Fern, W.-K. Wong, S. Das, A. Emmott, J. Irvine, J. Yoon Lee, D. Koutra, C. Faloutsos, D. D. Corkill, L. Friedland, A. Gentzel, and D. D. Jensen. 2013. Detecting insider threats in a real corporate database of computer usage activity. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1393--1401. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. W. T. Strayer, D. E. Lapsley, R. Walsh, and C. Livadas. 2008. Botnet Detection Based on Network Behavior. In Botnet Detection: Countering the Largest Security Threat. 1--24.Google ScholarGoogle Scholar
  23. A. Tuor, S. Kaplan, B. Hutchinson, N. Nichols, and S. Robinson. 2017. Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams. In Proceedings of the AAAI-17 Workshop on Artificial Intelligence for Cyber Security. 224--231.Google ScholarGoogle Scholar
  24. A. Vahdat, J. Morgan, A. R. McIntyre, M. I. Heywood, and A. N. Zincir-Heywood. 2015. Evolving GP Classifiers for Streaming Data Tasks with Concept Change and Label Budgets: A Benchmarking Study. In Handbook of Genetic Programming Applications. 451--480.Google ScholarGoogle Scholar
  25. Q. Wang, W. Guo, K. Zhang, A. G. Ororbia II, X. Xing, Liu X, and C. L. Giles. 2017. Adversary Resistant Deep Neural Networks with an Application to Malware Detection. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1145--1153. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. F. M. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M. Steinbach, D. J. Hand, and D. Steinberg. 2008. Top 10 algorithms in data mining. Knowledge Information Systems 14, 1 (2008), 1--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. I. Zliobaite, A. Bifet, B. Pfahringer, and G. Holmes. 2014. Active Learning With Drifting Streaming Data. IEEE Transactions on Neural Networks Learning Systems 25, 1 (2014), 27--39.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Benchmarking evolutionary computation approaches to insider threat detection

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference
          July 2018
          1578 pages
          ISBN:9781450356183
          DOI:10.1145/3205455

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 2 July 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,669of4,410submissions,38%

          Upcoming Conference

          GECCO '24
          Genetic and Evolutionary Computation Conference
          July 14 - 18, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader