skip to main content
10.1145/2487788.2487942acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
abstract

Effective analysis, characterization, and detection of malicious web pages

Published:13 May 2013Publication History

ABSTRACT

The steady evolution of the Web has paved the way for miscreants to take advantage of vulnerabilities to embed malicious content into web pages. Up on a visit, malicious web pages steal sensitive data, redirect victims to other malicious targets, or cease control of victim's system to mount future attacks. Approaches to detect malicious web pages have been reactively effective at special classes of attacks like drive-by-downloads. However, the prevalence and complexity of attacks by malicious web pages is still worrisome. The main challenges in this problem domain are (1) fine-grained capturing and characterization of attack payloads (2) evolution of web page artifacts and (3) exibility and scalability of detection techniques with a fast-changing threat landscape. To this end, we proposed a holistic approach that leverages static analysis, dynamic analysis, machine learning, and evolutionary searching and optimization to effectively analyze and detect malicious web pages. We do so by: introducing novel features to capture fine-grained snapshot of malicious web pages, holistic characterization of malicious web pages, and application of evolutionary techniques to fine-tune learning-based detection models pertinent to evolution of attack payloads. In this paper, we present key intuition and details of our approach, results obtained so far, and future work.

References

  1. M. Alexander, B. Tanya, D. Damien, S. D. Gribble, and H. M. Levy. Spyproxy: execution-based detection of malicious web content. In Proceedings of 16th USENIX Security Symposium, pages 3:1--3:16, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. I. Archive. Heritrix. http://crawler.archive.org/index.html, July 2012.Google ScholarGoogle Scholar
  3. K. Byung-Ik, I. Chae-Tae, and J. Hyun-Chul. Suspicious malicious web site detection with strength analysis of a javascript obfuscation. In International Journal of Advanced Science and Technology, pages 19--32, 2011.Google ScholarGoogle Scholar
  4. D. Canali, M. Cova, G. Vigna, and C. Kruegel. Prophiler: a fast filter for the large-scale detection of malicious web pages. In Proceedings of WWW, pages 197--206, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. Choi, B. B. Zhu, and H. Lee. Detecting malicious web links and identifying their attack types. In Proceedings of the 2nd USENIX conference on Web application development, pages 11--11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Corporation. Symantec web based attack prevalence report. http://www.symantec.com/business/threatreport/topic.jsp?id=threat_activity_trends&aid=web_based_attack_prevalence, July 2011.Google ScholarGoogle Scholar
  7. A. Dewald, T. Holz, and F. C. Freiling. Adsandbox: sandboxing javascript to fight malicious websites. In ACM Symposium on Applied Computing, pages 1859--1864, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Eshete, A. Villafiorita, and K. Weldemariam. Binspect: Holistic analysis and detection of malicious web pages. In Proceedings of Security and Privacy in Communication Networks, 2012.Google ScholarGoogle Scholar
  9. B. Eshete, A. Villafiorita, and K. Weldemariam. Einspect: Evolution-guided analaysis and detection of malicious web pages. Technical report, Fondazione Bruno Kessler, 2012.Google ScholarGoogle Scholar
  10. Google. Google safe browsing api. http://code.google.com/apis/safebrowsing/, August 2011.Google ScholarGoogle Scholar
  11. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: An update. SIGKDD Explorations, 11, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Ikinci, T. Holz, and F. Freiling. Monkey-spider: Detecting malicious websites with low-interaction honeyclients. In Proceedings of Sicherheit, Schutz und Zuverl Lssigkeit, pages 407--421, 2008.Google ScholarGoogle Scholar
  13. M. Justin, S. L. K., S. Stefan, and V. G. M. Beyond blacklists: learning to detect malicious web sites from suspicious urls. In Proceedings of KDDM, pages 1245--1254, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Justin, S. L. K., S. Stefan, and V. G. M. Identifying suspicious urls: an application of large-scale online learning. In Proceedings of ICML, pages 681--688, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Kolbitsch, B. Livshits, B. Zorn, and C. Seifer. Rozzle: De-cloaking internet malware. Technical report, Microsoft, 2011.Google ScholarGoogle Scholar
  16. C. Marco, K. Christopher, and V. Giovanni. Detection and analysis of drive-by-download attacks and malicious javascript code. In Proceedings of WWW, pages 281--290, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Micro. Web threats. http://apac.trendmicro.com/apac/threats/enterprise/web-threats/, November 2012.Google ScholarGoogle Scholar
  18. MITRE. The mitre honeyclient project. http://search.cpan.org/~mitrehc, November 2011.Google ScholarGoogle Scholar
  19. H. Project. Honeyc. https://projects.honeynet.org/honeyc, July 2011.Google ScholarGoogle Scholar
  20. T. H. Project. Capture-hpc. https://projects.honeynet.org/capture-hpc, October 2011.Google ScholarGoogle Scholar
  21. M. Qassrawi and H. Zhang. Detecting malicious web servers with honeyclients. Journal of Networks, 6(1), 2011.Google ScholarGoogle ScholarCross RefCross Ref
  22. K. Rieck, T. Krueger, and A. Dewald. Cujo: efficient detection and prevention of drive-by-download attacks. In Proceedings ACSAC, pages 31--39, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Seifert, I. Welch, and P. Komisarczuk. Identification of malicious web pages with static heuristics. In Proceedings of the Australasian Telecommunication Networks and Applications Conference, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  24. C. Seifert, I. Welch, P. Komisarczuk, C. Aval, and B. Endicott-Popovsky. Identification of malicious web pages through analysis of underlying dns and web server relationships. In 33rd IEEE Conference on Local Computer Networks, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  25. G. Software. Htmlunit. http://htmlunit.sourceforge.net/, March 2012.Google ScholarGoogle Scholar
  26. Symantec. Symantec report on attack kits and malicious websites. http://symantec.com/content/en/us/enterprise/other_resources/b-symantec_report_on_attack_kits_and_malicious_websites_21169171_WP.en-us.pdf, July 2011.Google ScholarGoogle Scholar
  27. K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song. Design and Evaluation of a Real-Time URL Spam Filtering Service. In Proceedings of the IEEE Symposium on Security and Privacy, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. UCSB. Wepawet. http://wepawet.cs.ucsb.edu, July 2011.Google ScholarGoogle Scholar
  29. Y.-M. Wang, D. Beck, X. Jiang, and R. Roussev. Automated web patrol with strider honeymonkeys: Finding web sites that exploit browser vulnerabilities. In Proceedings of the NDSS, 2006.Google ScholarGoogle Scholar
  30. A. Weiss. Top 5 security threats in html5. http://www.esecurityplanet.com/trends/article.php/3916381/Top-5-Security-Threats-in-HTML5.htm, October 2011.Google ScholarGoogle Scholar
  31. D. Whitley. A genetic algorithm tutorial. Statistics and Computing, 4:65--85, 1993.Google ScholarGoogle Scholar
  32. C. Whittaker, B. Ryner, and M. Nazif. Large-scale automatic classification of phishing pages. In Proceedings of the NDSS, 2010.Google ScholarGoogle Scholar
  33. H. Yung-Tsung, C. Yimeng, C. Tsuhan, L. Chi-Sung, and C. Chia-Mei. Malicious web content detection by machine learning. Expert Syst. Appl., 3 (1):55--60, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Effective analysis, characterization, and detection of malicious web pages

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader