skip to main content
10.1145/1177080.1177102acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
Article

Is sampled data sufficient for anomaly detection?

Published:25 October 2006Publication History

ABSTRACT

Sampling techniques are widely used for traffic measurements at high link speed to conserve router resources. Traditionally, sampled traffic data is used for network management tasks such as traffic matrix estimations, but recently it has also been used in numerous anomaly detection algorithms, as security analysis becomes increasingly critical for network providers. While the impact of sampling on traffic engineering metrics such as flow size and mean rate is well studied, its impact on anomaly detection remains an open question.This paper presents a comprehensive study on whether existing sampling techniques distort traffic features critical for effective anomaly detection. We sampled packet traces captured from a Tier-1 IP-backbone using four popular methods: random packet sampling, random flow sampling, smart sampling, and sample-and-hold. The sampled data is then used as input to detect two common classes of anomalies: volume anomalies and port scans. Since it is infeasible to enumerate all existing solutions, we study three representative algorithms: a wavelet-based volume anomaly detection and two portscan detection algorithms based on hypotheses testing. Our results show that all the four sampling methods introduce fundamental bias that degrades the performance of the three detection schemes, however the degradation curves are very different. We also identify the traffic features critical for anomaly detection and analyze how they are affected by sampling. Our work demonstrates the need for better measurement techniques, since anomaly detection operates on a drastically different information region, which is often overlooked by existing traffic accounting methods that target heavy-hitters.

References

  1. Cisco IOS Software NetFlow. http://www.cisco.com/warp/public/732/ Tech/nmp/netflow/.Google ScholarGoogle Scholar
  2. Juniper Networks: JUNOS 7.2 Software Documentation. http://www.juniper.net/techpubs/software/junos/junos72/index.html.Google ScholarGoogle Scholar
  3. Snort. http://www.snort.org.Google ScholarGoogle Scholar
  4. P. Barford, J. Kline, D. Plonka, and A. Ron. A Signal Analysis of Network Traffic Anomalies. In Proc. ACM SIGCOMM IMW'02, pages 71--82, Marseille, France, Nov. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Barford and D. Plonka. Characteristics of Network TRaffic Flow Anomalies. In Proc. ACM SIGCOMM IMW'01, pages 69--73, San Francisco, CA, USA, Nov. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B.-Y. Choi, J. Park, and Z.-L. Zhang. Adaptive Random Sampling for Traffic Load Measurement. In Proc. IEEE International Conference on Communications (ICC'03), Anchorage, Alaska, USA, May 2003.Google ScholarGoogle Scholar
  7. N. Duffield. Sampling for Passive Internet Measurement: A Review. Statistical Science, 19(3):472--498, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. N. Duffield, C. Lund, and M. Thorup. Properties and Prediction of Flow Statistics from Sampled Packet Streams. In Proc. ACM SIGCOMM IMW'02, Marseille, France, Nov. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Duffield, C. Lund, and M. Thorup. Estimating Flow Distributions from Sampled Flow Statistics. In Proc. ACM SIGCOMM'03, Karlsruhe, Germany, Aug. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Estan, K. Keys, D. Moore, and G. Varghese. Building a Better NetFlow. In Proc. of SIGCOMM'04, Portland, Oregon, USA, Aug. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Estan and G. Varghese. New Directions in Traffic Measurement and Accounting. In Proc. of SIGCOMM'02, Pittsburgh, Pennsylvania, USA, Aug. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. Fraleigh, S. Moon, B. Lyles, C. Cotton, M. Khan, D. Moll, R. Rockell, T. Seely, and C. Diot. Packet-Level Traffic Measurements from the Sprint IP Backbone. IEEE Network, 17(6):6--16, November/December 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Hohn and D. Veitch. Inverting Sampled Traffic. In Proc. ACM SIGCOMM IMC'03, Miami Beach, Florida, USA, Oct. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Jung, V. Paxson, A. W. Berger, and H. Balakrishnan. Fast Portscan Detection Using Sequential Hypothesis Testing. In Proc. of 2004 IEEE Symposium on Security and Privacy, Oakland, CA, USA, May 2004.Google ScholarGoogle ScholarCross RefCross Ref
  15. B. Krishnamurthy, S. Sen, Y. Zhang, and Y. Chen. Sketch-based Change Detection: Methods, Evaluation, and Applications. In Proc. ACM SIGCOMM IMC'03, Miami Beach, Florida, USA, Oct. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Lakhina, M. Crovella, and C. Diot. Mining Anomalies Using Traffic Feature Distributions. In Proc. ACM SIGCOMM '05, Philadelphia, PA, USA, Aug. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Mai, A. Sridharan, C.-N. Chuah, T. Ye, and H. Zang. Impact of Packet Sampling on Portscan Detection. Technical Report RR06-ATL-043166, Sprint ATL, 2006. (accepted by IEEE JSAC Special Issue on Sampling the Internet).Google ScholarGoogle Scholar
  18. M. Roesch. Snort - Lightweight Intrusion Detection for Networks. In Proc. 1999 USENIX LISA Conference, Seattle, WA, USA, Nov. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Sridharan, T. Ye, and S. Bhattacharyya. Connection Port Scan Detection on the Backbone. In Malware Workshop held in conjunction with IPCC, Phoenix, Arizona, USA, April 2006.Google ScholarGoogle Scholar
  20. S. Staniford, J. A. Hoagland, and J. M. McAlerney. Practical automated detection of stealthy portscans. Journal of Computer Security, 10:105--136, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Thottan and C. Ji. Anomaly Detection in IP Networks. IEEE Trans. on Signal Processing, 51(8):2191--2204, Aug. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Is sampled data sufficient for anomaly detection?

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
        October 2006
        356 pages
        ISBN:1595935614
        DOI:10.1145/1177080

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 October 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate277of1,083submissions,26%

        Upcoming Conference

        IMC '24
        ACM Internet Measurement Conference
        November 4 - 6, 2024
        Madrid , AA , Spain

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader