Article

Is sampled data sufficient for anomaly detection?

Authors:
Jianning Mai

University of California, Davis, Davis, CA

University of California, Davis, Davis, CA
View Profile

,
Chen-Nee Chuah

University of California, Davis, Davis, CA

University of California, Davis, Davis, CA
View Profile

,
Ashwin Sridharan

Sprint Advanced Technology Labs, Burlingame, CA

Sprint Advanced Technology Labs, Burlingame, CA
View Profile

,
Tao Ye

Sprint Advanced Technology Labs, Burlingame, CA

Sprint Advanced Technology Labs, Burlingame, CA
View Profile

,
Hui Zang

Sprint Advanced Technology Labs, Burlingame, CA

Sprint Advanced Technology Labs, Burlingame, CA
View Profile

IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurementOctober 2006Pages 165–176https://doi.org/10.1145/1177080.1177102

Published:25 October 2006Publication History

IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement

Pages 165–176

ABSTRACT

Sampling techniques are widely used for traffic measurements at high link speed to conserve router resources. Traditionally, sampled traffic data is used for network management tasks such as traffic matrix estimations, but recently it has also been used in numerous anomaly detection algorithms, as security analysis becomes increasingly critical for network providers. While the impact of sampling on traffic engineering metrics such as flow size and mean rate is well studied, its impact on anomaly detection remains an open question.This paper presents a comprehensive study on whether existing sampling techniques distort traffic features critical for effective anomaly detection. We sampled packet traces captured from a Tier-1 IP-backbone using four popular methods: random packet sampling, random flow sampling, smart sampling, and sample-and-hold. The sampled data is then used as input to detect two common classes of anomalies: volume anomalies and port scans. Since it is infeasible to enumerate all existing solutions, we study three representative algorithms: a wavelet-based volume anomaly detection and two portscan detection algorithms based on hypotheses testing. Our results show that all the four sampling methods introduce fundamental bias that degrades the performance of the three detection schemes, however the degradation curves are very different. We also identify the traffic features critical for anomaly detection and analyze how they are affected by sampling. Our work demonstrates the need for better measurement techniques, since anomaly detection operates on a drastically different information region, which is often overlooked by existing traffic accounting methods that target heavy-hitters.

References

Cisco IOS Software NetFlow. http://www.cisco.com/warp/public/732/ Tech/nmp/netflow/.Google Scholar
Juniper Networks: JUNOS 7.2 Software Documentation. http://www.juniper.net/techpubs/software/junos/junos72/index.html.Google Scholar
Snort. http://www.snort.org.Google Scholar
P. Barford, J. Kline, D. Plonka, and A. Ron. A Signal Analysis of Network Traffic Anomalies. In Proc. ACM SIGCOMM IMW'02, pages 71--82, Marseille, France, Nov. 2002. Google ScholarDigital Library
P. Barford and D. Plonka. Characteristics of Network TRaffic Flow Anomalies. In Proc. ACM SIGCOMM IMW'01, pages 69--73, San Francisco, CA, USA, Nov. 2001. Google ScholarDigital Library
B.-Y. Choi, J. Park, and Z.-L. Zhang. Adaptive Random Sampling for Traffic Load Measurement. In Proc. IEEE International Conference on Communications (ICC'03), Anchorage, Alaska, USA, May 2003.Google Scholar
N. Duffield. Sampling for Passive Internet Measurement: A Review. Statistical Science, 19(3):472--498, 2004.Google ScholarCross Ref
N. Duffield, C. Lund, and M. Thorup. Properties and Prediction of Flow Statistics from Sampled Packet Streams. In Proc. ACM SIGCOMM IMW'02, Marseille, France, Nov. 2002. Google ScholarDigital Library
N. Duffield, C. Lund, and M. Thorup. Estimating Flow Distributions from Sampled Flow Statistics. In Proc. ACM SIGCOMM'03, Karlsruhe, Germany, Aug. 2003. Google ScholarDigital Library
C. Estan, K. Keys, D. Moore, and G. Varghese. Building a Better NetFlow. In Proc. of SIGCOMM'04, Portland, Oregon, USA, Aug. 2004. Google ScholarDigital Library
C. Estan and G. Varghese. New Directions in Traffic Measurement and Accounting. In Proc. of SIGCOMM'02, Pittsburgh, Pennsylvania, USA, Aug. 2002. Google ScholarDigital Library
C. Fraleigh, S. Moon, B. Lyles, C. Cotton, M. Khan, D. Moll, R. Rockell, T. Seely, and C. Diot. Packet-Level Traffic Measurements from the Sprint IP Backbone. IEEE Network, 17(6):6--16, November/December 2003. Google ScholarDigital Library
N. Hohn and D. Veitch. Inverting Sampled Traffic. In Proc. ACM SIGCOMM IMC'03, Miami Beach, Florida, USA, Oct. 2003. Google ScholarDigital Library
J. Jung, V. Paxson, A. W. Berger, and H. Balakrishnan. Fast Portscan Detection Using Sequential Hypothesis Testing. In Proc. of 2004 IEEE Symposium on Security and Privacy, Oakland, CA, USA, May 2004.Google ScholarCross Ref
B. Krishnamurthy, S. Sen, Y. Zhang, and Y. Chen. Sketch-based Change Detection: Methods, Evaluation, and Applications. In Proc. ACM SIGCOMM IMC'03, Miami Beach, Florida, USA, Oct. 2003. Google ScholarDigital Library
A. Lakhina, M. Crovella, and C. Diot. Mining Anomalies Using Traffic Feature Distributions. In Proc. ACM SIGCOMM '05, Philadelphia, PA, USA, Aug. 2005. Google ScholarDigital Library
J. Mai, A. Sridharan, C.-N. Chuah, T. Ye, and H. Zang. Impact of Packet Sampling on Portscan Detection. Technical Report RR06-ATL-043166, Sprint ATL, 2006. (accepted by IEEE JSAC Special Issue on Sampling the Internet).Google Scholar
M. Roesch. Snort - Lightweight Intrusion Detection for Networks. In Proc. 1999 USENIX LISA Conference, Seattle, WA, USA, Nov. 1999. Google ScholarDigital Library
A. Sridharan, T. Ye, and S. Bhattacharyya. Connection Port Scan Detection on the Backbone. In Malware Workshop held in conjunction with IPCC, Phoenix, Arizona, USA, April 2006.Google Scholar
S. Staniford, J. A. Hoagland, and J. M. McAlerney. Practical automated detection of stealthy portscans. Journal of Computer Security, 10:105--136, 2002. Google ScholarDigital Library
M. Thottan and C. Ji. Anomaly Detection in IP Networks. IEEE Trans. on Signal Processing, 51(8):2191--2204, Aug. 2003. Google ScholarDigital Library

Index Terms

Is sampled data sufficient for anomaly detection?
1. General and reference
  1. Cross-computing tools and techniques
    1. Design
2. Networks
  1. Network services
    1. Network monitoring

Recommendations

Impact of packet sampling on anomaly detection metrics
IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement

Packet sampling methods such as Cisco's NetFlow are widely employed by large networks to reduce the amount of traffic data measured. A key problem with packet sampling is that it is inherently a lossy process, discarding (potentially useful) ...
Read More
On mitigating sampling-induced accuracy loss in traffic anomaly detection systems

Real-time Anomaly Detection Systems (ADSs) use packet sampling to realize traffic analysis at wire speeds. While recent studies have shown that a considerable loss of anomaly detection accuracy is incurred due to sampling, solutions to mitigate this ...
Read More
Towards efficient flow sampling technique for anomaly detection
TMA'12: Proceedings of the 4th international conference on Traffic Monitoring and Analysis

With increasing amount of network traffic, sampling techniques have become widely employed allowing monitoring and analysis of high-speed network links. Despite of all benefits, sampling methods negatively influence the accuracy of anomaly detection ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
October 2006
356 pages
ISBN:1595935614
DOI:10.1145/1177080
General Chairs:
Jussara Almeida
Federal University of Minas Gerais, Brazil
,
Virgilio Almeida
Federal University of Minas Gerais, Brazil
,
Program Chair:
Paul Barford
University of Wisconsin -- Madison, USA
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 October 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anomaly detection
portscan
sampling
volume anomaly
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate277of1,083submissions,26%
Upcoming Conference
IMC '24

Sponsor:

sigcomm

sigcomm

ACM Internet Measurement Conference

November 4 - 6, 2024

Madrid , AA , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 169
  Total Citations
  View Citations
- 1,477
  Total Downloads
- Downloads (Last 12 months)34
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Is sampled data sufficient for anomaly detection?

IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement

ABSTRACT

References

Cited By

Index Terms

Recommendations

Impact of packet sampling on anomaly detection metrics

On mitigating sampling-induced accuracy loss in traffic anomaly detection systems

Towards efficient flow sampling technique for anomaly detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Is sampled data sufficient for anomaly detection?

IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement

ABSTRACT

References

Cited By

Index Terms

Recommendations

Impact of packet sampling on anomaly detection metrics

On mitigating sampling-induced accuracy loss in traffic anomaly detection systems

Towards efficient flow sampling technique for anomaly detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media