skip to main content
10.1145/3278532.3278563acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article
Public Access

Advancing the Art of Internet Edge Outage Detection

Published:31 October 2018Publication History

ABSTRACT

Measuring reliability of edge networks in the Internet is difficult due to the size and heterogeneity of networks, the rarity of outages, and the difficulty of finding vantage points that can accurately capture such events at scale. In this paper, we use logs from a major CDN, detailing hourly request counts from address blocks. We discovered that in many edge address blocks, devices, collectively, contact the CDN every hour over weeks and months. We establish that a sudden temporary absence of these requests indicates a loss of Internet connectivity of those address blocks, events we call disruptions.

We develop a disruption detection technique and present broad and detailed statistics on 1.5M disruption events over the course of a year. Our approach reveals that disruptions do not necessarily reflect actual service outages, but can be the result of prefix migrations. Major natural disasters are clearly represented in our data as expected; however, a large share of detected disruptions correlate well with planned human intervention during scheduled maintenance intervals, and are thus unlikely to be caused by external factors. Cross-evaluating our results we find that current state-of-the-art active outage detection over-estimates the occurrence of disruptions in some address blocks. Our observations of disruptions, service outages, and different causes for such events yield implications for the design of outage detection systems, as well as for policymakers seeking to establish reporting requirements for Internet services.

References

  1. AT&T Switched Ethernet Service Guide. Section 3 - Service Level Agreement. http://cpr.att.com/pdf/se/0001--0003.pdf.Google ScholarGoogle Scholar
  2. Comcast Business: Enterprise Dedicated Internet PSA. https://business.comcast. com/terms-conditions-ent/enterprise_dedicated-internet-psa.Google ScholarGoogle Scholar
  3. FCC. 47 CFR Part 4 --DISRUPTIONS TO COMMUNICATIONS. Outage reporting requirements - threshold criteria. https://www.law.cornell.edu/cfr/text/47/part-4.Google ScholarGoogle Scholar
  4. Internet Addresses Survey dataset, PREDICT ID: USC-LANDER/internet-address-survey-reprobing-it76c-20170723/rev7956. Traces taken 2017-07-23 to 2017-08-06. Provided by the USC/LANDER project. http://www.isi.edu/ant/lander.Google ScholarGoogle Scholar
  5. Internet Addresses Survey dataset, PREDICT ID: USC-LANDER/internet-address-survey-reprobing-it76w-20170628/rev7942. Traces taken 2017-06-28 to 2017-07-13. Provided by the USC/LANDER project. http://www.isi.edu/ant/lander.Google ScholarGoogle Scholar
  6. Internet Addresses Survey dataset, PREDICT ID: USC-LANDER/internet-address-survey-reprobing-it77c-20170914/rev8018. Traces taken 2017-09-14 to 2017-09-29. Provided by the USC/LANDER project. http://www.isi.edu/ant/lander.Google ScholarGoogle Scholar
  7. Internet Addresses Survey dataset, PREDICT ID: USC-LANDER/internet-address-survey-reprobing-it77w-20170830/rev8013. Traces taken 2017-08-30 to 2017-09-14. Provided by the USC/LANDER project. http://www.isi.edu/ant/lander.Google ScholarGoogle Scholar
  8. Internet Outage Dataset, PREDICT ID: USC-LANDER/internet-outage-adaptive-a28all-20170403. Provided by the USC/LANDER project. http://www.isi.edu/ant/lander.Google ScholarGoogle Scholar
  9. Charu C. Aggarwal. Outlier Analysis, second edition. Springer Publishing Company, Incorporated, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. O. Argon, A. Bremler-Barr, O. Mokryn, D. Schirman, Y. Shavitt, and U. Weinsberg. On the dynamics of IP address allocation and availability of end-hosts. arXiv preprint arXiv:1011.2324, 2010.Google ScholarGoogle Scholar
  11. R. Banerjee, A. Razaghpanah, L. Chiang, A. Mishra, V. Sekar, Y. Choi, and P. Gill. Internet Outages, the Eyewitness Accounts: Analysis of the Outages Mailing List. In PAM, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  12. K. Benson, A. Dainotti, kc claffy, A. Snoeren, and M. Kallitsis. Leveraging Internet Background Radiation for Opportunistic Network Analysis. In ACM IMC, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Beverly and M. Luckie. The Impact of Router Outages on the AS-level Internet. In ACM SIGCOMM, Aug 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Beverly, M. Luckie, L. Mosley, and k. claffy. Measuring and Characterizing IPv6 Router Availability. In Passive and Active Network Measurement Workshop (PAM), pages 123--135, Mar 2015.Google ScholarGoogle ScholarCross RefCross Ref
  15. Z. Bischof, F. Bustamante, and N. Feamster. The Growing Importance of Being Always On -- A First Look at the Reliability of Broadband Internet Access. In Research Conference on Communications, Information and Internet Policy (TPRC) 46, 2018.Google ScholarGoogle Scholar
  16. Z. Bischof, F. Bustamante, and R. Stanojevic. Need, Want, Can Afford: Broadband Markets and the Behavior of Users. In ACM IMC, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. BroadbandNow. The Complete List of Internet Providers in the US. https://broadbandnow.com/All-Providers.Google ScholarGoogle Scholar
  18. R. Bush, O. Maennel, M. Roughan, and S. Uhlig. Internet Optometry: Assessing the Broken Glasses in Internet Reachability. In ACM IMC, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Comcast Business. Maintenance Notifications. https://business.comcast.com/terms-conditions-ent/maintenance.Google ScholarGoogle Scholar
  20. R. Cleveland, W. Cleveland, and I. Terpenning. Stl: A seasonal-trend decomposition procedure based on loess. Journal of Official Statistics, 6(1):3, 1990.Google ScholarGoogle Scholar
  21. G. Comarela, G. Gürsun, and M. Crovella. Studying interdomain routing over long timescales. In ACM IMC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Dainotti, C. Squarcella, E. Aben, KC Claffy, M. Chiesa, M. Russo, and A. Pescape. Analysis of Country-wide Internet Outages Caused by Censorship. In ACM IMC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Dhamdhere, R. Teixeira, C. Dovrolis, and C. Diot. NetDiagnoser: Troubleshooting Network Unreachabilities Using End-to-end Probes and Routing Data. In CoNEXT, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. DSLReports.com. Is there an official DSL network maintenance window? http://www.dslreports.com/faq/2496.Google ScholarGoogle Scholar
  25. Z. Durumeric, E. Wustrow, and J. A. Halderman. ZMap: Fast Internet-Wide Scanning and its Security Applications. In USENIX Security Symposium, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. V. Giotsas, C. Dietzel, G. Smaragdakis, A. Feldmann, A. Berger, and E. Aben. Detecting Peering Infrastructure Outages in the Wild. In ACM SIGCOMM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Grover, M. Park, S. Sundaresan, S. Burnett, H. Kim, B. Ravi, and N. Feamster. Peeking behind the NAT: an empirical study of home networks. In ACM IMC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Heidemann, Y. Pradkin, R. Govindan, C. Papadopoulos, G. Bartlett, and J. Bannister. Census and survey of the visible internet. In ACM IMC, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Heidemann, Y. Pradkin, and A. Nisar. Back out: End-to-end inference of common points-of-failure in the internet (extended). Technical Report ISI-TR-724, USC/Information Sciences Institute, Feb 2018.Google ScholarGoogle Scholar
  30. C. Hublet and R. De Schrijver. DHCP reconfigure extension. IETF RFC 3203.Google ScholarGoogle Scholar
  31. V. Jandhyala, S. Fotopoulos, I. MacNeill, and P. Liu. Inference for single and multiple change-points in time series. Journal of Time Series Analysis, 34(4):423--446, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  32. U. Javed, I. Cunha, D. R. Choffnes, E. Katz-Bassett, T. Anderson, and A. Krishnamurthy. PoiRoot: Investigating the Root Cause of Interdomain Path Changes. In ACM SIGCOMM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. E. Katz-Bassett, H. V. Madhyastha, J. P. John, A. Krishnamurthy, D. Wetherall, and T. Anderson. Studying Black Holes in the Internet with Hubble. In NSDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. E. Katz-Bassett, C. Scott, D. R. Choffnes, I. Cunha, V. Valancius, N. Feamster, H. V. Madhyastha, T. Anderson, and A. Krishnamurthy. LIFEGUARD: Practical Repair of Persistent Route Failures. In ACM SIGCOMM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian. Delayed Internet Routing Convergence. In ACM SIGCOMM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Miami Herald. No internet after Irma means no work and no fun. When will I be online again? http://www.miamiherald.com/news/weather/hurricane/article173954151.html.Google ScholarGoogle Scholar
  37. Al Jazeera News. Rising Internet shutdowns aimed at 'Silencing Dissent'. https://tinyurl.com/y8pb6eq9.Google ScholarGoogle Scholar
  38. Broadband in the U.K.: data and research. https://www.ofcom.org.uk/research-and-data/telecoms-research/broadband-research.Google ScholarGoogle Scholar
  39. Broadband Measurement Project, Canada. https://crtc.gc.ca/eng/internet/proj.htm.Google ScholarGoogle Scholar
  40. Measuring Broadband America. https://www.fcc.gov/general/measuring-broadband-america.Google ScholarGoogle Scholar
  41. Measuring Broadband Australia. https://www.accc.gov.au/consumers/internet-phone/monitoring-broadband-performance.Google ScholarGoogle Scholar
  42. R. Padmanabhan, A. Dhamdhere, E. Aben, kc claffy, and N. Spring. Reasons Dynamic Addresses Change. In ACM IMC, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. V. Paxson. End-to-End Routing Behavior in the Internet. IEEE/ACM Transactions on Networking, 5(5):601--615, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. D. Plonka and A. Berger. Temporal and Spatial Classification of Active IPv6 Addresses. In ACM IMC, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. D. Plonka and A. Berger. kIP: a Measured Approach to IPv6 Address Anonymization. CoRR, abs/1707.03900, 2017.Google ScholarGoogle Scholar
  46. L. Quan, J. Heidemann, and Y. Pradkin. Trinocular: Understanding Internet Reliability Through Adaptive Probing. In ACM SIGCOMM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. P. Richter, M. Allman, R. Bush, and V. Paxson. A Primer on IPv4 Scarcity. ACM CCR, 45(2), Apr 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. P. Richter, G. Smaragdakis, D. Plonka, and A. Berger. Beyond Counting: New Perspectives on the Active IPv4 Address Space. In ACM IMC, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. P. Richter, F. Wohlfart, N. Vallina-Rodriguez, M. Allman, R. Bush, A. Feldmann, C. Kreibich, N. Weaver, and V. Paxson. A Multi-perspective Analysis of Carrier-Grade NAT Deployment. In ACM IMC, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. RIPE NCC. Atlas. http://atlas.ripe.net.Google ScholarGoogle Scholar
  51. John P. Rula, Fabián E. Bustamante, and Moritz Steiner. Cell Spotting: Studying the Role of Cellular Networks in the Internet. In ACM IMC, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. SamKnows. Test methodology white paper, 2011.Google ScholarGoogle Scholar
  53. M A. Sánchez, J. S. Otto, Z. S. Bischof, D. R. Choffnes, F. E. Bustamante, B. Krishnamurthy, and W. Willinger. Dasu: Pushing Experiments to the Internet's Edge. In NSDI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. A. Schulman and N. Spring. Pingin' in the Rain. In ACM IMC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. A. Shah, R. Fontugne, E. Aben, C. Pelsser, and R. Bush. Disco: Fast, good, and cheap outage detection. In TMA, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  56. Y. Shavitt and E. Shir. DIMES: Let the Internet Measure Itself. SIGCOMM Comput. Commun. Rev., 35, October 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. D. A. Stephens. Bayesian retrospective multiple-changepoint identification. Journal of the Royal Statistical Society. Series C (Applied Statistics), 43(1):159--178, 1994.Google ScholarGoogle Scholar
  58. S. Sundaresan, S. Burnett, N. Feamster, and W. Donato. BISmark: A testbed for deploying measurements and applications in broadband access networks. In USENIX ATC, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. D. Turner, K. Levchenko, A. C. Snoeren, and S. Savage. California Fault Lines: Understanding the Causes and Impact of Network Failures. In ACM SIGCOMM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. O. Vallis, J. Hochenbaum, and A. Kejariwal. A Novel Technique for Long-Term Anomaly Detection in the Cloud. In Usenix HoutCloud, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Advancing the Art of Internet Edge Outage Detection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        IMC '18: Proceedings of the Internet Measurement Conference 2018
        October 2018
        507 pages
        ISBN:9781450356190
        DOI:10.1145/3278532

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 31 October 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate277of1,083submissions,26%

        Upcoming Conference

        IMC '24
        ACM Internet Measurement Conference
        November 4 - 6, 2024
        Madrid , AA , Spain

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader