skip to main content
10.1145/2517349.2522727acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Open Access

Do not blame users for misconfigurations

Published:03 November 2013Publication History

ABSTRACT

Similar to software bugs, configuration errors are also one of the major causes of today's system failures. Many configuration issues manifest themselves in ways similar to software bugs such as crashes, hangs, silent failures. It leaves users clueless and forced to report to developers for technical support, wasting not only users' but also developers' precious time and effort. Unfortunately, unlike software bugs, many software developers take a much less active, responsible role in handling configuration errors because "they are users' faults."

This paper advocates the importance for software developers to take an active role in handling misconfigurations. It also makes a concrete first step towards this goal by providing tooling support to help developers improve their configuration design, and harden their systems against configuration errors. Specifically, we build a tool, called Spex, to automatically infer configuration requirements (referred to as constraints) from software source code, and then use the inferred constraints to: (1) expose misconfiguration vulnerabilities (i.e., bad system reactions to configuration errors such as crashes, hangs, silent failures); and (2) detect certain types of error-prone configuration design and handling.

We evaluate Spex with one commercial storage system and six open-source server applications. Spex automatically infers a total of 3800 constraints for more than 2500 configuration parameters. Based on these constraints, Spex further detects 743 various misconfiguration vulnerabilities and at least 112 error-prone constraints in the latest versions of the evaluated systems. To this day, 364 vulnerabilities and 80 inconsistent constraints have been confirmed or fixed by developers after we reported them. Our results have influenced the Squid Web proxy project to improve its configuration parsing library towards a more user-friendly design.

Skip Supplemental Material Section

Supplemental Material

d2-03-tianyin-xu.mp4

mp4

1.1 GB

References

  1. B. Aggarwal, R. Bhagwan, T. Das, S. Eswaran, V. N. Padmanabhan, and G. M. Voelker. NetPrints: Diagnosing Home Network Misconfigurations Using Shared Knowledge. In Proceedings of the 6th USENIX Symposium on Networked System Design and Implementation (NSDI'09), April 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Amazon Web Services Team. Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US East Region. http://aws.amazon.com/message/65648, 2011.Google ScholarGoogle Scholar
  3. M. Attariyan, M. Chow, and J. Flinn. X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI'12), October 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Attariyan and J. Flinn. Using Causality to Diagnose Configuration Bugs. In Proceedings of the 2008 USENIX Annual Technical Conference (USENIX'08), June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Attariyan and J. Flinn. Automating Configuration Troubleshooting with Dynamic Information Flow Analysis. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI'10), October 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. A. Barroso and U. Hölzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan and Claypool Publishers, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Chen, C. Guo, H. Wu, J. Yuan, Z. Feng, Y. Chen, S. Lu, and W. Wu. Generic and Automatic Address Configuration for Data Center Networks. In Proceedings of the 2010 Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM'10), August 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Computing Research Association. Grand Research Challenges in Information Systems, Technical Report, September 2003.Google ScholarGoogle Scholar
  9. S. Duan, V. Thummala, and S. Babu. Tuning Database Conguration Parameters with iTuned. In Proceedings of the 35th International Conference on Very Large Data Bases (VLDB'09), August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP'01), October 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Feamster and H. Balakrishnan. Detecting BGP Configuration Faults with Static Analysis. In Proceedings of the 2nd USENIX Symposium on Networked System Design and Implementation (NSDI'05), May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Gray. Why Do Computers Stop and What Can Be Done About It? Tandem Technical Report 85.7, June 1985.Google ScholarGoogle Scholar
  13. R. Johnson. More Details on Today's Outage. http://www.facebook.com/note.php?note_id=431441338919,2010.Google ScholarGoogle Scholar
  14. A. Kappor. Web-to-host: Reducing Total Cost of Ownership. Technical Report 200503, The Tolly Group, May 2000.Google ScholarGoogle Scholar
  15. L. Keller, P. Upadhyaya, and G. Candea. ConfErr: A Tool for Assessing Resilience to Human Configuration Errors. In Proceedings of the 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'08), June 2008.Google ScholarGoogle ScholarCross RefCross Ref
  16. S. Kendrick. What Takes Us Down? USENIX;login:, 37(5):37--45, October 2012.Google ScholarGoogle Scholar
  17. N. Kushman and D. Katabi. Enabling Configuration-Independent Automation by Non-Expert Users. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI'10), October 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04), March 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Mahajan, D. Wetherall, and T. Anderson. Understanding BGP Misconfigurations. In Proceedings of the 2002 Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM'02), August 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. J. Mayhew. Principles and Guidelines in Software User Interface Design. Prentice Hall, October 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Mickens, M. Szummer, and D. Narayanan. Snitch: Interactive Decision Trees for Troubleshooting Misconfigurations. In Proceedings of the 2nd USENIX Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SYSML'07), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Nagaraja, F. Oliveira, R. Bianchini, R. P. Martin, and T. D. Nguyen. Understanding and Dealing with Operator Mistakes in Internet Services. In Proceedings of the 6th USENIX Conference on Operating Systems Design and Implementation (OSDI'04), December 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. A. Norman. Design Rules Based on Analyses of Human Error. Communications of the ACM, 26(4):254--258, April 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Oppenheimer, A. Ganapathi, and D. A. Patterson. Why Do Internet Services Fail, and What Can Be Done About It? In Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems (USITS'03), March 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Rabkin and R. Katz. Precomputing Possible Configuration Error Diagnosis. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE'11), November 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Rabkin and R. Katz. Static Extraction of Program Configuration Options. In Proceedings of the 33th International Conference on Software Engineering (ICSE'11), May 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Rabkin and R. Katz. How Hadoop Clusters Break. IEEE Software, 30(4):88--94, July 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Schüpbach, A. Baumann, T. Roscoe, and S. Peter. A Declarative Language Approach to Device Configuration. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'11), March 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Sridharan, S. J. Fink, and R. Bodík. Thin Slicing. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI'07), June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y.-Y. Su, M. Attariyan, and J. Flinn. AutoBash: Improving Configuration Management with Operating System Causality Analysis. In Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP'07), October 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Sverdlik. Microsoft: Misconfigured Network Device Led to Azure Outage. http://www.datacenterdynamics.com/focus/archive/2012/07/microsoft-misconfigured-network-device-led-azure-outage, 2012.Google ScholarGoogle Scholar
  32. H. J. Wang, J. C. Platt, Y. Chen, R. Zhang, and Y.-M. Wang. Automatic Misconfiguration Troubleshooting with PeerPressure. In Proceedings of the 6th USENIX Conference on Operating Systems Design and Implementation (OSDI'04), December 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y.-M. Wang, C. Verbowski, J. Dunagan, Y. Chen, H. J. Wang, C. Yuan, and Z. Zhang. STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support. In Proceedings of the 17th Large Installation Systems Administration Conference (LISA'03), October 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Whitaker, R. S. Cox, and S. D. Gribble. Configuration Debugging as Search: Finding the Needle in the Haystack. In Proceedings of the 6th USENIX Conference on Operating Systems Design and Implementation (OSDI'04), December 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Y. Xiong, A. Hubaux, S. She, and K. Czarnecki. Generating Range Fixes for Software Configuration. In Proceedings of the 34th International Conference on Software Engineering (ICSE'12), June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Z. Yin, X. Ma, J. Zheng, Y. Zhou, L. N. Bairavasundaram, and S. Pasupathy. An Empirical Study on Configuration Errors in Commercial and Open Source Systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP'11), October 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. C. Yuan, N. Lao, J.-R. Wen, J. Li, Z. Zhang, Y.-M. Wang, and W.-Y. Ma. Automated Known Problem Diagnosis with Event Traces. In Proceedings of the 1st EuroSys Conference (EuroSys'06), April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. D. Yuan, Y. Xie, R. Panigrahy, J. Yang, C. Verbowski, and A. Kumar. Context-based Online Configuration Error Detection. In Proceedings of the 2011 USENIX Annual Technical Conference (USENIX'11), June 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Zeller. Why Programs Fail: A Guide to Systematic Debugging (2nd Edition). Morgan Kaufmann Publishers, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Zhang and M. D. Ernst. Automated Diagnosis of Software Configuration Errors. In Proceedings of the 35th Internationl Conference on Software Engineering (ICSE'13), May 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Do not blame users for misconfigurations

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
        November 2013
        498 pages
        ISBN:9781450323888
        DOI:10.1145/2517349

        Copyright © 2013 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 November 2013

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate131of716submissions,18%

        Upcoming Conference

        SOSP '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader