Abstract
We present a sound method for clustering alarms from static analyzers. Our method clusters alarms by discovering sound dependencies between them such that if the dominant alarms of a cluster turns out to be false, all the other alarms in the same cluster are guaranteed to be false. We have implemented our clustering algorithm on top of a realistic buffer-overflow analyzer and proved that our method reduces 45% of alarm reports. Our framework is applicable to any abstract interpretation-based static analysis and orthogonal to abstraction refinements and statistical ranking schemes.
- Gogul Balakrishnan, Sriram Sankaranarayanan, Franjo Ivančić, Ou Wei, and Aarti Gupta. 2008. SLR: Path-sensitive analysis through infeasible-path detection and syntactic language refinement. In Proceedings of the 15th International Symposium on Static Analysis (SAS’08). Springer-Verlag, Berlin, 238--254. Google ScholarDigital Library
- Thomas Ball, Mayur Naik, and Sriram K. Rajamani. 2003. From symptom to cause: Localizing errors in counterexample traces. In Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’03). ACM, New York, NY, 97--105. Google ScholarDigital Library
- Mike Beaven and Ryan Stansifer. 1993. Explaining type errors in polymorphic languages. ACM Lett. Program. Lang. Syst. 2, 1--4 (March 1993), 17--30. 1057-4514 Google ScholarDigital Library
- Sam Blackshear, Bor-Yuh Evan Chang, and Manu Sridharan. 2013. Thresher: Precise refutations for heap reachability. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). ACM, New York, NY, 275--286. Google ScholarDigital Library
- Bruno Blanchet, Patrick Cousot, Radhia Cousot, Jérome Feret, Laurent Mauborgne, Antoine Miné, David Monniaux, and Xavier Rival. 2003. A static analyzer for large safety-critical software. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI’03). ACM, New York, NY, 196--207. Google ScholarDigital Library
- Olaf Chitil. 2001. Compositional explanation of types and algorithmic debugging of type errors. In Proceedings of the 6th ACM SIGPLAN International Conference on Functional Programming (ICFP’01). ACM, New York, NY, 193--204. Google ScholarDigital Library
- Cristina Cifuentes, Christian Hoermann, Nathan Keynes, Lian Li, Simon Long, Erica Mealy, Michael Mounteney, and Bernhard Scholz. 2009. BegBunch: Benchmarking for C bug detection tools. In Proceedings of the 2nd International Workshop on Defects in Large Software Systems: Held in Conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2009) (DEFECTS’09). ACM, New York, NY, 16--20. Google ScholarDigital Library
- Patrick Cousot and Rahida Cousot. 1992. Abstract interpretation and application to logic programs. J. Log. Program. 13, 2--3 (July 1992), 103--179. 0743-1066 Google ScholarDigital Library
- P. Cousot, P. Ganty, and J.-F. Raskin. 2007. Fixpoint-guided abstraction refinements. In Proceedings of the 14th International Symposium on Static Analysis, SAS’07, G. Filé and H. Riis Nielson (Eds.). Springer, Berlin, 333--348. Google ScholarDigital Library
- Vijay D’Silva, Leopold Haller, Daniel Kroening, and Michael Tautschnig. 2012. Numeric bounds analysis with conflict-driven learning. In Proceedings of the 18th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’12). Springer-Verlag, Berlin, 48--63. Google ScholarDigital Library
- Dominic Duggan and Frederick Bent. 1995. Explaining type inference. In Science of Computer Programming. 37--83. Google ScholarDigital Library
- Alex Groce and Willem Visser. 2003. What went wrong: Explaining counterexamples. In Proceedings of the 10th International Conference on Model Checking Software (SPIN’03). Springer-Verlag, Berlin, 121--136. http://dl.acm.org/citation.cfm?id=1767111.1767119 Google ScholarDigital Library
- Bhargav S. Gulavani and Sriram K. Rajamani. 2006. Counterexample driven refinement for abstract interpretation. In Tools and Algorithms for the Construction and Analysis of Systems, Holger Hermanns and Jens Palsberg (Eds.). Lecture Notes in Computer Science, Vol. 3920. Springer, Berlin, 474--488. Google ScholarDigital Library
- Bhargav S. Gulavani, Supratik Chakraborty, Aditya V. Nori, and Sriram K. Rajamani. 2008. Automatically refining abstract interpretations. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’08/ETAPS’08). Springer-Verlag, Berlin, 443--458. Google ScholarDigital Library
- Gregory F. Johnson and Janet A. Walz. 1986. A maximum-flow approach to anomaly isolation in unification-based incremental type inference. In Proceedings of the 13th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL’86). ACM, New York, NY, 44--57. Google ScholarDigital Library
- Yungbum Jung, Jaehwang Kim, Jaeho Shin, and Kwangkeun Yi. 2005. Taming false alarms from a domain-unaware C analyzer by a bayesian statistical post analysis. In Proceedings of the 12th International Conference on Static Analysis (SAS’05). Springer-Verlag, Berlin, 203--217. Google ScholarDigital Library
- Heejung Kim, Yungbum Jung, Sunghun Kim, and Kwankeun Yi. 2011. MeCC: Memory comparison-based clone detector. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, NY, 301--310. Google ScholarDigital Library
- Youil Kim, Jooyong Lee, Hwansoo Han, and Kwang-Moo Choe. 2010. Filtering false alarms of buffer overflow analysis using SMT solvers. Inf. Softw. Technol. 52, 2 (Feb. 2010), 210--219. 0950-5849 Google ScholarDigital Library
- Ted Kremenek, Ken Ashcraft, Junfeng Yang, and Dawson Engler. 2004. Correlation exploitation in error ranking. In Proceedings of the 12th ACM SIGSOFT Twelfth International Symposium on Foundations of Software Engineering (SIGSOFT’04/FSE-12). ACM, New York, NY, 83--93. Google ScholarDigital Library
- Ted Kremenek and Dawson Engler. 2003. Z-ranking: Using statistical analysis to counter the impact of static analysis approximations. In Proceedings of the 10th International Conference on Static Analysis (SAS’03). Springer-Verlag, Berlin, 295--315. Google ScholarDigital Library
- Wei Le and Mary Lou Soffa. 2010. Path-based fault correlations. In Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’10). ACM, New York, NY, 307--316. Google ScholarDigital Library
- Woosuk Lee, Wonchan Lee, and Kwangkeun Yi. 2012. Sound non-statistical clustering of static analysis alarms. In Proceedings of the 13th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI’12). Springer-Verlag, Berlin, 299--314. Google ScholarDigital Library
- Percy Liang, Omer Tripp, and Mayur Naik. 2011. Learning minimal abstractions. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’11). ACM, New York, NY, 31--42. Google ScholarDigital Library
- Ravi Mangal, Xin Zhang, Aditya V. Nori, and Mayur Naik. 2015. A user-guided approach to program analysis. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, 462--473. Google ScholarDigital Library
- MathWorks. 2015. Polyspace Embedded Software Verification. Retrieved from http://www.mathworks.com/products/polyspace/index.html.Google Scholar
- Laurent Mauborgne and Xavier Rival. 2005. Trace partitioning in abstract interpretation based static analyzers. In Proceedings of the 14th European Conference on Programming Languages and Systems (ESOP’05). Springer-Verlag, Berlin, 5--20. Google ScholarDigital Library
- Microsoft. 2015. Code Contracts. Retrieved from http://msdn.microsoft.com/en-us/devlabs/dd491992.aspx.Google Scholar
- Antoine Miné. 2006. The octagon abstract domain. Higher Order Symbol. Comput. 19, 1 (March 2006), 31--100. 1388-3690 Google ScholarDigital Library
- Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, Daejun Park, Jeehoon Kang, and Kwangkeun Yi. 2014. Global sparse analysis framework. ACM Trans. Program. Lang. Syst. 36, 3, Article 8 (Sept. 2014), 44 pages. 0164-0925 Google ScholarDigital Library
- Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. 2012. Design and implementation of sparse global analyses for C-like languages. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12). ACM, New York, NY, 229--238. Google ScholarDigital Library
- Xavier Rival. 2005. Understanding the origin of alarms in ASTRÉE. In Proceedings of the 12th International Conference on Static Analysis (SAS’05). Springer-Verlag, Berlin, 303--319. Google ScholarDigital Library
- ROPAS. 2017. The Sparrow Static Analyzer. Retrieved from https://github.com/ropas/sparrow.Google Scholar
- F. Tip and T. B. Dinesh. 2001. A slicing-based approach for locating type errors. ACM Trans. Softw. Eng. Methodol. 10, 1 (Jan. 2001), 5--55. 1049-331X Google ScholarDigital Library
- Mitchell Wand. 1986. Finding the source of type errors. In Proceedings of the 13th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL’86). ACM, New York, NY, 38--43. Google ScholarDigital Library
Index Terms
- Sound Non-Statistical Clustering of Static Analysis Alarms
Recommendations
Sound non-statistical clustering of static analysis alarms
VMCAI'12: Proceedings of the 13th international conference on Verification, Model Checking, and Abstract InterpretationWe present a sound method for clustering alarms from static analyzers. Our method clusters alarms by discovering sound dependencies between them such that if the dominant alarm of a cluster turns out to be false (respectively true) then it is assured ...
Pushdown control-flow analysis for free
POPL '16Traditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...
Machine-learning-guided selectively unsound static analysis
ICSE '17: Proceedings of the 39th International Conference on Software EngineeringWe present a machine-learning-based technique for selectively applying unsoundness in static analysis. Existing bug-finding static analyzers are unsound in order to be precise and scalable in practice. However, they are uniformly unsound and hence at ...
Comments