ABSTRACT
Recent strategies to improve the observable resilience of applications require the ability to classify vulnerabilities of individual components (e.g., data structures, instructions) of an application, and then, selectively apply protection mechanisms to its critical components. To facilitate this vulnerability classification, it is important to have accurate, quantitative techniques that can be applied uniformly and automatically across real-world applications. Traditional methods cannot effectively quantify vulnerability, because they lack a holistic view to examine system resilience, and come with prohibitive evaluation costs. In this paper, we introduce a data-driven, practical methodology to analyze these application vulnerabilities using a novel resilience metric: the data vulnerability factor (DVF). DVF integrates knowledge from both the application and target hardware into the calculation. To calculate DVF, we extend a performance modeling language to provide a structured, fast modeling solution. We evaluate our methodology on six representative computational kernels; we demonstrate the significance of DVF by quantifying the impact of algorithm optimization on vulnerability, and by quantifying the effectiveness of specific hardware protection mechanisms.
- Bames-hut Implementation on GitHub. http://github.com/JAChapmanII/barnes-hut, 2010.Google Scholar
- NPB Website. https://www.nas.nasa.gov/publications/npb.html, 2012.Google Scholar
- Conjugate Gradient Implementation on GitHub. https://github.com/danesh-d/cg/blob/master, 2013.Google Scholar
- The Monte Carlo Macroscopic Cross Section Lookup Benchmark. https://github.com/jtramm/XSBench, 2013.Google Scholar
- A. Biswas, P. Racunas, R. Cheveresan, J. Emer, S. S. Mukherjee, and R. Rangan. A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor. In International Symposium on Computer Architecture (ISCA), 2005.Google Scholar
- A. Biswas, P. Racunas, R. Cheveresan, J. Emer, S. S. Mukherjee, and R. Rangan. Computing Architectural Vulnerability Factors for Address-based Structures. In International Symposium on Computer Architecture (ISCA), 2005. Google ScholarDigital Library
- A. Bland, W. Joubert, D. Maxwell, N. Podhorszki, J. Rogers, G. Shipman, and A. Tharrington. Titan: 20-Petaflop Cray XK6 at Oak Ridge National Laboratory. In J. S. Vetter, editor, Contemporary High Performance Computing: From Petascale Toward Exascale, CRC Computational Science Series. Taylor and Francis, 2013.Google Scholar
- G. Bosilca, A. Bouteiller, F. Cappello, S. Djilali, G. Fedak, C. Germain, T. Herault, P. Lemarinier, O. Lodygensky, and F. Magniette. MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes. In The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2002. Google ScholarDigital Library
- G. Bronevetsky and B. R. Supinski. Soft Error Vulnerability of Iterative Linear Algebra Methods. In International Conference on Supercomputing (ICS), 2008. Google ScholarDigital Library
- M. Casas, B. R. Supinski, G. Bronevetsky, and M. Schulz. Fault Resilience of the Algebraic Multi-grid Solver. In International Conference on Supercomputing (ICS), 2012. Google ScholarDigital Library
- Z. Chen. Algorithm-Based Recovery for Iterative Methods without Checkpointing. In The International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2011. Google ScholarDigital Library
- Z. Chen. Online-ABFT: An Online Algorithm Based Fault Tolerance Scheme for Soft Error Detection in Iterative Methods. In ACM SIGPLAN Annual Symposium Principles and Practice of Parallel Programming (PPoPP), 2013. Google ScholarDigital Library
- J. Chung, I. Lee, M. Sullivan, J. H. Ryoo, D. W. Kim, D. H. Yoon, L. Kaplan, and M. Erez. Containment Domains: A Scalable, Efficient, and Flexible Resilience Scheme for Exascale Systems. In The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2012. Google ScholarDigital Library
- T. Davies and Z. Chen. Correcting Soft Errors Online in LU Factorization. In The International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2013. Google ScholarDigital Library
- T. Davies, C. Karlsson, H. Liu, C. Ding, and Z. Chen. High Performance Linpack Benchmark: A Fault Tolerant Implementation without Checkpointing. In International Conference on Supercomputing (ICS), 2011. Google ScholarDigital Library
- T. Dell. A White Paper On The Benefits Of Chipkill-Correct ECC for PC Server Main Memory. Technical report, IBM Microelectronics Division, 1997.Google Scholar
- P. Du, A. Bouteiller, G. Bosilca, T. Herault, and J. Dongarra. Algorithm-based Fault Tolerance for Dense Matrix Factorizations. In ACM SIGPLAN Annual Symposium Principles and Practice of Parallel Programming (PPoPP), 2011. Google ScholarDigital Library
- L. Duan, B. Li, and L. Peng. Versatile Prediction and Fast Estimation of Architectural Vulnerability Factor from Processor Performance Metrics. In International Symposium on High-Performance Computer Architecture (HPCA), 2009.Google Scholar
- P. H. Hargrove and J. C. Duell. Berkeley Lab Checkpoint/Restart(BLCR) for Linux Clusters. JPCS, 2006.Google ScholarCross Ref
- S. K. S. Hari, S. V. Adve, H. Naeimi, and P. Ramachandran. Relyzer: Exploiting Application-Level Fault Equivalence to Analyze Application Resiliency to Transient Faults. In The International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2012. Google ScholarDigital Library
- M. Y. Hsiao. A Class of Optimal Minimum Odd-Weight-Column SECDED Codes. IBM Journal of Research and Development, 1970. Google ScholarDigital Library
- J. Hursey, J. M. Squyres, T. I. Mattox, and A. Lumsdaine. The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI. In IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2007.Google Scholar
- R. Leveugle, A. Calvez, P. Maistri, and P. Vanhauwaert. Statistical Fault Injection: Quantified Error and Confidence. In Design, Automation and Test in Europe (DATE), 2009. Google ScholarDigital Library
- D. Li, J. S. Vetter, and W. Yu. Classifying Soft Error Vulnerabilities in Extreme-Scale Scientific Applications Using a Binary Instrumentation Tool. In The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2012. Google ScholarDigital Library
- S. Li, K. Chen, M.-Y. Hsieh, N. Muralimanohar, C. D. Kersey, J. B. Brockman, A. F. Rodrigues, and N. P. Jouppi. System Implications of Memory Reliability in Exascale Computing. In The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2011. Google ScholarDigital Library
- X. Li, M. C. Huang, K. Shen, and L. Chu. A Realistic Evaluation of Memory Hardware Errors and Software System Susceptibility. In USENIX Annual Technical Conference (ATC), 2010. Google ScholarDigital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. SIGPLAN Not., 2005. Google ScholarDigital Library
- S. Manegold, P. Boncz, and M. L. Kersten. Generic Database Cost Models for Hierarchical Memory Systems. In International Conference on Very Large Databases (VLDB), 2002. Google ScholarDigital Library
- A. Moody, G. Bronevetsky, K. Mohror, and B. R. Supinski. Design, Modeling, and Evaluation of A Scalable Multi-level Checkpointing System. In The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2010. Google ScholarDigital Library
- S. S. Mukherjee, C. Weaver, J. Emer, S. K. Reinhardt, and T. Austin. A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor. In The Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2003. Google ScholarDigital Library
- S. S. Mukherjee, C. T. Weaver, J. Emer, S. K. Reinhardt, and T. Austin. Measuring Architectural Vulnerability Factors. IEEE Micro, 2003. Google ScholarDigital Library
- M. Shantharam, S. Srinivasmurthy, and P. Raghavan. Characterizing the Impact of Soft Errors on Iterative Methods in Scientific Computing. In International Conference on Supercomputing (ICS), 2011. Google ScholarDigital Library
- G. Shi, J. Enos, M. Showerman, and V. Kindratenko. On Testing GPU Memory for Hard and Soft Errors. In Symposium on Application Accelerators in High-Performance Computing (SAAHPC), 2009.Google Scholar
- C. Slayman. Impact of Error Correction Code and Dynamic Memory Reconfiguration on High-Reliability/Low-Cost Server Memory. In Integrated Reliability Workshop, 2006.Google Scholar
- K. Spafford and J. S. Vetter. Aspen: A Domain Specific Language for Performance Modeling. In The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2012. Google ScholarDigital Library
- V. Sridharan and D. R. Kaeli. Eliminating Microarchitectural Dependency From Architectural Vulnerability. In International Symposium on High-Performance Computer Architecture (HPCA), 2009.Google Scholar
- V. Sridharan and D. R. Kaeli. Using PVF Traces to Accelerate AVF Modeling. In Workshop on Silicon Errors in Logic - System Effects, 2010.Google Scholar
- D. Thiebaut and H. S. Stone. Footprints in the Cache. ACM Trans. Comput. Syst., 1987. Google ScholarDigital Library
- A. N. Udipi, N. Muralimanohar, R. Balsubramonian, A. Davis, and N. P. Jouppi. LOT-ECC: Localized and Tiered Reliability Mechanisms for Commodity Memory Systems. In International Symposium on Computer Architecture (ISCA), 2012. Google ScholarDigital Library
- K. R. Walcott, G. Humphreys, and S. Gurumurthi. Dynamic Prediction of Architectural Vulnerability from Microarchitectural State. In International Symposium on Computer Architecture (ISCA), 2007. Google ScholarDigital Library
- X. Xu and M.-L. Li. Understanding Soft Error Propagation Using Efficient Vulnerability-Driven Fault Injection. In The Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2012. Google ScholarDigital Library
Index Terms
- Quantitatively modeling application resilience with the data vulnerability factor
Recommendations
Side-channel vulnerability factor: a metric for measuring information leakage
ISCA '12There have been many attacks that exploit side-effects of program execution to expose secret information and many proposed countermeasures to protect against these attacks. However there is currently no systematic, holistic methodology for understanding ...
Side-channel vulnerability factor: a metric for measuring information leakage
ISCA '12: Proceedings of the 39th Annual International Symposium on Computer ArchitectureThere have been many attacks that exploit side-effects of program execution to expose secret information and many proposed countermeasures to protect against these attacks. However there is currently no systematic, holistic methodology for understanding ...
A Review on 0-day Vulnerability Testing in Web Application
ICTCS '16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive StrategiesIn recent year a lot of web applications have been released in the world. At the same time, Zero-Day attacks against web application vulnerabilities have also increased. In such a scenario, it is necessary to make web applications more secure. However ...
Comments