As device geometries decrease and processor clock frequency increases, the incidence of hardware transient errors increases. Simultaneously, computer architectures are using increased degrees of instruction-level resource parallelism to achieve performance goals, e.g. pipelined, superscalar and Very Long Instruction Word (VLIW) processors. Full utilization of this parallelism is difficult to achieve and sustain, resulting in the occurrence of idle resources.
This thesis explores the use of such idle resources for concurrent error detection in processors employing instruction-level resource parallelism. Focus is on the detection of errors in program control-flow and program data. An experimental approach is taken in which a commercial VLIW processor, the Multiflow TRACE 14/300, is selected as the target processor. The resource utilization of the TRACE 14/300 during execution of 11 scientific benchmark programs is examined. Experimental evaluation reveals that resource utilization is low. Fundamental factors limiting the resource utilization are identified. These factors indicate that significant idle resources are likely to exist across a wide range of applications for the TRACE 14/300 as well as other processors employing a significant amount of instruction-level parallelism.
A methodology is developed to utilize idle processor resources, called Available Resource-driven Control-flow monitoring (ARC), for detecting transient control-flow errors. It is unique in that the monitoring computation's resource use is tailored to the existence of idle resources in the application processor. An algorithm for the implementation of the ARC-based monitoring computation and results characterizing its error detection properties are presented. The results demonstrate that ARC is highly effective in using the idle resources of a processor to achieve concurrent error detection at a very low cost in performance overhead.
Finally, a technique for detecting errors in program data, called Algorithm-Based Fault Tolerance (ABFT), is applied to the TRACE 14/300. It is found that the degree to which ABFT is able to make use of idle resources varies considerably, depending upon the application, while detecting a high percentage of data errors. Overall results demonstrate that concurrent error detection techniques can significantly reduce their hardware and performance overhead by use of idle resources in processors employing instruction-level resource parallelism, while achieving effective error coverage.
Cited By
- Chou Y and Shen J Instruction path coprocessors Proceedings of the 27th annual international symposium on Computer architecture, (270-281)
- Chou Y and Shen J (2000). Instruction path coprocessors, ACM SIGARCH Computer Architecture News, 28:2, (270-281), Online publication date: 1-May-2000.
- Schnarr E and Larus J Instruction scheduling and executable editing Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture, (288-297)
Index Terms
- Exploitation of instruction-level parallelism for detection of processor execution errors
Recommendations
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruction-level parallelism (ILP) and thread-level parallelism (TLP). Wide-issue super-scalar processors exploit ILP by executing multiple instructions from a ...
Exploiting Instruction-Level Parallelism for Integrated Control-Flow Monitoring
Computer architectures are using increased degrees of instruction-level machine parallelism to achieve higher performance, e.g., superpipelined, superscalar and very long instruction word (VLIW) processors. Full utilization of such machine parallelism ...
A design of EPIC type processor based on MIPS architecture
AbstractThis paper proposes an EPIC (Explicitly Parallel Instruction Computing Architecture) type processor based on MIPS. VLIW processors can execute multiple instructions simultaneously, but due to dependency of instructions, it is often impossible to ...