Rising PVT variations at advanced process nodes make it increasingly difficult to meet aggressive performance targets under strict power budgets. Traditional adaptive techniques that compensate for PVT variations need safety margins and cannot respond to rapid environmental changes. In this thesis, we present a novel voltage management technique, called Razor, which eliminates worst-case safety margins through in situ error detection and correction of variation-induced delay errors. In Razor, we use a delay-error tolerant flip-flop on critical paths to scale the supply voltage to the point of first failure of a die for a given frequency. Thus, all margins due to global and local PVT variations are eliminated, resulting in significant energy savings. In addition, the supply voltage can be scaled even lower than the first failure point into the sub-critical region, deliberately tolerating a targeted error rate, thereby providing additional energy savings. Thus, in the context of Razor, a timing error is not a catastrophic system failure but a trade-off between the overhead of error-correction and the additional energy savings due to sub-critical operation. In Razor, the error-rate is monitored and the supply voltage is tuned to achieve a targeted error-rate.
We developed two techniques, called RazorI and RazorII, for implementation of Razor-based voltage tuning in microprocessors. The RazorI approach achieves error-detection by double-sampling the critical-path output at different points in time and comparing both samples. A global recovery signal overwrites the earlier, speculative sample with the later sample and restores the pipeline to its correct state. We implemented RazorI error-detection and correction in a 64bit processor in 0.18micron technology and obtained 50% energy savings over the worst-case at 120MHz. However, the efficacy of the RazorI technique for high-performance processors is undermined by its reliance on a metastability-detector and potentially, timing-critical pipeline recovery path.
The RazorII approach addresses this issue by achieving recovery from delay-errors through a conventional, architectural-replay mechanism. Error-detection in RazorII occurs by flagging spurious transitions at critical-path endpoints. Furthermore, RazorII also detects logic and register SER. We implemented a RazorII-enabled 64bit processor in 0.13µm technology and obtained 33% power savings over the worst-case. SER tolerance was demonstrated with radiation experiments.
Cited By
- Hadjilambrou Z, Das S, Antoniades M and Sazeides Y Leveraging CPU electromagnetic emanations for voltage noise characterization Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, (573-585)
- Nejat M, Alizadeh B and Afzali-Kusha A Dynamic flip-flop conversion to tolerate process variation in low power circuits Proceedings of the conference on Design, Automation & Test in Europe, (1-4)
- Lefurgy C, Drake A, Floyd M, Allen-Ware M, Brock B, Tierno J and Carter J Active management of timing guardband to save energy in POWER7 Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, (1-11)
Recommendations
Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation
MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on MicroarchitectureWith increasing clock frequencies and silicon integration,power aware computing has become a critical concernin the design of embedded processors and systems-on-chip.One of the more effective and widely used methods for power-awarecomputing is dynamic ...
Razor: a low-power pipeline based on circuit-level timing speculation
SBCCI '06: Proceedings of the 19th annual symposium on Integrated circuits and systems designWith increasing clock frequencies and silicon integration, power aware computing has become a critical concern in the design of embedded processors and systems-on-chip. One of the more effective and widely used methods for power-aware computing is ...
Razor: Circuit-Level Correction of Timing Errors for Low-Power Operation
Dynamic voltage scaling is one of the more effective and widely used methods for power-aware computing. Here is a dvs approach that uses dynamic detection and correction of circuit timing errors to tune processor supply voltage and eliminate the need ...