Multiple-context processors have been proposed as an architectural technique to mitigate the effects of large memory latency in multiprocessors. We examine two schemes for implementing multiple-context processors. The first scheme switches between contexts only on a cache miss, while the other interleaves the contexts on a cycle-by-cycle basis. Both schemes provide the capability for a single context to fully utilize the pipeline. We show that cycle-by-cycle interleaving of contexts provides a performance advantage over switching contexts only at a cache miss. This advantage results from the context interleaving hiding pipeline dependencies and reducing the context switch cost. In addition, we show that while the implementation of the interleaved scheme is more complex, the complexity is not overwhelming. As pipelines get deeper and operate at lower percentages of peak performance, the performance advantage of the interleaved scheme is likely to justify its additional complexity.
Cited By
- Gunther B (1997). Multithreading with Distributed Functional Units, IEEE Transactions on Computers, 46:4, (399-411), Online publication date: 1-Apr-1997.
- Nuth P and Dally W The Named-State Register File Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
- Laudon J, Gupta A and Horowitz M (1994). Interleaving, ACM SIGPLAN Notices, 29:11, (308-318), Online publication date: 1-Nov-1994.
- Laudon J, Gupta A and Horowitz M Interleaving Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, (308-318)
- Laudon J, Gupta A and Horowitz M (1994). Interleaving, ACM SIGOPS Operating Systems Review, 28:5, (308-318), Online publication date: 1-Dec-1994.
Recommendations
Architectural and implementation tradeoffs in the design of multiple-context processors (abstract)
ISCA '92: Proceedings of the 19th annual international symposium on Computer architectureWe examine two multiple-context schemes in the context of scalable shared-memory multiprocessors. The blocked scheme switches between contexts at cache misses. The proposed interleaved scheme switches between available contexts on a cycle-by-cycle basis,...