This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today. In this edition, the authors bring their trademark method of quantitative analysis not only to high-performance desktop machine design, but also to the design of embedded and server systems. They have illustrated their principles with designs from all three of these domains, including examples from consumer electronics, multimedia and Web technologies, and high-performance computing.
Cited By
- Zhong W, Li J, Niu N and Fu F Algorithm analysis of MCU automatic trimming The 2nd International Conference on Computing and Data Science, (1-6)
- Li Y, Phanishayee A, Murray D and Kim N Doing more with less Proceedings of the Workshop on Hot Topics in Operating Systems, (119-127)
- Fosse T, Tisi M, Bousse E, Mottu J and Sunyé G Towards platform specific energy estimation for executable domain-specific modeling languages Proceedings of the 22nd International Conference on Model Driven Engineering Languages and Systems, (314-317)
- Dong X, Shen Z, Criswell J, Cox A and Dwarkadas S Spectres, virtual ghosts, and hardware support Proceedings of the 7th International Workshop on Hardware and Architectural Support for Security and Privacy, (1-9)
- Kurth A, Capotondi A, Vogel P, Benini L and Marongiu A HERO Proceedings of the 2nd Workshop on AutotuniNg and aDaptivity AppRoaches for Energy efficient HPC Systems, (1-6)
- Multanen J, Viitanen T, Jääskeläinen P and Takala J (2018). Instruction Fetch Energy Reduction with Biased SRAMs, Journal of Signal Processing Systems, 90:11, (1519-1532), Online publication date: 1-Nov-2018.
- Hammari E, Kjeldsberg P and Catthoor F (2018). Runtime Precomputation of Data-Dependent Parameters in Embedded Systems, ACM Transactions on Embedded Computing Systems, 17:3, (1-21), Online publication date: 31-May-2018.
- Altaf M and Wood D (2015). LogCA: A Performance Model for Hardware Accelerators, IEEE Computer Architecture Letters, 14:2, (132-135), Online publication date: 1-Jul-2015.
- Childers B, Yang J and Zhang Y Achieving Yield, Density and Performance Effective DRAM at Extreme Technology Sizes Proceedings of the 2015 International Symposium on Memory Systems, (78-84)
- Jurkiewicz T and Mehlhorn K (2015). On a Model of Virtual Address Translation, ACM Journal of Experimental Algorithmics, 19, (1-28), Online publication date: 3-Feb-2015.
- Xiang P, Yang Y, Mantor M, Rubin N and Zhou H Revisiting ILP designs for throughput-oriented GPGPU architecture Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (121-130)
- Tu C, Hsu H, Chen J, Chen C and Hung S (2014). Performance and power profiling for emulated Android systems, ACM Transactions on Design Automation of Electronic Systems, 19:2, (1-25), Online publication date: 1-Mar-2014.
- Dossis M A Floating-Point Paradigm for High-level Synthesis Proceedings of the 18th Panhellenic Conference on Informatics, (1-6)
- Stewin P A Primitive for Revealing Stealthy Peripheral-Based Attacks on the Computing Platform's Main Memory Proceedings of the 16th International Symposium on Research in Attacks, Intrusions, and Defenses - Volume 8145, (1-20)
- Ge R, Feng X and Sun X SERA-IO Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), (204-211)
- Alvarez L, Vilanova L, Gonzalez M, Martorell X, Navarro N and Ayguade E Hardware-software coherence protocol for the coexistence of caches and local memories Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, (1-11)
- Park S, Gupta S, Mojumder N, Raghunathan A and Roy K Future cache design using STT MRAMs for improved energy efficiency Proceedings of the 49th Annual Design Automation Conference, (492-497)
- Stewin P and Bystrov I Understanding DMA malware Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment, (21-41)
- Velev M and Gao P Automatic formal verification of multithreaded pipelined microprocessors Proceedings of the International Conference on Computer-Aided Design, (679-686)
- Thiyagalingam J, Goodman D, Schnabel J, Trefethen A and Grau V (2011). On the usage of GPUs for efficient motion estimation in medical image sequences, Journal of Biomedical Imaging, 2011, (1-15), Online publication date: 1-Jan-2011.
- Gilroy M, Irvine J and Atkinson R (2011). RAID 6 Hardware Acceleration, ACM Transactions on Embedded Computing Systems, 10:4, (1-17), Online publication date: 1-Nov-2011.
- Schoeberl M, Korsholm S, Kalibera T and Ravn A (2011). A Hardware Abstraction Layer in Java, ACM Transactions on Embedded Computing Systems, 10:4, (1-40), Online publication date: 1-Nov-2011.
- Caparrós Cabezas V and Stanley-Marbell P Parallelism and data movement characterization of contemporary application classes Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures, (95-104)
- Yuan F, Wright S, Eder K and May D Managing complexity through abstraction Proceedings of the 13th international conference on Formal methods and software engineering, (585-600)
- Chen W, Wang Z, Dou Q and Wang Y A novel chaining approach to indirect control transfer instructions Proceedings of the IFIP WG 8.4/8.9 international cross domain conference on Availability, reliability and security for business, enterprise and health information systems, (309-320)
- Habgood K and Arel I Revisiting Cramer's rule for solving dense linear systems Proceedings of the 2010 Spring Simulation Multiconference, (1-8)
- Torbert S, Vishkin U, Tzur R and Ellison D Is teaching parallel algorithmic thinking to high school students possible? Proceedings of the 41st ACM technical symposium on Computer science education, (290-294)
- Vaidyanathan N, Billionniere E and Collofello J (2010). A preliminary comparative survey of computer architecture courses across the nation's top schools, Journal of Computing Sciences in Colleges, 25:4, (193-202), Online publication date: 1-Apr-2010.
- Jin Z, Pittman R and Forin A Reconfigurable custom floating-point instructions (abstract only) Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays, (287-287)
- Velev M and Gao P A method for debugging of pipelined processors in formal verification by correspondence checking Proceedings of the 2010 Asia and South Pacific Design Automation Conference, (619-624)
- Schoeberl M, Preußer T and Uhrig S The embedded Java benchmark suite JemBench Proceedings of the 8th International Workshop on Java Technologies for Real-Time and Embedded Systems, (120-127)
- Pesterev A, Zeldovich N and Morris R Locating cache performance bottlenecks using data profiling Proceedings of the 5th European conference on Computer systems, (335-348)
- Cabodi G, Lavagno L, Murciano M, Kondratyev A and Watanabe Y (2010). Speeding-up heuristic allocation, scheduling and binding with SAT-based abstraction/refinement techniques, ACM Transactions on Design Automation of Electronic Systems, 15:2, (1-34), Online publication date: 1-Feb-2010.
- Amir A and Levy A String rearrangement metrics Algorithms and Applications, (1-33)
- Velev M and Gao P Method for formal verification of soft-error tolerance mechanisms in pipelined microprocessors Proceedings of the 12th international conference on Formal engineering methods and software engineering, (355-370)
- Amir A, Eisenberg E, Keller O, Levy A and Porat E Approximate string matching with stuck address bits Proceedings of the 17th international conference on String processing and information retrieval, (395-405)
- La Fratta P and Kogge P Models for generating locality-tuned traveling threads for a hierarchical multi-level heterogeneous multicore Proceedings of the 7th ACM international conference on Computing frontiers, (227-236)
- Schwartz-Narbonne D, Chan C, Mahajan Y and Malik S Supporting RTL flow compatibility in a microarchitecture-level design framework Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, (343-352)
- Wang B, Yao Y, Himmelspach J, Ewald R and Uhrmacher A Experimental analysis of logical process simulation algorithms in JAMES II Winter Simulation Conference, (1167-1179)
- Murase M, Shimizu K, Plouffe W and Sakamoto M Effective implementation of the cell broadband engine™ isolation loader Proceedings of the 16th ACM conference on Computer and communications security, (303-313)
- Ferri B and Ferri A (2009). Reconfiguration of IIR filters in response to computer resource availability, ACM Transactions on Embedded Computing Systems, 9:1, (1-25), Online publication date: 1-Oct-2009.
- El-Shobaky S, El-Mahdy A and El-Nahas A Automatic vectorization using dynamic compilation and tree pattern matching technique in Jikes RVM Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems, (63-69)
- Bilardi G, Ekanadham K and Pattnaik P (2009). On approximating the ideal random access machine by physical machines, Journal of the ACM, 56:5, (1-57), Online publication date: 1-Aug-2009.
- Sirowy S, Sheldon D, Givargis T and Vahid F (2009). Virtual microcontrollers, ACM SIGBED Review, 6:1, (1-8), Online publication date: 1-Jan-2009.
- Moreto M, Cazorla F, Ramirez A, Sakellariou R and Valero M (2009). FlexDCP, ACM SIGOPS Operating Systems Review, 43:2, (86-96), Online publication date: 21-Apr-2009.
- Williams S, Waterman A and Patterson D (2009). Roofline, Communications of the ACM, 52:4, (65-76), Online publication date: 1-Apr-2009.
- Sahoo S, Shekhar C, Kodali S, Asati A and Gupta A (2009). Dual channel addition based FFT processor architecture for signal and image processing, International Journal of High Performance Systems Architecture, 2:1, (35-45), Online publication date: 1-Dec-2009.
- Amir A, Aumann Y, Kapah O, Levy A and Porat E (2009). Approximate string matching with address bit errors, Theoretical Computer Science, 410:51, (5334-5346), Online publication date: 1-Nov-2009.
- Le G and Shi Y (2009). Access region cache with register guided memory reference partitioning, Journal of Systems Architecture: the EUROMICRO Journal, 55:10-12, (434-445), Online publication date: 1-Oct-2009.
- Xu L (2008). A modular approach to language engineering using XML and inexpensive robots, Journal of Computing Sciences in Colleges, 23:5, (133-141), Online publication date: 1-May-2008.
- Kiselyov O, Byrd W, Friedman D and Shan C Pure, declarative, and constructive arithmetic relations (declarative pearl) Proceedings of the 9th international conference on Functional and logic programming, (64-80)
- Pirzadeh H and Dubé D VEP Proceedings of the 1st ACM workshop on Virtual machine security, (9-18)
- Bungo J (2008). The use of compiler optimizations for embedded systems software, XRDS: Crossroads, The ACM Magazine for Students, 15:1, (8-15), Online publication date: 1-Sep-2008.
- Koo H and Mishra P Specification-based compaction of directed tests for functional validation of pipelined processors Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis, (137-142)
- Chowdhury R and Ramachandran V Cache-efficient dynamic programming algorithms for multicores Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures, (207-216)
- Middha B, Simpson M and Barua R (2008). MTSS, ACM Transactions on Embedded Computing Systems, 7:4, (1-37), Online publication date: 1-Jul-2008.
- Wang W, Wang Q, Wei W and Liu D Modeling and evaluating heterogeneous memory architectures by trace-driven simulation Proceedings of the 2008 workshop on Memory access on future processors: a solved problem?, (369-376)
- He B and Luo Q (2008). Cache-oblivious databases, ACM Transactions on Database Systems, 33:2, (1-42), Online publication date: 1-Jun-2008.
- Xu L (2008). Language engineering in the context of a popular, inexpensive robot platform, ACM SIGCSE Bulletin, 40:1, (43-47), Online publication date: 29-Feb-2008.
- Xu L Language engineering in the context of a popular, inexpensive robot platform Proceedings of the 39th SIGCSE technical symposium on Computer science education, (43-47)
- Schoeberl M (2008). A Java processor architecture for embedded real-time systems, Journal of Systems Architecture: the EUROMICRO Journal, 54:1-2, (265-286), Online publication date: 1-Jan-2008.
- Khanli L and Analoui M (2008). An approach to grid resource selection and fault management based on ECA rules, Future Generation Computer Systems, 24:4, (296-316), Online publication date: 1-Apr-2008.
- Amir A, Aumann Y, Kapah O, Levy A and Porat E Approximate String Matching with Address Bit Errors Proceedings of the 19th annual symposium on Combinatorial Pattern Matching, (118-129)
- Xu L (2007). Project the wiki way, Journal of Computing Sciences in Colleges, 22:6, (109-116), Online publication date: 1-Jun-2007.
- Shacham A, Bergman K and Carloni L The case for low-power photonic networks on chip Proceedings of the 44th annual Design Automation Conference, (132-135)
- Murphy R and Kogge P (2007). On the Memory Access Patterns of Supercomputer Applications, IEEE Transactions on Computers, 56:7, (937-945), Online publication date: 1-Jul-2007.
- Sugihara M, Ishihara T and Murakami K Task scheduling for reliable cache architectures of multiprocessor systems Proceedings of the conference on Design, automation and test in Europe, (1490-1495)
- Rhod E, Lisbôa C and Carro L A low-SER efficient core processor architecture for future technologies Proceedings of the conference on Design, automation and test in Europe, (1448-1453)
- Verma S, Harris I and Ramineni K Interactive presentation: Automatic generation of functional coverage models from behavioral verilog descriptions Proceedings of the conference on Design, automation and test in Europe, (900-905)
- Dominguez A, Nguyen N and Barua R Recursive function data allocation to scratch-pad memory Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, (65-74)
- Koc H, Kandemir M, Ercanli E and Ozturk O Reducing off-chip memory access costs using data recomputation in embedded chip multi-processors Proceedings of the 44th annual Design Automation Conference, (224-229)
- Ali A, Johnsson L and Subhlok J Scheduling FFT computation on SMP and multicore systems Proceedings of the 21st annual international conference on Supercomputing, (293-301)
- Nesbit K, Laudon J and Smith J (2007). Virtual private caches, ACM SIGARCH Computer Architecture News, 35:2, (57-68), Online publication date: 9-Jun-2007.
- Nesbit K, Laudon J and Smith J Virtual private caches Proceedings of the 34th annual international symposium on Computer architecture, (57-68)
- Sasanka R, Li M, Adve S, Chen Y and Debes E (2007). ALP, ACM Transactions on Architecture and Code Optimization, 4:1, (3-es), Online publication date: 1-Mar-2007.
- Li X and Parashar M (2007). Hybrid Runtime Management of Space-Time Heterogeneity for Parallel Structured Adaptive Applications, IEEE Transactions on Parallel and Distributed Systems, 18:9, (1202-1214), Online publication date: 1-Sep-2007.
- Qin X (2007). Design and analysis of a load balancing strategy in data grids, Future Generation Computer Systems, 23:1, (132-137), Online publication date: 1-Jan-2007.
- Yang H, Ziavras S and Hu J (2007). Reconfiguration support for vector operations, International Journal of High Performance Systems Architecture, 1:2, (89-97), Online publication date: 1-Oct-2007.
- Marescaux T, Brockmeyer E and Corporaal H The Impact of Higher Communication Layers on NoC Supported MP-SoCs Proceedings of the First International Symposium on Networks-on-Chip, (107-116)
- Lin T, Lin H, Chao C, Liu C and Jen C (2006). A Compact DSP Core with Static Floating-Point Arithmetic, Journal of VLSI Signal Processing Systems, 42:2, (127-138), Online publication date: 1-Feb-2006.
- Andrews J and Baker N (2006). Xbox 360 System Architecture, IEEE Micro, 26:2, (25-37), Online publication date: 1-Mar-2006.
- Yang X and H. Vaidya N (2006). A Wireless MAC Protocol Using Implicit Pipelining, IEEE Transactions on Mobile Computing, 5:3, (258-273), Online publication date: 1-Mar-2006.
- Zheng K, Che H, Wang Z, Liu B and Zhang X (2006). DPPC-RE, IEEE Transactions on Computers, 55:8, (947-961), Online publication date: 1-Aug-2006.
- Kwak J, Jhang S and Jhon C Accuracy enhancement by selective use of branch history in embedded processor Proceedings of the 6th international conference on Computational Science - Volume Part IV, (979-986)
- Kwak J and Jhon C Recovery logics for speculative update global and local branch history Proceedings of the 21st international conference on Computer and Information Sciences, (258-266)
- Bariamis D, Iakovidis D and Maroulis D Dedicated hardware for real-time computation of second-order statistical features for high resolution images Proceedings of the 8th international conference on Advanced Concepts For Intelligent Vision Systems, (67-77)
- Dolev S and Haviv Y (2006). Self-Stabilizing Microprocessor, IEEE Transactions on Computers, 55:4, (385-399), Online publication date: 1-Apr-2006.
- Cérin C, Koskas M, Fkaier H and Jemni M (2006). Sequential in-core sorting performance for a SQL data service and for parallel sorting on heterogeneous clusters, Future Generation Computer Systems, 22:7, (776-783), Online publication date: 1-Aug-2006.
- Zhu Y and Jiang H (2006). CEFT, Journal of Parallel and Distributed Computing, 66:2, (291-306), Online publication date: 1-Feb-2006.
- Mendes J, Coutinho L and Martins C Web memory hierarchy learning and research environment Proceedings of the 2006 workshop on Computer architecture education: held in conjunction with the 33rd International Symposium on Computer Architecture, (5-es)
- Gill G, Hansen J and Singh M Loop pipelining for high-throughput stream computation using self-timed rings Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design, (289-296)
- Bellens P, Perez J, Badia R and Labarta J CellSs Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (86-es)
- Gilbert J and Abrahamson D Adaptive object code compression Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, (282-292)
- Adams K and Agesen O (2006). A comparison of software and hardware techniques for x86 virtualization, ACM SIGARCH Computer Architecture News, 34:5, (2-13), Online publication date: 20-Oct-2006.
- Adams K and Agesen O (2006). A comparison of software and hardware techniques for x86 virtualization, ACM SIGPLAN Notices, 41:11, (2-13), Online publication date: 1-Nov-2006.
- Adams K and Agesen O (2006). A comparison of software and hardware techniques for x86 virtualization, ACM SIGOPS Operating Systems Review, 40:5, (2-13), Online publication date: 20-Oct-2006.
- Adams K and Agesen O A comparison of software and hardware techniques for x86 virtualization Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, (2-13)
- Bardine A, Bechini A, Foglia P and Prete C (2005). Analysis of embedded video coder systems, ACM SIGARCH Computer Architecture News, 34:1, (71-76), Online publication date: 1-Mar-2006.
- Chiyonobu A and Sato T (2005). Energy-efficient instruction scheduling utilizing cache miss information, ACM SIGARCH Computer Architecture News, 34:1, (65-70), Online publication date: 1-Mar-2006.
- Yue Y, Lin C and Tan Z (2005). NPCryptBench, ACM SIGARCH Computer Architecture News, 34:1, (49-56), Online publication date: 1-Mar-2006.
- Chen W, Bhansali S, Chilimbi T, Gao X and Chuang W Profile-guided proactive garbage collection for locality optimization Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation, (332-340)
- Chen W, Bhansali S, Chilimbi T, Gao X and Chuang W (2006). Profile-guided proactive garbage collection for locality optimization, ACM SIGPLAN Notices, 41:6, (332-340), Online publication date: 11-Jun-2006.
- Koo H and Mishra P Test generation using SAT-based bounded model checking for validation of pipelined processors Proceedings of the 16th ACM Great Lakes symposium on VLSI, (362-365)
- Ou S, Lin T, Huang C, Kuo Y, Chao C, Liu C and Jen C A 52mW 1200MIPS compact DSP for multi-core media SoC Proceedings of the 2006 Asia and South Pacific Design Automation Conference, (118-119)
- Heffernan M, Wilken K and Shobaki G Data-Dependency Graph Transformations for Superblock Scheduling Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, (77-88)
- Velev M Formal Verification of Pipelined Microprocessors with Delayed Branches Proceedings of the 7th International Symposium on Quality Electronic Design, (296-299)
- Sugihara M, Ishihara T, Muroyama M and Hashimoto K A Simulation-Based Soft Error Estimation Methodology for Computer Systems Proceedings of the 7th International Symposium on Quality Electronic Design, (196-203)
- Velev M Using Abstraction for Efficient Formal Verification of Pipelined Processors with Value Prediction Proceedings of the 7th International Symposium on Quality Electronic Design, (51-56)
- Billerbeck B and Zobel J (2006). Efficient query expansion with auxiliary data structures, Information Systems, 31:7, (573-584), Online publication date: 1-Nov-2006.
- Koukis E and Koziris N Memory and Network Bandwidth Aware Scheduling of Multiprogrammed Workloads on Clusters of SMPs Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1, (345-354)
- Vedantham R, Zhuang Z and Sivakumar R (2006). Hazard avoidance in wireless sensor and actor networks, Computer Communications, 29:13-14, (2578-2598), Online publication date: 1-Aug-2006.
- Zhang C, Zhou H, Zhang M and Xing Z An architectural leakage power reduction method for instruction cache in ultra deep submicron microprocessors Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture, (588-594)
- Amir A Asynchronous pattern matching Proceedings of the 17th Annual conference on Combinatorial Pattern Matching, (1-10)
- Wang F, Zhang S, Feng D, Jiang H, Zeng L and Lv S A hybrid scheme for object allocation in a distributed object-storage system Proceedings of the 6th international conference on Computational Science - Volume Part IV, (396-403)
- Di Blas A, Dahle D, Diekhans M, Grate L, Hirschberg J, Karplus K, Keller H, Kendrick M, J. Mesa-Martinez F, Pease D, Rice E, Schultz A, Speck D and Hughey R (2005). The UCSC Kestrel Parallel Processor, IEEE Transactions on Parallel and Distributed Systems, 16:1, (80-92), Online publication date: 1-Jan-2005.
- Roy A, Panda S, Kumar R and Chakrabarti P (2005). A framework for systematic validation and debugging of pipeline simulators, ACM Transactions on Design Automation of Electronic Systems, 10:3, (462-491), Online publication date: 1-Jul-2005.
- Gunawi H, Agrawal N, Arpaci-Dusseau A, Arpaci-Dusseau R and Schindler J (2005). Deconstructing Commodity Storage Clusters, ACM SIGARCH Computer Architecture News, 33:2, (60-71), Online publication date: 1-May-2005.
- Velev M Automatic formal verification of liveness for pipelined processors with multicycle functional units Proceedings of the 13 IFIP WG 10.5 international conference on Correct Hardware Design and Verification Methods, (97-113)
- Gunawi H, Agrawal N, Arpaci-Dusseau A, Arpaci-Dusseau R and Schindler J Deconstructing Commodity Storage Clusters Proceedings of the 32nd annual international symposium on Computer Architecture, (60-71)
- Haga S, Reeves N, Barua R and Marculescu D (2005). Dynamic functional unit assignment for low power, The Journal of Supercomputing, 31:1, (47-62), Online publication date: 1-Jan-2005.
- Sinha R and Zobel J (2005). Using random sampling to build approximate tries for efficient string sorting, ACM Journal of Experimental Algorithmics, 10, (2.10-es), Online publication date: 31-Dec-2005.
- Papathanasiou A and Scott M Aggressive prefetching Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10, (6-6)
- Bardine A, Bechini A, Foglia P and Prete C Analysis of embedded video coder systems Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture, (71-76)
- Chiyonobu A and Sato T Energy-efficient instruction scheduling utilizing cache miss information Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture, (65-70)
- Yue Y, Lin C and Tan Z NPCryptBench Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture, (49-56)
- Rao W, Orailoglu A and Karri R Fault tolerant nanoelectronic processor architectures Proceedings of the 2005 Asia and South Pacific Design Automation Conference, (311-316)
- Velev M Comparison of schemes for encoding unobservability in translation to SAT Proceedings of the 2005 Asia and South Pacific Design Automation Conference, (1056-1059)
- Hasan J and Vijaykumar T (2005). Dynamic pipelining, ACM SIGCOMM Computer Communication Review, 35:4, (205-216), Online publication date: 1-Oct-2005.
- Zennaro M and Sengupta R Distributing synchronous programs using bounded queues Proceedings of the 5th ACM international conference on Embedded software, (325-334)
- Hasan J and Vijaykumar T Dynamic pipelining Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications, (205-216)
- Gulati A and Varman P Lexicographic QoS scheduling for parallel I/O Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures, (29-38)
- Johnson J, Krandick W and Ruslanov A Architecture-aware classical Taylor shift by 1 Proceedings of the 2005 international symposium on Symbolic and algebraic computation, (200-207)
- Zhong L and Jha N Energy efficiency of handheld computer interfaces Proceedings of the 3rd international conference on Mobile systems, applications, and services, (247-260)
- Schaumont P, Lai B, Qin W and Verbauwhede I Cooperative multithreading on 3mbedded multiprocessor architectures enables energy-scalable design Proceedings of the 42nd annual Design Automation Conference, (27-30)
- Zhang C, Vahid F, Yang J and Najjar W (2005). A way-halting cache for low-energy high-performance systems, ACM Transactions on Architecture and Code Optimization, 2:1, (34-54), Online publication date: 1-Mar-2005.
- Lin T, Chao C, Liu C, Hsiao P, Chen S, Lin L, Liu C and Jen C A unified processor architecture for RISC & VLIW DSP Proceedings of the 15th ACM Great Lakes symposium on VLSI, (50-55)
- Hashempour H and Lombardi F (2005). Application of Arithmetic Coding to Compression of VLSI Test Data, IEEE Transactions on Computers, 54:9, (1166-1177), Online publication date: 1-Sep-2005.
- Petrou D, Gibson G and Ganger G Scheduling speculative tasks in a compute farm Proceedings of the 2005 ACM/IEEE conference on Supercomputing
- Vuletic M, Pozzi L and Ienne P (2005). Seamless Hardware-Software Integration in Reconfigurable Computing Systems, IEEE Design & Test, 22:2, (102-113), Online publication date: 1-Mar-2005.
- Heffernan M and Wilken K (2005). Data-Dependency Graph Transformations for Instruction Scheduling, Journal of Scheduling, 8:5, (427-451), Online publication date: 1-Oct-2005.
- Schoeberl M Design and Implementation of an Efficient Stack Machine Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
- Datta A, Bhunia S, Mukhopadhyay S, Banerjee N and Roy K Statistical Modeling of Pipeline Delay and Design of Pipeline under Process Variation to Enhance Yield in sub-100nm Technologies Proceedings of the conference on Design, Automation and Test in Europe - Volume 2, (926-931)
- Herruzo E, Mesones A, Benavides J, Plata O and Zapata E Distributed architecture system for computer performance testing Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics, (140-147)
- Athanasaki E, Kourtis K, Anastopoulos N and Koziris N Tuning blocked array layouts to exploit memory hierarchy in SMT architectures Proceedings of the 10th Panhellenic conference on Advances in Informatics, (600-610)
- Butt A, Johnson T, Zheng Y and Hu Y Kosha Proceedings of the 2004 ACM/IEEE conference on Supercomputing
- Togawa N, Tachikake K, Miyaoka Y, Yanagisawa M and Ohtsuki T Instruction set and functional unit synthesis for SIMD processor cores Proceedings of the 2004 Asia and South Pacific Design Automation Conference, (743-750)
- Candea G, Cutler J and Fox A (2004). Improving availability with recursive microreboots, Performance Evaluation, 56:1-4, (213-248), Online publication date: 1-Mar-2004.
- Kumar R, Tullsen D, Ranganathan P, Jouppi N and Farkas K Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance Proceedings of the 31st annual international symposium on Computer architecture
- Zhang Y and Chakrabarty K Task Feasibility Analysis and Dynamic Voltage Scaling in Fault-Tolerant Real-Time Embedded Systems Proceedings of the conference on Design, automation and test in Europe - Volume 2
- Velev M Exploiting Signal Unobservability for Efficient Translation to CNF in Formal Verification of Microprocessors Proceedings of the conference on Design, automation and test in Europe - Volume 1
- Chen M, Accardi A, Kiciman E, Lloyd J, Patterson D, Fox A and Brewer E Path-based faliure and evolution management Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1, (23-23)
- Deng Y and Maly W 2.5D system integration Proceedings of the 2004 Asia and South Pacific Design Automation Conference, (450-455)
- Velev M Using positive equality to prove liveness for pipelined microprocessors Proceedings of the 2004 Asia and South Pacific Design Automation Conference, (316-321)
- Velev M Efficient translation of boolean formulas to CNF in formal verification of microprocessors Proceedings of the 2004 Asia and South Pacific Design Automation Conference, (310-315)
- Johnson T, Eigenmann R and Vijaykumar T (2004). Min-cut program decomposition for thread-level speculation, ACM SIGPLAN Notices, 39:6, (59-70), Online publication date: 9-Jun-2004.
- Johnson T, Eigenmann R and Vijaykumar T Min-cut program decomposition for thread-level speculation Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, (59-70)
- Vuletić M, Pozzi L and Ienne P Virtual memory window for application-specific reconfigurable coprocessors Proceedings of the 41st annual Design Automation Conference, (948-953)
- Netto E, Azevedo R, Centoducatte P and Araujo G Multi-profile based code compression Proceedings of the 41st annual Design Automation Conference, (244-249)
- Velev M Efficient formal verification of pipelined processors with instruction queues Proceedings of the 14th ACM Great Lakes symposium on VLSI, (92-95)
- Lin T, Lin H, Chao C, Liu C and Jen C A compact DSP core with static floating-point unit & its microcode generation Proceedings of the 14th ACM Great Lakes symposium on VLSI, (57-60)
- Metzgen P A high performance 32-bit ALU for programmable logic Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays, (61-70)
- Branovic I, Giorgi R and Martinelli E WebMIPS Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture, (19-es)
- Bečvář M Teaching basics of instruction pipelining with HDLDLX Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture, (16-es)
- Chihaia I and Gross T An analytical model for software-only main memory compression Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture, (107-113)
- Smolens J, Gold B, Kim J, Falsafi B, Hoe J and Nowatzyk A (2004). Fingerprinting, ACM SIGOPS Operating Systems Review, 38:5, (224-234), Online publication date: 1-Dec-2004.
- Smolens J, Gold B, Kim J, Falsafi B, Hoe J and Nowatzyk A (2004). Fingerprinting, ACM SIGARCH Computer Architecture News, 32:5, (224-234), Online publication date: 1-Dec-2004.
- Smolens J, Gold B, Kim J, Falsafi B, Hoe J and Nowatzyk A (2004). Fingerprinting, ACM SIGPLAN Notices, 39:11, (224-234), Online publication date: 1-Nov-2004.
- Kumar R, Tullsen D, Ranganathan P, Jouppi N and Farkas K (2004). Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance, ACM SIGARCH Computer Architecture News, 32:2, (64), Online publication date: 2-Mar-2004.
- Smolens J, Gold B, Kim J, Falsafi B, Hoe J and Nowatzyk A Fingerprinting Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, (224-234)
- Berekovic M, Moch S and Pirsch P (2003). A scalable, clustered SMT processor for digital signal processing, ACM SIGARCH Computer Architecture News, 32:3, (62-69), Online publication date: 1-Jun-2004.
- Biswas S, Simpson M and Barua R Memory overflow protection for embedded systems using run-time checks, reuse and compression Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems, (280-291)
- Mathew B, Davis A and Parker M A low power architecture for embedded perception Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems, (46-56)
- Citron D, Haber G and Levin R Reducing program image size by extracting frozen code and data Proceedings of the 4th ACM international conference on Embedded software, (297-305)
- Sica F, Coelho C, Nacif J, Foster H and Fernandes A Exception handling in microprocessors using assertion libraries Proceedings of the 17th symposium on Integrated circuits and system design, (55-59)
- Choi K, Soma R and Pedram M Dynamic voltage and frequency scaling based on workload decomposition Proceedings of the 2004 international symposium on Low power electronics and design, (174-179)
- Zhang C, Vahid F, Yang J and Najjar W A way-halting cache for low-energy high-performance systems Proceedings of the 2004 international symposium on Low power electronics and design, (126-131)
- Oliver J, Akella V and Chong F Efficient orchestration of sub-word parallelism in media processors Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures, (225-234)
- Fields B, Rastislav , Hill M and Newburn C (2004). Interaction Cost, IEEE Micro, 24:6, (57-61), Online publication date: 1-Nov-2004.
- Samavi S, Shirani S, Karimi N and Deen M (2004). A Pipeline Architecture for Processing of DNA Microarrays Images, Journal of VLSI Signal Processing Systems, 38:3, (287-297), Online publication date: 1-Nov-2004.
- Yim K, Lee J, Kim J, Kim S and Koh K A space-efficient on-chip compressed cache organization for high performance computing Proceedings of the Second international conference on Parallel and Distributed Processing and Applications, (952-964)
- Aggarwal A Single FU bypass networks for high clock rate superscalar processors Proceedings of the 11th international conference on High Performance Computing, (319-332)
- Liu G, Xia F, Yang X, Zhou H, Zhao H and Deng Y The design and performance analysis of embedded parallel multiprocessing system Proceedings of the First international conference on Embedded Software and Systems, (210-215)
- Tachikake K, Togawa N, Miyaoka Y, Choi J, Yanagisawa M and Ohtsuki T A hardware/software partitioning algorithm for SIMD processor cores Proceedings of the 2003 Asia and South Pacific Design Automation Conference, (135-140)
- Egan C, Steven G, Quick P, Anguera R, Steven F and Vintan L (2003). Two-level branch prediction using neural networks, Journal of Systems Architecture: the EUROMICRO Journal, 49:12-15, (557-570), Online publication date: 1-Dec-2003.
- Fields B, Bodík R, Hill M and Newburn C Using Interaction Costs for Microarchitectural Bottleneck Analysis Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
- Denning P Virtual memory Encyclopedia of Computer Science, (1832-1835)
- Udayakumaran S and Barua R Compiler-decided dynamic memory allocation for scratch-pad based embedded systems Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems, (276-286)
- Goodwin D and Petkov D Automatic generation of application specific processors Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems, (137-147)
- Dufour B, Driesen K, Hendren L and Verbrugge C (2003). Dynamic metrics for java, ACM SIGPLAN Notices, 38:11, (149-168), Online publication date: 26-Nov-2003.
- Dufour B, Driesen K, Hendren L and Verbrugge C Dynamic metrics for java Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, (149-168)
- Chen M and Olukotun K (2003). The Jrpm system for dynamically parallelizing Java programs, ACM SIGARCH Computer Architecture News, 31:2, (434-446), Online publication date: 1-May-2003.
- Kozyrakis C and Patterson D (2003). Overcoming the limitations of conventional vector processors, ACM SIGARCH Computer Architecture News, 31:2, (399-409), Online publication date: 1-May-2003.
- Ernst D, Hamel A and Austin T (2003). Cyclone, ACM SIGARCH Computer Architecture News, 31:2, (253-263), Online publication date: 1-May-2003.
- Chen M and Olukotun K The Jrpm system for dynamically parallelizing Java programs Proceedings of the 30th annual international symposium on Computer architecture, (434-446)
- Kozyrakis C and Patterson D Overcoming the limitations of conventional vector processors Proceedings of the 30th annual international symposium on Computer architecture, (399-409)
- Ernst D, Hamel A and Austin T Cyclone Proceedings of the 30th annual international symposium on Computer architecture, (253-263)
- Aziz A, Prakash A and Ramachandran V A near optimal scheduler for switch-memory-switch routers Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures, (343-352)
- Becvar M, Pluhacek A and Danecek J DOP Proceedings of the 2003 workshop on Computer architecture education: Held in conjunction with the 30th International Symposium on Computer Architecture, (4-es)
- Cornea M, Harrison J and Tang P Intel® Itanium® floating-point architecture Proceedings of the 2003 workshop on Computer architecture education: Held in conjunction with the 30th International Symposium on Computer Architecture, (3-es)
- Berekovic M, Moch S and Pirsch P A scalable, clustered SMT processor for digital signal processing Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture, (62-69)
- Velev M and Bryant R (2003). Effective use of boolean satisfiability procedures in the formal verification of superscalar and VLIW microprocessors, Journal of Symbolic Computation, 35:2, (73-106), Online publication date: 1-Feb-2003.
- Venkateswaran N and Chandramouli C General purpose processor architecture for modeling stochastic biological neuronal assemblies Proceedings of the 5th international conference on Evolvable systems: from biology to hardware, (387-397)
- Song D, Heywood M and Zincir-Heywood A A linear genetic programming approach to intrusion detection Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII, (2325-2336)
- Pisharath J and Choudhary A An integrated approach to reducing power dissipation in memory hierarchies Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems, (88-97)
- Brorsson M MipsIt Proceedings of the 2002 workshop on Computer architecture education: Held in conjunction with the 29th International Symposium on Computer Architecture, (12-es)
- Herath J, Ramnath S, Herath A and Herath S An active learning environment for intermediate computer architecture courses Proceedings of the 2002 workshop on Computer architecture education: Held in conjunction with the 29th International Symposium on Computer Architecture, (8-es)
- Miyaoka Y, Kataoka Y, Togawa N, Yanagisawa M and Ohtsuki T Area/delay estimation for digital signal processor cores Proceedings of the 2001 Asia and South Pacific Design Automation Conference, (156-161)
- Rhea S, Wells C, Eaton P, Geels D, Zhao B, Weatherspoon H and Kubiatowicz J (2001). Maintenance-Free Global Data Storage, IEEE Internet Computing, 5:5, (40-49), Online publication date: 1-Sep-2001.
- Vajracharya S and Grunwald D Loop re-ordering and pre-fetching at run-time Proceedings of the 1997 ACM/IEEE conference on Supercomputing, (1-13)
- Szymanski T (1997). Design Principles for Practical Self-Routing Nonblocking Switching Networks with O(N · log N) Bit-Complexity, IEEE Transactions on Computers, 46:10, (1057-1069), Online publication date: 1-Oct-1997.
- Gupta R Analysis of operation delay and execution rate constraints for embedded systems Proceedings of the 33rd annual Design Automation Conference, (601-604)
- Wulf W and McKee S (1995). Hitting the memory wall, ACM SIGARCH Computer Architecture News, 23:1, (20-24), Online publication date: 1-Mar-1995.
- Kuga M, Murakami K and Tomita S (1991). DSNS (dynamically-hazard-resolved statically-code-scheduled, nonuniform superscalar), ACM SIGARCH Computer Architecture News, 19:4, (14-29), Online publication date: 1-Jul-1991.
Index Terms
- Computer architecture: a quantitative approach