skip to main content
Skip header Section
Bulldog: a compiler for VLSI architecturesMay 1986
  • Author:
  • John R. Ellis
Publisher:
  • MIT Press
  • 55 Hayward St.
  • Cambridge
  • MA
  • United States
ISBN:978-0-262-05034-0
Published:01 May 1986
Pages:
320
Skip Bibliometrics Section
Bibliometrics
Abstract

No abstract available.

Cited By

  1. Shah N, Meert W and Verhelst M DPU-v2: Energy-Efficient Execution of Irregular Directed Acyclic Graphs Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture, (1288-1307)
  2. Cao H, Guo S, Hao J, Xia Y and Xu J (2022). Superblock-based performance optimization for Sunway Math Library on SW26010 many-core processor, The Journal of Supercomputing, 78:4, (4827-4849), Online publication date: 1-Mar-2022.
  3. Simar R and Tatge R (2021). How VLIWs Were Adopted as Digital Signal Processors, IEEE Micro, 41:6, (121-128), Online publication date: 1-Nov-2021.
  4. ACM
    Fujiki D, Mahlke S and Das R Duality cache for data parallel acceleration Proceedings of the 46th International Symposium on Computer Architecture, (397-410)
  5. Lu C, Shih W, Wu C and Lee J (2014). Achieving spilling-friendly register file assignment for highly distributed register files, The Journal of Supercomputing, 69:3, (1342-1362), Online publication date: 1-Sep-2014.
  6. ACM
    Kim N and Krall A Integrated modulo scheduling and cluster assignment for TI TMS320C64x+ architecture Proceedings of the 11th Workshop on Optimizations for DSP and Embedded Systems, (25-32)
  7. ACM
    Goel N, Kumar A and Panda P (2014). Shared-port register file architecture for low-energy VLIW processors, ACM Transactions on Architecture and Code Optimization, 11:1, (1-32), Online publication date: 1-Feb-2014.
  8. Porpodas V and Cintra M CAeSaR Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, (1-10)
  9. ACM
    Beg M and Beek P (2013). A constraint programming approach for integrated spatial and temporal scheduling for clustered architectures, ACM Transactions on Embedded Computing Systems (TECS), 13:1, (1-23), Online publication date: 1-Aug-2013.
  10. ACM
    Huang Y, Zhao M and Xue C WCET-aware re-scheduling register allocation for real-time embedded systems with clustered VLIW architecture Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems, (31-40)
  11. ACM
    Huang Y, Zhao M and Xue C (2012). WCET-aware re-scheduling register allocation for real-time embedded systems with clustered VLIW architecture, ACM SIGPLAN Notices, 47:5, (31-40), Online publication date: 18-May-2012.
  12. ACM
    Ghandour W, Akkary H and Masri W (2012). Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction, ACM Transactions on Architecture and Code Optimization, 9:1, (1-33), Online publication date: 1-Mar-2012.
  13. ACM
    Zhang X, Wu H and Xue J An efficient heuristic for instruction scheduling on clustered vliw processors Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems, (35-44)
  14. Gupta S, Feng S, Ansari A and Mahlke S Erasing Core Boundaries for Robust and Configurable Performance Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, (325-336)
  15. ACM
    Ghandour W, Akkary H and Masri W The potential of using dynamic information flow analysis in data value prediction Proceedings of the 19th international conference on Parallel architectures and compilation techniques, (431-442)
  16. ACM
    Purnaprajna M, Porrmann M, Rueckert U, Hussmann M, Thies M and Kastens U (2010). Runtime Reconfiguration of Multiprocessors Based on Compile-Time Analysis, ACM Transactions on Reconfigurable Technology and Systems (TRETS), 3:3, (1-25), Online publication date: 1-Sep-2010.
  17. Brogioli M and Cavallaro J Compiler driven architecture design space exploration for DSP workloads Proceedings of the 43rd Asilomar conference on Signals, systems and computers, (221-225)
  18. Drozdov A and Novikov S (2018). A program auto-parallelizer based on the component technology of optimizing compiler construction, Programming and Computing Software, 35:6, (321-339), Online publication date: 1-Nov-2009.
  19. ACM
    Park H, Fan K, Mahlke S, Oh T, Kim H and Kim H Edge-centric modulo scheduling for coarse-grained reconfigurable architectures Proceedings of the 17th international conference on Parallel architectures and compilation techniques, (166-176)
  20. ACM
    Fan K, Park H, Kudlur M and Mahlke S Modulo scheduling for highly customized datapaths to increase hardware reusability Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization, (124-133)
  21. Santana O, Ramirez A and Valero M (2007). Enlarging Instruction Streams, IEEE Transactions on Computers, 56:10, (1342-1357), Online publication date: 1-Oct-2007.
  22. Hsu J, Wu Y, Lin X and Chung Y SCRF Proceedings of the 9th international conference on Parallel Computing Technologies, (525-536)
  23. ACM
    Chu M and Mahlke S (2007). Code and data partitioning for fine-grain parallelism, ACM SIGPLAN Notices, 42:7, (161-164), Online publication date: 13-Jul-2007.
  24. ACM
    Chu M and Mahlke S Code and data partitioning for fine-grain parallelism Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, (161-164)
  25. ACM
    Terechko A and Corporaal H (2007). Inter-cluster communication in VLIW architectures, ACM Transactions on Architecture and Code Optimization (TACO), 4:2, (11-es), Online publication date: 1-Jun-2007.
  26. Codina J, Sanchez J and Gonzalez A Virtual Cluster Scheduling Through the Scheduling Graph Proceedings of the International Symposium on Code Generation and Optimization, (89-101)
  27. Aleta A, Codina J, Gonzalez A and Kaeli D Heterogeneous Clustered VLIW Microarchitectures Proceedings of the International Symposium on Code Generation and Optimization, (354-366)
  28. ACM
    Coons K, Chen X, Burger D, McKinley K and Kushwaha S (2006). A spatial path scheduling algorithm for EDGE architectures, ACM SIGPLAN Notices, 41:11, (129-140), Online publication date: 1-Nov-2006.
  29. ACM
    Coons K, Chen X, Burger D, McKinley K and Kushwaha S A spatial path scheduling algorithm for EDGE architectures Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, (129-140)
  30. ACM
    Nagpal R and Srikant Y Compiler-assisted leakage energy optimization for clustered VLIW architectures Proceedings of the 6th ACM & IEEE International conference on Embedded software, (233-241)
  31. ACM
    Park H, Fan K, Kudlur M and Mahlke S Modulo graph embedding Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, (136-146)
  32. ACM
    Coons K, Chen X, Burger D, McKinley K and Kushwaha S (2006). A spatial path scheduling algorithm for EDGE architectures, ACM SIGARCH Computer Architecture News, 34:5, (129-140), Online publication date: 20-Oct-2006.
  33. ACM
    Coons K, Chen X, Burger D, McKinley K and Kushwaha S (2006). A spatial path scheduling algorithm for EDGE architectures, ACM SIGOPS Operating Systems Review, 40:5, (129-140), Online publication date: 20-Oct-2006.
  34. Hammond S and Lacey D Loop transformations in the ahead-of-time optimization of java bytecode Proceedings of the 15th international conference on Compiler Construction, (109-123)
  35. Chu M and Mahlke S Compiler-directed Data Partitioning for Multicluster Processors Proceedings of the International Symposium on Code Generation and Optimization, (208-220)
  36. Lakshmi K, Sreedhar D, Raman E and Shankar P Integrating a new cluster assignment and scheduling algorithm into an experimental retargetable code generation framework Proceedings of the 12th international conference on High Performance Computing, (518-527)
  37. Tang Y, Deng K, Cao H and Zhou X Trace-Based runtime instruction rescheduling for architecture extension Proceedings of the Second international conference on Embedded Software and Systems, (4-15)
  38. Reshadi M, Gorjiara B and Gajski D Utilizing Horizontal and Vertical Parallelism with a No-Instruction-Set Compiler for Custom Datapaths Proceedings of the 2005 International Conference on Computer Design, (69-76)
  39. ACM
    Reshadi M and Gajski D A cycle-accurate compilation algorithm for custom pipelined datapaths Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, (21-26)
  40. Ishii N, Ogi H, Mochizuki T and Iwata K Parallelism improvements of software pipelining by combining spilling with rematerialization Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I, (820-826)
  41. Stephenson M and Amarasinghe S Predicting Unroll Factors Using Supervised Classification Proceedings of the international symposium on Code generation and optimization, (123-134)
  42. Nagarajan R, Kushwaha S, Burger D, McKinley K, Lin C and Keckler S Static Placement, Dynamic Issue (SPDI) Scheduling for EDGE Architectures Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, (74-84)
  43. Chu M, Fan K, Ravindran R and Mahlke S (2018). Cost-Sensitive Partitioning in an Architecture Synthesis System for Multicluster Processors, IEEE Micro, 24:3, (10-20), Online publication date: 1-May-2004.
  44. Kudlur M, Fan K, Chu M, Ravindran R, Clark N and Mahlke S FLASH Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
  45. ACM
    Song L and Kavi K (2004). What can we gain by unfolding loops?, ACM SIGPLAN Notices, 39:2, (26-33), Online publication date: 1-Feb-2004.
  46. ACM
    Chu M, Fan K and Mahlke S Region-based hierarchical operation partitioning for multicluster processors Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, (300-311)
  47. ACM
    George L and Blume M Taming the IXP network processor Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, (26-37)
  48. ACM
    Chu M, Fan K and Mahlke S (2003). Region-based hierarchical operation partitioning for multicluster processors, ACM SIGPLAN Notices, 38:5, (300-311), Online publication date: 9-May-2003.
  49. ACM
    George L and Blume M (2003). Taming the IXP network processor, ACM SIGPLAN Notices, 38:5, (26-37), Online publication date: 9-May-2003.
  50. Rau B and Fisher J Instruction-level parallelism Encyclopedia of Computer Science, (883-887)
  51. Lee W, Puppin D, Swenson S and Amarasinghe S Convergent scheduling Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, (111-122)
  52. Larsen S, Witchel E and Amarasinghe S Increasing and Detecting Memory Address Congruence Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques, (18-29)
  53. Aletà A, Codina J, Sánchez F, González A and Kaeli D Exploiting Pseudo-Schedules to Guide Data Dependence Graph Partitioning Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques, (281-290)
  54. ACM
    Gibert E, Sánchez J and González A An interleaved cache clustered VLIW processor Proceedings of the 16th international conference on Supercomputing, (210-219)
  55. ACM
    Krishnamurthy G, Granston E and Stotzer E Affinity-based cluster assignment for unrolled loops Proceedings of the 16th international conference on Supercomputing, (107-116)
  56. Zalamea J, Llosa J, Ayguadé E and Valero M Modulo scheduling with integrated register spilling for clustered VLIW architectures Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, (160-169)
  57. Aletà A, Codina J, Sánchez J and González A Graph-partitioning based instruction scheduling for clustered processors Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, (150-159)
  58. Chen S and Fuchs W (2001). Compiler-Assisted Multiple Instruction Word Retry for VLIW Architectures, IEEE Transactions on Parallel and Distributed Systems, 12:12, (1293-1304), Online publication date: 1-Dec-2001.
  59. ACM
    Buss M, Azevedo R, Centoducatte P and Araujo G Tailoring pipeline bypassing and functional unit mapping to application in clustered VLIW architectures Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, (141-148)
  60. ACM
    Mark W and Proudfoot K Compiling to a VLIW fragment pipeline Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, (47-56)
  61. Goossens G, Van Praet J, Lanneer D, Geurts W, Kifli A, Liem C and Paulin P Embedded software in real-time signal processing systems Readings in hardware/software co-design, (433-451)
  62. ACM
    Shiue W Retargetable compilation for low power Proceedings of the ninth international symposium on Hardware/software codesign, (254-259)
  63. ACM
    Mattson P, Dally W, Rixner S, Kapasi U and Owens J (2000). Communication scheduling, ACM SIGOPS Operating Systems Review, 34:5, (82-92), Online publication date: 1-Dec-2000.
  64. ACM
    Mattson P, Dally W, Rixner S, Kapasi U and Owens J (2000). Communication scheduling, ACM SIGARCH Computer Architecture News, 28:5, (82-92), Online publication date: 1-Dec-2000.
  65. ACM
    Sánchez J and González A Modulo scheduling for a fully-distributed clustered VLIW architecture Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, (124-133)
  66. ACM
    Mattson P, Dally W, Rixner S, Kapasi U and Owens J Communication scheduling Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, (82-92)
  67. ACM
    Mattson P, Dally W, Rixner S, Kapasi U and Owens J (2000). Communication scheduling, ACM SIGPLAN Notices, 35:11, (82-92), Online publication date: 1-Nov-2000.
  68. Sánchez J and González A Instruction scheduling for clustered VLIW architectures Proceedings of the 13th international symposium on System synthesis, (41-46)
  69. Sánchez J and González A The Effectiveness of Loop Unrolling for Modulo Scheduling in Clustered VLIW Architectures Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
  70. Suga A and Matsunami K (2000). Introducing the FR500 Embedded Microprocessor, IEEE Micro, 20:4, (21-27), Online publication date: 1-Jul-2000.
  71. ACM
    Chen G and Smith M Reorganizing global schedules for register allocation Proceedings of the 13th international conference on Supercomputing, (408-416)
  72. Rau B, Kathail V and Aditya S (2018). Machine-Description Driven Compilers for EPIC and VLIW Processors, Design Automation for Embedded Systems, 4:2-3, (71-118), Online publication date: 1-Mar-1999.
  73. Banerjia S, Sathaye S, Menezes K and Conte T (1998). MPS, IEEE Transactions on Computers, 47:12, (1382-1397), Online publication date: 1-Dec-1998.
  74. Özer E, Banerjia S and Conte T Unified assign and schedule Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, (308-315)
  75. Nystrom E and Eichenberger A Effective cluster assignment for modulo scheduling Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, (103-114)
  76. ACM
    Chang P, Mahlke S, Chen W, Warter N and Hwu W IMPACT 25 years of the international symposia on Computer architecture (selected papers), (408-417)
  77. ACM
    Gabbay F and Mendelson A (1998). The effect of instruction fetch bandwidth on value prediction, ACM SIGARCH Computer Architecture News, 26:3, (272-281), Online publication date: 1-Jun-1998.
  78. Gabbay F and Mendelson A The effect of instruction fetch bandwidth on value prediction Proceedings of the 25th annual international symposium on Computer architecture, (272-281)
  79. Liao S, Devadas S, Keutzer K, Tjiang S and Wang A (1998). Code Optimization Techniques in Embedded DSP Microprocessors, Design Automation for Embedded Systems, 3:1, (59-73), Online publication date: 1-Jan-1998.
  80. Gabbay F and Mendelson A Can program profiling support value prediction? Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, (270-280)
  81. Koseki A, Fukazawa Y and Komatsu H A Register Allocation Technique Using Register Existence Graph Proceedings of the international Conference on Parallel Processing, (404-411)
  82. Moreno J, Moudgill M, Ebcioğlu K, Altman E, Hall C, Miranda R, Chen S and Polyak A (2019). Simulation/evaluation environment for a VLIW processor architecture, IBM Journal of Research and Development, 41:3, (287-302), Online publication date: 1-May-1997.
  83. Abraham S, Kathail V and Deitrich B Meld scheduling Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture, (308-321)
  84. ACM
    Mudge T (2018). Strategic directions in computer architecture, ACM Computing Surveys (CSUR), 28:4, (671-678), Online publication date: 1-Dec-1996.
  85. Franklin M and Sohi G (1996). ARB, IEEE Transactions on Computers, 45:5, (552-571), Online publication date: 1-May-1996.
  86. Chang M and Lai F (1996). Efficient Exploitation of Instruction-Level Parallelism for Superscalar Processors by the Conjugate Register File Scheme, IEEE Transactions on Computers, 45:3, (278-293), Online publication date: 1-Mar-1996.
  87. ACM
    Koseki A, Komatsu H and Fukazawa Y A register allocation technique using guarded PDG Proceedings of the 10th international conference on Supercomputing, (270-277)
  88. Natarajan B and Schlansker M Spill-free parallel scheduling of basic blocks Proceedings of the 28th annual international symposium on Microarchitecture, (119-124)
  89. Schlansker M and Kathail V Critical path reduction for scalar programs Proceedings of the 28th annual international symposium on Microarchitecture, (57-69)
  90. Luk C Memory disambiguation for general-purpose applications Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
  91. Moon S and Carson S (1995). Generalized Multiway Branch Unit for VLIW Microprocessors, IEEE Transactions on Parallel and Distributed Systems, 6:8, (850-862), Online publication date: 1-Aug-1995.
  92. Inoue K, Shintani Y, Kamada E and Shonai T (1995). A Performance and Cost Analysis of Applying Superscalar Method to Mainframe Computers, IEEE Transactions on Computers, 44:7, (891-902), Online publication date: 1-Jul-1995.
  93. ACM
    Liao S, Devadas S, Keutzer K, Tjiang S and Wang A Storage assignment to decrease code size Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation, (186-195)
  94. ACM
    Liao S, Devadas S, Keutzer K, Tjiang S and Wang A (2019). Storage assignment to decrease code size, ACM SIGPLAN Notices, 30:6, (186-195), Online publication date: 1-Jun-1995.
  95. Lavery D, Chang P, Mahlke S, Chen W and Hwu W (1995). The Importance of Prepass Code Scheduling for Superscalar and Superpipelined Processors, IEEE Transactions on Computers, 44:3, (353-370), Online publication date: 1-Mar-1995.
  96. ACM
    Liao S, Devadas S, Keutzer K, Tjiang S and Wang A Code optimization techniques for embedded DSP microprocessors Proceedings of the 32nd annual ACM/IEEE Design Automation Conference, (599-604)
  97. Li C, Chen S, Fuchs W and Hwu W (1995). Compiler-Based Multiple Instruction Retry, IEEE Transactions on Computers, 44:1, (35-46), Online publication date: 1-Jan-1995.
  98. ACM
    Schlansker M, Kathail V and Anik S Height reduction of control recurrences for ILP processors Proceedings of the 27th annual international symposium on Microarchitecture, (40-51)
  99. Chiueh T Sunder Proceedings of the 1994 ACM/IEEE conference on Supercomputing, (488-496)
  100. Goossens G, Bolsens I, Lin B and Catthoor F Design of heterogeneous ICs for mobile and personal communication systems Proceedings of the 1994 IEEE/ACM international conference on Computer-aided design, (524-531)
  101. Chen S, Fuchs W and Hwu W An Analytical Approach to Scheduling Code for Superscalar and VLIW Architectures Proceedings of the 1994 International Conference on Parallel Processing - Volume 01, (285-292)
  102. ACM
    Ebcioglu K, Groves R, Kim K, Silberman G and Ziv I VLIW compilation techniques in a superscalar environment Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation, (36-48)
  103. ACM
    Ebcioglu K, Groves R, Kim K, Silberman G and Ziv I (2019). VLIW compilation techniques in a superscalar environment, ACM SIGPLAN Notices, 29:6, (36-48), Online publication date: 1-Jun-1994.
  104. Lanneer D, Cornero M, Goossens G and De Man H Data routing Proceedings of the 7th international symposium on High-level synthesis, (17-22)
  105. Malloy B, Lloyd E and Soffa M (2019). Scheduling DAG's for Asynchronous Multiprocessor Execution, IEEE Transactions on Parallel and Distributed Systems, 5:5, (498-508), Online publication date: 1-May-1994.
  106. Nakatani T and Ebcioglu K (2019). Making Compaction-Based Parallelization Affordable, IEEE Transactions on Parallel and Distributed Systems, 4:9, (1014-1029), Online publication date: 1-Sep-1993.
  107. ACM
    Warter N, Mahlke S, Hwu W and Rau B Reverse If-Conversion Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation, (290-299)
  108. ACM
    Warter N, Mahlke S, Hwu W and Rau B (2019). Reverse If-Conversion, ACM SIGPLAN Notices, 28:6, (290-299), Online publication date: 1-Jun-1993.
  109. ACM
    Chatterjee S, Gilbert J, Schreiber R and Teng S Automatic array alignment in data-parallel programs Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, (16-28)
  110. O'Donnell C High Level Compiling for Low Level Machines Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, (309-320)
  111. Menez G, Auguin M, Boeri F and Carrière C Contribution of Compilation Techniques to the Synthesis of Dedicated VLIW Architectures Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, (217-228)
  112. de Dinechin B StaCS Proceedings of the 25th annual international symposium on Microarchitecture, (282-291)
  113. Sweany P and Beaty S Dominator-path scheduling Proceedings of the 25th annual international symposium on Microarchitecture, (260-263)
  114. Kiyohara T and Gyllenhaal J Code scheduling for VLIW/superscalar processors with limited register files Proceedings of the 25th annual international symposium on Microarchitecture, (197-201)
  115. ACM
    de Dinechin B (1992). StaCS, ACM SIGMICRO Newsletter, 23:1-2, (282-291), Online publication date: 10-Dec-1992.
  116. ACM
    Sweany P and Beaty S (1992). Dominator-path scheduling, ACM SIGMICRO Newsletter, 23:1-2, (260-263), Online publication date: 10-Dec-1992.
  117. ACM
    Kiyohara T and Gyllenhaal J (1992). Code scheduling for VLIW/superscalar processors with limited register files, ACM SIGMICRO Newsletter, 23:1-2, (197-201), Online publication date: 10-Dec-1992.
  118. Watts T, Soffa M and Gupta R Techniques for integrating parallelizing transformations and compiler-based scheduling methods Proceedings of the 1992 ACM/IEEE conference on Supercomputing, (830-839)
  119. Mahlke S, Chen W, Gyllenhaal J and Hwu W Compiler code transformations for superscalar-based high performance systems Proceedings of the 1992 ACM/IEEE conference on Supercomputing, (808-817)
  120. ACM
    Baker H (1992). Inlining semantics for subroutines which are recursive, ACM SIGPLAN Notices, 27:12, (39-46), Online publication date: 1-Dec-1992.
  121. Menez G, Auguin M, Boéri F and Carrière C A partitioning algorithm for system-level synthesis Proceedings of the 1992 IEEE/ACM international conference on Computer-aided design, (482-487)
  122. ACM
    Fisher J and Freudenberger S (2019). Predicting conditional branch directions from previous runs of a program, ACM SIGPLAN Notices, 27:9, (85-95), Online publication date: 1-Sep-1992.
  123. ACM
    Fisher J and Freudenberger S Predicting conditional branch directions from previous runs of a program Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, (85-95)
  124. ACM
    Silberman G and Ebcioğlu K An architectural framework for migration from CISC to higher performance platforms Proceedings of the 6th international conference on Supercomputing, (198-215)
  125. ACM
    Wallace D Low level scheduling using the hierarchical task graph Proceedings of the 6th international conference on Supercomputing, (72-81)
  126. ACM
    De Gloria A and Faraboschi P Instruction-level parallelism in Prolog Proceedings of the 19th annual international symposium on Computer architecture, (224-233)
  127. ACM
    Keckler S and Dally W Processor coupling Proceedings of the 19th annual international symposium on Computer architecture, (202-213)
  128. ACM
    De Gloria A and Faraboschi P (2019). Instruction-level parallelism in Prolog, ACM SIGARCH Computer Architecture News, 20:2, (224-233), Online publication date: 1-May-1992.
  129. ACM
    Keckler S and Dally W (2019). Processor coupling, ACM SIGARCH Computer Architecture News, 20:2, (202-213), Online publication date: 1-May-1992.
  130. Waldspurger C, Hogg T, Huberman B, Kephart J and Stornetta W (2019). Spawn, IEEE Transactions on Software Engineering, 18:2, (103-117), Online publication date: 1-Feb-1992.
  131. ACM
    Baker H (1991). Precise instruction scheduling without a precise machine model, ACM SIGARCH Computer Architecture News, 19:6, (4-8), Online publication date: 1-Dec-1991.
  132. ACM
    Chang P, Chen W, Mahlke S and Hwu W Comparing static and dynamic code scheduling for multiple-instruction-issue processors Proceedings of the 24th annual international symposium on Microarchitecture, (25-33)
  133. ACM
    Bird P and Pleban U A semantics-directed partitioning of a processor architecture Proceedings of the 1991 ACM/IEEE conference on Supercomputing, (702-709)
  134. ACM
    Corporaal H and Mulder H MOVE: a framework for high-performance processor design Proceedings of the 1991 ACM/IEEE conference on Supercomputing, (692-701)
  135. ACM
    Bakewell H, Quammen D and Wang P (2019). Mapping concurrent programs to VLIW processors, ACM SIGPLAN Notices, 26:7, (21-27), Online publication date: 1-Jul-1991.
  136. ACM
    Chang P, Mahlke S, Chen W, Warter N and Hwu W (1991). IMPACT, ACM SIGARCH Computer Architecture News, 19:3, (266-275), Online publication date: 1-May-1991.
  137. ACM
    Chang P, Mahlke S, Chen W, Warter N and Hwu W IMPACT Proceedings of the 18th annual international symposium on Computer architecture, (266-275)
  138. ACM
    Bakewell H, Quammen D and Wang P Mapping concurrent programs to VLIW processors Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming, (21-27)
  139. Gupta R and Soffa M (1991). Compile-Time Techniques for Improving Scalar Access Performance in Parallel Memories, IEEE Transactions on Parallel and Distributed Systems, 2:2, (138-148), Online publication date: 1-Apr-1991.
  140. ACM
    Keller W (1991). Automated generation of code using backtracking parsers for attribute grammars, ACM SIGPLAN Notices, 26:2, (109-117), Online publication date: 2-Jan-1991.
  141. Berlin A and Weise D (2019). Compiling Scientific Code Using Partial Evaluation, Computer, 23:12, (25-37), Online publication date: 1-Dec-1990.
  142. Shih L Microprogramming heritage of RISC design Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (275-280)
  143. Shieh J and Papachristou C An instruction reoderer for pipelined computers Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (135-142)
  144. Kenyon P, Agrawal P and Seth S High-level microprogramming Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (97-106)
  145. Nicolau A and Potasman R Realistic scheduling Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (69-79)
  146. Gupta R A fine-grained MIMD architecture based upon register channels Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, (28-37)
  147. Gupta R, Epstein M and Whelan M The design of a RISC based multiprocessor chip Proceedings of the 1990 ACM/IEEE conference on Supercomputing, (920-929)
  148. Colwell R, Hall W, Joshi C, Papworth D, Rodman P and Tornes J Architecture and implementation of a VLIW supercomputer Proceedings of the 1990 ACM/IEEE conference on Supercomputing, (910-919)
  149. Heggy B and Soffa M Architectural support for register allocation in the presence of aliasing Proceedings of the 1990 ACM/IEEE conference on Supercomputing, (730-739)
  150. Tirumalai P, Lee M and Schlansker M Parallelization of loops with exits on pipelined architectures Proceedings of the 1990 ACM/IEEE conference on Supercomputing, (200-212)
  151. ACM
    Berlin A Partial evaluation applied to numerical computation Proceedings of the 1990 ACM conference on LISP and functional programming, (139-150)
  152. ACM
    Gloria A (1990). VISA: A variable instruction set architecture, ACM SIGARCH Computer Architecture News, 18:2, (76-84), Online publication date: 1-May-1990.
  153. Gupta R and Soffa M (2019). Region Scheduling, IEEE Transactions on Software Engineering, 16:4, (421-431), Online publication date: 1-Apr-1990.
  154. Patel M A design representation for high level synthesis Proceedings of the conference on European design automation, (374-379)
  155. ACM
    Gupta R (2019). Employing register channels for the exploitation of instruction level parallelism, ACM SIGPLAN Notices, 25:3, (118-127), Online publication date: 1-Mar-1990.
  156. ACM
    Gupta R Employing register channels for the exploitation of instruction level parallelism Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming, (118-127)
  157. Ryder B, Landi W and Pande H (2019). Profiling an Incremental Data Flow Analysis Algorithm, IEEE Transactions on Software Engineering, 16:2, (129-140), Online publication date: 1-Feb-1990.
  158. ACM
    Palem K and Simons B Scheduling time-critical instructions on RISC machines Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, (270-280)
  159. Pollock L and Soffa M (2019). An Incremental Version of Iterative Data Flow Analysis, IEEE Transactions on Software Engineering, 15:12, (1537-1549), Online publication date: 1-Dec-1989.
  160. ACM
    Dietz H, Schwederski T, O'Keefe M and Zaafrani A Static synchronization beyond VLIW Proceedings of the 1989 ACM/IEEE conference on Supercomputing, (416-425)
  161. ACM
    Shieh J and Papachristou C (2019). On reordering instruction streams for pipelined computers, ACM SIGMICRO Newsletter, 20:3, (199-206), Online publication date: 1-Aug-1989.
  162. ACM
    Mulder H and Portier R (2019). Cost-effective design of application specific VLIW processors using the SCARCE framework, ACM SIGMICRO Newsletter, 20:3, (35-42), Online publication date: 1-Aug-1989.
  163. ACM
    Shieh J and Papachristou C On reordering instruction streams for pipelined computers Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture, (199-206)
  164. ACM
    Mulder H and Portier R Cost-effective design of application specific VLIW processors using the SCARCE framework Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture, (35-42)
  165. ACM
    Hwu W and Chang P (1989). Achieving high instruction cache performance with an optimizing compiler, ACM SIGARCH Computer Architecture News, 17:3, (242-251), Online publication date: 1-Jun-1989.
  166. ACM
    Jouvelot P and Dehbonei B A unified semantic approach for the vectorization and parallelization of generalized reductions Proceedings of the 3rd international conference on Supercomputing, (186-194)
  167. ACM
    Chang P and Hwu W Control flow optimization for supercomputer scalar processing Proceedings of the 3rd international conference on Supercomputing, (145-153)
  168. ACM
    Hwu W and Chang P Achieving high instruction cache performance with an optimizing compiler Proceedings of the 16th annual international symposium on Computer architecture, (242-251)
  169. ACM
    Gupta R The fuzzy barrier: a mechanism for high speed synchronization of processors Proceedings of the third international conference on Architectural support for programming languages and operating systems, (54-63)
  170. ACM
    Dehnert J, Hsu P and Bratt J Overlapped loop support in the Cydra 5 Proceedings of the third international conference on Architectural support for programming languages and operating systems, (26-38)
  171. ACM
    Gupta R (1989). The fuzzy barrier: a mechanism for high speed synchronization of processors, ACM SIGARCH Computer Architecture News, 17:2, (54-63), Online publication date: 1-Apr-1989.
  172. ACM
    Dehnert J, Hsu P and Bratt J (1989). Overlapped loop support in the Cydra 5, ACM SIGARCH Computer Architecture News, 17:2, (26-38), Online publication date: 1-Apr-1989.
  173. Rau B, Yen D, Yen W and Towie R (2019). The Cydra 5 Departmental Supercomputer, Computer, 22:1, (12-26, 28-30, 32-35), Online publication date: 1-Jan-1989.
  174. Daper J Compiling on horizon Proceedings of the 1988 ACM/IEEE conference on Supercomputing, (51-52)
  175. ACM
    Ebcioğlu K (1988). A compilation technique for software pipelining of loops with conditional jumps, ACM SIGMICRO Newsletter, 19:3, (36-41), Online publication date: 1-Sep-1988.
  176. Colwell R, Nix R, O'Donnell J, Papworth D and Rodman P (2019). A VLIW architecture for a trace Scheduling Compiler, IEEE Transactions on Computers, 37:8, (967-979), Online publication date: 1-Aug-1988.
  177. Hwu W and Chang P Exploiting parallel microprocessor microarchitectures with a compiler code generator Proceedings of the 15th Annual International Symposium on Computer architecture, (45-53)
  178. Lewis D A programmable hardware accelerator for compiled electrical simulation Proceedings of the 25th ACM/IEEE Design Automation Conference, (172-177)
  179. ACM
    Hanen C Optimizing horizontal microprograms for vectorial loops with timed petri nets Proceedings of the 2nd international conference on Supercomputing, (466-477)
  180. ACM
    Hwu W and Chang P (1988). Exploiting parallel microprocessor microarchitectures with a compiler code generator, ACM SIGARCH Computer Architecture News, 16:2, (45-53), Online publication date: 17-May-1988.
  181. ACM
    Natour I On the control dependence in the program dependence graph Proceedings of the 1988 ACM sixteenth annual conference on Computer science, (510-519)
  182. Chandross J, Jagadish H and Asthana A The trap as a control flow mechanism Proceedings of the 21st annual workshop on Microprogramming and microarchitecture, (50-52)
  183. ACM
    Ebcioğlu K A compilation technique for software pipelining of loops with conditional jumps Proceedings of the 20th annual workshop on Microprogramming, (69-79)
  184. ACM
    Colwell R, Nix R, O'Donnell J, Papworth D and Rodman P (1987). A VLIW architecture for a trace scheduling compiler, ACM SIGARCH Computer Architecture News, 15:5, (180-192), Online publication date: 1-Nov-1987.
  185. ACM
    Colwell R, Nix R, O'Donnell J, Papworth D and Rodman P A VLIW architecture for a trace scheduling compiler Proceedings of the second international conference on Architectual support for programming languages and operating systems, (180-192)
  186. ACM
    Colwell R, Nix R, O'Donnell J, Papworth D and Rodman P (1987). A VLIW architecture for a trace scheduling compiler, ACM SIGPLAN Notices, 22:10, (180-192), Online publication date: 1-Oct-1987.
  187. ACM
    Colwell R, Nix R, O'Donnell J, Papworth D and Rodman P (1987). A VLIW architecture for a trace scheduling compiler, ACM SIGOPS Operating Systems Review, 21:4, (180-192), Online publication date: 1-Oct-1987.
Contributors
  • Yale University

Recommendations

Reviews

Frank Lawrence Friedman

This book represents the publication of Ellis's PhD dissertation which received the 1985 ACM Doctoral Dissertation Award (the fourth such award in the series initiated by the ACM in 1982). The importance of the dissertation is perhaps best summarized by John White, the Chairman of the 1985 ACM Doctoral Dissertation Award Subcommittee, in the Series Foreword presented at the beginning of the book book: John Ellis was judged to have made an outstanding contribution to the problem of exploiting the parallelism available in emerging multiprocessor architectures. Ellis's thesis is that high-quality compilers can be written for VLIW (Very Long Instruction Word) computers. These machines drive multiple parallel RISCs (Reduced Instruction Set Computers) with a single instruction stream. Each instruction, however, is long enough to command all of the RISCs at once. Exploiting the parallelism available in VLIW machines can be very difficult if done by hand. Thus, Ellis developed a compiler that incorporates a number of optimizations based on trace scheduling and memory reference and memory bank disambiguation. Trace scheduling increases parallelism by scheduling several basic blocks at once. The compiler increases parallelism further by unrolling loops and by using disambiguation algorithms to tell at least some of the time when vector and bank references cannot collide. Very Long Instruction Word (VLIW) computers are reduced-instruction-set machines with a large number of parallel, pipelined functional units, but a single thread of control. They offer much promise for order-of-magnitude increases in processing speed, but are impossible to program without a high-level language compiler. By building a working compiler for such languages for use in compiling ordinary scientific programs, Ellis has demonstrated the practicality of combining old and new compiling technology to successfully address the problem of efficient use of parallel architectures. Ellis's compiler uses several new compilation techniques, such as trace scheduling (to identify more parallelism), memory-reference and memory-bank disambiguation (to increase memory bandwith), and new code-generation algorithms to build his compiler. The dissertation includes an in-depth discussion of these techniques and algorithms and a discussion of the results of preliminary experiments testing the compiler and various aspects of VLIW architectures. The results show that for many scientific applications, VLIW architectures may be built which can achieve an order-of-mag- nitude improvement in efficiency over current machines. The ability to build compiler systems that produce code for the efficient use of parallel architectures is a critical and largely missing link in the quest to successfully use these architectures in the variety of application domains for which they are intended. The book does indeed make an outstanding contribution, and is must reading for computer researchers and practitioners involved in the compiling and parallel architecture areas. In fact, owing largely to the beauty of the prose, the painstaking attention to clear, concise statements of motivation, goals, methods, and results, this thesis belongs in every computer scientist's library.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.