ABSTRACT
We describe the design issues in an implementation of the ESA/390 architecture based on binary translation to a very long instruction word (VLIW) processor. During binary translation, complex ESA/390 instructions are decomposed into instruction “primitives” which are then scheduled onto a wide-issue machine. The aim is to achieve high instruction level parallelism due to the increased scheduling and optimization opportunities which can be exploited by binary translation software, combined with the efficiency of long instruction word architectures. A further aim is to study the feasibility of a common execution platform for different instruction set architectures, such as ESA/390, RS?6000, AS/400 and the Java Virtual Machine, so that multiple systems can be built around a common execution platform.
- 1.K. Ebcioglu and E. Altman. DAISY: dynamic compilation for 100% architectural compatibility. Research Report RC 20538, IBM TJ. Watson Research Center, Yorktown Heights, NY, 1996.Google Scholar
- 2.K. Ebcioglu, E. R. Altman, and E. Hokenek. A JAVA ILP machine based on fast dynamic compilation. In IEEE MASCOTS International Workshop on Security and Efficiency Aspects of Java, January 1997.Google Scholar
- 3.J. E. Smith, T. Heil, S. Sastry, and T. M. Bezenek. Achieving high performance via co-designed virtual machines. In International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, pages 77- 84, October 1998. Google ScholarDigital Library
- 4.G. M. Silberman and K. Ebcioglu. An architectural framework for migration from CISC to higher performance platforms. In Proc of the 1992 International Conference on Supercomputing, pages 198-215, Washington, DC, July 1992. ACM Press. Google ScholarDigital Library
- 5.G. M. Silberman and K. Ebeioglu. An architectural framework for supporting heterogeneous instruction-set architectures. IEEE Computer, 26(6):39-56, June 1993. Google ScholarDigital Library
- 6.K. Ebcioglu and E. Altman. DAISY: dynamic compilation for 100% architectural compatibility. In Proc. of the 24th Annual International Symposium on Computer Architecture, pages 26-37, Denver, CO, June 1997. ACM. Google ScholarDigital Library
- 7.K. Ebcioglu, E. Altman, S. Sathaye, and M. Gschwind. Execution-based scheduling for VLIW architectures. In Euro- Par '99 Parallel Processing- 5th International Euro-Par Conference, number 1685 in Lecture Notes in Computer Science, pages 1269-1280. Springer Verlag, Berlin, Germany, August 1999. Google ScholarDigital Library
- 8.K. Ebcioglu, E. Altman, S. Sathaye, and M. Gschwind. Optimizations and oracle parallelism with dynamic translation. In Accepted for: Micro-32, Haifa, Israel, November 1999. Google ScholarDigital Library
- 9.C. May. Mimic: A fast S/370 simulator. In Proc. of the ACM SIGPLAN 1987 Symposium on Interpreters and Interpretive Techniques, volume 22 of SIGPLAN Notices, pages 1-13. ACM, June 1987. Google ScholarDigital Library
- 10.S. Kim, S.-M. Moon, K. Ebcioglu, and E, Altman. VLa'ITe: a Java just-in-time compiler for VLIW with fast scheduling and register allocation. To appear.Google Scholar
- 11.P. Hohensee, M. Myszewski, and D. Reese. WABI CPU emulation. In Hot Chips VIII, Palo Alto, CA, 1996.Google Scholar
- 12.M. Gschwind. Method for the deferred materialization of condition code information. Research Disclosures, 1999. (to appear).Google Scholar
- 13.K. Ebcioglu. Some design ideas for a VLIW architecture for sequential-natured softwhre. In M. Cosnard et al., editor, Parallel Processing, pages 3-21. North-Holland, 1988. (Proceedings of IFIP WG 10.3 Working Conference on Parallel Processing).Google Scholar
- 14.S. Adve and K. Gharachorloo. Shared memory consistency models: a tutorial. IEEE Computer, 29(12):66-76, December 1996. Google ScholarDigital Library
- 15.J. Moreno and M. Moudgill. Method and apparatus for reordering of memory operations in a processor. US Patent No. 5,758,051, May 1998.Google Scholar
- 16.E. Boyd and E. Davidson. Hierarchical performance modeling with MACS: a case study of the Convex C-240. In Proc. of the 20th Annual International Symposium on Computer Architecture, pages 203-210, San Diego, CA, May 1993. ACM. Google ScholarDigital Library
- 17.K. Ebciogglu, R. Groves, K. Kim, and G. Silberman. VLIW compilation techniques in a superscalar environment. In Proc. of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation, volume 29 of SIGPILAN Notices, pages 36-48, Orlando, FL, June 1994. ACM. Google ScholarDigital Library
- 18.A. Chemoff, M. Herdeg, R. Hookway, C. Reeve, N. Rubin, T. Tye, S. B. Yadavalli, and J. Yates. FX!32-a profile-directed binary translator. IEEEMicro, 18(2):56-64, March 1998. Google ScholarDigital Library
- 19.M. Rosenblum, S. Herrod, E. Witchel, and A. Gupta. Complete computer simulation: The SimOS approach. IEEE Parallel and Distributed Technology, 3(4):34.-43, Winter 1995. Google ScholarDigital Library
- 20.R. Sites, A. Chemoff, M. Kirk, M. Marks, and S. Robinson. Binary translation. Communications of the ACM, 36(2):69-81, February 1993. Google ScholarDigital Library
- 21.A. Klaiber. The technology behind crusoe processors. Technical report, Transmeta Corp., Santa Clara, CA, January 2000.Google Scholar
- 22.E. Kelly, R. Cmelik, and M. Wing. Memory controller for a microprocessor for detecting a failure of speculation on the physical nature of a component being addressed. US Patent 5832205, November 1998.Google Scholar
- 23.R. Nair and M. Hopkins. Exploiting instruction level parallelism in processors by caching scheduled groups. In Proc of the 24th Annual International Symposium on Computer Architecture, pages 13-25, Denver, CO, June 1997. ACM. Google ScholarDigital Library
- 24.E. Rotenberg, Q. Jacobson, Y. Sazeides, and J. Smith. Trace processors. In Proc. of the 30th Annual International Symposium on Microarchitecture, pages 138-148, Research Triangle Park, NC, December 1997. IEEE Computer Society. Google ScholarDigital Library
- 25.K. Ebcioglu, J. Fritts, S. Kosonocky, M. Gschwind, E. Altman, K. Kailas, and T. Bright. An eight-issue tree-VLIW processor for dynamic binary translation. In Proc. of the 1998 International Conference on Computer Design (ICCD '98) - VLSI in Computers and Processors, pages 488-495, Austin, TX, October 1998. IEEE Computer Society. Google ScholarDigital Library
Index Terms
- Binary translation and architecture convergence issues for IBM system/390
Recommendations
Low overhead dynamic binary translation on ARM
PLDI 2017: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and ImplementationThe ARMv8 architecture introduced AArch64, a 64-bit execution mode with a new instruction set, while retaining binary compatibility with previous versions of the ARM architecture through AArch32, a 32-bit execution mode. Most hardware implementations ...
Low overhead dynamic binary translation on ARM
PLDI '17The ARMv8 architecture introduced AArch64, a 64-bit execution mode with a new instruction set, while retaining binary compatibility with previous versions of the ARM architecture through AArch32, a 32-bit execution mode. Most hardware implementations ...
Instruction translation for an experimental S/390 processor
The IBM™ S/390™ architecture is a complex architecture, which has grown over a long period of time. Typical implementations use microcode to cope with the more complex instructions and facilities of S/390. Current IBM S/390 processors even ...
Comments