- 1.S.Adve, D.Burger, R.Eigenmann, A.Rawsthorne, M.Smith, C.Gebotys, M.Kandemir, D.Lilja, A.Choudhary, J.Fang, P-C.Yew, "The interaction of architecture and compilation technology for high-performance processor design", IEEE Computer Magazine, Vol.30, No.12, pp.51-58, Dec. 1997.]] Google ScholarDigital Library
- 2.U.Banerjee, R.Eigenmann, A.Nicolau, D.Padua, "Automatic program parallelisation", Proc. of the IEEE, invited paper, Vol.81, No.2, pp.211-243, Feb. 1993.]]Google ScholarCross Ref
- 3.E.Brockmeyer, L.Nachtergaele, ECatthoor, J.Bormans, H.De Man, "Low power memory storage and transfer organization for the MPEG-4 full pel motion estimation on a multi media processor", IEEE Trans. on Multi-Media, Vol. 1, No.2, pp.202-216, June 1999.]]Google ScholarDigital Library
- 4.ECatthoor, "Energy-delay efficient data storage and transfer architectures and methodologies: current solutions and remaining problems", special issue on "IEEE CS Annual Workshop on VLSI" (eds. A.Smailagic, R.Brodersen, H.De Man) in Journal of VLSI Signal Processing, Vol.21, No.3, Kluwer, Boston, pp.219-232, July 1999.]]Google Scholar
- 5.ECatthoor, S.Wuytack, E.De Greef, F.Balasa, L.Nachtergaele, A.Vandecappelle, "Custom Memory Management Methodology - Exploration of Memory Organisation for Embedded Multimedia System Design", ISBN 0-7923-8288-9, Kluwer Acad. Publ., Boston, 1998.]] Google ScholarDigital Library
- 6.K.Danckaert, ECatthoor, H.De Man, "System-level memory management for weakly parallel image processing", Proc. EuroPar Conference, Lyon, France, August 1996. "Lecture notes in computer science" series, Vol.1124, Springer Verlag, pp.217-225, 1996.]] Google ScholarDigital Library
- 7.K.Danckaert, ECatthoor, H.De Man, "Platform independent data transfer and storage exploration illustrated on a parallel cavity detection algorithm", Proc. ACM Conf. on Par. and Dist. Proc. Techniques and Applications, PDPTA'99, Vol.III, pp. 1669-1675, Las Vegas NV, June 1999.]]Google Scholar
- 8.E.De Greef, ECatthoor, H.De Man, "Reducing storage size for static control programs mapped onto parallel architectures", presented at Dagstuhl Seminar on Loop Parallelisation, Schloss Dagstuhl, Germany, April 1996.]]Google Scholar
- 9.J.RDiguet, S.Wuytack, ECatthoor, H.De Man, "Formalized methodology for data reuse exploration in hierarchical memory mappings", Proc. IEEE Intnl. Symp. on Low Power Design, Monterey CA, pp.30- 35, Aug. 1997.]] Google ScholarDigital Library
- 10.M.Hall, J.Anderson, S.Amarasinghe, B.Murphy, S.Liao, E.Bugnion, M.Lam, "Maximizing multiprocessor performance with the SUIF compiler", IEEE Computer Magazine, Vol.30, No. 12, pp.84-89, Dec. 1996.]] Google ScholarDigital Library
- 11.C.Kulkarni, ECatthoor, H.De Man, "Cache transformations for low power caching in embedded multimedia processors", Proc. Intnl. Parallel Proc. Symp.(IPPS), Orlando FL, pp.292-297, April 1998.]] Google ScholarDigital Library
- 12.C.Kulkarni, ECatthoor, H.De Man, "Hardware cache optimization for parallel multimedia applications", Proc. EuroPar Conference, Southampton, UK, Sep. 1998.]] Google ScholarDigital Library
- 13.C.Kulkarni, ECatthoor, H.De Man, "Optimizing Graphics Applications: A Data Transfer and Storage Exploration Perspective", accepted for Proc. 1st Wsh. on Media Proc. and DSPs, in IEEE/ACM Intnl. Symp. on Microarchitecture, MICRO-32, Haifa, Israel, Nov. 1999.]]Google Scholar
- 14.G.Lee, RYew, special issue on "Interaction between Compilers and Computer Architectures", IEEE TC on Computer Architecture Newsletter, June 1997.]]Google Scholar
- 15.V.Lefebvre, RFeautrier, "Optimizing storage size for static control programs in automatic parallelizers", Proc. EuroPar Conference, Passau, Germany, August 1997. "Lecture notes in computer science" series, Springer Verlag, Vol. 1300, 1997.]] Google ScholarDigital Library
- 16.M.Miranda, ECatthoor, M.Janssen, H.De Man, "High-level Address Optimisation and Synthesis Techniques for Data-Transfer Intensive Applications", IEEE Trans. on VLSI Systems, Vol.6, No.4, pp.677- 686, Dec. 1998.]] Google ScholarDigital Library
- 17.L.Nachtergaele, T.Gijbels, J.Bormans, ECatthoor, M.Engels, "Power and speed-efficient code transformation of multi-media algorithms for RISC processors", IEEE Intnl. Wsh. on Multi-media Signal Proc., Los Angeles CA, pp. 317-322, Dec. 1998.]]Google Scholar
- 18.RR.Panda, N.D.Dutt, A.Nicolau, "Efficient utilization of scratch-pad memory in embedded processor applications", Proc. 5th ACM/IEEE Europ. Design and Test Conf., Paris, France, pp., March 1997.]] Google ScholarDigital Library
- 19.RR.Panda, H.Nakamura, N.D.Dutt and A.Nicolau, "A data alignment technique for improving cache performance", Proc. IEEE Int. Conf. on Computer Design, Santa Clara CA, pp.587-592, Oct. 1997.]] Google ScholarDigital Library
- 20.RR.Panda, N.D.Dutt, A.Nicolau, "Data cache sizing for embedded processor applications", Proc. 1st ACM/IEEE Design and Test in Europe Conf., Paris, France, pp.925-926, Feb. 1998.]] Google ScholarDigital Library
- 21.RR.Panda, "Memory bank customization and assignment in behavioral synthesis", Proc. IEEE Int. Conf. Comp. Aided Design, Santa Clara CA, pp.477-481, Nov. 1999.]] Google ScholarDigital Library
- 22.F.Quillere, S.Rajopadhye, "Optimizing memory usage in the polyhedral model", Proc. Massively Parallel Computer Systems Conf., April 1998.]]Google Scholar
- 23.RSlock, S.Wuytack, ECatthoor, G.de Jong, "Fast and extensive system-level memory exploration for ATM applications", Proc. lOth ACM/IEEE Intnl. Syrup. on System-Level Synthesis, Antwerp, Belgium, pp.74-81, Sep. 1997.]] Google ScholarDigital Library
- 24.A.Sudarsanam, S.Malik, "Memory bank and register allocation in software synthesis for ASIPs',, Proc. IEEE Int. Conf. Comp. Aided Design, San Jose CA, pp.388-392, Nov. 1995.]] Google ScholarDigital Library
- 25.J. Carter et al. Impulse: Building a Smarter Memory Controller. In 5th Int. Conference on High Performance Computer Architecture, pages 70-9, Jan. 1999.]] Google ScholarDigital Library
- 26.K. Diefendorff and E Dubey. How Multimedia Workloads Will Change Processor Design. IEEE Computer, 30(9):43-45, Sept. 1997.]] Google ScholarDigital Library
- 27.C. Dulong. The IA-64 architecture at work. IEEE Computer, 31 (7):24-32, July 1998.]] Google ScholarDigital Library
- 28.J. Hennessy and D. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 1996.]] Google ScholarDigital Library
- 29.C. Kozyrakis. A Media Enhanced Architecture for Embedded Memory Systems. Technical Report UCB//CSD-99-1059, University of California at Berkeley, 1999.]] Google ScholarDigital Library
- 30.C. Kozyrakis and D. Patterson. A New Direction in Computer Architecture Research. IEEE Computer, 31(11):24-32, Nov. 1998.]] Google ScholarDigital Library
- 31.A. Halambi, E Grun, H. Tomiyama, N. Dutt, and A. Nicolau. Automatic Software Toolkit Generation for Embedded Systems-On-Chip. In Proc. ICVC-99, Seoul, Korea, October 1999.]]Google ScholarCross Ref
- 32.A. Halambi, E Grun, V. Ganesh, A. Khare, N. Dutt, and A. Nicolau. EXPRESSION: A language for architecture exploration through compiler/simulator retargetability. In Proc. DATE-99, March 1999.]] Google ScholarDigital Library
- 33.ER. Panda, N.D. Dutt and A. Nicolau. Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration. Kluwer Academic Publishers, Boston, Massachusetts, 1998.]] Google ScholarDigital Library
- 34.S. Novack, J. Hummel, and A. Nicolau. A Simple Mechanism for Improving the Accuracy and Efficiency of Instruction-Level Disambiguation. In Languages and Compilers for Parallel Computing (August 1995), Springer-Verlag LCNS volume 1033, 1995.]] Google ScholarDigital Library
- 35.J. Hummel, L. Hendren, and A. Nicolau. A Framework for Data Dependence Testing in the Presence of Pointers. In 23rd Annual International Conference on Parallel Processing, 1994.]] Google ScholarDigital Library
- 36.Architectures and Compilers for Embedded Systems (ACES) Laboratory. http ://www.cecs.uci.edu/~aces/.]]Google Scholar
- 37.PROMIS Project Home Page. http://www.cecs.uci.edu/~aces/promis.html.]]Google Scholar
- 38.E Grun, F. Balasa, N. Dutt. Memory Size Estimation for Multimedia Applications. In Proc. CODES/CASHE, 1998.]] Google ScholarDigital Library
- 39.E Grun, A. Halambi, N. Dutt, and A. Nicolau. RTGEN: An algorithm for automatic generation of reservation tables from architectural descriptions. In Proc. ISSS, 1999.]] Google ScholarDigital Library
- 40.S.Wuytack, ECatthoor, H.De Man, "Transforming Set Data Types to Power Optimal Data Structures", Proc. IEEE Intnl. Workshop on Low Power Design, Laguna Beach CA, pp.51-56, April 1995.]] Google ScholarDigital Library
- 41.S.Wuytack, J.L.da Silva, ECatthoor, G.De Jong, C.Ykman, "Memory management for embedded network applications", IEEE Trans. on Comp.-aided Design, Vol.CAD-18, No.5, pp.533-544, May 1999.]]Google ScholarDigital Library
- 42.J.L.da Silva Jr, C.Ykman-Couvreur, M.Miranda, K.Croes, S.Wuytack, G.de Jong, ECatthoor, D.Verkest, ESix, H.De Man, "Efficient System Exploration and Synthesis of Applications with Dynamic Data Storage and Intensive Data Transfer", Proc. 35th ACM/IEEE Design Automation Conf., San Francisco CA, pp.76-81, June 1998.]] Google ScholarDigital Library
- 43.J.L.da Silva Jr, ECatthoor, D.Verkest, H.De Man, "Power Exploration for Dynamic Data Types through Virtual Memory Management Refinement", Proc. IEEE Intnl. Symp. on Low Power Design, Monterey CA, pp.311-316, Aug. 1998.]] Google ScholarDigital Library
- 44.J.L.da Silva Jr, ECatthoor, D.Verkest, H.De Man, "Trading-off Power versus Area through a Parameterizable Model for Virtual Memory Management", IEEE Alessandro Volta Memorial Intnl. Wsh. on Low Power Design (VOLTA), Como, Italy, pp.34-42, March 1999.]] Google ScholarDigital Library
Index Terms
- How to solve the current memory access and data transfer bottlenecks: at the processor architecture or at the compiler level
Recommendations
Cross-Point Architecture for Spin-Transfer Torque Magnetic Random Access Memory
Spin-transfer torque magnetic random access memory (STT-MRAM) is considered as one of the most promising candidates to build up a true universal memory thanks to its fast write/read speed, infinite endurance, and nonvolatility. However, the conventional ...
Overcoming Data Transfer Bottlenecks in DNN Accelerators via Layer-Conscious Memory Managment
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysDeep Neural Networks (DNNs) are rapidly evolving to satisfy the performance and accuracy requirements in many real world applications. The evolution renders DNNs more and more complex in terms of network topology, data sizes and layer types. Currently ...
Spin-transfer torque magnetic random access memory (STT-MRAM)
Special issue on memory technologiesSpin-transfer torque magnetic random access memory (STT-MRAM) is a novel, magnetic memory technology that leverages the base platform established by an existing 100+nm node memory product called MRAM to enable a scalable nonvolatile memory solution for ...
Comments