skip to main content
10.5555/998680.1006736acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article

The Vector-Thread Architecture

Authors Info & Claims
Published:02 March 2004Publication History

ABSTRACT

The vector-thread (VT) architectural paradigm unifies the vectorand multithreaded compute models. The VT abstraction providesthe programmer with a control processor and a vector of virtualprocessors (VPs). The control processor can use vector-fetch commandsto broadcast instructions to all the VPs or each VP can usethread-fetches to direct its own control flow. A seamless intermixingof the vector and threaded control mechanisms allows a VT architectureto flexibly and compactly encode application parallelismand locality, and a VT machine exploits these to improve performanceand efficiency. We present SCALE, an instantiation of theVT architecture designed for low-power and high-performance embeddedsystems. We evaluate the SCALE prototype design usingdetailed simulation of a broad range of embedded applications andshow that its performance is competitive with larger and more complexprocessors.

References

  1. {1} T.-C. Chiueh. Multi-threaded vectorization. In ISCA-18, May 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {2} C. R. Jesshope. Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines. Australia Computer Science Communications, 23(4):80-88, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {3} K. Kitagawa, S. Tagaya, Y. Hagihara, and Y. Kanoh. A hardware overview of SX-6 and SX-7 supercomputer. NEC Research & Development Journal, 44(1):2-7, Jan 2003.Google ScholarGoogle Scholar
  4. {4} C. Kozyrakis. Scalable vector media-processors for embedded systems. PhD thesis, University of California at Berkeley, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. {5} C. Kozyrakis and D. Patterson. Overcoming the limitations of conventional vector processors. In ISCA-30, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. {6} C. Kozyrakis, S. Perissakis, D. Patterson, T. Anderson, K. Asanovi¿, N. Cardwell, R. Fromm, J. Golbus, B. Gribstad, K. Keeton, R. Thomas, N. Treuhaft, and K. Yelick. Scalable Processors in the Billion-Transistor Era: IRAM. IEEE Computer, 30(9):75-78, Sept 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. {7} K. Mai, T. Paaske, N. Jayasena, R. Ho, W. Dally, and M. Horowitz. Smart Memories: A modular reconfigurable architecture. In Proc. ISCA 27, pages 161-171, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. {8} S. Rixner, W. Dally, U. Kapasi, B. Khailany, A. Lopez-Lagunas, P. Mattson, and J. Owens. A bandwidth-efficient architecture for media processing. In MICRO-31, Nov 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {9} R. M. Russel. The CRAY-1 computer system. Communications of the ACM, 21(1):63-72, Jan 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. {10} K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. Moore. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In ISCA-30, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. {11} J. E. Smith. Dynamic instruction scheduling and the Astronautics ZS-1. IEEE Computer, 22(7):21-35, July 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {12} G. S. Sohi, S. E. Breach, and T. N. Vijaykumar. Multiscalar processors. In ISCA-22, pages 414-425, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. {13} E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to software: Raw machines. IEEE Computer, 30(9):86-93, Sept 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. {14} J. Wawrzynek, K. Asanovi¿, B. Kingsbury, J. Beck, D. Johnson, and N. Morgan. Spert-II: A vector microprocessor system. IEEE Computer, 29(3):79-86, Mar 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. {15} M. Zhang and K. Asanovi¿. Highly-associative caches for low-power processors. In Kool Chips Workshop, MICRO-33, Dec 2000.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture
    June 2004
    373 pages
    ISBN:0769521436
    • cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 32, Issue 2
      ISCA 2004
      March 2004
      373 pages
      ISSN:0163-5964
      DOI:10.1145/1028176
      Issue’s Table of Contents

    Publisher

    IEEE Computer Society

    United States

    Publication History

    • Published: 2 March 2004

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    ISCA '04 Paper Acceptance Rate31of217submissions,14%Overall Acceptance Rate543of3,203submissions,17%

    Upcoming Conference

    ISCA '24

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader