skip to main content
10.1145/223982.224439acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free Access

Next cache line and set prediction

Authors Info & Claims
Published:01 May 1995Publication History

ABSTRACT

Accurate instruction fetch and branch prediction is increasingly important on today's wide-issue architectures. Fetch prediction is the process of determining the next instruction to request from the memory subsystem. Branch prediction is the process of predicting the likely out-come of branch instructions. Several researchers have proposed very effective fetch and branch prediction mechanisms including branch target buffers (BTB) that store the target addresses of taken branches. An alternative approach fetches the instruction following a branch by using an index into the cache instead of a branch target address. We call such an index a next cache line and set (NLS) predictor. A NLS predictor is a pointer into the instruction cache, indicating the target instruction of a branch.In this paper we examine the use of NLS predictors for efficient and accurate fetch and branch prediction. Previous studies associated each NLS predictor with a cache line and provided only one-bit conditional branch predictors. Our study examines the use of NLS predictors with highly accurate two-level correlated conditional branch architectures. We examine the performance of decoupling the NLS predictors from the cache line and storing them in a separate tag-less memory buffer. Our results show that the decoupled architecture performs better than associating the NLS predictors with the cache line, that the NLS architecture benefits from reduced cache miss rates, and it is particularly effective for programs containing many branches. We also provide an in-depth comparison between the NLS and BTB architectures, showing that the NLS architecture is a competitive alternative to the BTB design.

References

  1. 1.Brian Bray and M.J. Flynn. Strategies for branch target buffers. in 24th Annual International Symposium and Workshop on Microprogramming, pages 42-49. ACM, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.Brad Calder and Dirk Grunwald. Fast & accurate instruction fetch and branch prediction. In 21 stAnnual International Symposium of Computer Architecture, pages 2-11. ACM, April 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.Peter Yan-Tek Hsu. Designing the TFP microprocessor. IEEE Micro, 14(2):23-33, April 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.Wen-mei W. Hwu and Pohua P. Chang. Achieving high instruction cache performance with an optimizing compiler. In 16th Annual International Symposium on Computer Architecture, pages 242-251. ACM, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.Mike Johnson. Superscalar Microprocessor Design. Innovative Technology. Prentice-Hall. Inc., Englewood Cliffs, NJ, 1991.Google ScholarGoogle Scholar
  6. 6.David R. Kaeli and Philip G. Emma. Branch history table prediction of moving target branches due to subroutine retums. In 18th Annual International Symposium of Computer Architecture, pages 34-42. ACM, May 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.Johnny K. F. Lee and Alan Jay Smith. Branch prediction strategies and branch target buffer design. IEEE Computer, pages 6-22, January 1984.Google ScholarGoogle Scholar
  8. 8.Scott McFading. Program optimization for instruction caches. In Proceedings of the 3rd Symposium on Architectural Support for Programming Languages and Operating Systems, pages 183-191. ACM, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.Scott McFading. Combining branch predictors. TN 36, DEC- WRL, June 1993.Google ScholarGoogle Scholar
  10. 10.Scott McFading and John Hennessy. Reducing the cost of branches. In 13th Annual International Symposium of Computer Architecture, pages 396-403. ACM, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.Johannes M. Mulder, Nhon T. Quach, and Michael J. Flynn. An area model for on-chip memories and its application. IEEE Journal of Solid-State Circuits, 26(2):98-105, February 1991.Google ScholarGoogle ScholarCross RefCross Ref
  12. 12.S.-T. Pan, K. So, and J. T. Rahmeh. Improving the accuracy of dynamic branch prediction using branch correlation. In Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 76- 84, Boston, Mass., October 1992. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.Chris Perleberg and Alan Jay Smith. Branch target buffer design and optimization. IEEE Transactions on Computers, 42(4):396-4 12, April 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.Karl Pettis and Robert C. Hansen. Profile guided code positioning. In Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, pages 16-27. ACM, June 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.J. E. Smith. A study of branch prediction strategies. In 8th Annual International Symposium of Computer Architecture, pages 135-148. ACM, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.S. Peter Song, Marvin Denman, and Joe Chang. The PowerPC 604 RISC microprocessor. IEEE Micro, 14(5):8-17, October 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.Amitabh Srivastava and Alan Eustace. ATOM: A system for building customized program analysis tools. In 1994 Programming Language Design and Implementation, pages 196-205. ACM, June 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.Simon C. Steely and David J. Sager. Next line prediction apparatus for a pipelined computer system. US. Patent #5,283,873, Feb. 1994.Google ScholarGoogle Scholar
  19. 19.Steven J. E. Wilton and Norman P. Jouppi. An enhanced access and cycle time model for on-chip caches. WRL Report 93/5, DEC Western Research Lab, 1993.Google ScholarGoogle Scholar
  20. 20.Tse-Yu Yeh and Yale N. Patt. Alternative implementations of two-level adaptive branch predictions. In 19th Annual International Symposium of Computer Architecture, pages 124-134, Gold Coast, Australia, May 1992. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.Tse-Yu Ueh and Yale/q. Patt. A comprehenslve lnstructlon fetch mechanism for a processor supporting speculative execution. In 25th Annual International Symposium on Microarchitecture, pages 129-139, Portland, Or, December 1992. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22.Tse-Yu Yeh and Yale N. Patt. A comparison of dynamic branch predictors that use two levels of branch history. In 20th Annual International Symposium on Computer Architecture, pages 257-266, San Diego, CA, May 1993. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Next cache line and set prediction

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture
              July 1995
              426 pages
              ISBN:0897916980
              DOI:10.1145/223982
              • cover image ACM SIGARCH Computer Architecture News
                ACM SIGARCH Computer Architecture News  Volume 23, Issue 2
                Special Issue: Proceedings of the 22nd annual international symposium on Computer architecture (ISCA '95)
                May 1995
                412 pages
                ISSN:0163-5964
                DOI:10.1145/225830
                Issue’s Table of Contents

              Copyright © 1995 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 May 1995

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              Overall Acceptance Rate543of3,203submissions,17%

              Upcoming Conference

              ISCA '24

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader