skip to main content
article
Free Access

Micropipelines

Published:01 June 1989Publication History
Skip Abstract Section

Abstract

The pipeline processor is a common paradigm for very high speed computing machinery. Pipeline processors provide high speed because their separate stages can operate concurrently, much as different people on a manufacturing assembly line work concurrently on material passing down the line. Although the concurrency of pipeline processors makes their design a demanding task, they can be found in graphics processors, in signal processing devices, in integrated circuit components for doing arithmetic, and in the instruction interpretation units and arithmetic operations of general purpose computing machinery.

Because I plan to describe a variety of pipeline processors, I will start by suggesting names for their various forms. Pipeline processors, or more simply just pipelines, operate on data as it passes along them. The latency of a pipeline is a measure of how long it takes a single data value to pass through it. The throughput rate of a pipeline is a measure of how many data values can pass through it per unit time.

Pipelines both store and process data; the storage elements and processing logic in them alternate along their length. I will describe pipelines in their complete form later, but first I will focus on their storage elements alone, stripping away all processing logic. Stripped of all processing logic, any pipeline acts like a series of storage elements through which data can pass.

Pipelines can be clocked or event-driven, depending on whether their parts act in response to some widely-distributed external clock, or act independently whenever local events permit. Some pipelines are inelastic; the amount of data in them is fixed. The input rate and the output rate of an inelastic pipeline must match exactly. Stripped of any processing logic, an inelastic pipeline acts like a shift register. Other pipelines are elastic; the amount of data in them may vary. The input rate and the output rate of an elastic pipeline may differ momentarily because of internal buffering. Stripped of all processing logic, an elastic pipeline becomes a flow-through first-in-first-out memory, or FIFO. FIFOs may be clocked or event-driven; their important property is that they are elastic.

I assign the name micropipeline to a particularly simple form of event-driven elastic pipeline with or without internal processing. The micro part of this name seems appropriate to me because micropipelines contain very simple circuitry, because micropipelines are useful in very short lengths, and because micropipelines are suitable for layout in microelectronic form.

I have chosen micropipelines as the subject of this lecture for three reasons. First, micropipelines are simple and easy to understand. I believe that simple ideas are best, and I find beauty in the simplicity and symmetry of micropipelines. Second, I see confusion surrounding the design of FIFOs. I offer this description of micropipelines in the hope of reducing some of that confusion.

The third reason I have chosen my subject addresses the limitations imposed on us by the clocked-logic conceptual framework now commonly used in the design of digital systems. I believe that this conceptual framework or mind set masks simple and useful structures like micropipelines from our thoughts, structures that are easy to design and apply given a different conceptual framework. Because micropipelines are event-driven, their simplicity is not available within the clocked-logic conceptual framework. I offer this description of micropipelines in the hope of focusing attention on an alternative transition-signalling conceptual framework.

We need a new conceptual framework because the complexity of VLSI technology has now reached the point where design time and design cost often exceed fabrication time and fabrication cost. Moreover, most systems designed today are monolithic and resist mid-life improvement. The transition-signalling conceptual framework offers the opportunity to build up complex systems by hierarchical composition from simpler pieces. The resulting systems are easily modified. I believe that the transition-signalling conceptual framework has much to offer in reducing the design time and cost of complex systems and increasing their useful lifetime. I offer this description of micropipelines as an example of the transition-signalling conceptual framework.

Until recently only a hardy few used the transition-signalling conceptual framework for design because it was too hard. It was nearly impossible to design the small circuits of 10 to 100 transistors that form the elemental building blocks from which complex systems are composed. Moreover, it was difficult to prove anything about the resulting compositions. In the past five years, however, much progress has been made on both fronts. Charles Molnar and his colleagues at Washington University have developed a simple way to design the small basic building blocks [9]. Martin Rem's "VLSI Club" at the Technical University of Eindhoven has been working effectively on the mathematics of event-driven systems [6, 10, 11, 19]. These emerging conceptual tools now make transition signalling a lively candidate for widespread use.

References

  1. 1 Chaney, T.J., and Mo lnar, C.E. Anomalous behavior of synchronizer and arbiter circuits, i EEI:: Trans. Comput. C-22, 4 (Apr. 1973), 421- 422.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 Clark, W.A. Macrom~dular computer systems. In Proceedings of the Spring Joint Computer Conference, AFIPS, April 1967,Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 Clark, W.A., and Molnar. C.E. Macromodular computer systems. Computers in Biomedi:al Research, Vol. 4, R. Stacy and B. Waxman, Eds., Academic Pres:, New York, 1974, 45-85.Google ScholarGoogle Scholar
  4. 4 Dally, W.J., Seitz, C.l_. Deadlock-free message routing in multiprocessor interconnectic n networks IEE~ Trans. Comput. 36, 5 (May 1987), 547-553. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 Dill, D.L., Nowick, S.M., and Sproull, R.F. Specification and automatic verification of self.timed queues. Computer Systems Laboratory Report, Stanford University, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 Ebergen, J.C. Translating programs into delay-insensitive circuits. Ph.D. dissertation, EJndhoven University of Technology, 1987.Google ScholarGoogle Scholar
  7. 7 Levy, J.V. Buses, the skeleton of computer structures. In Computer Engineering, C.G. Bell, J.C. Mudge, and J.E. McNamara, Eds., Digital Press, 1978.Google ScholarGoogle Scholar
  8. 8 Miller, R.E. "Sequenl ial Circuits", Chapter I0, In Switching Theory, Vol 2, Wiley, NY, 19,35.Google ScholarGoogle Scholar
  9. 9 Molnar, C.E., Fang, 'I'.P., and Rosenberger, F.U. Synthesis of delayinsensitive modules. In Proceedings of the 1985 Chapel Hill Conference on VLSI, H. Fuchs, E~I., Computer Science Press, 1985.Google ScholarGoogle Scholar
  10. 10 Rem, M., van de Snepscheut, J.L.A., and Udding, J.T. Trace theory and the definition of hierarchical components. In Proceedings of the Caltech Conference on VLSI, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  11. 11 Rem, M. Trace theory and systolic computations, in Prpc. PARLE (Parallel Architectures and Languages Europe), Vol 1, J.W. deBakker, A.J. Nijman, and P.C. Treleaven, Eds, Springer-Verlag, 1987, pp. 14- 34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12 Rosenberger, F,U., Molnar, C.E., Chaney, T.J., et al. Q-modules: Locally clocked delay-insensitive modules. IEEE Trans. Comput. 37, 9 (Sept. 1988), 1005-1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13 Seitz, C.L. System Timing. In Introduction to VLSI Systems, C.A. Mead and L.A. Conway, Eds., Addison-Wesley, 1980.Google ScholarGoogle Scholar
  14. 14 Sproull, R.F., and Sutherland, I.E. A clipping divider. FJCC 1968, Thompson Books, Washington, D.C., 765.Google ScholarGoogle Scholar
  15. 15 Sutherland, I.E., and Hodgman, G.W. Reentrant polygon clipping. Commun. ACM 17,1 (Jan. 1974), 32-42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 Sutherland, I.E. Asynchronous queue system, U.S. Patent 4,679,213, July 7, 1987.Google ScholarGoogle Scholar
  17. 17 Sutherland, I.E., Asynchronous first-in-first-out register structure. US Patent Pending.Google ScholarGoogle Scholar
  18. 18 Sutherland, I.E. Asynchronous pipelined data processing system. US Patent pending.Google ScholarGoogle Scholar
  19. 19 Udding, J.T. A formal model for defining assifying delay-insensitive circuits and systems. J. Distrib. Comptg. 1, 1986, 197-2(14.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Micropipelines

        Recommendations

        Reviews

        Harry Frederick Jordan

        Ivan Sutherland's Turing Award lecture is important reading for computer designers. As used in this work, a micropipeline is a powerful combination of the concepts of pipelining, asynchronous sequential logic, and transition signaling. Along with these three main concepts, micropipelines incorporate some aspects of the “programming” method of designing hardware and of using a minimal set of functional modules for control. Sutherland does a good job of presenting the ideas as an integrated design philosophy instead of as a bag of design tricks or as a design toolkit, which is often difficult for authors of hardware design papers. Asynchronous design has been of interest to the computer engineering community since the work of Muller and others on the ILLIAC II computer. Sutherland credits Muller's work [1] as one root of micropipelines (though his name is accidentally misspelled in the references). Another important root is the work of Molnar, Clark, and others on macromodules [2]. Macromodules exploited difference coaxial cable transmission speeds to guarantee data arrival in advance of control signals, while Sutherland employs VLSI layout geometry for the same purpose. Transition signaling is thoroughly mixed with asynchronous design in this paper and is not always properly distinguished as a concept. An example is the praise for transition signaling because it yields the ability to do proofs of functionality, whereas these proof techniques are possible in any form of asynchronous design, whether edges or pulses are used to represent events. The most ubiquitous root, and perhaps the most important, is pipelining. Pipelined control, which is introduced first, reminds me of the control delay style of building computer control units that Hill and Peterson advocate in their 1973 text [3]. Pipelining techniques are becoming more and more important as circuit speeds crowd the speed-of-light limit. They are also being applied at lower and lower levels of design. Systolic arrays that pipeline large operations such as floating-point multiply-add fit naturally with carry-save versions of multipliers. This paper presents a coherent methodology for doing pipelined design from the bottom up, while retaining the ability to modularize functions and have them interface consistently with other functional modules. A cautionary note on the future applications of this methodology is that asynchronous logic, once in the days of ILLIAC II and again with the macromodules, enjoyed a brief period of intense interest and then gave way to synchronous techniques in the interests of enhanced speed with available technology. I hope the happy combination of asynchronous design with transition signaling and pipelining will make asynchronous design a mainline, rather than a peripheral, design technique.

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Communications of the ACM
          Communications of the ACM  Volume 32, Issue 6
          June 1989
          92 pages
          ISSN:0001-0782
          EISSN:1557-7317
          DOI:10.1145/63526
          Issue’s Table of Contents

          Copyright © 1989 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 June 1989

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader