skip to main content
article
Free Access

Sequentiality and prefetching in database systems

Published:01 September 1978Publication History
Skip Abstract Section

Abstract

Sequentiality of access is an inherent characteristic of many database systems. We use this observation to develop an algorithm which selectively prefetches data blocks ahead of the point of reference. The number of blocks prefetched is chosen by using the empirical run length distribution and conditioning on the observed number of sequential block references immediately preceding reference to the current block. The optimal number of blocks to prefetch is estimated as a function of a number of “costs,” including the cost of accessing a block not resident in the buffer (a miss), the cost of fetching additional data blocks at fault times, and the cost of fetching blocks that are never referenced. We estimate this latter cost, described as memory pollution, in two ways. We consider the treatment (in the replacement algorithm) of prefetched blocks, whether they are treated as referenced or not, and find that it makes very little difference. Trace data taken from an operational IMS database system is analyzed and the results are presented. We show how to determine optimal block sizes. We find that anticipatory fetching of data can lead to significant improvements in system operation.

References

  1. 1 AHo, A.V,, DENNING, P.J., AND ULLMAN, J.D. Principles of optimal page replacement. J. ACM 18, 1 (Jan. 1971), 80-93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 BAER, J.L., AND SAGER, G.R. Dynamic improvement of locality in virtual memory systems. IEEE Trans. Software Eng. SE-2, 1 (March 1976), 54-62.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 BARD, Y. Characterization of program paging in a time-sharing environment. IBM J. Res. Develop. 17, 3 (Sept. 1973), 387-393.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 COrtBATO, F.J. A paging experiment with the Multics system. In In Honor of P.M. Morse, M.I,T. Press, Cambridge, Mass., 1969, pp. 217-228.Google ScholarGoogle Scholar
  5. 5 DATE, C.J. An Introduction to Data Base Systems. Addison-Wesley, Reading, Mass,, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 DENNING, P.J. The working set model for program behavior. Comm. ACM I1, 5 (May 1968), 323-333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 GAV~,R, D.P., LAVENBERG, S.S., AND PRICE, T.G. JR. Exploration analysis of access path length data for a data base management system. IBM J. Res. Develop. 20, 5 (Sept. 1976), 449-464.Google ScholarGoogle Scholar
  8. 8 GOLD, D.E., AND KUCK, D.J. A model for masking rotational latency by dynamic disk allocation. Comm. ACM 17, 5 (May 1974), 278-288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 HELD, G.D., STONEBRAKER, M.R., AND WON(}, E. INGRES--a relational data base system. Proc. AFIPS 1975 NCC, AFIPS Press, Montvale, N.J., pp. 409-416.Google ScholarGoogle Scholar
  10. 10 IBM CORP. Information Management System/360, Version 2, General information Manual. Form GH20-0765-3, IBM Corp. Tech. Pub. Dept., Palo Alto, Calif., 1973.Google ScholarGoogle Scholar
  11. 11 IBM CORP. Information Management System/360, Version 2, Application Programming Reference Manual. Form SH20-0912, IBM Corp., Palo Alto, Calif., Nov. i973.Google ScholarGoogle Scholar
  12. 12 IBM CORP. Information Management System/360, Version 2, System Programming Reference Manual. Form SH20-0911, IBM Corp., Palo Alto, Calif., Sept. 1974.Google ScholarGoogle Scholar
  13. 13 JOSEPH, M. An analysis of paging and program behavior. Comptr. J. 13, 1 (Feb. 1970), 48-54.Google ScholarGoogle ScholarCross RefCross Ref
  14. 14 LAVENBERG, S.S., AND SCHEDLER, G,S. A queueing model of the DL/I component of IMS. Res. Rep. RJ 1561, IBM Res. Lab., San Jose, Calif., April 1975. Republished as {15}.Google ScholarGoogle Scholar
  15. 15 LAVENBERC, S.S., AND SHEDLER, G.S. Stochastic modelling of processor scheduling with application to data base management systems. IBM J. Res. Develop. 20, 5 (Sept. 1976), 437-448.Google ScholarGoogle Scholar
  16. 16 LEwis, P.A.W., ANt) Sn~.DLErt, G.S. Statistical analysis of transaction processing in a data base system. Res. Rep. RJ 1629, IBM Res. Lab., San jose, Calif., Sept. 1975. Republished as: Statistical analysis of non-stationary series of events in a data base system. IBM J. Res. Develop. 20, 5 (Sept. 1976), 465-482.Google ScholarGoogle Scholar
  17. 17 MATTSON, R., GECSEI, J., SLUTZ, D.R., AND TRAIGER, I.L. Evaluation techniques for storage hierarchies. IBM Syst. J. 2 (I970), 78-117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 RAGAZ, N., AND RODRIGUEZ-ROSELL, J. Empirical studies of storage management in a data base system. Res. Rep. RJ 1834, IBM Res. Lab., San Jose, Calif., Oct. 1976.Google ScholarGoogle Scholar
  19. 19 RITCHIE, D.M., AND THOMPSON, K. The UNIX time-sharing system. Comm. ACM 17, 7 (July 1974), 365-375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20 RODRIGUEZ-ROSELL, J. Empirical data reference behavior in data base systems. Computer 9, 11 (Nov. 1976), 9-13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 RODR{GUEZ-ROSELL, J., AND HILDEBRAND, D. A framework for evaluation of data base systems. Res. Rep. RJ 1587, IBM Res. Lab., San Jose, Calif., May 1975. Also in Proc. Int. Comput. Syrup., Antibes, France, June 1975.Google ScholarGoogle Scholar
  22. 22 SMITH, A.J. A locality model for disk reference patterns. Proc. IEEE Comptr. Soc. Conf., San Francisco, Feb. 1975, 109-112.Google ScholarGoogle Scholar
  23. 23 SMITH, A.J. Analysis of a locality model for disk reference patterns. Proc. Second Conf. Inform. Sci. and Syst., Johns Hopkins U., Baltimore, Md., April 1976, 593-601.Google ScholarGoogle Scholar
  24. 24 SMITH, A.J. Sequential program prefetching in memory hierarchies. April 1977; submitted for publication. To appear, Computer.Google ScholarGoogle Scholar
  25. 25 TRIVEDI, K.S. Prepaging and applications to array algorithms. IEEE Trans. Comptrs. C-25, 9 (Sept. 1976), 915-921.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26 TRIV}~DI, K.S. Prepaging and applications to the STAR-100 computer. Proc. Symp. High Performance Comptr. and Algorithm Organization, Champaign, Ill., April 1977.Google ScholarGoogle Scholar
  27. 27 TRIVF.DI, K.S. An analysis of prepaging. Comptr. Sci. Rep. CS-1977-7, Duke U., Durham, N.C., Aug. 1977.Google ScholarGoogle Scholar
  28. 28 TRIVED1, K.S. On the paging performance of array algorithms. IEEE Trans. Comptrs. C-26, i0 (Oct. 1977), 938-947.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. 29 TUgL, W.G. JR. An analysis of buffer paging in virtual storage systems. Res. Rep. RJ 1421, IBM Res. Lab., San Jose, Calif., 1974.Google ScholarGoogle Scholar
  30. 30 TU~L, W.G. Jm, AN~ RODamUEZ-ROSELL, J. A methodology for the evaluation of data base systems. Res. Rep. RJ 1668, IBM Res. Lab., San Jose, Calif., Oct. 1975.Google ScholarGoogle Scholar

Index Terms

  1. Sequentiality and prefetching in database systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader