skip to main content
article
Free Access

The design and implementation of a log-structured file system

Published:01 February 1992Publication History
Skip Abstract Section

Abstract

This paper presents a new technique for disk storage management called a log-structured file system. A log-structured file system writes all modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery. The log is the only structure on disk; it contains indexing information so that files can be read back from the log efficiently. In order to maintain large free areas on disk for fast writing, we divide the log intosegmentsand use a segment cleaner to compress the live information from heavily fragmented segments. We present a series of simulations that demonstrate the efficiency of a simple cleaning policy based on cost and benefit. We have implemented a prototype log-structured file system called Sprite LFS; it outperforms current Unix file systems by an order of magnitude for small-file writes while matching or exceeding Unix performance for reads and large writes. Even when the overhead for cleaning is included, Sprite LFS can use 70% of the disk bandwidth for writing, whereas Unix file systems typically can use only 5–10%.

References

  1. 1 BAKER, H. G. List processing in real time on a serial computer. A.I. Working Paper 139, MIT-AI Lab, Boston, MA, Apr. 1977.Google ScholarGoogle Scholar
  2. 2 BAKER, M. G., HARTMAN, J. H., KUPFER, M~ D., SHIRRIFF, K. W., AND OUSTERHOUT, J. K. Measurements of a distributed file system. In Proceedings of the 13th Symposium on Operating Systems Principles (Pacific Grove, Calif., Oct. 1991), ACM. pp. 198-212. Google ScholarGoogle Scholar
  3. 3 CHANG, A., MERGEN, M. F., RADER, R. K~, ROBERTS, J. A., AND PORTER, S.L. Evolution of storage facilities in AIX Version 3 for RISC System/6000 processors. IBM J. Res. Dev. 34, 1 (Jan. 1990), 105-109. Google ScholarGoogle Scholar
  4. 4 DEWITT, D. J., KATZ, R. H., OLKEN, F., SHAPIRO. L. D., STONEBRAKER M. R., AND WOOD, D. Implementation techniques for main memory database systems. In Proceedings of SIGMOD 1984 (Jun. 1984), pp. 1-8. Google ScholarGoogle Scholar
  5. 5 FINLAYSON, R. S. AND CHERITON, D. R. Log files: An extended file service exploiting write-once storage. In Proceedings of the 11th Sympostum on Operating Systems Principles (Austin, Tex., Nov. 1987), ACM, pp. 129-148. Google ScholarGoogle Scholar
  6. 6 GRAY, J. Notes on data base operating systems. In Operating Systems, An Advanced Course. Springer-Verlag, Berlin, 1979, pp~ 393-481. Google ScholarGoogle Scholar
  7. 7 HAaMANN, R.B. A crash recovery scheme for a memory-resident database system, IEEE Trans. Comput. C-35, 9 (Sept. 1986), 000-000. Google ScholarGoogle Scholar
  8. 8 HAGMANN, R.B. Reimplementing the cedar file system using logging and group commit. In Proceedings of the 11th Symposium on Operating Systems Principles (Austin, Tex., Nov. 1987), pp. 155-162. Google ScholarGoogle Scholar
  9. 9 KAZAR, M. L, LEVERETT, B. W., ANDERSON, O. T, APOSTOLIDES, V., BOTTOS, B. A CHUTANI, S., EVERHART, C. F., MASON, W. A., Tu, S. T., AND ZAYAS, E. R. Decorum file system architectural overview. In Proceedings of the USENIX 1990 Summer Conference (Anaheim, Calif., Jun. 1990), pp. 151-164.Google ScholarGoogle Scholar
  10. 10 LAZOWSKA, E. D., ZAHORJAN, J., CHERITON, D. R., AND ZWAENEPOEL, W. File access performance of diskless workstations. Trans. Comput. Syst. 4, 3 (Aug. 1986), 238-268. Google ScholarGoogle Scholar
  11. 11 LIEBERMAN, H AND HEWITT, C. A real-time garbage collector based on the lifetimes of objects. Commun. ACM. 26, 6 (June 1983), 419-429. Google ScholarGoogle Scholar
  12. 12 McKumcK, M. K. A fast file system for Unix. Trans. Comput. Syst. 2, 3 (Aug. 1984), 181-197. Google ScholarGoogle Scholar
  13. 13 McKusIcK, M K., JoY, W. N., LEFFLER~ S. J., AND FABRY, R. S. Fsck--The UNIX file system check program. Unix System Manager's Manual--4.3 BSD Virtual VAX-11 Version, USENIX, Berkeley, Calif., Apr. 1986.Google ScholarGoogle Scholar
  14. 14 McVoY, L. AND KL~mAN, S. Extent-like performance from a UNIX file system. In Proceedings of the USENIX 1991 Winter Conference (Dallas, Tex., Jan. 1991).Google ScholarGoogle Scholar
  15. 15 OKI, B. M., LISKOV, B. H., AND SCHEIFLER, R.W. Reliable object storage to support atomic actions. In Proceedings of the lOth Symposium on Operating Systems Principles (Orcas Island, Wash., Dec. 1985) ACM, pp. 147-159. Google ScholarGoogle Scholar
  16. 16 OUSTERHOU% J. K. Why aren't operating systems getting faster as fast as hardware? In Proceedings of the USENIX 1990 Summer Conference (Anaheim, Calif., Jun. 1990), pp. 247-256.Google ScholarGoogle Scholar
  17. 17 OUSTERHOUT, J. K, CHERENSON, A, R., DOUGLIS, F., NELSON, M N, AND WELCH, B.B. The Sprite Network Operating System. IEEE Comput. 21, 2 (Feb. 1988), 23 36. Google ScholarGoogle Scholar
  18. 18 OUSTERHOUT, J. K, DA COSTA, H., HARRISON, D., KUNZE, J A., KUPFER, M., AND THOMPSON, J. G A trace-dmven analysis of the Unix 4.2 BSD file system. In Proceedings of the lOth Symposium on Operating Systems Principles (Orcas Island, Wash., 1985), ACM, pp. 15-24. Google ScholarGoogle Scholar
  19. 19 PATTERSON, D. A., Gmsor4, G., AND KATZ, R. H A case for redundant arrays of inexpensive disks (RAID). ACM SIGMOD 88, pp. 109-116, (Jun. 1988). Google ScholarGoogle Scholar
  20. 20 REED, D. AND SVOBODOVA, L. SWALLOW: A distributed data storage system for a local network. In Local Networks for Computer Communications. North-Holland, 1981, pp. 355-373.Google ScholarGoogle Scholar
  21. 21 SANDSERG, R. Design and implementation of the Sun network filesystem. In Proceedings of the USENIX 1985 Summer Conference (Portland, Ore , Jun. 1985) pp. 119-130.Google ScholarGoogle Scholar
  22. 22 SALEM, K. AND GARCIA-MOLINA, H. Crash recovery mechamsms for main storage database systems. CS-TR-034-86, Princeton Univ., Princeton, NJ 1986.Google ScholarGoogle Scholar
  23. 23 SATYANARAYANAN, M A study of file sizes and functional lifetimes. In Proceedings of the 8th Symposium on Operating Systems Principles (Pacffic Grove, Calif., Dec. 1981), ACM, pp. 96-108. Google ScholarGoogle Scholar
  24. 24 SELTZER, M. I., CHEN, P. M., AND OUSTERHOUT, J. K Disk scheduling revisited. In Proceedlngs of the Winter 1990 USENIX Techmcal Conference (Jan. 1990), pp.Google ScholarGoogle Scholar

Index Terms

  1. The design and implementation of a log-structured file system

                      Recommendations

                      Reviews

                      Charles N. Schroeder

                      This well-written paper is of particular interest at this time since it deals with a much-improved disk storage management system for UNIX-based systems. The system described shows an order of magnitude increase in performance over traditional log systems; it is of particular interest to system administrators of UNIX-based systems. As indicated in the abstract, “the log-structured file system writes modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery. The log is the only structure on disk; it contains indexing information so files can be read back from the log efficiently.” In order to maintain large free areas on the disk for fast writing, the log is divided into segments and a segment cleaner is used to compress fragmented segments. The authors describe a prototype of the log-structured file system and give performance comparisons to current UNIX file systems. The results are impressive and indicate that this system has definite potential for increasing the performance of file systems.

                      Access critical reviews of Computing literature here

                      Become a reviewer for Computing Reviews.

                      Comments

                      Login options

                      Check if you have access through your login credentials or your institution to get full access on this article.

                      Sign in

                      Full Access

                      PDF Format

                      View or Download as a PDF file.

                      PDF

                      eReader

                      View online with eReader.

                      eReader