Abstract
The widely studied I/O and ideal-cache models were developed to account for the large difference in costs to access memory at different levels of the memory hierarchy. Both models are based on a two level memory hierarchy with a fixed size fast memory (cache) of size M, and an unbounded slow memory organized in blocks of size B. The cost measure is based purely on the number of block transfers between the primary and secondary memory. All other operations are free. Many algorithms have been analyzed in these models and indeed these models predict the relative performance of algorithms much more accurately than the standard Random Access Machine (RAM) model. The models, however, require specifying algorithms at a very low level, requiring the user to carefully lay out their data in arrays in memory and manage their own memory allocation.
We present a cost model for analyzing the memory efficiency of algorithms expressed in a simple functional language. We show how some algorithms written in standard forms using just lists and trees (no arrays) and requiring no explicit memory layout or memory management are efficient in the model. We then describe an implementation of the language and show provable bounds for mapping the cost in our model to the cost in the ideal-cache model. These bounds imply that purely functional programs based on lists and trees with no special attention to any details of memory layout can be asymptotically as efficient as the carefully designed imperative I/O efficient algorithms. For example we describe an o(n/BlogM/Bn/B) cost sorting algorithm, which is optimal in the ideal cache and I/O models.<!-- END_PAGE_1 -->
- Abello, J., Buchsbaum, A.L., Westbrook, J. A functional approach to external graph algorithms. Algorithmica 32, 3 (2002), 437--458.Google ScholarDigital Library
- Aggarwal, A., Vitter, J.S. The input/output complexity of sorting and related problems. Commun. ACM 31, 9 (1988), 1116--1127. Google ScholarDigital Library
- Arge, L., Bender, M.A., Demaine, E.D., Leiserson, C.E., Mehlhorn, K., eds. Cache-Oblivious and Cache-Aware Algorithms, 18.07.--23.07.2004, Volume 04301 of Dagstuhl Seminar Proceedings. IBFI, Schloss Dagstuhl, Germany, 2005.Google Scholar
- Blelloch, G.E., Greiner, J. Parallelism in sequential functional languages. In SIGPLAN-SIGARCH-WG2.8 Conference on Functional Programming and Computer Architecture (FPCA) (La Jolla, CA, 1995), 226--237. Google ScholarDigital Library
- Blelloch, G.E., Harper, R. Cache and I/O efficent functional algorithms. In ACM-SIAM Symposium on Discrete Algorithms (SODA). R. Giacobazzi and R. Cousot, eds, (Rome, Italy, 2013), ACM, 39--50.Google Scholar
- Chiang, Y.-J., Goodrich, M.T., Grove, E.F., Tamassia, R., Vengroff, D.E., Vitter, J.S. External-memory graph algorithms. In ACM-SIAM Symposium on Discrete Algorithms (SODA). K.L. Clarkson, ed. (San Francisco, CA, 1995), ACM/SIAM, 139--149. Google ScholarDigital Library
- Chilimbi, T.M., Larus, J.R. Using generational garbage collection to implement cache-conscious data placement. In International Symposium on Memory Management. S.L.P. Jones and R.E. Jones, eds. (Vancouver, British Columbia, 1998), ACM, 37--48. Google ScholarDigital Library
- Church, A. An unsolvable problem of elementary number theory. Am. J. Math. 58, 2 (April 1936), 345--363.Google ScholarCross Ref
- Church, A. The Calculi of Lambda-Conversion. Annals of Mathematics Studies. Princeton University Press, Princeton, NJ, 1941.Google Scholar
- Courts, R. Improving locality of reference in a garbage-collecting memory management system. Commun. ACM 31, 9 (1988), 1128--1138. Google ScholarDigital Library
- Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S. Cache-oblivious algorithms. In FOCS (IEEE Computer Society, 1999), 285--298. Google ScholarDigital Library
- Goodrich, M.T., Tsay, J.-J., Vengroff, D.E., Vitter, J.S. External-memory computational geometry (preliminary version). In FOCS (IEEE Computer Society, 1993), 714--723. Google ScholarDigital Library
- Greiner, J., Blelloch, G.E. A provably time-efficient parallel implementation of full speculation. ACM Trans. Program. Lang. Syst. 21, 2 (1999), 240--285. Google ScholarDigital Library
- Grunwald, D., Zorn, B.G., Henderson, R. Improving the cache locality of memory allocation. In R. Cartwright, ed., PLDI (ACM, 1993), 177--186. Google ScholarDigital Library
- Harper, R. Practical Foundations for Programming Languages. Cambridge University Press, Cambridge, UK, 2013. Google ScholarDigital Library
- Jones, R., Lins, R. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. Wiley, 1996. Google ScholarDigital Library
- Meyer, U., Sanders, P., Sibeyn, J.F., eds. Algorithms for Memory Hierarchies, Advanced Lectures {Dagstuhl Research Seminar, March 10--14, 2002}, volume 2625 of Lecture Notes in Computer Science. (Schloss Dagstuhl, Germany, 2003), Springer. Google ScholarDigital Library
- Morrisett, J.G., Felleisen, M., Harper, R. Abstract models of memory management. In FPCA (1995), 66--77. Google ScholarDigital Library
- Munagala, K., Ranade, A.G. I/O-complexity of graph algorithms. In SODA. R.E. Tarjan and T. Warnow, eds. (ACM/SIAM, 1999), 687--694. Google ScholarDigital Library
- Plotkin, G.D. LCF considered as a programming language. Theor. Comput. Sci. 5, 3 (1977), 223--255.Google ScholarCross Ref
- Rahn, M., Sanders, P., Singler, J. Scalable distributed-memory external sorting. In ICDE. F. Li, M.M. Moro, S. Ghandeharizadeh, J.R. Haritsa, G. Weikum, M.J. Carey, F. Casati, E.Y. Chang, I. Manolescu, S. Mehrotra, U. Dayal, and V.J. Tsotras, eds. (IEEE, 2010), 685--688.Google ScholarCross Ref
- Spoonhower, D., Blelloch, G.E., Harper, R., Gibbons, P.B. Space profiling for parallel functional programs. In ICFP. J. Hook and P. Thiemann, eds. (ACM, 2008), 253--264. Google ScholarDigital Library
- Vitter, J.S. Algorithms and data structures for external memory. Foundations Trends Theor. Comput. Sci. 2, 4 (2006), 305--474. Google ScholarDigital Library
- Wilson, P.R., Lam, M.S., Moher, T.G. Caching considerations for generational garbage collection. In LISP and Functional Programming, 1992, 32--42. Google ScholarDigital Library
Index Terms
- Cache efficient functional algorithms
Recommendations
Cache and I/O efficent functional algorithms
POPL '13The widely studied I/O and ideal-cache models were developed to account for the large difference in costs to access memory at different levels of the memory hierarchy. Both models are based on a two level memory hierarchy with a fixed size primary ...
Cache and I/O efficent functional algorithms
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesThe widely studied I/O and ideal-cache models were developed to account for the large difference in costs to access memory at different levels of the memory hierarchy. Both models are based on a two level memory hierarchy with a fixed size primary ...
Efficient STT-RAM last-level-cache architecture to replace DRAM cache
MEMSYS '17: Proceedings of the International Symposium on Memory SystemsRecent research has proposed die-stacked Last Level Cache (LLC) to overcome the Memory Wall. Lately, Spin-Transfer-Torque Random Access Memory (STT-RAM) caches have been recommended as they provide improved energy efficiency compared to DRAM caches. ...
Comments