skip to main content
research-article
Open Access

A Relational Theory of Locality

Authors Info & Claims
Published:20 August 2019Publication History
Skip Abstract Section

Abstract

In many areas of program and system analysis and optimization, locality is a common concept and has been defined and measured in many ways. This article aims to formally establish relations between these previously disparate types of locality. It categorizes locality definitions in three groups and shows whether and how they can be interconverted. For the footprint, a recent metric, it gives a new measurement algorithm that is asymptotically more time/space efficient than previous approaches. Using the conversion relations, the new algorithm derives with the same efficiency different locality metrics developed and used in program analysis, memory management, and cache design.

References

  1. Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd ed.). Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Randy Allen and Ken Kennedy. 2001. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers.Google ScholarGoogle Scholar
  3. Ganesh Balakrishnan and Yan Solihin. 2012. WEST: Cloning data cache behavior using stochastic traces. In Proceedings of the International Symposium on High-Performance Computer Architecture. 387--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kristof Beyls and Erik H. D’Hollander. 2006. Discovery of locality-improving refactoring by reuse path analysis. In Proceedings of High Performance Computing and Communications, Lecture Notes in Computer Science, Vol. 4208. Springer, 220--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jacob Brock, Chen Ding, Rahman Lavaee, Fangzhou Liu, and Liang Yuan. 2018. Prediction and bounds on shared cache demand from memory access interleaving. In Proceedings of the International Symposium on Memory Management. 96--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jeffrey P. Buzen. 2015. Rethinking Randomness: A New Foundation for Stochastic Modeling.Google ScholarGoogle Scholar
  7. Daniel Byrne, Nilufer Onder, and Zhenlin Wang. 2018. mPart: Miss-ratio curve guided partitioning in key-value stores. In Proceedings of the International Symposium on Memory Management. 84--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dhruba Chandra, Fei Guo, Seongbeom Kim, and Yan Solihin. 2005. Predicting inter-thread cache contention on a chip multi-processor architecture. In Proceedings of the International Symposium on High-Performance Computer Architecture. 340--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dong Chen, Fangzhou Liu, Chen Ding, and Sreepathi Pai. 2018. Locality analysis through static parallel sampling. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 557--570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Xi E. Chen and Tor M. Aamodt. 2009. A first-order fine-grained multithreaded throughput model. In Proceedings of the International Symposium on High-Performance Computer Architecture. 329--340.Google ScholarGoogle Scholar
  11. Trishul M. Chilimbi, Bob Davidson, and James R. Larus. 1999. Cache-conscious structure definition. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Edward G. Coffman Jr. and Peter J. Denning. 1973. Operating Systems Theory. Prentice-Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Keith Cooper and Linda Torczon. 2010. Engineering a Compiler (2nd ed.). Morgan Kaufmann.Google ScholarGoogle Scholar
  14. Peter J. Denning. 1968. The working set model for program behaviour. Commun. ACM 11, 5 (1968), 323--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Peter J. Denning and Jeffrey P. Buzen. 1978. The operational analysis of queueing network models. Comput. Surv. 10, 3 (1978), 225--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Peter J. Denning and Kevin C. Kahn. 1975. A study of program locality and lifetime functions. In Proceedings of the ACM Symposium on Operating System Principles. 207--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Peter J. Denning and Craig H. Martell. 2015. Great Principles of Computing. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Peter J. Denning and Stuart C. Schwartz. 1972. Properties of the working set model. Commun. ACM 15, 3 (1972), 191--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Peter J. Denning and Donald R. Slutz. 1978. Generalized working sets for segment reference strings. Commun. ACM 21, 9 (1978), 750--759. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Ding and K. Kennedy. 2004. Improving effective bandwidth through compiler enhancement of global cache reuse. J. Parallel Distrib. Comput. 64, 1 (2004), 108--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Chen Ding, Xiaoya Xiang, Bin Bao, Hao Luo, Ying-Wei Luo, and Xiao-lin Wang. 2014. Performance metrics and models for shared cache. J. Comput. Sci. Technol. 29, 4 (2014), 692--712.Google ScholarGoogle ScholarCross RefCross Ref
  22. Malcolm C. Easton and Ronald Fagin. 1978. Cold-start vs. Warm-start miss ratios. Commun. ACM 21, 10 (1978), 866--872. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. David Eklov, David Black-Schaffer, and Erik Hagersten. 2011. Fast modeling of shared caches in multicore systems. In Proceedings of the International Conference on High Performance Embedded Architectures and Compilers. 147--157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. David Eklov and Erik Hagersten. 2010. StatStack: Efficient modeling of LRU caches. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 55--65.Google ScholarGoogle ScholarCross RefCross Ref
  25. Venmugil Elango, Fabrice Rastello, Louis-Noël Pouchet, J. Ramanujam, and P. Sadayappan. 2015. On characterizing the data access complexity of programs. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 567--580. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Changpeng Fang, Steve Carr, Soner Önder, and Zhenlin Wang. 2005. Instruction based memory distance analysis and its application. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 27--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Xiaoming Gu, Ian Christopher, Tongxin Bai, Chengliang Zhang, and Chen Ding. 2009. A component model of spatial locality. In Proceedings of the International Symposium on Memory Management. 99--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Saurabh Gupta, Ping Xiang, Yi Yang, and Huiyang Zhou. 2013. Locality principle revisited: A probability-based quantitative approach. J. Parallel Distrib. Comput. 73, 7 (2013), 1011--1027.Google ScholarGoogle ScholarCross RefCross Ref
  29. J. Hong and H. T. Kung. 1981. I/O complexity: The red-blue pebble game. In Proceedings of the ACM Conference on Theory of Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Chen Ding, and Zhenlin Wang. 2016. Kinetic modeling of data eviction in cache. In Proceedings of USENIX Annual Technical Conference. 351--364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Zhenlin Wang, Chen Ding, and Chencheng Ye. 2018. Fast miss ratio curve modeling for storage cache. ACM Trans. Stor. 14, 2 (2018), 12:1--12:34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Khaled Z. Ibrahim and Erich Strohmaier. 2010. Characterizing the relation between Apex-Map synthetic probes and reuse distance distributions. In Proceedings of the International Conference on Parallel Processing. 0 (2010), 353--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Bruce Jacob, Spencer W. Ng, and David T. Wang. 2010. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Jiang and X. Zhang. 2002. LIRS: An efficient low inter-reference recency set replacement to improve buffer cache performance. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Rahman Lavaee. 2016. The hardness of data packing. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 232--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Pengcheng Li, Colin Pronovost, William Wilson, Benjamin Tait, Jie Zhou, Chen Ding, and John Criswell. 2019. Beating OPT with statistical clairvoyance and variable size caching. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 243--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yumeng (Lucinda) Liu, Daniel Busaba, Chen Ding, and Daniel Gildea. 2018. All timescale window co-occurrence: Efficient analysis and a possible use. In Proceedings of the 28th Annual International Conference on Computer Science and Software Engineering. 289--292. http://dl.acm.org/citation.cfm?id=3291291.3291322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Li Lu and Michael L. Scott. 2011. Toward a formal semantic framework for deterministic parallel programming. In Proceedings of the International Conference on Distributed Computing. 460--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Hao Luo, Guoyang Chen, Fangzhou Liu, Pengcheng Li, Chen Ding, and Xipeng Shen. 2018. Footprint modeling of cache associativity and granularity. In Proceedings of the International Symposium on Memory Systems (MEMSYS’18). 232--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Hao Luo, Pengcheng Li, and Chen Ding. 2017. Thread data sharing in cache: Theory and measurement. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 103--115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Rafael K. V. Maeda, Qiong Cai, Jiang Xu, Zhe Wang, and Zhongyuan Tian. 2017. Fast and accurate exploration of multi-level caches using hierarchical reuse distance. In Proceedings of the International Symposium on High-Performance Computer Architecture. 145--156.Google ScholarGoogle ScholarCross RefCross Ref
  42. G. Marin and J. Mellor-Crummey. 2004. Cross architecture performance predictions for scientific applications using parameterized models. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems. 2--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM Syst. J. 9, 2 (1970), 78--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Ulrich Meyer, Peter Sanders, and Jop F. Sibeyn (Eds.). 2003. Algorithms for Memory Hierarchies, Advanced Lectures. Lecture Notes in Computer Science, Vol. 2625. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Cedric Nugteren, Gert-Jan van den Braak, Henk Corporaal, and Henri E. Bal. 2014. A detailed GPU cache model based on reuse distance theory. In Proceedings of the International Symposium on High-Performance Computer Architecture.Google ScholarGoogle Scholar
  46. E. Petrank and D. Rawitz. 2002. The hardness of cache conscious data placement. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Apan Qasem and Ken Kennedy. 2005. Evaluating a Model for Cache Conflict Miss Prediction. Technical Report CS-TR05-457. Rice University.Google ScholarGoogle Scholar
  48. S. Rubin, R. Bodik, and T. Chilimbi. 2002. An efficient profile-analysis framework for data layout optimizations. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. Portland, Oregon. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Derek L. Schuff, Milind Kulkarni, and Vijay S. Pai. 2010. Accelerating multicore reuse distance analysis with sampling and parallelization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 53--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Rathijit Sen and David A. Wood. 2013. Reuse-based online models for caches. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems. 279--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Xipeng Shen and Jonathan Shaw. 2008. Scalable implementation of efficient locality approximation. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing. 202--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Xipeng Shen, Jonathan Shaw, Brian Meeker, and Chen Ding. 2007. Locality approximation using time. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 55--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Donald R. Slutz and Irving L. Traiger. 1974. A note on the calculation working set size. Commun. ACM 17, 10 (1974), 563--565. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. A. J. Smith. 1976. On the effectiveness of set associative page mapping and its applications in main memory management. In Proceedings of the International Conference on Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. M. Snir and J. Yu. 2005. On the Theory of Spatial and Temporal Locality. Technical Report DCS-R-2005-2564. Computer Science Deptartment, University of Illinois at Urbana--Champaign.Google ScholarGoogle Scholar
  56. G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2001. Analytical cache models with applications to cache partitioning. In Proceedings of the International Conference on Supercomputing. 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. David K. Tam, Reza Azimi, Livio Soares, and Michael Stumm. 2009. RapidMRC: Approximating L2 miss rate curves on commodity systems for online optimizations. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 121--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Qingsen Wang, Xu Liu, and Milind Chabbi. 2019. Featherlight reuse-distance measurement. In Proceedings of the International Symposium on High-Performance Computer Architecture. 440--453.Google ScholarGoogle ScholarCross RefCross Ref
  59. Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas JA Harvey, Andrew Warfield, and Coho Data. 2014. Characterizing storage workloads with counter stacks. In Proceedings of the Symposium on Operating Systems Design and Implementation. USENIX Association, 335--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. M. J. Wolfe. 1996. High Performance Compilers for Parallel Computing. Addison-Wesley, Redwood City, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Meng-Ju Wu and Donald Yeung. 2013. Efficient reuse distance analysis of multicore scaling for loop-based parallel programs. ACM Trans. Comput. Syst. 31, 1 (2013), 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Meng-Ju Wu and Donald Yeung. 2011. Coherent profiles: Enabling efficient reuse distance analysis of multicore scaling for loop-based parallel programs. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 264--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Meng-Ju Wu, Minshu Zhao, and Donald Yeung. 2013. Studying multicore processor scaling via reuse distance analysis. In Proceedings of the International Symposium on Computer Architecture. 499--510. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Xiaoya Xiang, Bin Bao, Tongxin Bai, Chen Ding, and Trishul M. Chilimbi. 2011a. All-window profiling and composable models of cache sharing. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 91--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Xiaoya Xiang, Bin Bao, Chen Ding, and Yaoqing Gao. 2011b. Linear-time modeling of program working set in shared cache. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 350--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Xiaoya Xiang, Chen Ding, Hao Luo, and Bin Bao. 2013. HOTL: A higher order theory of locality. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 343--356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Yaocheng Xiang, Xiaolin Wang, Zihui Huang, Zeyu Wang, Yingwei Luo, and Zhenlin Wang. 2018. DCAPS: Dynamic cache allocation with partial sharing. In Proceedings of the EuroSys Conference. 13:1--13:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Chencheng Ye, Chen Ding, Hao Luo, Jacob Brock, Dong Chen, and Hai Jin. 2017. Cache exclusivity and sharing: Theory and optimization. ACM Trans. Arch. Code Optimiz. 14, 4, 34:1--34:26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Liang Yuan, Wesley Smith, Chen Ding, Sicong Fan, Zixu Chen, and Yunquan Zhang. 2018. Footmark: A new formulation for working set statistics. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing.Google ScholarGoogle Scholar
  70. Chengliang Zhang, Chen Ding, Mitsunori Ogihara, Yutao Zhong, and Youfeng Wu. 2006. A hierarchical model of data locality. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 16--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Y. Zhong, S. G. Dropsho, X. Shen, A. Studer, and C. Ding. 2007. Miss rate prediction across program inputs and cache configurations. IEEE Trans. Comput. 56, 3 (Mar. 2007), 328--343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Yutao Zhong, Xipeng Shen, and Chen Ding. 2009. Program locality analysis using reuse distance. ACM Trans. Program. Lang. Syst. 31, 6 (Aug. 2009), 1--39. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Relational Theory of Locality

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Architecture and Code Optimization
          ACM Transactions on Architecture and Code Optimization  Volume 16, Issue 3
          September 2019
          347 pages
          ISSN:1544-3566
          EISSN:1544-3973
          DOI:10.1145/3341169
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 August 2019
          • Revised: 1 June 2019
          • Accepted: 1 June 2019
          • Received: 1 August 2018
          Published in taco Volume 16, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format