abstract

Persistent unfairness arising from cache residency imbalance

Authors:
Dave Dice

Oracle Labs, Burlington, MA, USA

Oracle Labs, Burlington, MA, USA
View Profile

,
Virendra J. Marathe

Oracle Labs, Burlington, MA, USA

Oracle Labs, Burlington, MA, USA
View Profile

,
Nir Shavit

MIT, Cambridge, MA, USA

MIT, Cambridge, MA, USA
View Profile

SPAA '14: Proceedings of the 26th ACM symposium on Parallelism in algorithms and architecturesJune 2014Pages 82–83https://doi.org/10.1145/2612669.2612703

Published:21 June 2014Publication History

SPAA '14: Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures

Pages 82–83

ABSTRACT

We describe a counter-intuitive performance phenomena relevant to concurrency research. On a modern multicore system with a shared last-level cache, a set of concurrently running identical threads that loop -- each accessing the same quantity of distinct thread-private data -- can suffer significant relative progress imbalance. If one thread, or a small subset of the threads, manages to transiently enjoy higher cache residency than the other threads, that thread will tend to iterate faster and keep more of its data resident, thus increasing the odds that it will continue to run faster. This emergent behavior tends to be stable over surprisingly long periods.

References

Y. Afek, D. Dice, and A. Morrison. Cache Index-aware Memory Allocation. In Proceedings of the International Symposium on Memory Management, ISMM '11, pages 55--64, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
B. Brett, P. Kumar, M. Kim, and H. Kim. CHiP: A Profiler to Measure the Effect of Cache Contention on Scalability. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, IPDPSW '13, pages 1565--1574, Washington, DC, USA, 2013. IEEE Computer Society. Google ScholarDigital Library
A. K. Katti and V. Ramachandran. Competitive Cache Replacement Strategies for Shared Cache Environments. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS '12, pages 215--226, Washington, DC, USA, 2012. IEEE Computer Society. Google ScholarDigital Library
G. Marsaglia. Xorshift RNGs. Journal of Statistical Software, 8(14):1--6, 7 2003.Google ScholarCross Ref
J. M. Mellor-Crummey and M. L. Scott. Algorithms for Scalable Synchronization on Shared-memory Multiprocessors. ACM Trans. Comput. Syst., 9:21--65, February 1991. Google ScholarDigital Library
Oracle Corporation. Oracle's SPARC T4--1, SPARC T4--2, SPARC T4--4, and SPARC T4--1B Server Architecture, 2012.Google Scholar

Index Terms

Persistent unfairness arising from cache residency imbalance
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Parallel programming languages

Recommendations

Malthusian Locks
EuroSys '17: Proceedings of the Twelfth European Conference on Computer Systems

Applications running in modern multithreaded environments are sometimes overthreaded. The excess threads do not improve performance, and in fact may act to degrade performance via scalability collapse, which can manifest even when there are fewer ready ...
Read More
Lowering Conflicts of High Contention Software Transactional Memory
CSSE '08: Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 03

Two concurrent transactions are said to conflict based on linearizability semantics if they access the same shared data and at least one of them modifies that data. In many applications enforcing the strict linearizability semantics over the entire read-...
Read More
Lock Cohorting: A General Technique for Designing NUMA Locks
Special Issue on PPOPP 2012

Multicore machines are quickly shifting to NUMA and CC-NUMA architectures, making scalable NUMA-aware locking algorithms, ones that take into account the machine's nonuniform memory and caching hierarchy, ever more important. This article presents lock ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SPAA '14: Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures
June 2014
356 pages
ISBN:9781450328210
DOI:10.1145/2612669
General Chair:
Guy Blelloch
Carnegie Mellon University, USA
,
Program Chair:
Peter Sanders
Karlsruhe Institute of Technology, Germany
Copyright © 2014 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 June 2014
Check for updates
Author Tags
caches
concurrency
multicore
threads
Qualifiers
- abstract
Conference

Acceptance Rates
SPAA '14 Paper Acceptance Rate30of122submissions,25%Overall Acceptance Rate447of1,461submissions,31%
More
Upcoming Conference
SPAA '24

Sponsor:

sigact

sigact

36th ACM Symposium on Parallelism in Algorithms and Architectures

June 17 - 21, 2024

Nantes , France
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 155
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Persistent unfairness arising from cache residency imbalance

SPAA '14: Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures

ABSTRACT

References

Cited By

Index Terms

Recommendations

Malthusian Locks

Lowering Conflicts of High Contention Software Transactional Memory

Lock Cohorting: A General Technique for Designing NUMA Locks