ABSTRACT
This paper presents an analysis of how Linux's performance has evolved over the past seven years. Unlike recent works that focus on OS performance in terms of scalability or service of a particular workload, this study goes back to basics: the latency of core kernel operations (e.g., system calls, context switching, etc.). To our surprise, the study shows that the performance of many core operations has worsened or fluctuated significantly over the years. For example, the select system call is 100% slower than it was just two years ago. An in-depth analysis shows that over the past seven years, core kernel subsystems have been forced to accommodate an increasing number of security enhancements and new features. These additions steadily add overhead to core kernel operations but also frequently introduce extreme slowdowns of more than 100%. In addition, simple misconfigurations have also severely impacted kernel performance. Overall, we find most of the slowdowns can be attributed to 11 changes.
Some forms of slowdown are avoidable with more proactive engineering. We show that it is possible to patch two security enhancements (from the 11 changes) to eliminate most of their overheads. In fact, several features have been introduced to the kernel unoptimized or insufficiently tested and then improved or disabled long after their release.
Our findings also highlight both the feasibility and importance for Linux users to actively configure their systems to achieve an optimal balance between performance, functionality, and security: we discover that 8 out of the 11 changes can be avoided by reconfiguring the kernel, and the other 3 can be disabled through simple patches. By disabling the 11 changes with the goal of optimizing performance, we speed up Redis, Apache, and Nginx benchmark workloads by as much as 56%, 33%, and 34%, respectively.
- Advanced Micro Devices. 2018. "Speculative Store Bypass" Vulnerability Mitigations for AMD Platforms. https://www.amd.com/en/corporate/security-updates.Google Scholar
- Advanced Micro Devices. 2019. AMD64 Architecture Programmer's Manual. Vol. 3. Chapter 3, 262.Google Scholar
- Al Gillen and Gary Chen. 2011. The Value of Linux in Today's Fast-Changing Computing Environments.Google Scholar
- Amazon Web Services. 2017. AWS re:Invent 2017: How Netflix Tunes Amazon EC2 Instances for Performance (CMP325). https://www.youtube.com/watch?v=89fYOoW2pA.Google Scholar
- Thomas E. Anderson, Henry M. Levy, Brian N. Bershad, and Edward D. Lazowska. 1991. The Interaction of Architecture and Operating System Design. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV). ACM, 108--120.Google Scholar
- Apache. 2018. ab - Apache HTTP Server Benchmarking Tool. https://httpd.apache.org/docs/2.4/programs/ab.html.Google Scholar
- Apache. 2018. Apache HTTP Server Project. https://httpd.apache.org/.Google Scholar
- Simon Biggs, Damon Lee, and Gernot Heiser. 2018. The Jury Is In: Monolithic OS Design Is Flawed: Microkernel-based Designs Improve Security. In Proceedings of the 9th Asia-Pacific Workshop on Systems (APSys '18). ACM, Article 16, 7 pages.Google ScholarDigital Library
- Jeff Bonwick. 1994. The Slab Allocator: An Object-caching Kernel Memory Allocator. In Proceedings of the 1994 USENIX Summer Technical Conference (USTC '94). USENIX Association, 87--98.Google Scholar
- Silas Boyd-Wickizer, Austin T. Clements, Yandong Mao, Aleksey Pesterev, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. 2010. An Analysis of Linux Scalability to Many Cores. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI '10). USENIX Association, 1--16.Google ScholarDigital Library
- Aaron B. Brown and Margo I. Seltzer. 1997. Operating System Benchmarking in the Wake of Lmbench: A Case Study of the Performance of NetBSD on the Intel x86 Architecture. In Proceedings of the 1997 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '97). ACM, 214--224.Google Scholar
- Peter M. Chen and David A. Patterson. 1993. A New Approach to I/O Performance Evaluation: Self-scaling I/O Benchmarks, Predicted I/O Performance. In Proceedings of the 1993 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '93). ACM, 1--12.Google Scholar
- Tim Chen, Leonid I. Ananiev, and Alexander V. Tikhonov. 2007. Keeping Kernel Performance from Regressions. In Proceedings of the Linux Symposium, Vol. 1. 93--102.Google Scholar
- Colin Ian King. 2013. Context Switching on 3.11 Kernel Costing CPU and Power. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1233681.Google Scholar
- DB-Engines. 2019. DB-engines Ranking. https://db-engines.com/en/ranking.Google Scholar
- Docker. 2018. Docker. https://www.docker.com/.Google Scholar
- George Greer. 2014. getitimer Returns it_value=0 Erroneously. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349028.Google Scholar
- Graz University of Technology. 2018. Meltdown and Spectre. https://meltdownattack.com/.Google Scholar
- Greg Kroah-Hartman. 2017. Linux Kernel Release Model. http://kroah.com/log/blog/2018/02/05/linux-kernel-release-model/.Google Scholar
- Gernot Heiser and Kevin Elphinstone. 2016. L4 Microkernels: The Lessons from 20 Years of Research and Deployment. ACM Transaction on Computer Systems 34, 1, Article 1 (April 2016), 29 pages.Google ScholarDigital Library
- Intel Corporation. 2017. Linux Kernel Performance. https://01.org/lkp.Google Scholar
- Intel Corporation. 2018. Speculative Execution and Indirect Branch Prediction Side Channel Analysis Method. https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00088.html.Google Scholar
- Intel Corporation. 2019. Intel® 64 and IA-32 Architectures Software Developer's Manual. Vol. 3A. Chapter 4.10.1.Google Scholar
- Intel Corporation. 2019. Intel® 64 and IA-32 Architectures Software Developer's Manual. Vol. 1. Chapter 11.4.4.4.Google Scholar
- Intel Corporation. 2019. Intel® 64 and IA-32 Architectures Software Developer's Manual. Vol. 3. Chapter 14.5.Google Scholar
- Jake Edge. 2016. Hardened Usercopy. https://lwn.net/Articles/695991/.Google Scholar
- Jake Edge. 2017. Testing Kernels. https://lwn.net/Articles/734016/.Google Scholar
- Jon Oberheide. 2010. Linux Kernel CAN SLUB Overflow. https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/.Google Scholar
- Jonathan Corbet. 2007. Notes from a Container. https://lwn.net/Articles/256389/.Google Scholar
- Jonathan Corbet. 2015. User-space Page Fault Handling. https://lwn.net/Articles/636226/.Google Scholar
- Jonathan Corbet. 2017. The Current State of Kernel Page-table Isolation. https://lwn.net/Articles/741878/.Google Scholar
- Jonathan Corbet and Greg Kroah-Hartman. 2017. 2017 State of Linux Kernel Development. https://www.linuxfoundation.org/2017-linux-kernel-report-landing-page/.Google Scholar
- Judd Vinet and Aaron Griffin. 2018. Arch Linux. https://www.archlinux.org/.Google Scholar
- Kirill A. Shutemov. 2014. mm: Map Few Pages Around Fault Address if They are in Page Cache. https://lwn.net/Articles/588802/.Google Scholar
- Paul Kocher, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, and Yuval Yarom. 2018. Spectre Attacks: Exploiting Speculative Execution. (Jan. 2018). arXiv:1801.01203Google Scholar
- Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel. 2016. Coordinated and Efficient Huge Page Management with Ingens. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI '16). USENIX Association, 705--721.Google ScholarDigital Library
- Kevin Lai and Mary Baker. 1996. A Performance Comparison of UNIX Operating Systems on the Pentium. In Proceedings of the 1996 USENIX Annual Technical Conference (ATC '96). USENIX Association, 265--277.Google ScholarDigital Library
- Linux. 2017. = Transparent Hugepage Support =. https://www.kernel.org/doc/Documentation/vm/transhuge.txt.Google Scholar
- Linux. 2017. Short Users Guide for SLUB. https://www.kernel.org/doc/Documentation/vm/slub.txt.Google Scholar
- Linux. 2018. NO_HZ: Reducing Scheduling-Clock Ticks. https://www.kernel.org/doc/Documentation/timers/NO_HZ.txt.Google Scholar
- Linux. 2018. Page Table Isolation. https://www.kernel.org/doc/Documentation/x86/pti.txt.Google Scholar
- Linux. 2019. Memory Resource Controller. https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt.Google Scholar
- Linux Containers. 2018. Linux Containers. https://linuxcontainers.org/.Google Scholar
- Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, and Mike Hamburg. 2018. Meltdown. (Jan. 2018). arXiv:1801.01207Google Scholar
- Jean-Pierre Lozi, Baptiste Lepers, Justin Funston, Fabien Gaud, Vivien Quéma, and Alexandra Fedorova. 2016. The Linux Scheduler: A Decade of Wasted Cores. In Proceedings of the 11th European Conference on Computer Systems (EuroSys '16). ACM, Article 1, 16 pages.Google ScholarDigital Library
- Markus Podar. 2014. Current Ubuntu 14.04 Uses Kernel with Degraded Disk Performance in SMP Environment. https://github.com/jedi4ever/veewee/issues/1015.Google Scholar
- Larry McVoy and Carl Staelin. 1996. Lmbench: Portable Tools for Performance Analysis. In Proceedings of the 1996 USENIX Annual Technical Conference (ATC '96). USENIX Association, 279--294.Google Scholar
- Michael Dale Long. 2016. Unnaccounted for High CPU Usage While Idle. https://bugzilla.kernel.org/show_bug.cgi?id=150311.Google Scholar
- Michael Kerrisk. 2012. KS2012: memcg/mm: Improving Memory cgroups Performance for Non-users. https://lwn.net/Articles/516533/.Google Scholar
- Michael Larabel. 2010. Five Years of Linux Kernel Benchmarks: 2.6.12 Through 2.6.37. https://www.phoronix.com/scan.php?page=article&item=linux_2612_2637.Google Scholar
- Michael Larabel. 2016. Linux 3.5 Through Linux 4.4 Kernel Benchmarks: A 19-Way Kernel Showdown Shows Some Regressions. https://www.phoronix.com/scan.php?page=article&item=linux-44-19way.Google Scholar
- Michael Larabel. 2017. The Linux Kernel Gained 2.5 Million Lines of Code, 71k Commits in 2017. https://www.phoronix.com/scan.php?page=news_item&px=Linux-Kernel-Commits-2017.Google Scholar
- Juan Navarro, Sitararn Iyer, Peter Druschel, and Alan Cox. 2002. Practical, Transparent Operating System Support for Superpages. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI '02). USENIX Association, 89--104.Google ScholarDigital Library
- Netcraft. 2019. March 2019 Web Server Survey | Netcraft. https://news.netcraft.com/archives/2019/03/28/march-2019-web-server-survey.html.Google Scholar
- Nginx. 2019. NGINX | High Performance Load Balancer, Web Server, & Reverse Proxy. https://www.nginx.com/.Google Scholar
- John K. Ousterhout. 1990. Why Aren't Operating Systems Getting Faster As Fast as Hardware?. In Proceedings of the 1990 USENIX Summer Technical Conference (USTC '90). USENIX Association, 247--256.Google Scholar
- Philippe Gerum. 2018. Troubleshooting Guide. https://gitlab.denx.de/Xenomai/xenomai/wikis/Troubleshooting.Google Scholar
- Thanumalayan Sankaranarayana Pillai, Vijay Chidambaram, Ramnatthan Alagappan, Samer Al-Kiswany, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-consistent Applications. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14). USENIX Association, 433--448.Google Scholar
- Randal E. Bryant and David R. O'Hallaron. 2002. Computer Systems: A Programmer's Perspective (1 ed.). Prentice Hall, 467--470.Google Scholar
- Redis. 2018. Command Reference --- Redis. https://redis.io/commands.Google Scholar
- Redis. 2018. How Fast is Redis? https://redis.io/topics/benchmarks.Google Scholar
- Redis. 2018. Redis. https://redis.io/.Google Scholar
- Drew Roselli, Jacob R. Lorch, and Thomas E. Anderson. 2000. A Comparison of File System Workloads. In Proceedings of the 2000 USENIX Annual Technical Conference (ATC '00). USENIX Association, 41--54.Google Scholar
- M. Rosenblum, E. Bugnion, S. A. Herrod, E. Witchel, and A. Gupta. 1995. The Impact of Architectural Trends on Operating System Performance. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP '95). ACM, 285--298.Google Scholar
- Theodore Y. Ts'o. 2019. Personal Communication.Google Scholar
- Thomas Garnier. 2016. mm: SLAB Freelist Randomization. https://lwn.net/Articles/682814/.Google Scholar
- Thomas Gleixner. 2018. x86/retpoline: Add Initial Retpoline Support. https://patchwork.kernel.org/patch/10152669/.Google Scholar
- Ubuntu. 2018. Ubuntu. https://www.ubuntu.com/.Google Scholar
- Vlad Frolov. 2016. [REGRESSION] Intensive Memory CGroup Removal Leads to High Load Average 10+. https://bugzilla.kernel.org/show_bug.cgi?id=190841.Google Scholar
- W3Techs. 2018. Usage Statistics and Market Share of Linux for Websites. https://w3techs.com/technologies/details/os-linux/all/all.Google Scholar
Index Terms
- An analysis of performance evolution of Linux's core operations
Recommendations
Implementation and experimental performance evaluation of a hybrid interrupt-handling scheme
The performance of network hosts can be severely degraded when subjected to heavy traffic of today's Gigabit networks. This degradation occurs as a result of the interrupt overhead associated with the high rate of packet arrivals. NAPI, a packet ...
Performance analysis of network operating systems in local area networks
CEA'08: Proceedings of the 2nd WSEAS International Conference on Computer Engineering and ApplicationsIn this paper, in a laboratory environment the performance of four different operating systems (Windows NT4, Windows 2000, Windows 2003, and Linux Fedora) are compared. The performance parameters measured are bandwidth and network delay. Linux Fedora ...
A detailed performance analysis of UDP/IP, TCP/IP, and M-VIA network protocols using Linux/SimOS
This paper presents a performance study of UDP/IP, TCP/IP, and M-VIA using Linux/SimOS. Linux/SimOS is a Linux operating system port to a complete machine simulator SimOS. A complete machine simulator includes all the system components, such as CPU, ...
Comments