- Anurag Acharya, Mustafa Uysal, and Joel Saltz. 1998. Active Disks: Programming Model, Algorithms and Evaluation. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII). 81--91.Google ScholarDigital Library
- J. Ahn, S. Hong, S. Yoo, O. Mutlu, and K. Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing. In 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). 105--117. Google ScholarDigital Library
- J. Ahn, S. Yoo, O. Mutlu, and K. Choi. 2015. PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture. In 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). 336--348.Google Scholar
- B. Akin, F. Franchetti, and J. C. Hoe. 2016. HAMLeT Architecture for Parallel Data Reorganization in Memory. IEEE Micro 36, 1 (Jan 2016), 14--23. Google ScholarDigital Library
- H. Asghari-Moghaddam, A. Farmahini-Farahani, K. Morrow, J. H. Ahn, and N. S. Kim. 2016. Near-DRAM Acceleration with Single-ISA Heterogeneous Processing in Standard Memory Modules. IEEE Micro 36, 1 (Jan 2016), 24--34. Google ScholarDigital Library
- J. L. Baer. 1976. Multiprocessing Systems. IEEE Trans. Comput. 25, 12 (Dec. 1976), 1271--1277. Google ScholarDigital Library
- R. Balasubramonian, J. Chang, T. Manning, J. H. Moreno, R. Murphy, R. Nair, and S. Swanson. 2014. Near-Data Processing: Insights from a MICRO-46 Workshop. IEEE Micro 34, 4 (July 2014), 36--42. Google ScholarCross Ref
- Francisco J Ballesteros, Noah Evans, Charles Forsyth, Gorka Guardiola, Jim McKie, Ron Minnich, and Enrique Soriano-Salvador. 2012. Nix: A case for a manycore system for cloud computing. Bell Labs Technical Journal 17, 2 (2012), 41--54.Google ScholarDigital Library
- Antonio Barbalace and Anthony Iliopoulos. 2017. Address Space and Executable Formats, Such Old Topics!. In Proceedings of the 7th Workshop on Multicore and Rack-scale Systems (MaRS'17).Google Scholar
- Antonio Barbalace, Rob Lyerly, Christopher Jelesnianski, Anthony Carno, Ho-ren Chuang, and Binoy Ravindran. 2017. Breaking the Boundaries in Heterogeneous-ISA Datacenters. In Proceedings of the 22th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '17). Google ScholarDigital Library
- Antonio Barbalace, Binoy Ravindran, and David Katz. 2014. Popcorn: a Replicated-kernel OS Based on Linux. In In Proceedings of Ottawa Linux Symposium (OLS '14).Google Scholar
- Antonio Barbalace, Marina Sadini, Saif Ansary, Christopher Jelesnianski, Akshay Ravichandran, Cagil Kendir, Alastair Murray, and Binoy Ravindran. 2015. Popcorn: Bridging the Programmability Gap in heterogeneous-ISA Platforms. In Proceedings of the Tenth European Conference on Computer Systems (EuroSys '15). 29:1--29:16. Google ScholarDigital Library
- Luiz Barroso, Mike Marty, David Patterson, and Parthasarathy Ranganathan. 2017. Attack of the Killer Microseconds. Commun. ACM 60, 4 (March 2017), 48--54. Google ScholarDigital Library
- Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schupbach, and Akhilesh Singhania. 2009. The Multikernel: A New OS Architecture for Scalable Multicore Systems. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles (SOSP '09). 29--44. Google ScholarDigital Library
- Tony M. Brewer. 2010. Instruction Set Innovations for the Convey HC-1 Computer. IEEE Micro 30, 2 (March 2010), 70--79. Google ScholarDigital Library
- P. Cappelletti. 2015. Nonvolatile memory evolution and revolution. In 2015 IEEE International Electron Devices Meeting (IEDM). 10.1.1--10.1.4. Google ScholarCross Ref
- CCIX Consortium. 2017. Cache Coherent Interconnect for Accelerators (CCIX). http://www.ccixconsortium.com/. (2017).Google Scholar
- David Chisnall. 2014. There's No Such Thing As a General-purpose Processor. Queue 12, 10 (Oct. 2014).Google ScholarDigital Library
- Computer Systems Laboratory at Sungkyunkwan University. 2016. The OpenSSD Project. http://www.openssd-project.org/. (2016).Google Scholar
- Henry Cook, Miquel Moreto, Sarah Bird, Khanh Dao, David A. Patterson, and Krste Asanovic. 2013. A Hardware Evaluation of Cache Partitioning to Improve Utilization and Energy-efficiency While Preserving Responsiveness. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA '13). 308--319. Google ScholarDigital Library
- Jonathan Corbet. 2012. Toward better NUMA scheduling. https://lwn.net/Articles/486858/. (march 2012).Google Scholar
- Brett D. Fleisch and Mark Allan A. Co. 1997. Workplace Microkernel and OS: A Case Study. (1997).Google Scholar
- Vladimir Davydov. 2015. idle memory tracking. https://lwn.net/Articles/643578/. (2015).Google Scholar
- Matthew DeVuyst, Ashish Venkat, and Dean M. Tullsen. 2012. Execution Migration in a heterogeneous-ISA Chip Multiprocessor. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVII). 261--272. Google ScholarDigital Library
- Jeff Draper, Jacqueline Chame, Mary Hall, Craig Steele, Tim Barrett, Jeff LaCoss, John Granacki, Jaewook Shin, Chun Chen, Chang Woo Kang, Ihn Kim, and Gokhan Daglikoca. 2002. The Architecture of the DIVA Processing-in-memory Chip. In Proceedings of the 16th International Conf on Super computing (ICS '02). 14--25. Google ScholarDigital Library
- Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. 2011. Dark Silicon and the End of Multicore Scaling. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA '11).365--376. Google ScholarDigital Library
- Paolo Faraboschi, Kimberly Keeton, Tim Marsland, and Dejan Milojicic. 2015. Beyond Processor-centric Operating Systems. In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS'15).Google ScholarDigital Library
- Brad Fitzpatrick. 2017. memcached - a distributed memory object caching system. http://memcached.org/. (2017).Google Scholar
- Mingyu Gao, Grant Ayers, and Christos Kozyrakis. 2015. Practical Near-Data Processing for In-Memory Analytics Frameworks. In Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT) (PACT '15). 113--124. Google ScholarDigital Library
- Alain Gefflaut, Trent Jaeger, Yoonho Park, Jochen Liedtke, Kevin J. Elphinstone, Volkmar Uhlig, Jonathon E. Tidswell, Luke Deller, and Lars Reuther. 2000. The SawMill Multiserver Approach. In Proceedings of the 9th Workshop on ACM SIGOPS European Workshop: Beyond the PC: New Challenges for the Operating System (EW 9). 109--114. Google ScholarDigital Library
- Gen-Z Consortium. 2017. Gen-Z -- A New Approach to Data Access. http://genzconsortium.org/. (2017).Google Scholar
- Simon Gerber, Gerd Zellweger, Reto Achermann, Kornilios Kourtis, Timothy Roscoe, and Dejan Milojicic. 2015. Not Your Parents' Physical Address Space. In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS'15).Google ScholarDigital Library
- B. Gerofi, M. Takagi, A. Hori, G. Nakamura, T. Shirasawa, and Y. Ishikawa. 2016. On the Scalability, Performance Isolation and Device Driver Transparency of the IHK/McKernel Hybrid Lightweight Kernel. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 1041--1050. Google ScholarCross Ref
- M. J. Gonzalez and C. V. Ramamoorthy. 1972. Parallel Task Execution in a Decentralized System. IEEE Trans. Comput. C-21, 12 (Dec 1972). Google ScholarDigital Library
- Boncheol Gu, Andre S. Yoon, Duck-Ho Bae, Insoon Jo, Jinyoung Lee, Jonghyun Yoon, Jeong-Uk Kang, Moonsang Kwon, Chanho Yoon, Sangyeun Cho, Jaeheon Jeong, and Duckhyun Chang. 2016. Biscuit: A Framework for Near-data Processing of Big Data Workloads. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA '16). 153--165. Google ScholarDigital Library
- Maurice Herlihy and Nir Shavit. 2008. The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc.Google Scholar
- D. Katz, A. Barbalace, S. Ansary, A. Ravichandran, and B. Ravindran. 2015. Thread Migration in a Replicated-Kernel OS. In 2015 IEEE 35th International Conference on Distributed Computing Systems. 278--287. Google ScholarCross Ref
- Sangman Kim, Seonggu Huh, Yige Hu, Xinya Zhang, Emmett Witchel, Amir Wated, and Mark Silberstein. 2014. GPUnet: Networking Abstractions for GPU Programs. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI'14). 201--216.Google Scholar
- Ben Leslie. 2006. GrailOS: A micro-kernel based, multi-server, multi-personality operating system. In Proceedings of the 2nd International Workshop on Object Systems and Software Architectures. Victor Harbor, South Australia, Australia.Google Scholar
- Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn. 2007. Efficient Operating System Scheduling for Performance-asymmetric Multi-core Architectures. In Proceedings of the 2007 ACM/IEEE Conf. on Supercomputing (SC '07). Google ScholarDigital Library
- Felix Xiaozhu Lin, Zhen Wang, and Lin Zhong. 2014. K2: A Mobile Operating System for Heterogeneous Coherence Domains. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '14). 285--300. Google ScholarDigital Library
- Gang Lu, Jianfeng Zhan, Xinlong Lin, Chongkang Tan, and Lei Wang. 2016. On Horizontal Decomposition of the Operating System. CoRR abs/1604.01378 (2016).Google Scholar
- Marco Minutoli, S Kuntz, Antonino Tumeo, and P Kogge. 2015. Implementing radix sort on emu 1. In In the 3rd Workshop on Near-Data Processing (WoNDP), Waikiki, Hawaii.Google Scholar
- Sparsh Mittal. 2016. A Survey of Techniques for Architecting and Managing Asymmetric Multicore Processors. ACM Comput. Surv. 48, 3 (Feb. 2016), 45:1--45:38.Google Scholar
- Jeffrey C. Mogul, Andrew Baumann, Timothy Roscoe, and Livio Soares. 2011. Mind the Gap: Reconnecting Architecture and OS Research. In Proceedings of the 13th USENIX Conference on Hot Topics in Operating Systems (HotOS'13).Google Scholar
- R. Nair, S. F. Antao, C. Bertolli, P. Bose, J. R. Brunheroto, T. Chen, C. Y. Cher, C. H. A. Costa, J. Doi, C. Evangelinos, B. M. Fleischer, T. W. Fox, and et al. 2015. Active Memory Cube: A processing-in-memory architecture for exascale systems. IBM Journal of Research and Development 59, 2/3 (March 2015), 17:1--17:14.Google ScholarDigital Library
- Edmund B. Nightingale, Orion Hodson, Ross McIlroy, Chris Hawblitzel, and Galen Hunt. 2009. Helios: Heterogeneous Multiprocessing with Satellite Kernels. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles (SOSP '09). 221--234.Google ScholarDigital Library
- G.J. Nutt. 1977. A Parallel Processor Operating System Comparison. IEEE Transactions on Software Engineering 3, undefined (1977), 467--475.Google Scholar
- OpenCAPI Consortium. 2017. Welcom to OpenCAPI Consortium. http://opencapi.org/. (2017).Google Scholar
- David Patterson, Thomas Anderson, Neal Cardwell, Richard Fromm, Kimberly Keeton, Christoforos Kozyrakis, Randi Thomas, and Katherine Yelick. 1997. A case for intelligent RAM. IEEE Micro 17, 2 (1997), 34--44. Google ScholarDigital Library
- Bharath Pichai, Lisa Hsu, and Abhishek Bhattacharjee. 2014. Architectural Support for Address Translation on GPUs: Designing Memory Management Units for CPU/GPUs with Unified Address Spaces. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '14). 743--758. Google ScholarDigital Library
- Jason Power, Mark D Hill, and David A Wood. 2014. Supporting x86-64 address translation for 100s of GPU lanes. In High Performance Computer Architecture (HPCA), 2014 IEEE 20th International Symposium on. IEEE, 568--578.Google ScholarCross Ref
- C. Ramey. 2011. TILE-Gx100 ManyCore processor: Acceleration interfaces and architecture. In 2011 IEEE Hot Chips 23 Symposium (HCS). 1--21.Google ScholarCross Ref
- redislab. 2017. redis -- open source data object store. http://redis.io. (2017).Google Scholar
- Tamara Schmitz. 2014. The Rise of Serial Memory and the Future of DDR. (2014).Google Scholar
- Sudharsan Seshadri, Mark Gahagan, Sundaram Bhaskaran, Trevor Bunker, Arup De, Yanqin Jin, Yang Liu, and Steven Swanson. 2014. Willow: A User-programmable SSD. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI'14). 67--80.Google ScholarDigital Library
- Mark Silberstein, Bryan Ford, Idit Keidar, and Emmett Witchel. 2013. GPUfs: Integrating a File System with GPUs. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '13). 485--498. Google ScholarDigital Library
- Hayden Kwok-Hay So. 2007. Borph: An Operating System for Fpga-based Re configurable Computers. Ph.D. Dissertation. Advisor(s) Brodersen, Robert.Google Scholar
- Hung-Wei Tseng, Qianchen Zhao, Yuxiao Zhou, Mark Gahagan, and Steven Swanson. 2016. Morpheus: Creating Application Objects Efficiently for Heterogeneous Computing. 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) 00 (2016), 53--65.Google Scholar
- Jan van Lunteren. 2016. Programmable Near-Memory Acceleration on ConTutto. (2016).Google Scholar
- David Wentzlaff and Anant Agarwal. 2009. Factored Operating Systems (Fos): The Case for a Scalable Operating System for Multicores. SIGOPS Oper. Syst. Rev. 43, 2 (April 2009), 76--85. Google ScholarDigital Library
- Robert W. Wisniewski, Todd Inglett, Pardo Keppel, Ravi Murty, and Rolf Riesen. 2014. mOS: An Architecture for Extreme-scale Operating Systems. In Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS '14). 2:1--2:8.Google ScholarDigital Library
- Wm. A. Wulf and Sally A. McKee. 1995. Hitting the Memory Wall: Implications of the Obvious. SIGARCH Comput. Archit. News 23, 1 (March 1995).Google ScholarDigital Library
- Yi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai, and Yen-Ting Chao. 2015. VirtCL: A Framework for OpenCL Device Abstraction and Management. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2015). 161--172. Google ScholarDigital Library
- N. Zilberman, Y. Audzevich, G. A. Covington, and A. W. Moore. 2014. NetFPGA SUME: Toward 100 Gbps as Research Commodity. IEEE Micro 34, 5 (Sept 2014), 32--41. Google ScholarCross Ref
Index Terms
- It's Time to Think About an Operating System for Near Data Processing Architectures
Recommendations
Thoth, a portable real-time operating system
Thoth is a real-time operating system which is designed to be portable over a large set of machines. It is currently running on two minicomputers with quite different architectures. Both the system and application programs which use it are written in a ...
The Linux Operating System
The enormous consumer market for IBM PCs and compatibles has made them affordable. Now, with a free operating system called Linux, these inexpensive machines can be converted into powerful workstations for teaching, research, and software development. ...
Comments