Abstract
Complex storage stacks providing data compression, indexing, and analytics help leverage the massive amounts of data generated today to derive insights. It is challenging to perform this computation, however, while fully utilizing the underlying storage media. This is because, while storage servers with large core counts are widely available, single-core performance and memory bandwidth per core grow slower than the core count per die. Computational storage offers a promising solution to this problem by utilizing dedicated compute resources along the storage processing path. We present DeltaFS Indexed Massive Directories (IMDs), a new approach to computational storage. DeltaFS IMDs harvest available (i.e., not dedicated) compute, memory, and network resources on the compute nodes of an application to perform computation on data. We demonstrate the efficiency of DeltaFS IMDs by using them to dynamically reorganize the output of a real-world simulation application across 131,072 CPU cores. DeltaFS IMDs speed up reads by 1,740× while only slightly slowing down the writing of data during simulation I/O for in situ data processing.
- Google. 2012. LevelDB. Retrieved from https://github.com/google/levGoogle Scholar
- Oracle. 2013. A Technical Overview of the Oracle Exadata Database Machine and Exadata Storage Server. Retrieved from https://www.oracle.com/technetwork/database/exadata/exadata-dbmachine-x4-twp-2076451.pdf.Google Scholar
- IBM. 2014. IBM PureData System for Analytics Architecture, A Platform for High Performance Data Warehousing and Analytics. Retrieved from https://www.redbooks.ibm.com/redpapers/pdfs/redp4725.pdf.Google Scholar
- LANL, NERSC, SNL. 2016. APEX Workflows. Retrieved from https://www.nersc.gov/assets/apex-workflows-v2.pdf.Google Scholar
- LANL. 2016. LANL Trinity. Retrieved from http://www.lanl.gov/projects/trinity/.Google Scholar
- SNIA. 2019. Computational Storage Architecture and Programming Model. Retrieved from https://www.snia.org/sites/default/files/technical_work/PublicReview/SNIA-Computational-Storage-Architecture-and-Programming-Model-0.3R1.pdf.Google Scholar
- Anurag Acharya, Mustafa Uysal, and Joel Saltz. 1998. Active disks: Programming model, algorithms and evaluation. SIGOPS Oper. Syst. Rev. 32, 5 (Oct. 1998), 81--91. DOI:https://doi.org/10.1145/384265.291026Google ScholarDigital Library
- Ashok Anand, Chitra Muthukrishnan, Steven Kappes, Aditya Akella, and Suman Nath. 2010. Cheap and large CAMs for high performance data-intensive networked systems. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation (NSDI’10).Google ScholarDigital Library
- S. Atchley, D. Dillow, G. Shipman, P. Geoffray, J. M. Squyres, G. Bosilca, and R. Minnich. 2011. The common communication interface (CCI). In Proceedings of the IEEE Annual Symposium on High-Performance Interconnects (HOTI’11). 51--60. DOI:https://doi.org/10.1109/HOTI.2011.17Google Scholar
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’12). 53--64. DOI:https://doi.org/10.1145/2254756.2254766Google ScholarDigital Library
- Utkarsh Ayachit, Andrew Bauer, Earl P. N. Duque, Greg Eisenhauer, Nicola Ferrier, Junmin Gu, Kenneth E. Jansen, Burlen Loring, Zarija Lukić, Suresh Menon, Dmitriy Morozov, Patrick O’Leary, Reetesh Ranjan, Michel Rasquin, Christopher P. Stone, Venkat Vishwanath, Gunther H. Weber, Brad Whitlock, Matthew Wolf, K. John Wu, and E. Wes Bethel. 2016. Performance analysis, design considerations, and applications of extreme-scale in situ infrastructures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’16). Article 79, 12 pages.Google Scholar
- Michael A. Bender, Martin Farach-Colton, Rob Johnson, Russell Kraner, Bradley C. Kuszmaul, Dzejla Medjedovic, Pablo Montes, Pradeep Shetty, Richard P. Spillane, and Erez Zadok. 2012. Don’t thrash: How to cache your hash on flash. Proc. VLDB Endow. 5, 11 (July 2012), 1627--1637. DOI:https://doi.org/10.14778/2350229.2350275Google ScholarDigital Library
- J. C. Bennett, H. Abbasi, P. T. Bremer, R. Grout, A. Gyulassy, T. Jin, S. Klasky, H. Kolla, M. Parashar, V. Pascucci, P. Pebay, D. Thompson, H. Yu, F. Zhang, and J. Chen. 2012. Combining in-situ and in-transit processing to enable extreme-scale scientific analysis. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’12). 1--9. DOI:https://doi.org/10.1109/SC.2012.31Google Scholar
- J. Bent, S. Faibish, J. Ahrens, G. Grider, J. Patchett, P. Tzelnic, and J. Woodring. 2012. Jitter-free co-processing on a prototype exascale storage stack. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’12). 1--5. DOI:https://doi.org/10.1109/MSST.2012.6232382Google Scholar
- John Bent, Brad Settlemyer, and Gary Grider. 2016. Serving data to the lunatic fringe: The evolution of HPC storage. USENIX ;login: 41, 2 (June 2016).Google Scholar
- D. Bigelow, S. Brandt, J. Bent, and H. B. Chen. 2010. Mahanaxar: Quality of service guarantees in high-bandwidth, real-time streaming data storage. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’10). 1--11. DOI:https://doi.org/10.1109/MSST.2010.5496975Google Scholar
- Andrew D. Birrell and Bruce Jay Nelson. 1983. Implementing remote procedure calls. In Proceedings of the Ninth ACM Symposium on Operating Systems Principles (SOSP’83). 3–. DOI:https://doi.org/10.1145/800217.806609Google Scholar
- Burton H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (July 1970), 422--426. DOI:https://doi.org/10.1145/362686.362692Google ScholarDigital Library
- S. Boboila, Y. Kim, S. S. Vazhkudai, P. Desnoyers, and G. M. Shipman. 2012. Active flash: Out-of-core data analytics on flash storage. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST 12). 1--12. DOI:https://doi.org/10.1109/MSST.2012.6232366Google Scholar
- Jeff Bonwick, Matt Ahrens, Val Henson, Mark Maybee, and Mark Shellenbaum. 2003. The Zettabyte File System. Technical Report. Sun Microsystems.Google Scholar
- K. J. Bowers, B. J. Albright, L. Yin, B. Bergen, and T. J. T. Kwan. 2008. Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Phys. Plasmas 15, 5 (2008), 7.Google ScholarCross Ref
- Surendra Byna, Jerry Chou, Oliver Rübel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, and Kesheng Wu. 2012. Parallel I/O, analysis, and visualization of a trillion particle simulation. In Proceedings of the International Conference on High Performance Computing, Networking, Storage, and Analysis (SC’12). Article 59, 12 pages. DOI:https://doi.org/10.1109/SC.2012.92Google ScholarDigital Library
- Suren Byna, Robert Sisneros, Kalyana Chadalavada, and Quincey Koziol. 2015. Tuning parallel I/O on blue waters for writing 10 trillion particles. In Proceedings of the Cray User Group (CUG’15). Retrieved from https://cug.org/proceedings/cug2015_proceedings/includes/files/pap120-file2.pdf.Google Scholar
- Suren Byna, A. Uselton, D. Knaak Prabhat, and Y. He. 2013. Trillion particles, 120,000 cores, and 350 TBs: Lessons learned from a hero I/O run on Hopper. In Proceedings of the Cray User Group (CUG’13). Retrieved from https://cug.org/proceedings/cug2013_proceedings/includes/files/pap107-file2.pdf.Google Scholar
- P. Carns, W. Ligon, R. Ross, and P. Wyckoff. 2005. BMI: A network abstraction layer for parallel I/O. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS’05). 1--8. DOI:https://doi.org/10.1109/IPDPS.2005.128Google Scholar
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 205--218.Google ScholarDigital Library
- C. Chen, M. Lang, L. Ionkov, and Y. Chen. 2016. Active burst-buffer: In-transit processing integrated into hierarchical storage. In Proceedings of the IEEE International Conference on Networking Architecture and Storage (NAS’16). 1--10. DOI:https://doi.org/10.1109/NAS.2016.7549390Google Scholar
- Jacqueline H. Chen, Alok Choudhary, Bronis De Supinski, Matthew DeVries, Evatt R. Hawkes, Scott Klasky, Wei-Keng Liao, Kwan-Liu Ma, John Mellor-Crummey, Norbert Podhorszki, et al. 2009. Terascale direct numerical simulations of turbulent combustion using S3D. Comput. Sci. Discov. 2, 1 (2009), 015001.Google ScholarCross Ref
- Sangyeun Cho, Chanik Park, Hyunok Oh, Sungchan Kim, Youngmin Yi, and Gregory R. Ganger. 2013. Active disk meets flash: A case for intelligent SSDs. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing (ICS’13). 91--102. DOI:https://doi.org/10.1145/2464996.2465003Google Scholar
- Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel, Prabhat, and Rob D. Ryne. 2011. Parallel index and query for large scale data analysis. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’11). Article 30, 11 pages. DOI:https://doi.org/10.1145/2063384.2063424Google Scholar
- J. Chou, K. Wu, and Prabhat. 2011. FastQuery: A parallel indexing system for scientific data. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’11). 455--464. DOI:https://doi.org/10.1109/CLUSTER.2011.86Google ScholarDigital Library
- Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal navigable key-value store. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). 79--94. DOI:https://doi.org/10.1145/3035918.3064054Google ScholarDigital Library
- Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Symposium on Opearting Systems Design and Implementation (OSDI’04).Google Scholar
- Peter J. Desnoyers and Prashant Shenoy. 2007. Hyperion: High volume stream archival for retrospective querying. In Proceedings of the 2007 USENIX Annual Technical Conference (USENIX ATC’07). Article 4, 14 pages.Google Scholar
- Ananth Devulapalli, Iyyappa Murugandi, Da Xu, and Pete Wyckoff. 2009. Design of an Intelligent Object-based Storage Device. Technical Report. Ohio Supercomputer Center.Google Scholar
- Jaeyoung Do, Yang-Suk Kee, Jignesh M. Patel, Chanik Park, Kwanghyun Park, and David J. DeWitt. 2013. Query processing on smart SSDs: Opportunities and challenges. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’13). 1221--1230. DOI:https://doi.org/10.1145/2463676.2465295Google Scholar
- Douglas Doerfler, Brian Austin, Brandon Cook, Jack Deslippe, Krishna Kandalla, and Peter Mendygral. 2017. Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC. In Proceedings of the Cray User Group (CUG’17). Retrieved from https://cug.org/proceedings/cug2017_proceedings/includes/files/pap117s2-file1.pdf.Google Scholar
- Bin Dong, Surendra Byna, and Kesheng Wu. 2016. SDS-sort: Scalable dynamic skew-aware parallel sorting. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC’16). 57--68. DOI:https://doi.org/10.1145/2907294.2907300Google ScholarDigital Library
- M. Dorier, G. Antoniu, F. Cappello, M. Snir, and L. Orf. 2012. Damaris: How to efficiently leverage multicore parallelism to achieve scalable, jitter-free I/O. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’12). 155--163. DOI:https://doi.org/10.1109/CLUSTER.2012.26Google Scholar
- Robert Escriva, Bernard Wong, and Emin Gün Sirer. 2012. HyperDex: A distributed, searchable key-value store. In Proceedings of the ACM SIGCOMM Conference on Applications Technologies Architectures and Protocols for Computer Communication (SIGCOMM’12). 25--36. DOI:https://doi.org/10.1145/2342356.2342360Google ScholarDigital Library
- Bin Fan, Dave G. Andersen, Michael Kaminsky, and Michael D. Mitzenmacher. 2014. Cuckoo filter: Practically better than bloom. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies (CoNEXT’14). 75--88. DOI:https://doi.org/10.1145/2674005.2674994Google Scholar
- Hugh N. Greenberg, John Bent, and Gary Grider. 2015. MDHIM: A parallel key/value framework for HPC. In Proceedings of the 7th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’15).Google ScholarDigital Library
- P. Grun, S. Hefty, S. Sur, D. Goodell, R. D. Russell, H. Pritchard, and J. M. Squyres. 2015. A brief introduction to the openfabrics interfaces—A new network API for maximizing high performance application efficiency. In Proceedings of the IEEE Annual Symposium on High-Performance Interconnects (HOTI’15). 34--39. DOI:https://doi.org/10.1109/HOTI.2015.19Google Scholar
- Boncheol Gu, Andre S. Yoon, Duck-Ho Bae, Insoon Jo, Jinyoung Lee, Jonghyun Yoon, Jeong-Uk Kang, Moonsang Kwon, Chanho Yoon, Sangyeun Cho, Jaeheon Jeong, and Duckhyun Chang. 2016. Biscuit: A framework for near-data processing of big data workloads. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA’16). 153--165. DOI:https://doi.org/10.1109/ISCA.2016.23Google ScholarDigital Library
- Tyler Harter, Dhruba Borthakur, Siying Dong, Amitanand Aiyer, Liyin Tang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. Analysis of HDFS under HBase: A Facebook messages case study. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). 199--212.Google ScholarDigital Library
- Larry Huston, Rahul Sukthankar, Rajiv Wickremesinghe, M. Satyanarayanan, Gregory R. Ganger, Erik Riedel, and Anastassia Ailamaki. 2004. Diamond: A storage architecture for early discard in interactive search. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04).Google Scholar
- Junsu Im, Jinwook Bae, Chanwoo Chung, Arvind, and Sungjin Lee. 2020. PinK: High-speed in-storage key-value store with bounded tails. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’20).Google Scholar
- H. V. Jagadish, P. P. S. Narayan, S. Seshadri, S. Sudarshan, and Rama Kanneganti. 1997. Incremental organization for data recording and warehousing. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB’97). 16--25.Google ScholarDigital Library
- Y. Jin, H. Tseng, Y. Papakonstantinou, and S. Swanson. 2017. KAML: A flexible, high-performance key-value SSD. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’17). 373--384. DOI:https://doi.org/10.1109/HPCA.2017.15Google Scholar
- Y. Kang, Y. Kee, E. L. Miller, and C. Park. 2013. Enabling cost-effective data processing with smart SSD. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’13). 1--12. DOI:https://doi.org/10.1109/MSST.2013.6558444Google Scholar
- Yangwook Kang, Rekha Pitchumani, Pratik Mishra, Yang-suk Kee, Francisco Londono, Sangyoon Oh, Jongyeol Lee, and Daniel D. G. Lee. 2019. Towards building a high-performance, scale-in key-value storage system. In Proceedings of the 12th ACM International Conference on Systems and Storage (SYSTOR’19). 144--154. DOI:https://doi.org/10.1145/3319647.3325831Google Scholar
- Kimberly Keeton, David A. Patterson, and Joseph M. Hellerstein. 1998. A case for intelligent disks (IDISKs). SIGMOD Rec. 27, 3 (Sept. 1998), 42--52. DOI:https://doi.org/10.1145/290593.290602Google ScholarDigital Library
- J. Kim, H. Abbasi, L. Chacón, C. Docan, S. Klasky, Q. Liu, N. Podhorszki, A. Shoshani, and K. Wu. 2011. Parallel in situ indexing for data-intensive computing. In Proceedings of the IEEE Symposium on Large Data Analysis and Visualization (LDAV’11). 65--72. DOI:https://doi.org/10.1109/LDAV.2011.6092319Google Scholar
- C. Lee, H. Kang, D. Park, S. Park, Y. Kim, J. Noh, W. Chung, and K. Park. 2019. iLSM-SSD: An intelligent LSM-tree-based key-value SSD for data analytics. In Proceedings of the IEEE 27th International Symposium on Modeling Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’19). 384--395. DOI:https://doi.org/10.1109/MASCOTS.2019.00048Google Scholar
- S. Lee, J. Park, K. Fleming, Arvind, and J. Kim. 2011. Improving performance and lifetime of solid-state drives using hardware-accelerated compression. IEEE Trans. Consumer Electr. 57, 4 (Nov. 2011), 1732--1739. DOI:https://doi.org/10.1109/TCE.2011.6131148Google ScholarCross Ref
- M. Li, S. S. Vazhkudai, A. R. Butt, F. Meng, X. Ma, Y. Kim, C. Engelmann, and G. Shipman. 2010. Functional partitioning to optimize end-to-end performance on many-core architectures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’10). 1--12. DOI:https://doi.org/10.1109/SC.2010.28Google Scholar
- Siyang Li, Youyou Lu, Jiwu Shu, Yang Hu, and Tao Li. 2017. LocoFS: A loosely-coupled metadata service for distributed file systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’17). Article 4, 12 pages. DOI:https://doi.org/10.1145/3126908.3126928Google ScholarDigital Library
- Xiaozhou Li, David G. Andersen, Michael Kaminsky, and Michael J. Freedman. 2014. Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14). Article 27, 14 pages. DOI:https://doi.org/10.1145/2592798.2592820Google Scholar
- Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). 1--13. DOI:https://doi.org/10.1145/2043556.2043558Google ScholarDigital Library
- N. Liu, J. Cope, P. Carns, C. Carothers, R. Ross, G. Grider, A. Crume, and C. Maltzahn. 2012. On the role of burst buffers in leadership-class storage systems. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’12). 1--11. DOI:https://doi.org/10.1109/MSST.2012.6232369Google Scholar
- J. Lofstead, I. Jimenez, C. Maltzahn, Q. Koziol, J. Bent, and E. Barton. 2016. DAOS and friends: A proposal for an exascale storage system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’16). 585--596. DOI:https://doi.org/10.1109/SC.2016.49Google Scholar
- J. Lofstead, F. Zheng, S. Klasky, and K. Schwan. 2009. Adaptable, metadata rich IO methods for portable high performance IO. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS’09). 1--10. DOI:https://doi.org/10.1109/IPDPS.2009.5161052Google Scholar
- Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of the 14th Usenix Conference on File and Storage Technologies (FAST’16). 133--148.Google ScholarDigital Library
- Chen Luo and Michael J. Carey. 2020. LSM-based storage techniques: A survey. VLDB J. 29, 1 (Jan. 2020), 393--418. DOI:https://doi.org/10.1007/s00778-019-00555-yGoogle ScholarCross Ref
- Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. 2015. NVMKV: A scalable, lightweight, FTL-aware key-value store. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15). 207--219.Google ScholarDigital Library
- M. Mitzenmacher. 2001. The power of two choices in randomized load balancing. IEEE Trans. Parallel Distrib. Syst. 12, 10 (Oct. 2001), 1094--1104. DOI:https://doi.org/10.1109/71.963420Google ScholarDigital Library
- Ron A. Oldfield, Gregory D. Sjaardema, Gerald F. Lofstead, II, and Todd Kordenbrock. 2012. Trilinos I/O support trios. Sci. Program. 20, 2 (Apr. 2012), 181--196. DOI:https://doi.org/10.1155/2012/842791Google Scholar
- Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Info. 33, 4 (June 1996), 351--385. DOI:https://doi.org/10.1007/s002360050048Google Scholar
- Andrey Ovsyannikov, Melissa Romanus, Brian Van Straalen, Gunther H. Weber, and David Trebotich. 2016. Scientific workflows at datawarp-speed: Accelerated data-intensive science using NERSC’s burst buffer. In Proceedings of the 1st Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS’16). 1--6.Google ScholarCross Ref
- Rasmus Pagh and Flemming Friche Rodler. 2004. Cuckoo hashing. J. Algor. 51, 2 (May 2004), 122--144. DOI:https://doi.org/10.1016/j.jalgor.2003.12.002Google ScholarDigital Library
- Prashant Pandey, Michael A. Bender, Rob Johnson, and Rob Patro. 2017. A general-purpose counting filter: Making every bit count. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). 775--787. DOI:https://doi.org/10.1145/3035918.3035963Google ScholarDigital Library
- Juan Piernas, Jarek Nieplocha, and Evan J. Felix. 2007. Evaluation of active storage strategies for the lustre parallel file system. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC’07). Article 28, 10 pages. DOI:https://doi.org/10.1145/1362622.1362660Google Scholar
- Kai Ren and Garth Gibson. 2013. TABLEFS: Enhancing metadata efficiency in the local file system. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’13). 145--156.Google Scholar
- Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. 2017. SlimDB: A space-efficient key-value storage engine for semi-sorted data. Proc. VLDB Endow. 10, 13 (Sept. 2017), 2037--2048. DOI:https://doi.org/10.14778/3151106.3151108Google ScholarDigital Library
- Kai Ren, Qing Zheng, Swapnil Patil, and Garth Gibson. 2014. IndexFS: Scaling file system metadata performance with stateless caching and bulk insertion. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’14). 237--248. DOI:https://doi.org/10.1109/SC.2014.25Google ScholarDigital Library
- E. Riedel, C. Faloutsos, G. A. Gibson, and D. Nagle. 2001. Active disks for large-scale data processing. Computer 34, 6 (June 2001), 68--74. DOI:https://doi.org/10.1109/2.928624Google ScholarDigital Library
- Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10, 1 (Feb. 1992), 26--52. DOI:https://doi.org/10.1145/146941.146943Google ScholarDigital Library
- Robert B. Ross, George Amvrosiadis, Philip Carns, Charles D. Cranor, Matthieu Dorier, Kevin Harms, Greg Ganger, Garth Gibson, Samuel K. Gutierrez, Robert Latham, Bob Robey, Dana Robinson, Bradley Settlemyer, Galen Shipman, Shane Snyder, Jerome Soumagne, and Qing Zheng. 2020. Mochi: Composing data services for high-performance computing environments. J. Comput. Sci. Technol. 35, 1, Article 121 (2020), 23 pages. DOI:https://doi.org/10.1007/s11390-020-9802-0Google ScholarCross Ref
- M. T. Runde, W. G. Stevens, P. A. Wortman, and J. A. Chandy. 2012. An active storage framework for object storage devices. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’12). 1--12. DOI:https://doi.org/10.1109/MSST.2012.6232372Google Scholar
- Philip Schwan. 2003. Lustre: Building a file system for 1000-node clusters. In Proceedings of the Ottawa Linux Symposium (OLS’03). 380--386.Google Scholar
- Russell Sears and Raghu Ramakrishnan. 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’12). 217--228. DOI:https://doi.org/10.1145/2213836.2213862Google ScholarDigital Library
- Pradeep Shetty, Richard Spillane, Ravikant Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. 2013. Building workload-independent storage with VT-trees. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 17--30.Google ScholarDigital Library
- Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Devesh Tiwari, Ali Anwar, Ali R. Butt, and Lavanya Ramakrishnan. 2015. AnalyzeThis: An analysis workflow-aware storage system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’15). Article 20, 12 pages. DOI:https://doi.org/10.1145/2807591.2807622Google ScholarDigital Library
- A. Sodani, R. Gramunt, J. Corbal, H. S. Kim, K. Vinod, S. Chinthamani, S. Hutsell, R. Agarwal, and Y. C. Liu. 2016. Knights landing: Second-generation Intel Xeon phi product. IEEE Micro 36, 2 (Mar. 2016), 34--46. DOI:https://doi.org/10.1109/MM.2016.25Google ScholarDigital Library
- S. W. Son, S. Lang, P. Carns, R. Ross, R. Thakur, B. Ozisikyilmaz, P. Kumar, W. Liao, and A. Choudhary. 2010. Enabling active storage on parallel I/O software stacks. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’10). 1--12. DOI:https://doi.org/10.1109/MSST.2010.5496981Google Scholar
- J. Soumagne, D. Kimpe, J. Zounmevo, M. Chaarawi, Q. Koziol, A. Afsahi, and R. Ross. 2013. Mercury: Enabling remote procedure call for high-performance computing. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’13). 1--8. DOI:https://doi.org/10.1109/CLUSTER.2013.6702617Google Scholar
- Devesh Tiwari, Simona Boboila, Sudharshan S. Vazhkudai, Youngjae Kim, Xiaosong Ma, Peter J. Desnoyers, and Yan Solihin. 2013. Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 119--132.Google ScholarDigital Library
- Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw. 2010. Accelerating parallel analysis of scientific simulation data via Zazen. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10).Google ScholarDigital Library
- V. Vishwanath, M. Hereld, V. Morozov, and M. E. Papka. 2011. Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’11). 1--11. DOI:https://doi.org/10.1145/2063384.2063409Google Scholar
- V. Vishwanath, M. Hereld, and M. E. Papka. 2011. Toward simulation-time data analysis and I/O acceleration on leadership-class systems. In Proceedings of the IEEE Symposium on Large Data Analysis and Visualization (LDAV’11). 9--14. DOI:https://doi.org/10.1109/LDAV.2011.6092178Google Scholar
- Jianguo Wang, Chunbin Lin, Yannis Papakonstantinou, and Steven Swanson. 2017. An experimental study of bitmap compression vs. inverted list compression. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). 993--1008. DOI:https://doi.org/10.1145/3035918.3064007Google ScholarDigital Library
- Jianguo Wang, Dongchul Park, Yang-Suk Kee, Yannis Papakonstantinou, and Steven Swanson. 2016. SSD in-storage computing for list intersection. In Proceedings of the 12th International Workshop on Data Management on New Hardware (DaMoN’16). Article 4, 7 pages. DOI:https://doi.org/10.1145/2933349.2933353Google ScholarDigital Library
- Sage A. Weil, Andrew W. Leung, Scott A. Brandt, and Carlos Maltzahn. 2007. RADOS: A scalable, reliable storage service for petabyte-scale storage clusters. In Proceedings of the 2nd International Workshop on Petascale Data Storage (PDSW’07). 35--44. DOI:https://doi.org/10.1145/1374596.1374606Google ScholarDigital Library
- Louis Woods, Zsolt István, and Gustavo Alonso. 2014. Ibex: An intelligent storage engine with support for advanced SQL offloading. Proc. VLDB Endow. 7, 11 (July 2014), 963--974. DOI:https://doi.org/10.14778/2732967.2732972Google ScholarDigital Library
- Kesheng Wu, Ekow J. Otoo, and Arie Shoshani. 2006. Optimizing bitmap indices with efficient compression. ACM Trans. Database Syst. 31, 1 (Mar. 2006), 1--38. DOI:https://doi.org/10.1145/1132863.1132864Google ScholarDigital Library
- S. Wu, K. Lin, and L. Chang. 2018. KVSSD: Close integration of LSM trees and flash translation layer for write-efficient KV store. In Proceedings of the Design Automation Test in Europe Conference Exhibition (DATE’18). 563--568. DOI:https://doi.org/10.23919/DATE.2018.8342070Google Scholar
- Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15). 71--82.Google ScholarDigital Library
- X. Yu, M. Youill, M. Woicik, A. Ghanem, M. Serafini, A. Aboulnaga, and M. Stonebraker. 2020. PushdownDB: Accelerating a DBMS using S3 computation. In Proceedings of the IEEE 36th International Conference on Data Engineering (ICDE’20). 1802--1805. DOI:https://doi.org/10.1109/ICDE48307.2020.00174Google Scholar
- Yulai Xie, K. Muniswamy-Reddy, D. Feng, D. D. E. Long, Yangwook Kang, Z. Niu, and Zhipeng Tan. 2011. Design and evaluation of Oasis: An active storage framework based on T10 OSD standard. In Proceedings of the International Conference on Massive Storage Systems and Technologies (MSST’11). 1--12. DOI:https://doi.org/10.1109/MSST.2011.5937220Google Scholar
- Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G. Andersen, Michael Kaminsky, Kimberly Keeton, and Andrew Pavlo. 2018. SuRF: Practical range query filtering with fast succinct tries. In Proceedings of the International Conference on Management of Data (SIGMOD’18). 323--336. DOI:https://doi.org/10.1145/3183713.3196931Google ScholarDigital Library
- F. Zheng, H. Abbasi, C. Docan, J. Lofstead, Q. Liu, S. Klasky, M. Parashar, N. Podhorszki, K. Schwan, and M. Wolf. 2010. PreDatA—Preparatory data analytics on peta-scale machines. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS’10). 1--12. DOI:https://doi.org/10.1109/IPDPS.2010.5470454Google Scholar
- F. Zheng, H. Yu, C. Hantas, M. Wolf, G. Eisenhauer, K. Schwan, H. Abbasi, and S. Klasky. 2013. GoldRush: Resource efficient in situ scientific data analytics using fine-grained interference aware execution. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’13). 1--12. DOI:https://doi.org/10.1145/2503210.2503279Google Scholar
- F. Zheng, H. Zou, G. Eisenhauer, K. Schwan, M. Wolf, J. Dayal, T. A. Nguyen, J. Cao, H. Abbasi, S. Klasky, N. Podhorszki, and H. Yu. 2013. FlexIO: I/O middleware for location-flexible scientific data analytics. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS’13). 320--331. DOI:https://doi.org/10.1109/IPDPS.2013.46Google Scholar
- Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Garth A. Gibson, Charles D. Cranor, Bradley W. Settlemyer, Gary Grider, and Fan Guo. 2017. Software-defined storage for fast trajectory queries using a DeltaFS indexed massive directory. In Proceedings of the 2nd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS’17). 7--12. DOI:https://doi.org/10.1145/3149393.3149398Google ScholarDigital Library
- Qing Zheng, Kai Ren, Garth Gibson, Bradley W. Settlemyer, and Gary Grider. 2015. DeltaFS: Exascale file systems scale better without dedicated servers. In Proceedings of the 10th Parallel Data Storage Workshop (PDSW’15). 1--6. DOI:https://doi.org/10.1145/2834976.2834984Google ScholarDigital Library
- Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, and Danny Harnik. 2014. Compression and SSDs: Where and how? In Proceedings of the 2nd Workshop on Interactions of NVM/Flash with Operating Systems and Workloads (INFLOW’14).Google Scholar
Index Terms
- Streaming Data Reorganization at Scale with DeltaFS Indexed Massive Directories
Recommendations
Cost-effective, Energy-efficient, and Scalable Storage Computing for Large-scale AI Applications
Special Section on Computational Storage and Regular PapersThe growing volume of data produced continuously in the Cloud and at the Edge poses significant challenges for large-scale AI applications to extract and learn useful information from the data in a timely and efficient way. The goal of this article is ...
Software-defined storage for fast trajectory queries using a deltaFS indexed massive directory
PDSW-DISCS '17: Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing SystemsIn this paper we introduce the Indexed Massive Directory, a new technique for indexing data within DeltaFS. With its design as a scalable, server-less file system for HPC platforms, DeltaFS scales file system metadata performance with application scale. ...
DeltaFS: exascale file systems scale better without dedicated servers
PDSW '15: Proceedings of the 10th Parallel Data Storage WorkshopHigh performance computing fault tolerance depends on scalable parallel file system performance. For more than a decade scalable bandwidth has been available from the object storage systems that underlie modern parallel file systems, and recently we ...
Comments