skip to main content
10.1145/2834976.2834979acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Tackling the reproducibility problem in storage systems research with declarative experiment specifications

Published:15 November 2015Publication History

ABSTRACT

Validating experimental results in the field of storage systems is a challenging task, mainly due to the many changes in software and hardware that computational environments go through. Determining if an experiment is reproducible entails two separate tasks: re-executing the experiment and validating the results. Existing reproducibility efforts have focused on the former, envisioning techniques and infrastructures that make it easier to re-execute an experiment. In this position paper, we focus on the latter by analyzing the validation workflow that an experiment re-executioner goes through. We notice that validating results is done on the basis of experiment design and high-level goals, rather than exact quantitative metrics. Based on this insight, we introduce a declarative format for specifying the high-level components of an experiment as well as describing generic, testable conditions that serve as the basis for validation. We present a use case in the area of distributed storage systems to illustrate the usefulness of this approach.

References

  1. R. D. Peng, "Reproducible research in computational science," Science, vol. 334, Dec. 2011, pp. 1226--1227.Google ScholarGoogle ScholarCross RefCross Ref
  2. J. Vitek and T. Kalibera, "Repeatability, reproducibility, and rigor in systems research," Proceedings of the ninth ACM international conference on embedded software, New York, NY, USA: ACM, 2011, pp. 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Collberg, T. Proebsting, and A.M. Warren, "Repeatability and benefaction in computer systems research," 2015.Google ScholarGoogle Scholar
  4. C. T. Brown, "How we make our papers replicable," 2014. Available at: http://ivory.idyll.org/blog/2014-our-paper-process.html.Google ScholarGoogle Scholar
  5. S. Krishnamurthi and J. Vitek, "The real software crisis: Repeatability as a core value," Commun. ACM, vol. 58, Feb. 2015, pp. 34--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Clark, T. Deshane, E. Dow, S. Evanchik, M. Finlayson, J. Herne, and J.N. Matthews, "Xen and the art of repeated research," Proceedings of the annual conference on USENIX annual technical conference, Berkeley, CA, USA: USENIX Association, 2004, pp. 47--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. I. Jimenez, C. Maltzahn, J. Lofstead, A. Moody, K. Mohror, R.H. Arpaci-Dusseau, and A. Arpaci-Dusseau, "The role of container technology in reproducible computer systems research," 2015 IEEE international conference on cloud engineering (IC2E), Tempe, AZ: 2015.Google ScholarGoogle Scholar
  8. F. Chirigati, D. Shasha, and J. Freire, "ReproZip: Using provenance to support computational reproducibility," Proceedings of the 5th USENIX conference on theory and practice of provenance, Berkeley, CA, USA: USENIX Association, 2013, pp. 1--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. Jimenez, C. Maltzahn, A. Moody, and K. Mohror, Redo: Reproducibility at scale, UC Santa Cruz, 2014.Google ScholarGoogle Scholar
  10. S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn, "Ceph: A scalable, high-performance distributed file system," Proceedings of the 7th symposium on operating systems design and implementation, Berkeley, CA, USA: USENIX Association, 2006, pp. 307--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Ricci and E. Eide, "Introducing CloudLab: Scientific infrastructure for advancing cloud architectures and applications,";login: vol. 39, Dec. 2014, pp. 36--38.Google ScholarGoogle Scholar
  12. K. Popper, The logic of scientific discovery, New Delhi: Routledge, 2002.Google ScholarGoogle Scholar
  13. J. P. Ignizio, "On the establishment of standards for comparing algorithm performance," Interfaces, vol. 2, Nov. 1971, pp. 8--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. Crowder, R.S. Dembo, and J.M. Mulvey, "On reporting computational experiments with mathematical software," ACM Trans. Math. Softw., vol. 5, Jun. 1979, pp. 193--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Dietrich and D. Lohmann, "The dataref versuchung: Saving time through better internal repeatability," SIGOPS Oper. Syst. Rev., vol. 49, Jan. 2015, pp. 51--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. G. Feitelson, "From repeatability to reproducibility and corroboration," SIGOPS Oper. Syst. Rev., vol. 49, Jan. 2015, pp. 3--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Freire, P. Bonnet, and D. Shasha, "Computational reproducibility: State-of-the-art, challenges, and database research opportunities," Proceedings of the 2012 ACM SIGMOD international conference on management of data, New York, NY, USA: ACM, 2012, pp. 593--596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Neylon, J. Aerts, C.T. Brown, S.J. Coles, L. Hatton, D. Lemire, K.J. Millman, P. Murray-Rust, F. Perez, N. Saunders, N. Shah, A. Smith, G. Varoquaux, and E. Willighagen, "Changing computational research: The challenges ahead," Source Code for Biology and Medicine, vol. 7, Dec. 2012, pp. 1--2.Google ScholarGoogle ScholarCross RefCross Ref
  19. R. LeVeqije, I. Mitchell, and V. Stodden, "Reproducible research for scientific computing: Tools and strategies for changing the culture," Computing in Science Engineering, vol. 14, Jul. 2012, pp. 13--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Stodden, F. Leisch, and R.D. Peng, Implementing reproducible research, CRC Press, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  21. D. L. Donoho, A. Maleki, I.U. Rahman, M. Shahram, and V. Stodden, "Reproducible research in computational harmonic analysis," Computing in Science & Engineering, vol. 11, Jan. 2009, pp. 8--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Achenbach, "The new scientific revolution: Reproducibility at last," The Washington Post, Jan. 2015.Google ScholarGoogle Scholar
  23. M. B. Yaffe, "Reproducibility in science," Science Signaling, vol. 8, Apr. 2015, pp. eg5--eg5.Google ScholarGoogle ScholarCross RefCross Ref
  24. Editorial, "Journals unite for reproducibility," Nature, vol. 515, Nov. 2014, pp. 7--7.Google ScholarGoogle ScholarCross RefCross Ref
  25. I. P. Gent, "The recomputation manifesto," arXiv:1304.3674 {cs}, Apr. 2013.Google ScholarGoogle Scholar
  26. S. Crouch, N. Hong, S. Hettrick, M. Jackson, A. Pawlik, S. Sufi, L. Carr, D. De Roure, C. Goble, and M. Parsons, "The software sustainability institute: Changing research software attitudes and practices," Computing in Science Engineering, vol. 15, Nov. 2013, pp. 74--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. H. S. Gunawi, T. Do, P. Joshi, P. Alvaro, J.M. Hellerstein, A.C. Arpaci-Dusseau, R.H. Arpaci-Dusseau, K. Sen, and D. Borthakur, "FATE and DESTINI: A framework for cloud recovery testing," Proceedings of the 8th USENIX conference on networked systems design and implementation, Berkeley, CA, USA: USENIX Association, 2011, pp. 238--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. K. Li, P. Joshi, A. Gupta, and M.K. Ganai, "ReproLite: A lightweight tool to quickly reproduce hard system bugs," Proceedings of the ACM symposium on cloud computing, New York, NY, USA: ACM, 2014, pp. 25:1--25:13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Liu, B. T. Loo, and Y. Mao, "Declarative automated cloud resource orchestration," Proceedings of the 2Nd ACM symposium on cloud computing, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Tackling the reproducibility problem in storage systems research with declarative experiment specifications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PDSW '15: Proceedings of the 10th Parallel Data Storage Workshop
      November 2015
      59 pages
      ISBN:9781450340083
      DOI:10.1145/2834976

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 November 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      PDSW '15 Paper Acceptance Rate9of25submissions,36%Overall Acceptance Rate17of41submissions,41%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader