ABSTRACT
Exponential increases in data volumes and velocities are overwhelming finite human capabilities. Continued progress in science and engineering demands that we automate a broad spectrum of currently manual research data manipulation tasks, from data transfer and sharing to acquisition, publication, and analysis. These needs are particularly evident in large-scale experimental science, in which researchers are typically granted short periods of instrument time and must maximize experiment efficiency as well as output data quality and accuracy. To address the need for automation, which is pervasive across science and engineering, we present our experiences using Trigger-Action-Programming to automate a real-world scientific workflow. We evaluate our methods by applying them to a neuroanatomy application in which a synchrotron is used to image cm-scale mouse brains with sub-micrometer resolution. In this use case, data is acquired in real-time at the synchrotron and are automatically passed through a complex automation flow that involves reconstruction using HPC resources, human-in-the-loop coordination, and finally data publication and visualization. We describe the lessons learned from these experiences and outline the design for a new research automation platform.
- {n. d.}. Airflow. ({n. d.}). https://airflow.apache.org/. Accessed April 1, 2018.Google Scholar
- {n. d.}. Amazon Simple Workflow Service. ({n. d.}). https://aws.amazon.com/swf/. Accessed April 1, 2018.Google Scholar
- {n. d.}. Amazon Step Functions. ({n. d.}). https://aws.amazon.com/step-functions. Accessed April 1, 2018.Google Scholar
- {n. d.}. Conductor. ({n. d.}). https://netflix.github.io/conductor/. Accessed April 1, 2018.Google Scholar
- {n. d.}. If This Then That. ({n. d.}). https://platform.ifttt.com/docs. Accessed April 1, 2018.Google Scholar
- {n. d.}. Neuroglancer. ({n. d.}). https://github.com/google/neuroglancer. Accessed April 1, 2018.Google Scholar
- Moustafa AbdelBaky, Javier Diaz-Montes, and Manish Parashar. 2017. Software-defined environments for science and engineering. The International Journal of High Performance Computing Applications (2017), 1094342017710706. Google ScholarDigital Library
- Rachana Anathankrishnan, Kyle Chard, Ian Foster, Mattias Lidman, Brendan McCollam, Stephen Rosen, and Steven Tuecke. 2016. Globus Auth: A Research Identity and Access Management Platform.Google Scholar
- Yadu Babuji, Alison Brizius, Kyle Chard, Ian Foster, Daniel S. Katz, Michael Wilde, and Justin Wozniak. 2017. Introducing Parsl: A Python Parallel Scripting Library. (Aug. 2017).Google Scholar
- R. Chard, K. Chard, J. Alt, D. Y. Parkinson, S. Tuecke, and I. Foster. 2017. Ripple: Home Automation for Research Data Management. In 37th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW). 389--394.Google Scholar
- R. Chard, K. Chard, S. Tuecke, and I. Foster. 2017. Software Defined Cyberinfrastructure for Data Management. In 13th IEEE International Conference on e-Science (e-Science). 456--457.Google Scholar
- Francesco De Carlo, Doga Gürsoy, Federica Marone, Mark Rivers, Dilworth Y Parkinson, Faisal Khan, Nicholas Schwarz, David J Vine, Stefan Vogt, S-C Gleber, et al. 2014. Scientific data exchange: a schema for HDF5-based storage of raw and analyzed data. Journal of synchrotron radiation 21, 6 (2014), 1224--1230.Google ScholarCross Ref
- E. Deelman, K. Vahi, G. Juve, M. Rynge, S. Callaghan, P.J. Maechling, R. Mayani, W. Chen, R.F. da Silva, M. Livny, et al. 2015. Pegasus, a workflow management system for science automation. Future Generation Computer Systems 46 (2015), 17--35. Google ScholarDigital Library
- Ming Du, Rafael Vescovi, Ryan Chard, Narayanan Kasthuri, Chris Jacobsen, Eva Dyer, and Doga Gursoy. 2018. An Automated Pipeline for the Collection, Transfer, and Processing of Large-scale Tomography Data. In Biophotonics Congress: Biomedical Optics Congress 2018 (Microscopy/Translational/Brain/OTS). Optical Society of America, BF4C.2.Google ScholarCross Ref
- I. Foster, B. Blaiszik, K. Chard, and R. Chard. 2017. Software Defined Cyberinfrastructure. In 37th IEEE International Conference on Distributed Computing Systems (ICDCS). 1808--1814.Google Scholar
- Doga Gürsoy, Francesco De Carlo, Xianghui Xiao, and Chris Jacobsen. 2014. TomoPy: A framework for the analysis of synchrotron tomographic data. Journal of Synchrotron Radiation 21, 5 (2014), 1188--1193.Google ScholarCross Ref
- Justin Huang and Maya Cakmak. 2015. Supporting Mental Model Accuracy in Trigger-action Programming. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp '15). ACM, New York, NY, USA, 215--225. Google ScholarDigital Library
- Narayanan Kasthuri, Kenneth Jeffrey Hayworth, Daniel Raimund Berger, Richard Lee Schalek, José Angel Conchello, Seymour Knowles-Barley, Dongil Lee, Amelio Vázquez-Reina, Verena Kaynig, Thouis Raymond Jones, et al. 2015. Saturated reconstruction of a volume of neocortex. Cell 162, 3 (2015), 648--661.Google ScholarCross Ref
- Thomas Leibovici. 2015. Taking back control of HPC file systems with Robinhood Policy Engine. arXiv preprint arXiv:1505.01448 (2015).Google Scholar
- Michael J Litzkow, Miron Livny, and Matt W Mutka. 1988. Condor--a hunter of idle workstations. In 8th International Conference on Distributed Computing Systems. IEEE, 104--111.Google ScholarCross Ref
- Bertram Ludäscher, Ilkay Altintas, Chad Berkley, Dan Higgins, Efrat Jaeger, Matthew Jones, Edward A Lee, Jing Tao, and Yang Zhao. 2006. Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience 18, 10 (2006), 1039--1065. Google ScholarDigital Library
- Ayman Meidan, Julián Alberto García-García, MJ Escalona, and I Ramos. 2017. A survey on business processes management suites. Computer Standards & Interfaces 51 (2017), 71--86. Google ScholarDigital Library
- Tom Oinn, Matthew Addis, Justin Ferris, Darren Marvin, Martin Senger, Mark Greenwood, Tim Carver, Kevin Glover, Matthew R Pocock, Anil Wipat, et al. 2004. Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20, 17 (2004), 3045--3054. Google ScholarDigital Library
- Arnab K. Paul, Steven Tuecke, Ryan Chard, Ali R. Butt, Kyle Chard, and Ian Foster. 2017. Toward Scalable Monitoring on Large-scale Storage for Software Defined Cyberinfrastructure. In 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS '17). ACM, New York, NY, USA, 49--54. Google ScholarDigital Library
- Arcot Rajasekar, Reagan Moore, Chien-yi Hou, Christopher A Lee, Richard Marciano, Antoine de Torcy, Michael Wan, Wayne Schroeder, Sheau-Yen Chen, Lucas Gilbert, Paul Tooby, and Bing Zhu. 2010. iRODS Primer: Integrated rule-oriented data system. Synthesis Lectures on Information Concepts, Retrieval, and Services 2, 1 (2010), 1--143. Google ScholarDigital Library
- Mark L Rivers. 2012. tomoRecon: High-speed tomography reconstruction on workstations using multi-threading. In Developments in X-Ray Tomography VIII, Vol. 8506. International Society for Optics and Photonics, 85060U.Google ScholarCross Ref
- RFC Vescovi, MB Cardoso, and EX Miqueles. 2017. Radiography registration for mosaic tomography. Journal of synchrotron radiation 24, 3 (2017).Google ScholarCross Ref
- M. Wilde, M. Hategan, J.M. Wozniak, B. Clifford, D.S. Katz, and I. Foster. 2011. Swift: A language for distributed parallel scripting. Parallel Comput. 37, 9 (2011), 633--652. Google ScholarDigital Library
Recommendations
Functional neuroanatomy of mental rotation
Brain regions involved in mental rotation were determined by assessing increases in fMRI activation associated with increases in stimulus rotation during a mirror-normal parity-judgment task with letters and digits. A letter-digit category judgment task ...
The neuroanatomy of visual enumeration: Differentiating necessary neural correlates for subitizing versus counting in a neuropsychological voxel-based morphometry study
This study is the first to assess lesion-symptom relations for subitizing and counting impairments in a large sample of neuropsychological patients (41 patients) using an observer-independent voxel-based approach. We tested for differential effects of ...
IV. Neuroanatomy of Williams Syndrome: A High-Resolution MRI Study
Williams syndrome (WMS), a genetic condition resulting from a contiguous deletion on the long arm of chromosome 7, is associated with a relatively consistent profile of neurocognitive and neurobehavioral features. The distinctiveness and regularity of ...
Comments