Introduction to CONNJUR Workflow Builder and Yes Workflow 2017 Summer Workshop: June 29, 2017
Workflows (Wikipedia) u A workflow consists of an orchestrated and repeatable pattern of business activity enabled by the systematic organization of resources into processes that transform materials, provide services, or process information. It can be depicted as a sequence of operations , declared as work of a person or group, an organization of staff, or one or more simple or complex mechanisms. u From a more abstract or higher-level perspective, workflow may be considered a view or representation of real work. The flow being described may refer to a document, service or product that is being transferred from one step to another.
Workflows (Examples) u On the first day Frank described an iterative “workflow” by which a spectroscopist converts Varian/Bruker data into nmrPipe format, resolves ambiguities, performs preliminary processing, resolves phasing, reprocesses, iterate until done. u Bertram Ludascher: ASAP u Automate computation u Scalable u Adaptable for reuse u Provenance: capture processing history and data lineage
Kepler Ludäscher, Bertram, Ilkay Altintas, Chad Berkley, Dan Higgins, Efrat Jaeger, Matthew Jones, Edward A Lee, Jing Tao, and Yang Zhao. 2006. “Scientific Workflow Management and the Kepler System.” Concurrency and Computation: Practice and Experience 18 (10): 1039–65.
CONNJUR Workflow Builder M. Fenwick, G. Weatherby, J. Vyas, C. Sesanker, T . O. Martyn, H. J. Ellis, and M. R. Gryk, (2015) CONNJUR Workflow Builder: a software integration environment for spectral reconstruction. J Biomol NMR, 62 , 313-26.
Provenance Types (Michael Wilde, Argonne Labs) u Prospective Provenance: the specification of the workflows procedure calls and data dependencies (acqu, workflow) u Retrospective Provenance: the recordings of when and where each procedure ran, and how each invocation behaved (acqus, reconstruction)
Yes Workflow Annotation system for Prospective Provenance from scripts u
Reproducibility Replicability vs. reproducibility — or is it the other way around? Reproducibility, replicability, reusability, repeatability
Reproducibility (Dagstuhl Working Group) PRIMAD Platform – Portability vs. reproducibility (OS or hardware platforms) u Research Objective (goal of computation) u Implementation (Fast Fourier Transform) u Method (Fourier Transform) u Actors (Dagstuhl group defines agent as human) u Data (data used in study) u Rauber, A., Braganholo, V., Dittrich, J., et al. (2016). PRIMAD – Information gained by different types of reproducibility. In Reproducibility of Data-Oriented Experiments in e-Science (Dagstuhl Seminar 16041). Friere, J., Fuhr, N., & Rauber, A. editors. Gryk, M. & Ludäscher, B. (2017). Workflows and provenance: Towards information science solutions for the natural sciences. Library Trends , in press.
Metadata Definition Definition 1: Data about Data u
Workflow -> Provenance -> Reproducibility All rely on the capture of metadata Definition 1: Data about Data u
Workflow -> Provenance -> Reproducibility All rely on the capture of metadata Definition 1: Data about Data u Definition 2: Metadata As Surrogate u
Workflow -> Provenance -> Reproducibility All rely on the capture of metadata Definition 1: Data about Data u Definition 2: Metadata As Surrogate u
Yes Workflow #@BEGIN main #@IN raw_turkey @URI store:shnucks_turkey #@OUT cooked_turkey @URI plate:delicious_turkey #@BEGIN survey_guests #@OUT food_allergies #@END survey_guests #@BEGIN brining #@IN raw_turkey @URI store:shnucks_turkey #@PARAM seasonings #@OUT brined_turkey #@END brining #@BEGIN weighing #@IN brined_turkey #@OUT weighed_turkey #@OUT weight #@END weighing #@BEGIN stuffing #@IN weighed_turkey #@IN stuffing_ingredients #@IN food_allergies #@OUT stuffed_turkey #@END stuffing #@BEGIN baking #@PARAM weight #@PARAM temperature #@PARAM duration #@IN stuffed_turkey #@OUT cooked_turkey @URI plate:delicious_turkey #@END baking #@END main
Yes Workflow echo 'Converting from Varian to NMRPipe format' var2pipe -in ./fid \ -xN 1024 -yN 128 -zN 64 \ -xT 512 -yT 64 -zT 32 \ -xMODE Complex -yMODE States-TPPI -zMODE Rance-Kay \ -xSW 12000 -ySW 6000 -zSW 2000 \ -xOBS 599.5694 -yOBS 125.768 -zOBS 60.7438 \ -xCAR 4.772 -yCAR 45 -zCAR 119 \ -xLAB 1H -yLAB 13C -zLAB 15N \ -ndim 3 -aq2D States \ -out ./data/hnco%03d.pipe -verb -ov sleep 1 echo 'Transforming 3-dimensional NMR data!' echo 'Processing F3/F1 dimensions first' xyz2pipe -in data/hnco%03d.pipe -x -verb \ | nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly \ | nmrPipe -fn CBF -last 12 \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 512 -c 0.5 \ | nmrPipe -fn ZF -size 2048 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 0 -p1 0 -di \ | nmrPipe -fn EXT -left -sw -verb \ | nmrPipe -fn TP \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 64 -c 0.5 \ | nmrPipe -fn ZF -size 256 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 0 -p1 0 -di \ | pipe2xyz -out ft/hnco%03d.ft2 –y xyz2pipe -in ft/hnco%03d.ft2 -z -verb \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 32 -c 0.5 \ | nmrPipe -fn ZF -size 128 \ | nmrPipe -fn FT -neg -verb \ | nmrPipe -fn PS -p0 0 -p1 0 -di \ Website | pipe2xyz -out ft/hnco%03d.ft3 -z
bash vs. csh – the shell wars #! /bin/csh # = comment which nmrPipe tool? nmrPipe, # My Processing Script var2pipe, xyz2pipe, etc. nmrPipe -in mydata.pipe \ | nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly \ watch for trailing spaces! | nmrPipe -fn CBF -last 12 \ | nmrPipe -fn GMB -lb -7 -gb 0.1 -size 512 -c 0.5 \ | nmrPipe -fn ZF -size 2048 \ | nmrPipe -fn FT -verb \ Syntax? | nmrPipe -fn PS -p0 68.6 -p1 -34.8 -di \ | nmrPipe -fn EXT -left -sw -verb \ | nmrPipe -fn TP \ | nmrPipe -fn LP -fb -ord 30 -x1 2 -xn 128 -pred 64 -fix -fixMode 1 -after \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 192 -c 0.5 \ | nmrPipe -fn ZF -size 256 \ filenames and filesystems | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 -9.0 -p1 20.0 -di \ -out mydata.ft2 -ov How much of the above information is NMR related? How much is related to nmrPipe or the computer 16 we are using?
Functionality and Function Order! #! /bin/csh # My Processing Script nmrPipe -in mydata.pipe \ | nmrPipe -fn SOL -mode 1 -fl 16 -fs 1 -poly \ | nmrPipe -fn CBF -last 12 \ | nmrPipe -fn GMB -lb -7 -gb 0.1 -size 512 -c 0.5 \ | nmrPipe -fn ZF -size 2048 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 68.6 -p1 -34.8 -di \ | nmrPipe -fn EXT -left -sw -verb \ | nmrPipe -fn TP \ | nmrPipe -fn LP -fb -ord 30 -x1 2 -xn 128 -pred 64 -fix -fixMode 1 -after \ | nmrPipe -fn SP -off 0.39 -end 0.98 -pow 2 -size 192 -c 0.5 \ | nmrPipe -fn ZF -size 256 \ | nmrPipe -fn FT -verb \ | nmrPipe -fn PS -p0 -9.0 -p1 20.0 -di \ -out mydata.ft2 -ov What procedures do we use to massage our data? What procedures do we use to transform our data? 17
Recommend
More recommend