a better interface between scientists
play

A Better Interface Between Scientists Examples and Derived - PowerPoint PPT Presentation

Introduction A Better Interface Between Scientists Examples and Derived Proposals and Data Reduction Software Operation Folding Dependency tracking Intermediate product tracking Data reduction branches B. Nikolic Theory Implementing a


  1. Introduction A Better Interface Between Scientists Examples and Derived Proposals and Data Reduction Software Operation Folding Dependency tracking Intermediate product tracking Data reduction branches B. Nikolic Theory Implementing a better interface Astrophysics Group, Cavendish Laboratory, University of Cambridge Summary http://www.mrao.cam.ac.uk/˜bn204/ ALMA Software Development Workshop NRAO/Charlottesville October 2011

  2. Outline Introduction Examples and Derived Proposals Introduction Operation Folding Dependency tracking Intermediate product tracking Examples and Derived Proposals Data reduction branches Operation Folding Theory Dependency tracking Implementing a better interface Intermediate product tracking Summary Data reduction branches Theory Implementing a better interface Summary

  3. Aims for this talk Introduction Examples and Derived Proposals Operation Folding This talk will aim to convince you: Dependency tracking Intermediate product tracking ◮ We can make data reduction significantly easier, Data reduction branches faster and more reliable Theory Implementing a ◮ We can do this relatively easily better interface ◮ This is likely to be worth doing Summary I would like to take home: ◮ Feedback on the ideas and approach presented ◮ Is it worth doing? ◮ Good project for the ALMA development programme?

  4. Data reduction, software, and pipelines Introduction Examples and Derived Proposals ◮ This talk is relevant to data reduction like we currently Operation Folding Dependency tracking do for aperture synthesis interferometry: Intermediate product tracking Data reduction branches 1. Environments like CASA or AIPS+Python wrappers Theory 2. Iterative flagging/calibration/imaging Implementing a 3. Large datasets, expensive to move around and better interface expensive to process Summary 4. Mostly about command interface ≡ Language ◮ A fully commissioned pipeline that delivers reduced calibrated data will remove the need to develop this for ALMA! (No interaction – no interface needed!) ◮ However some of the ideas may be useful in development of a pipeline too

  5. Automatised data reduction/Human decision making Introduction Examples and Derived Proposals Operation Folding Dependency tracking Intermediate product tracking Data reduction branches Theory Implementing a We can’t solve all data reduction problems – better interface Lets give everybody an easy interface to solve it Summary themselves 1 1 This particular paraphrasing inspired by slides of one of David Nolen’s presentations

  6. Automatised data reduction/Human decision making Introduction Examples and Derived Proposals Operation Folding Dependency tracking Intermediate product tracking ◮ This is how I would approach reducing a significant Data reduction branches quantity of observations from ALMA Theory ◮ All the ‘tools’ are there, linking them up is too time Implementing a better interface consuming/difficult Summary ◮ Currently mostly just ideas – little implementation yet ◮ Aiming to develop a small, scaled back, prototype implementation for analysis of WVR testing data ◮ Likely would require funding from the ALMA development programme for a full implementation

  7. A better interface can be: Introduction Examples and Derived Proposals Operation Folding 1. Faster Dependency tracking Intermediate product ◮ Less Wall-clock time tracking Data reduction branches ◮ Less Scientist’s time Theory ◮ Fewer computational resources Implementing a 2. More reliable better interface ◮ Fewer opportunities for user error Summary ◮ Easier to make fully repeatable ◮ Easier to review by reading the script 3. More communicable ◮ The data reduction script can be used to communicate what needs to be done to other people as well as the computer

  8. Why? Introduction Examples and Derived Proposals Operation Folding ◮ Much more data/observations/spectral lines/fields per Dependency tracking Intermediate product radio astronomer! Can we keep up? tracking Data reduction branches ◮ Barriers to understanding and doing aperture Theory synthesis must be minimised – ‘we’ll do it for you’ is Implementing a better interface not a solution Summary ◮ In some aperture synthesis experiments there is no single ‘right’ way of doing the reduction – peers must be able to easily repeat and adjust our reduction ◮ In new generation of telescopes much cheaper to move data reduction ‘scripts’/‘recepies’ and products rather than the visibility data

  9. Straw-man requirements Introduction Examples and Derived Proposals Operation Folding Dependency tracking Intermediate product tracking 1. Commands should be designed to best communicate Data reduction branches Theory to other scientists what needs to be done Implementing a 2. Trying out different parameters/commands should be better interface easy, efficient – should recognise there is no single Summary ‘correct’ result 3. Concise 4. Efficient, fast

  10. Outline Introduction Examples and Derived Proposals Introduction Operation Folding Dependency tracking Intermediate product tracking Examples and Derived Proposals Data reduction branches Operation Folding Theory Dependency tracking Implementing a better interface Intermediate product tracking Summary Data reduction branches Theory Implementing a better interface Summary

  11. Outline Introduction Examples and Derived Proposals Introduction Operation Folding Dependency tracking Intermediate product tracking Examples and Derived Proposals Data reduction branches Operation Folding Theory Dependency tracking Implementing a better interface Intermediate product tracking Summary Data reduction branches Theory Implementing a better interface Summary

  12. Simple flagging-based example Introduction Examples and Derived Proposals Operation Folding Dependency tracking Intermediate product tracking Data reduction branches Theory Note: Implementing a better interface ◮ I use flagging here for illustration only Summary ◮ Similar principles apply to many other operations

  13. Flagging fragment Introduction Examples and Derived Proposals Operation Folding A fragment of an ALMA data reduction script: Dependency tracking Intermediate product tracking 1 # Python /CASA Data reduction branches 2 vis =” mydata .ms” 3 flagdata ( vis =vis , autocorr=True ) Theory 4 flagdata ( vis =vis , mode= ’shadow ’ , diameter =12.0) 5 flagdata ( vis =vis , antenna= ’DV04 ’ ) Implementing a better interface This likely causes three complete iterations through the Summary data. Why: ◮ The interface is fully procedural ◮ Each flagdata only knows about itself – it doesn’t know it is followed by another similar command If Input/Output limited ⇒ big performance penalty

  14. Operation folding ‘by hand’ Introduction Compare to following hypothetical command: Examples and Derived Proposals 1 # Python / Something l i k e CASA Operation Folding 2 vis =” mydata .ms” Dependency tracking 3 flagdata ( vis =vis , [ { ’ autocorr ’ : True } , Intermediate product tracking 4 { ’mode ’= ’shadow ’ , ’ diameter ’ : 12.0 } , Data reduction branches 5 { ’ antenna ’= ’DV04 ’ } ]) Theory ◮ All three operations have been ‘folded’ into a single Implementing a better interface command Summary flagdata can execute all of them in a single iteration ◮ through the data set Drawbacks: 1. The user must decide what commands to fold and when 2. Different interaction when doing single commands to script

  15. Folding multiple operations? Introduction Examples and Derived Proposals But, maybe there is also a benefit of combining application Operation Folding Dependency tracking of calibration and flagging? Intermediate product tracking Data reduction branches 1 # Python / Something l i k e CASA Theory 2 vis =” mydata .ms” 3 gencommand ( vis =vis , [ { ’ op ’ : ’ flagdata ’ , ’ autocorr ’ : True } , Implementing a 4 { ’ op ’ : ’ flagdata ’ , ’mode ’ : ’shadow ’ , ’ diameter ’ : 12.0 } , better interface 5 { ’ op ’ : ’ flagdata ’ , ’ antenna ’ : ’DV04 ’ } , 6 { ’ op ’ : ’ applycal ’ , ’ c a l t a b l e ’ : [ ’ myvis . bpass ’ , Summary 7 ’ myvis .W’ ] } 8 ] ) It is clear where this is going: 1 # Python / Something l i k e CASA 2 gencommand ( ’ myscript . py ’ ) Back to square one! ⇒ The ‘script’ must be in a non-procedural language

  16. Proposal Introduction Examples and Derived Proposals Operation Folding Operations automatically re-ordered and folded to optimise Dependency tracking Intermediate product performance: tracking Data reduction branches 1 # Python /CASA Theory 2 vis =” mydata .ms” Implementing a 3 flagdata ( vis =vis , autocorr=True ) better interface 4 flagdata ( vis =vis , mode= ’shadow ’ , diameter =12.0) 5 flagdata ( vis =vis , antenna= ’DV04 ’ ) Summary ⇒ Automatic translation (‘re-writing’) ⇒ 1 # Python / Something l i k e CASA/ User does not see t h i s 2 vis =” mydata .ms” 3 flagdata ( vis =vis , [ { ’ autocorr ’ : True } , 4 { ’mode ’= ’shadow ’ , ’ diameter ’ : 12.0 } , 5 { ’ antenna ’= ’DV04 ’ } ]) ⇒ Execution!

Recommend


More recommend