the alma archive
play

The ALMA archive Mark Lacy Data Services Lead, NAASC, NRAO NA ALMA - PowerPoint PPT Presentation

The ALMA archive Mark Lacy Data Services Lead, NAASC, NRAO NA ALMA Development 2016 Motivation Papers from archival data are an important. Publications using archival data account for half of the HST and Chandra publications each year.


  1. The ALMA archive Mark Lacy Data Services Lead, NAASC, NRAO NA ALMA Development 2016

  2. Motivation Papers from archival data are an important. Publications using archival data • account for half of the HST and Chandra publications each year. ALMA archival papers picking up, some examples (in high-z extragalactic science): • – Fujimoto et al. 2016: faint end of ALMA source counts to 0.02mJy. – Silva et al. 2015 – excess of submm sources around bright WISE-selected AGN – Oteo et al. 2016: number counts from calibrator fields. – Can save many hours of observation time spent looking at deep fields. – Removes issues of field-to-field variations Can point the way for some future observations without needing the TAC to be • “brave” and approve an ambitious/risky proposal. Could allow entire projects to be constructed, or supplemented with small • amounts of new ALMA data e.g. for a student thesis. NA ALMA development 2016

  3. Goals of the Archive • Provide access to data. – Traditional search/download limited by bandwidth. – Need to move to server-side tools for the largest ALMA datasets. • Provide rich metadata to allow complicated queries/data mining – ALMA has a lot of data (~200TB today), but with very low information content (~10 -4 %). – To change it to “Big Data” that we can mine using data science techniques we need to extract the information from the noise. (Specific examples: source and line lists.) NA ALMA development 2016

  4. Short-term Archive Developments • May have noticed “collapsed rows” in latest release. – These were needed as a prerequisite for the ingest of individual pipeline products in Cycle 4. – Tests of this will begin in September. – Current product “tar blobs” will be phased out. • Coming soon (during Cycle 4): – footprints via Aladdin Lite – RMS values – Upload of target lists to search on. NA ALMA development 2016

  5. Current projects • Access to data – even very large files – CARTA - remote visualization capability (Erik Rosolowsky) – Pipeline Processing Interface (PPI) – remote pipeline runs to deliver calibrated measurement sets and/or images (NRAO initiative; this talk). • From “a lot of data” to “Big Data” – ADMIT enhanced metadata production (Peter Teuben) NA ALMA development 2016

  6. PPI Pipeline Processing interface will allow reruns of the ALMA pipeline. Two • modes: – Apply existing archival calibration tables to ALMA raw data to produce calibrated measurement sets. • Useful if you just need calibrated uv-data – Run the current ALMA pipeline version to produce calibrated measurement sets and/or images. • Useful for running data with a new version of the pipeline • Also will allow for some pipeline parameter tweaks (parameters will expand with time). Initially available as part of the new NRAO archive access tool, will also be • made available to all ARCs as an add-on to the request handler. NA ALMA development 2016

  7. RH/OODT infrastructure for the PPI • The PPI uses a modified ALMA request handler and the “Object Oriented Data Technology” (OODT) to kick off jobs on the cluster. • This can be generalized to other tasks e.g. analysis tools, visualization tasks etc. • So provides a framework for server-side deployment of software from the ALMA Development program. NA ALMA development 2016

  8. Creating rich metadata • Accurate prior information is crucial – Source positions & spatial extents – Source velocities/redshifts – (Targeted lines) • Some of this is supplied by the PI in the OT, some not – Need to supplement with 3 rd party sources (SIMBAD, NED) – But problem of validation, and how to update – We are working on it, but slowly. • Source/clump finders in 2D and 3D (Jeff’s talk). NA ALMA development 2016

  9. What else might we need/want? • Predictive searches (because you selected this dataset you may also be interested in…) (Barrientos, JAO). • Cutout server (for extracting small pieces of large cubes). • VO interoperability, e.g. – Search multiwavelength archives for other data on an ALMA field. – Upload ALMA pointings/footprints to e.g. DS9 or Topcat (SAMP). NA ALMA development 2016

  10. Summary • We strongly encourage development proposals for the ALMA archive. • Do need to work closely with the Archive working group and the software development team in Garching to ensure successful integration of products/services. NA ALMA development 2016

Recommend


More recommend