PROOF as a Service on the Cloud a Virtual Analysis Facility based on the CernVM ecosystem Dario Berzano R.Meusel, G.Lestaris, I.Charalampidis, G.Ganis, P .Buncic, J.Blomer CERN PH-SFT CHEP2013 - Amsterdam, 15.10.2013
A cloud-aware analysis facility IaaS SaaS admins provide user’s workflow virtual clusters does not change geographically distributed independent cloud providers Virtual Analysis Facility → analysis cluster on the cloud in one click Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 2
A cloud-aware analysis facility Clouds can be a troubled environment • Resources are diverse → Like the Grid but at virtual machine level • Virtual machines are volatile → Might appear and disappear without notice Building a cloud aware application for HEP • Scale promptly when resources vary → No prior pinning of data to process to the workers • Deal smoothly with crashes → Automatic failover and clear recovery procedures Usual Grid workflow → static job pre-splitting ≠ cloud-aware Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 3
PROOF is cloud-aware PROOF: the Parallel ROOT Facility • Based on unique advanced features of ROOT • Event-based parallelism • Automatic merging and display of results • Runs on batch systems and Grid with PROOF on Demand PROOF is interactive • Constant control and feedback of attached resources • Data is not preassigned to the workers → pull scheduler • New workers dynamically attached to a running process NEW Interactivity is what makes PROOF cloud-aware Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 4
PROOF is cloud-aware PROOF on Demand: runs PROOF on top of batch systems • Zero configuration No system-wide installation • Sandboxing User crashes don’t propagate to others • Self-servicing User can restart her PROOF server • Advanced scheduling pod.gsi.de Leverage policies of underlying WMS Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 5
PROOF is cloud-aware Adaptive workload: very granular (up to per event) pull architecture nonuniform workload distribution master worker Packets per worker Packets get next 4000 ready 3500 3000 2500 packet process 2000 packet generator Worker activity stop (seconds) 1500 Mean Mean 2287 2287 1000 RMS 16.61 RMS 16.61 get next time all workers 16 500 ready 0 are done 14 0.83 0.77 0.5 0.23 0.17 0.33 0.47 0.27 0.37 0.67 0.0 0.73 0.57 0.53 0.13 0.43 0.63 Worker packet process in ~20 s 12 10 get next ready 8 6 packet process 4 2 uniform completion time 2260 2270 2280 2290 2300 2310 2320 2330 Query Processing Time (s) Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 6
PROOF is cloud-aware Dynamic addition of workers new workers can join and offload a running process master initially available NEW IN init bulk init init register register ROOT init v5.34.10 worker worker worker worker worker worker init new workers autoregister deferred init init time init process process process Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 7
PROOF dynamic workers User requests N workers Old Workflow New Workflow Wait until at least one worker Wait until “some” workers becomes available are ready A bunch of workers is started Run the full analysis on Run the analysis such workers only Other workers will gradually become available Additional workers join They will be available the processing at next run only Minimal latency and optimal resources usage See ATLAS use case here: http://chep2013.org/contrib/256 Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 8
PROOF dynamic workers 100 Num. available workers 80 60 40 CERN CNAF ROMA1 20 NAPOLI MILANO 0 0 500 1000 1500 2000 2500 3000 3500 Time [s] Measured time taken for 100 Grid jobs requested at the Various ATLAS Grid See ATLAS talk: same time to start sites considered http://chep2013.org/contrib/256 Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 9
PROOF dynamic workers 1200 Actual time to results [s] Grid batch jobs (ideal num. of workers) 1000 PROOF pull and dynamic workers Analytically derived from 800 actual startup latency measurements 600 400 Batch jobs: results collected only when late workers are finished (latencies and dead times) 200 0 5000 10000 15000 20000 25000 30000 35000 Total required computing time [s] PROOF with Dynamic Workers: PROOF is up 30% more efficient all job time spent in computing on the same computing resources by (never idle, no latencies) design (analytical upper limit) Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 10
The virtual analysis facility PROOF PoD µν CernVM Elastiq HTCondor CVM online CernVM-FS authn/authz • What: a cluster of µν CernVMs with HTCondor → One head node plus a scalable number of workers • How: contextualization configured on the Web → Simple web interface: http://cernvm-online.cern.ch • Who: so easy that can even be created by end users → You can have your personal analysis facility • When: scales up/down automatically → Optimal usage of resources: fundamental when you pay for them! Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 11
The virtual analysis facility VAF leverages the CernVM ecosystem and HTCondor • µ CernVM: SLC6 compatible OS on demand → See previous talk: http://chep2013.org/contrib/213 • CernVM-FS: HTTP-based cached FUSE filesystem → Both OS and experiments software downloaded on demand • CernVM Online: safe context GUI and repository → See previous talk: http://chep2013.org/contrib/185 • HTCondor: light and stable workload management system → Workers auto-register to head node: no static resources configuration The full stack of components is cloud-aware Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 12
Elastiq queue monitor Python app to monitor HTCondor and scale up or down working running HTCondor queue working running Running shutdown idle VMs start new idle running VMs VMs idle waiting cloud controller idle waiting or CernVM Cloud Experimental meta cloud controller Jobs waiting too EC2 interface • Accepts scale requests long will trigger a (credentials given • Translates them to multiple clouds scale up securely in the context) Code available at Can be used on any HTCondor cluster http://bit.ly/elastiq and has a trivial configuration Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 13
Elastic cloud computing in action Context creation with CernVM Online : http://cernvm-online.cern.ch Create new special context Customize a few options Get generated user-data Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 14
Elastic cloud computing in action Screencast: http://youtu.be/fRq9CNXMcdI Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 15
µν CernVM+PROOF startup latency Measured the delay before requested resources become available Target clouds: Note: not comparing cloud • Small: OpenNebula @ INFN Torino infrastructures. Only measuring µ CernVM+PROOF latencies . • Large: OpenStack @ CERN (Agile) Test conditions: Measuring latency due to: • µν CernVM use a HTTP caching proxy • µν CernVM boot time → Precaching via a dummy boot • HTCondor automatic • µν CernVM image is 12 MB big registration of new nodes → Image transfer time negligible • PoD and PROOF reaction time • VMs deployed when resources are available → Rule out delay and errors due to lack of resources Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 16
µν CernVM+PROOF startup latency Compatible results: latency is ~6 minutes from scratch 8:00 7:00 Time to wait for workers [m:ss] 6:00 5:00 4:00 3:00 2:00 1:00 0:00 CERN OpenStack Torino OpenNebula Measured time elapsed between PoD workers’ request and availability: 10 VMs started in the test pod-info -l Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 17
Conclusions Every VAF layer is cloud-aware • PROOF+HTCondor deal with “elastic” addition/removal of workers • µ CernVM is very small and fast to deploy • CernVM-FS downloads only what is needed Consistent configuration of solid and independent components • No login to configure: all done via CernVM Online context • PROOF+PoD also work dynamically on the Grid • Elastiq can scale any HTCondor cluster , not PROOF-specific • Reused existing components wherever possible Dario.Berzano@cern.ch - PROOF as a Service on the Cloud - http://chep2013.org/contrib/308 18
Thank you for your attention!
Recommend
More recommend