e-Science Introduction Eric Yen e-Science Workshop, March 2011
Outline • Workshop Overview • E-Science Basics • Landscape of e-Science • Application Development Concept • Security Infrastructure • Exemplar Applications 2
e-Science Workshop Overview • Objectives • Help user communities to take advantage of the global DCI – World Wide Grid • Engage close collaboration among regional user communities and with the Grid community • Target Audience • Both users and Grid/e-Science engineer • Of course, this is also good for novice to understand the e-Science, application development, related technology and the collaboration. • Two workshop on Natural Disaster Mitigation and Life Science are arranged. 3
4
5
e-Science Basics • “e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it. ... e-Science will change the dynamic of the way science is undertaken.” • By John Taylor, former Director General of Research Councils UK • Vision: a globally connected scholarly community promoting the highest quality scientific research • e-Science refers to either computationally intensive science or data intensive science that is carried out in highly distributed computing environment. • WLCG, EGEE, EGI, TeraGrid, OSG, EUAsiaGrid, ! 6
e-Infrastructure/ Cyberinfrastructure • Driven by Data Deluge • Turning data into insight and knowledge base efficiently • Open, consistent and well-designed data format, interface, protocol and quality code • Searchability, accessibility and sustainability • Resources and Tools are shared cross- disciplinarily • Enable Service-Oriented Science • “scientific research enabled by distributed networks of interoperating services” • New e-Infrastructure is required to host both the data and services 7
Data Centric Sciences • Data Deluge: going to Exa-scale Era • Data is inherently distributed • Data is produced in large quantities • Data is produced at a very high rate • Data is needed by many people • More complicated data management required • Data has complex interrelations • Data has many free parameters • Data Integration • Co-Scheduling, Streaming, Caching, & Replication • Mass Collaboration • Large Scale Computing 8
The Changing Nature Of Research e-Science ! 2 . ( % 2 a 4 G c * ) & # = " ! & # 2 a 3 a & # ' $ ! ! ! ! ! ! ! Last few Today and the Thousand Last hundred years Future years ago few decades Description of Newton � s laws, Simulation of Unify theory, experiment natural Maxwell � s complex and simulation with large phenomena equations ! phenomena multidisciplinary data Using data exploration and data mining (from instruments, sensors, humans ! ) Distributed communities
Terabyte � Petabyte (2008) Terabyte Petabyte RAM time to move 2.5 minutes ~2 days 1GB WAN move 10 minutes 6 days time Disk cost 2 disks = 2000 Disks + 42 units $200 (SATA) + 5 racks = $500000 Disk power 20 Watts 50 Kilowatts Disk weight 2 Kg 5.5 Tonnes Disk footprint Inside 4 m 2 machine Source: P. Kunst et. Al, ADSSS, 2009 16
Distributed Computing Infrastructure for e-Science • Enabling collaboration to realize that the whole is grater than the sum of parts • WWG realized the global e-Infrastructure to share resources over Internet • Cloud offers versatile granularity and new usage patterns to the DCI services • Granularity: service-oriented layers in infrastructure, platform, software, data, network, etc. • Usage pattern: on-demand elasticity • More user customized and user controlled environment on remote Mário Campolargo European Commission - DG INFSO – resources OGF 23, Barcelona June 2008 11
The Grid • “a software infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources”. • Foster, Kesselman and Tuecke • Features • No central control • Production quality • Open standards and open architecture 12
Data-Driven Multiscale Collaborations for Complexity Great Challenges of 21st Century ! Multiscale Collaborations • General Relativity, Particles, Geosciences, Bio, Social... • And all combinations... ! Science and Society being transformed by CI and Data • Completely new methodologies • “The End of Science” (as we know it) ! CI plays central role • No community can attack challenges • Technical, CS, social issues to solve ! Places requirements on computing, software, networks, *Small groups still important! tools, etc 13 Source: Ed Seidel � "#!
14 14
••• 15
e-Infrastructures underpinning a creativity machine… “We humans have built a creativity machine. It’s the sum of three things: a few hundred million of computers, a communication system connecting those computers, and some millions of human beings using those computers and communications.” Vernor Vinge ( Nature, Vol 440, March 2006 ) ••• 16
Future perspectives for e-Infrastructures " e-Infrastructures in transition • Towards infrastructure-as-a-service • From connectivity and grids to an integrated offer involving networks, data, all computing and software • Progressive and disparate involvement of users • Governance and financial models in evolution • What role for innovation? " More emphasis on Scientific Data Infrastructures " International dimension continues to be important " Enabling open Science, research and innovation ••• 17
EGI-InSPIRE • Integrated Sustainable Pan-European Infrastructure for !"#$%&'(#)*#+,-./01%(#1.#!(1&#/,#2,1. � Researchers in Europe • A 4-year project with " 25M EC contribution – Project cost " 70M – Activity cost ~ " 330M • EGI – European Grid Initiative – Deploying Technology Innovation • Distributed Computing continues to evolve – Grids, Desktops, Virtualisation, Clouds – Enabling Software Innovation • Provide reliable persistent technology platforms – Today: Tools built on gLite/UNICORE/ARC – Supporting Research Innovation • Infrastructure for data driven research – Support for international research (e.g. ESFRI) www.egi.eu EGI-InSPIRE RI-261323
e-Science in Asia • $%&'()%*+! • ,'-.(/01%2/33+!3/(.'!/45!263*6(/33+!5%&'()'!%4!4/*6('! • 7'&'3!-8!)2%'49:2!2-33/;-(/9-4!-<'4!('='2*'5!;+!4'*>-(?!2-44'29&%*+! • @1'!('.%-4!/)!/!>1-3'!*(/5%9-4/33+!%4'A0'(%'42'5!%4!('.%-4/3!2-33/;-(/9-4! • ,(%5)!/45!B3-65)!%4!C)%/! • D41-E-.'4'-6)!,(%5)!/45!B3-65)!>%*1!3%E%*'5!-0'(/9-4)!'A0'(%'42'F! E/?%4.!2-33/;-(/9-4!5%G263*H! • I1+!'JK2%'42'!%4!C)%/!L! • ,3-;/3!%48(/)*(62*6('!%)!')*/;3%)1%4.!M6%2?3+! • @/?'!/5&/4*/.'!-8!)1/(%4.!/45!2-33/;-(/9-4!*-!;(%5.'!*1'!./0!;'*>''4!C)%/! /45!*1'!>-(35! • @-!/55('))!*1'!21/33'4.'!-8!('.%-4/3!2--0'(/9-4! ! N,NNONPC)%/,(%5!1/&'!1'30'5!;6%35%4.!*1'!64)''4!Q'.%-4/3!B-33/;-(/9-4H!R4'!1-0')! ! E/4+!-*1'()!>%33!1/00'4!)--4S! ! www.egi.eu EGI-InSPIRE RI-261323 20 !
Enabling Grids for E-sciencE 345%$$%./#60,70%((# 345%%'%'# 348%5/&9,.(:# !"#0,$%#;&(#<%=#>,0#/?%#80,2%5/#(-55%((@# Computational Chemistry Social Science Bioinformatics and Biomedical High Energy Physics T " Mitigation of natural disasters � www.egi.eu EGI-InSPIRE RI-261323
Features of Distributed Applications • D4*'(-0'(/;%3%*+U!I-(?!/2(-))!E63903'!5%)*(%;6*'5! (')-6(2')! • $%)*(%;6*'5!K2/3'JR6*U!P93%V'!E63903'!5%)*(%;6*'5! (')-6(2')!2-426(('4*3+! • NA*'4)%;%3%*+U!!K600-(*!4'>!0/W'(4)O/;)*(/29-4)F! 0(-.(/EE%4.!E-5'3)F!86429-4/3%*+!X! %48(/)*(62*6('! • C5/09&%*+U!Q')0-45!*-!=62*6/9-4)!-8!5+4/E%2! (')-6(2'!/45!/&/%3/;%3%*+!-8!5+4/E%2!5/*/! • K%E03%2%*+U!C22-EE-5/*'!5%)*(%;6*'5!2-42'(4)!/*! 5%Y'('4*!3'&'3)!'/)%3+! Source: SAGA TT!
Middleware Services for Grid App Seismic Wave Propagation Weather HEP Drug Discovery Simulation and Simulation Hazard Mapping Need to explore in more detail the requirements and scientific workflows GUI, CLI or Portal, application packages, together with client Application services Collective Application specific services, such as checkpointing, job (application- management, failover, staging, distributed data discovery and specific) backup, and workflow engine, customized services, etc. Collective Resource discovery, resource brokering, system monitoring, (Generic) community authorization, certificate revocation Access to computation, data; access to information about resource Resource matchmaking, system structure, status, and performance. Communication (IP), service discovery, authentication, Connectivity authorization, delegation Fabric Storage system, computers, networks, code repositories, catalogs T#!
Grid and Cloud Logical Architecture Earth Environ. Social Life Security, Information, Accounting & Scienc Change Science Science ! e s Dynamic Computing Model (Application Environment) EMI Stacks API Monitor Distributed Resource Management & Services Job Management Data Service Service VM & Dynamic Resource Management Hardware Fabric
Recommend
More recommend