The African Open Science Platform ICT Infrastructure in Support of Data Sharing Presented by Ina Smith Project Manager African Open Science Platform Academy of Science of South Africa (ASSAf) WACREN 2018 Conference, 15 March 2018
Data Driven World
Square Kilometre Array (SKA) • Data collection on a massive scale • Telescope array to consist of 250,000 radio antennas between Australia & SA • Investment in machine learning and artificial intelligence software tools to enable data analysis • 400+ engineers and technicians in infrastructure, fibre optics, data collection • Supercomputers to process data (IBM) • To come: super computer 3x times power of world’s current fastest computer (Tianhe -2) to cope with SKA data
“ Construction of the SKA is due to begin in 2018 and finish sometime in the middle of the next decade. Data acquisition will begin in 2020, requiring a level of processing power and data management know-how that outstretches current capabilities. Astronomers estimate that the project will generate 35,000-DVDs-worth of data every second. This is equivalent to “the whole world wide web every day,” said Fanaroff .”
H3ABioNet (H3Africa) 30 institutions, 15 African countries, 2 partners outside Africa
• African genomic research ; Central node at University of Cape Town • Using NetMap to monitor connectivity • Data transfer: Africa Globus Online (668,622 files transferred between Rhodes University & UCT; 140TB data transferred from USA to SA • Challenges: slow & unstable Internet, unreliable power supply, continent-wide obsolete computer infrastructure that varies between medium-scale server infrastructure to a small number of workstations, with multiple operating systems, lack of centralized, secure data storage • Other: database of participants (H3APRDB, REDCap), data analysis incl. Galaxy, Job Management System, eBiokits, REDCap, WebProtege, Pipelines for data execution, data repository (European Genome-Phenome Archive)
Open Science Defined “Open Science is the practice of science in such a way that others can collaborate and contribute , where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods .” - FOSTER Project, funded by the European Commission
Benefits of open data • Provide evidence for research conducted • Collaboration advances science, discovery • Predict trends & informed decisions • Drive development, service delivery • More entrepreneurs – using data in innovative ways, create jobs • Have potentially far more outcomes when open, higher impact • Democratising research & data towards achieving 2030 Sustainable Development Goals
Open Data, Open Science & Research Lifecycle (Foster)
Repositories Policy & Infrastructure Plan Tools Gold/Green OA Repositories Original Research Data Lifecycle image from University of California, Santa Cruz http://guides.library.ucsc.edu/datamanagement/
“Several open science activities are underway across Africa, but a great deal will be gained if, in the context of developing inter-regional links, these activities were to be coordinated and developed through such a coordinating initiative .” - CODATA
Open Data Repositories (re3data - 16)
http://opendatabarometer.org/?_year=2016&indicator=ODB
https://index.okfn.org/place/#map
African Open Science Platform http://africanopenscience.org.za/ • Platform = opportunity to engage in dialogue, create awareness, connect all, provide continental view • Funded by SA Dept. of Science & Technology through National Research Foundation • 3 years (1 Nov. 2016 – 31 Oct. 2019) • Managed by Academy of Science of South Africa (ASSAf) • Through ASSAf hosting ICSU Regional Office for Africa (ICSU ROA) • Direction from CODATA
Accord on Open Data in a Big Data World • Proposes comprehensive set of principles • FAIR Principles • Values of open data in emerging scientific culture of big data • Need for an international framework • Provides framework & plan for African data science capacity Call to mobilization initiative Endorse
Key Stakeholders • Global Network of Science Academies (IAP) • International Council for Science (ICSU) • The World Academy of Sciences (TWAS) • Research Data Alliance (RDA) • NRENs (Internet Service Providers for Education) • Association of African Universities (AAU) • Network of African Science Academies (NASAC) • African Research Councils (incl. DIRISA, funders) • African Universities • African Governments • Other
Database Experts & Initiatives 800+
Landscape Survey: Countries & Initiatives 567
Concentration of Activities
Click to view Initiatives/Country https://www.targetmap.com/viewer.aspx?reportId=56245 Please note: this is just a preview and data still to be cleaned and updated and corrected.
AOSP Focus Areas Capacity Policy Incentives Infrastructure Building
Infrastructure Framework • Purpose: Create awareness & guide development of a cyber- infrastructure strategy & action plan, promote policies & strategies • NRENs – Level 6 Elaborated Service Offering An NREN Capability Maturity Model – Duncan Greaves (2015, Tertiary Education Network) • Richly connected at high speed to many other networks/resources • Deep culture of collaboration
Proposed NREN Service Catalogue in support of Data • Grid & cloud computing resources/middleware – access: • Scientific applications, complex data sets, computing facilities • User controlled light paths, videoconferencing, federated identity services, security, data storage and archives, connecting e-resources e.g. electron & astronomical microscopes, medical imaging, simulators, sensor networks, accelerators, supercomputers, state-of-the-art affordable bandwidth on demand, computing power, capacity building, dedicated point-to-point Internet Protocol circuits, data storage (data centres)
• Disciplines: Engineering, IT, Economics, Physics, Biology, Environmental Studies, Public Health, Town Planning (Smart Cities), Population Studies • Research Areas: Climate change, environmental impact, extreme weather events, biodiversity, food security, malaria, infectious diseases and pandemics
Data in Africa • Tunisian Computing Centre el Khawarizmi manages Data Centre • Kenya Education Network (KENET) provides access to domain names, data center, cloud computing & science gateways, capacity building, security services • Data Intensive Research Initiative for South Africa (DIRISA) – component of SA National Cyber- Infrastructure System • Open Data for Africa platform (African Development Bank (AfDB)) – to boost access to quality data for managing & monitoring development results in African countries, incl. African Action Plan 2063 & 2030 SDGs
• High Performance Computing (HPCs): Botswana, Lesotho, Mozambique, SA, Tanzania, Zambia, Zimbabwe • South Africa : Data Intensive Research Cloud Infrastructure Initiatives – ARC, SADIRC, Ilifu (cloud for researchers working in astronomy and bioinformatics in Western Cape & research data management system)
Africa Data Consensus Study • Adopted in March 2015 at High Level Conference on Data Revolution • Strategy for implementing data revolution in Africa • Plan of action to be guided by United Nations Economic Commission for Africa (UNECA), African Union Commission (AUC), African Development Bank (AfDB), supported by UN Development Programme (UNDP), UN Populations Fund (UNFPA) • Implemented in collaboration with partner institutions from public & private sectors, civil society organisations
SADC Cyber-Infrastructure Framework • Towards strategy and action plan , implementation plan and governance structure • Support strategic plans on Science, Technology, Innovation • Guide on creating and enabling environment to harness science, technology and innovation • Impact socio-economic development & industrialization • Enhance education in developing & using technologies • Support collaborative research development & innovation
• Cyber-infrastructure is a key driver for a knowledge based economy • Comprises of technologies, skills, people and policies which support generation, analysis, transport, sharing, stewardship of information (incl. data) • Framework provides Roadmap towards Cyber-infrastructure Strategy
Services offered by UbuntuNet NRENs [Source: Colin Wright SADC/ET-ST1/1/2016/11 Document]
Components • Research and Education Networks (RENs) • Computation resources & services (HPC etc) • Data – tools & facilities to enable efficient data driven discoveries, technologies, innovations • HR-capacity development to enable: • CI specialists to roll-out services & infrastructure • Beneficiaries to fully benefit from CI services • Policies to enable optimum establishment & utilization of CI
Recommend
More recommend