www.tugraz.at n W I S S E N n T E C H N I K n L E I D E N S C H A F T Science 2.0 VU Big Science, e-Science and E- Infrastructures + Bibliometric Network Analysis Elisabeth Lex KTI, TU Graz WS 2015/16 u www.tugraz.at
www.tugraz.at n Agenda • Repetition from last time: altmetrics / altmetrics in practice • Big Data and Science • E-Science • E-Infrastructures • Bibliometric Network Analysis • Your Assignment! 2
www.tugraz.at n Altmetrics (repetition) „Altmetric is the creation and study of new metrics based on the Social Web for analyzing and informing scholarship“ - Altmetrics Manifesto, http://altmetrics.org/about • Aggregated from many sources (e.g. Twitter, Mendeley, github, slideshare,...) • Article Level Metrics (ALM) • multidimensional suite of transparent and established metrics at article level 3
www.tugraz.at n Examples for Altmetrics sources (repetition) • Usage • Views, downloads,.. • Captures • Bookmarks, readers,.. • Mentions • Blog posts, news stories, Wikipedia articles, comments, reviews • Social Media • Tweets, Google+, Facebook likes, shares, ratings • Citations • Web of Science, Scopus, Google Scholar,... 4
www.tugraz.at n Examples: Altmetric.com 5 Source: http://www.altmetric.com/details.php?domain=www.altmetric.com&citation_id=843656
www.tugraz.at n Lessons learned (repetition) • Alternative ways to assess impact of various scientific outputs • No common understanding of altmetrics yet • What do they really express? • Are they useful and for which part of the research process? • Not necessarily „better“ metrics • E.g. Gamification • Can help to get an overview of a research field • Visualizations based on altmetrics 6
www.tugraz.at n Modern Science: What has changed? • 150 years later: Searching for new particles like Higgs boson with the Large Hadron Collider • Built in collaboration with over 10,000 scientists and engineers from over 100 countries, hundreds of universities and laboratories. In a tunnel of 27 km in circumference,175 m deep, near Geneva 7
www.tugraz.at n Motivation • Internet and science disciplines (e.g. physical sciences, biological sciences, medicine, and engineering) generate large and complex datasets (Big Data) • require more advanced database and architectural support • „New kind of research methodology“ has emerged (fourth paradigm of scientific exploration (Hey, 2007) • based on statistical exploration of big amounts of data http://www.ksi.mff.cuni.cz/astropara/ 8
www.tugraz.at n Data intensive scientific discovery http://research.microsoft.com/en-us/collaboration/fourthparadigm/4th_paradigm_book_complete_lr.pdf 9
www.tugraz.at n Example: Big Data in Science - European Exascale Projects Exascale computing: computers capable of at least one exaflops (10 18 floating point operations per second) à Not yet achieved, currently 10 15 10 http://exascale-projects.eu
www.tugraz.at n Publications as Big Data Cross- Journal Recommen- dation based on Click Streams [Bollen et al., 2009] 11
www.tugraz.at n e-Science • Large scale science (since 1999) • Data-driven discovery • Focus on computationally intensive science and how to tackle it using highly distributed environments in collaborative manner • Powerful computers: Supercomputers, High Performance Computing (HPC), Grid,… • Distributed Computing • Powerful research infrastructures – “e-infrastructures”, grids, clouds 12 http://www.anandtech.com/show/6421/inside-the-titan-supercomputer-299k-amd-x86-cores-and-186k-nvidia-gpu-cores/3
www.tugraz.at n Supercomputers • large, expensive systems, usually housed in a single room, in which multiple processors are connected by fast local network • Suited for highly complex, real-time applications and simulation Pros: data can move between processors rapidly à all processors can work together on same tasks Cons: expensive to build and maintain. Do not scale well, e.g. adding more processors is challenging http://www.wikihow.com/Build-a-Supercomputer 13 http://www.top500.org/lists/2014/06/
www.tugraz.at n Distributed Computing • systems in which processors are not necessarily located in close proximity to one another—and can even be housed on different continents—but which are connected via the Internet or other networks • Pros: relative to supercomputers much less expensive. • Cons: less speed achieved than with supercomputers 14
www.tugraz.at n Example: Hadoop • Ecosystem of tools for processing big data • Simple computational model • two-stage method for processing large data amounts • design an algorithm for operating on one chunk of the data in two stages (a Map and a Reduce stage), MapReduce automatically distributes that algorithm to cluster à hides complexity in framework http://hadoop.apache.org http://architects.dzone.com/articles/how-hadoop-mapreduce-works 15
www.tugraz.at n Hadoop in eScience: Example: Astronomical Image Processing • Large telescopes survey sky over a prolonged period of time. • Large Synoptic Survey Telescope LSST - under construction - will capture 1/2 of sky over 10 years - 30TB of data every night - ~60PBs in 10 years • Astronomers pick out faint objects for study by capturing multiple images of same area and by combining them – „coaddition“ • Challenge: how to organize and process all the resulting data. http://www.lsst.org/lsst/ 16
www.tugraz.at n Using Hadoop to help with image coaddition http://escience.washington.edu/get-help-now/astronomical-image-processing-hadoop 17
www.tugraz.at n Virtual Science Environments • Not only HPC but also sharing of knowledge and data is becoming a requirement for scientific discovery • providing useful mechanisms to facilitate this sharing • Preserve and organize research data à Virtual Science Environments: „virtual environments in which researchers work together through ubiquitous, trusted and easy access to services for scientific data, computing and networking, enabled by e-Infrastructures“ 18
www.tugraz.at n Defining e-Infrastructures European e- Infrastructure Reflection group (e-IRG): ‘The term e-Infrastructure refers to this new research environment in which all researchers—whether working in the context of their home institutions or in national or multinational scientific initiatives—have shared access to unique or distributed scientific facilities (including data, instruments, computing and communications), regardless of their type and location in the world.’ http://www.e-irg.eu/about-e-irg.html 19
www.tugraz.at n e-Infrastructures - Goals • Opening access to knowledge through reliable, distributed and participatory data e-infrastructures • Cost effective infrastructures for preservation and curation for re-use of data • Persistent availability of information and linking people and data through flexible and robust digital identifiers • Interoperability for consistency of approaches on global data exchange (e.g. standards) • Enabling trust through authentication and authorisation mechanisms http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/framework-for-action-in-h2020_en.pdf 20
www.tugraz.at n Example: e-Infrastructure OpenAIRE • The European Open Access Data Infrastructure for Scholarly and Scientific Communication • Functionality: • Harvesting and storing of information about publications from various repos (OAI-PMH) • Enables searching for publications and related infos (e.g. funding,..) • Provides list of OA repos that can be used to store publications • Orphan repo • Shows statistics of stored data https://www.openaire.eu 21
www.tugraz.at n OpenAIRE - Applications 22
www.tugraz.at n Example: e-Infrastructures Austria 1/2 http://www.e-infrastructures.at 23
www.tugraz.at n Example: e-Infrastructures Austria 2/2 24
www.tugraz.at n Take away message • Big Science / e-Science: data-driven, large scale science • Supercomputers and distributed computing • Virtual research environments • e-Infrastructures 25
www.tugraz.at n Bibliometric Network Analysis 26
www.tugraz.at n Bibliometrics • Quantitative study of all kinds of bibliographic data • Patterns of authorship, publications, citations • E.g: citation analysis of research outputs/publication • Assess research impact of individuals, groups, institutions • Measuring by Author (H Index), Article (Plos), or Publication (Journal Impact Factor) • Measure of Output not Quality (Quantitative Not Qualitative !) • Other measures could include funding received, number of patents, awards granted, or qualitative measures such as peer review 17/04/2015 Maynooth University
Recommend
More recommend