cise overview and big data
play

CISE Overview and Big Data Suzi Iacono CISE Directorate National - PowerPoint PPT Presentation

CISE Overview and Big Data Suzi Iacono CISE Directorate National Science Foundation SI^2 Workshop January 17, 2013 Image&Credit:&Exploratorium.& Economic Impact of IT Growth of IT industry coupled with productivity gains


  1. CISE Overview and Big Data Suzi Iacono CISE Directorate National Science Foundation SI^2 Workshop January 17, 2013 Image&Credit:&Exploratorium.&

  2. Economic Impact of IT • Growth of IT industry coupled with productivity gains across the entire economy have had enormous impact. • IT industries accounted for 25% of US economic growth since 1995. – In 2010, IT industries grew 16% and contributed 5% to overall US GDP • Use and production of IT accounted for ~2/3 of the post-1995 growth in labor productivity. • IT sector generates jobs: IT jobs have grown 125x faster than employment as a whole between 2001 and 2011, and in 2011, IT workers earned 74% more than the average worker. • IT diversifies regional economies to include idea-driven “creative” industries. Sources: NRC (2009). Assessing the Impacts of Changes in the IT R&D Ecosystem .; NRC (2012). Continuing Innovation in Information Technology.; ITIF (2012). Looking for Jobs? Look to IT in 2010 and Beyond.

  3. CISE&Directorate&& Computing and Information and Computer and Network Communication Intelligent Systems Systems (CNS) Foundations (CCF) (IIS) Keith Marzullo Susanne Hambrusch Howard Wactlar Computer+ HumanA Algorithmic+ Systems+ Centered+ CISE&Core&Programs& Founda1ons+ Research+ Compu1ng+ Communica1on+ Networking+ Informa1on+ 70%& and+Informa1on+ Technology+and+ Integra1on+and+ Founda1ons+ Systems+ Informa1cs+ So7ware+and+ Robust+ Hardware+ Intelligence+ Founda1ons+ CISE&Cross<Cu=ng&Programs& 30%& Cross<FoundaAon&Programs&

  4. CISE&Directorate&& Computing and Information and Office of Computer and Network Communication Intelligent Systems Cyberinfrastructure Systems (CNS) Foundations (CCF) (IIS) (OCI) Keith Marzullo Susanne Hambrusch Howard Wactlar Alan Blatecky Computer+ HumanA Algorithmic+ Systems+ Centered+ CISE&Core&Programs& Founda1ons+ Research+ Compu1ng+ Communica1on+ Networking+ Informa1on+ 70%& and+Informa1on+ Technology+and+ Integra1on+and+ Founda1ons+ Systems+ Informa1cs+ So7ware+and+ Robust+ Hardware+ Intelligence+ Founda1ons+ CISE&Cross<Cu=ng&Programs& 30%& Cross<FoundaAon&Programs&

  5. !"Word"Cloud"created"from"CISE"FY"2011"award"9tles."

  6. Who is the CISE community? PI&and&Co<PI&Departments&for&FY&2011&Awards&Funded&by&CISE& Interdisciplinary+ Sciences+&+ Centers,+3%+ Humani1es,+21%+ Computer+ Science+&+ Informa1on+ Science+&+ Computer+ Engineering+ (CISE),+65%+ Engineering+ (excluding+ Computer+ Engineering),+ 11%+

  7. Research Frontiers Smart&Systems:& Expanding&the&Limits& Data&Explosion& Sensing,&Analysis&and& of&ComputaAon& Decision& AugmenAng&Human& Secure&Cyberspace& Universal&ConnecAvity& CapabiliAes&

  8. Advances in information technologies are transforming the fabric of our society and data represents a transformative new currency for science, engineering, education and commerce. Image+Credit:+CCC+and+SIGACT+CATCS+

  9. Where+do+the+data+come+from?+ ++ Why+do+we+have+a+na1onal+ini1a1ve?+

  10. The Big Data Landscape I: Big Science • Science gathers data at an ever-increasing rate across all scales and complexities of natural phenomena • Sloan Digital Sky Survey in 2000 collected more data in its 1 st few weeks than had been amassed in the entire history of astronomy – Within a decade, over 140 terabytes of information collected • Large Hadron Collider generates scores of petabytes a year • The proposed Large Synoptic Survey Telescope (3.3 gigapixel digital camera) will generate 40 terabytes of data nightly • By 2015, the world will generate the equivalent of approximately 93 million Libraries of Congress

  11. The Big Data Landscape II: Smart Sensing, Reasoning and Decision- making Emergency&Response& Environment&Sensing& Situation Awareness: Humans as sensors feed multi- modal data streams Credit:+Photo+by+US+Geological+Survey++ Agent Percepts (Reasoning) (sensors) Actions Pervasive&&&&&CompuAng&& (controllers) Social&&&&&&&&&&&InformaAcs&& People<Centric&Sensing& Smart&Health&Care& Evaluate+ Sense+ Intervene+ Iden1fy+ Public+ Assess+ Sensing+ Social+ Sensing+ Personal+ Sensing+ Source: Sajal Das, Keith Marzullo

  12. The Big Data Landscape III: New Paradigms for Communications Today$ 1988$ Remarkable Pace of Innovation MOBILE BLOGS SOCIAL NETWORKS EMAIL VIDEO VOIP

  13. The Big Data Landscape IV: The Long Tail of Science • Hundreds of thousands of scientists and engineers work individually or in small, distributed, disconnected groups – all generating data that collectively represent an enormous, largely untapped scientific resource – From running simulations, experiments, etc. • Making heterogeneous data across many areas of science more homogeneous could give way to breakthroughs across all areas of science and engineering • Estimated 40 exabytes of unique new information generated worldwide in 2010 • Only 5% of the information created is “structured,” however, in a standard format of words or numbers; the rest are unstructured text, voice, images, etc.

  14. How Big is Big ? • “Big Data”: “Datasets whose size are beyond the ability of typical database software tools to capture, store, manage, and analyze” -McKinsey Global Institute, Big data: the next frontier for innovation, competition, and productivity, May 2011. Image+Credit:+ Sigrid"Knemeyer +

  15. …Not Just Volumes of Data • The science of big data is not just about volumes and velocity of data, but also – Heterogeneity and diversity • Levels of granularity • Media formats • Scientific disciplines – Complexity • Uncertainty • Incompleteness • Representation types

  16. Why is Big Data Important? • Critical to transforming how science is done and to accelerating the pace of discovery in almost every science and engineering discipline • Transformative implications for commerce and economy • Potential for addressing some of society’s most pressing challenges Image+Credit:+ Chi"Birmingham +

  17. Paradigm Shift: from Hypothesis-driven to Data-driven Discovery " " " " " " The"Fourth"Paradigm:" The"Economist," The+data+ Data!Intensive"Scien9fic" deluge+and+how+to+ Discovery"(2009," handle+it:+A+14Apage+ Microso7+Corpora1on).+ special+report+(Feb+25,+ + 2010).++ + + + + + + + + + + hVp://research.microso7.com/enAus/ hVp://www.sciencemag.org/site/special/data/+ hVp://www.economist.com/node/15579717++ collabora1on/fourthparadigm/++

  18. The Age of Data: From Data to Knowledge to Action • Data-driven discovery is revolutionizing scientific exploration and engineering innovations • Automatic extraction of new knowledge about the physical, biological and cyber world continues to accelerate • Multi-cores, concurrent and parallel algorithms, virtualization and advanced server architectures will enable data mining and machine learning , and discovery and visualization of Big Data

  19. Potential for Transformational Science & Engineering: From Data to Knowledge to Action • Integration of discipline (or media format…) specific data, examine for relationships – Disaster informatics • 3D toxic fume images • Simulations of gas spread • Maps of census concentrations • First responder on-the- ground findings • Evacuation routing

  20. Examples of Research Challenges • More data are being collected than we can store • Analyze the data as it becomes available • Decide what to archive and what to discard • Many data sets are too large to download • Analyze the data wherever it resides • Many data sets are too poorly organized to be usable • Better organize and retrieve data • Many data sets are heterogeneous in type, structure, semantics, organization, granularity, accessibility … • Integrate and customize access to federate data • Utility of data is limited by our ability to interpret and use it • Extract and visualize actionable knowledge • Evaluate results • Large and linked datasets may be exploited to identify individuals • Design management and analysis with built-in privacy preserving characteristics

  21. A National Imperative • PCAST calls on the Federal government to increase R&D investments for collecting, storing, preserving, managing, analyzing, and sharing the increasing quantities of data. • Furthermore, PCAST observed that the potential to gain new insights … to move from data to knowledge to action has tremendous potential to transform all areas of national priority. Source: PCAST (December 2010), “Report to the President and Congress: Designing a Digital Future…”– a periodic congressionally-mandated review of the Federal Networking and Information Technology Research and Development (NITRD) Program.

Recommend


More recommend