Jupyter Trends in 2018 Paco Nathan @pacoid Jupyter provides a rich - PowerPoint PPT Presentation

Jupyter Trends in 2018 Paco Nathan @pacoid

Jupyter provides a rich set of extensible, re-usable building blocks , expressed through various open protocols, APIs, and standards. These get combine for a wide variety of use cases, as extensible software architecture for interactive computing with data. Over the past year since JupyterCon 2017 , we’ve noted three distinct trends emerging ➔

1/ We’ve seen large organizations adopt Jupyter for their analytics infrastructure, in a “leap frog” effect over commercial offerings. Many people hired out of universities already know how to write ML apps in Jupyter – and those without coding backgrounds can learn rapidly via Jupyter. Why spend money re-training your staff to use proprietary frameworks when there are more effective means available?

2/ An emerging trend disrupts the past 15-20 years   of software engineering practice: hardware > software > process Hardware is now evolving more rapidly than software, which is evolving more rapidly than effective process. Jupyter helps “future proof” efforts during this period   of chaos / rapid evolution. BTW, that dovetails quite nicely with cloud services.

A recent interview with Andrew Feldman, founder/CEO of Cerebras Systems, gives a good overview of the blossoming area of specialized hardware for machine learning, edge computing, decentralization, etc.: https://www.oreilly.com/ideas/specialized-hardware-for- deep-learning-will-unleash-innovation

3/ As we see enterprise, government, universities, etc., roll out interactive computing at scale, the organizational challenges arise next: Practices regarding collaboration, data privacy , ethics, security, compliance, etc. Jupyter addresses critical needs – which Silicon Valley hadn’t previously focused on enough. Watch within the highly regulated environments , where that rapid evolution in open source is happening.

O’Reilly did a recent study about ML adoption in enterprise , with 8000+ respondents worldwide, which provides relevant insights: https://www.oreilly.com/ideas/5-findings-from-oreilly-machine- learning-adoption-survey-companies-should-know

an even larger challenge looms: We’re here now, 29 years after Tim Berners-Lee created   WWW – 55 years after Ted Nelson invented hypertext – 73+ years after Vannevar Bush (and Jorge Luis Borges ) first described it. Online media expands, while the business of print media   has all but tanked. Science, given its “publish or perish” onus, has become   a vast and scattered library of “ digital paper ” – all neatly indexed by keyword search and wiki entries…

an even larger challenge looms: We’re here now, 29 years after Tim Berners-Lee created   WWW – 55 years after Ted Nelson invented hypertext – 73+ years after Vannevar Bush (and Jorge Luis Borges ) first described it. Online media expands, while the business of print media   has all but tanked. except when it isn’t Science, given its “publish or perish” onus, has become   a vast and scattered library of “ digital paper ” – all neatly indexed by keyword search and wiki entries…

Those pioneers dreamt of entirely new ways for us to collaborate, to extend our shared understanding. However, they hadn’t dreamt of trolling and harassment … Russian bot swarms … climate science attacked due   to lack of reproducible papers … ML leveraged to polarize public animosity … cyberthreats holding hospital IT for ransom … Plus other ways of befouling scientific advances, online media, etc. While we’re talking about open source , these   are exploits – as attempts to undermine open society .

Karl Popper , however, warned about precisely that: “non-reproducible single occurrences   are of no significance to science” as explored in The Logic of Scientific Discovery (1934) and later in The Open Society and Its Enemies (1945)

Karl Popper , however, warned about precisely that: “non-reproducible single occurrences   are of no significance to science” as explored in The Logic of Scientific Discovery (1934) and later in The Open Society and Its Enemies (1945) if you have not studied the latter in detail, you should

Check out astrophysics research applied to analyze and detect cyberthreats in media, e.g., work by Steve Kramer, et al.: https://www.oreilly.com/ideas/identifying-viral-bots-and- cyborgs-in-social-media

Eight decades later, we inherit a blend of what both Bush and Popper had scried from the rubble and ashes of WWII. Reproducibility in science – and, importantly, the closely related aspect of falsifiability – become foremost concerns. To wit, unmitigated power craves universal statements   for its own whims; however, universal statements can   be disproven by singular events .

Reproducible science has close analogues in other fields   on which, as we find, an open society depends: ▪ data science – vital for any organization that depends on analytics,   as the key to shared, accountable judgement ▪ machine learning – interpretation, verification, transparency, ethics ▪ software engineering – continuous integration (CI/CD), testability,   security audits, reliability for critical infrastructure ▪ teaching – to help instructors manage the scaffolding needed to   make course materials more engaging, immediately hands-on;   to give learners confidence and direct experience ▪ journalism – how we demonstrate tangible, quantifiable evidence   about what might otherwise be dismissed as ephemeral reports

Reproducible science has close analogues in other fields   on which, as we find, an open society depends: ▪ data science – vital for any organization that depends on analytics,   as the key to shared, accountable judgement ▪ machine learning – interpretation, verification, transparency, ethics Q: ▪ software engineering – continuous integration (CI/CD), testability,   security audits, reliability for critical infrastructure where else? ▪ teaching – to help instructors manage the scaffolding needed to   make course materials more engaging, immediately hands-on;   to give learners confidence and direct experience ▪ journalism – how we demonstrate tangible, quantifiable evidence   about what might otherwise be dismissed as ephemeral reports

BTW, reproducible workflows in machine learning are notoriously difficult, due to a variety of reasons: e.g., the stochastic nature of training models, non-deterministic floating-point math on GPUs, etc. A new category of tooling approaches reproducible ML workflows   in innovative ways, including: ▪ Biome by Recognai ▪ PEDL by Determined AI

Meanwhile, there’s a compelling dynamic in which both reproducible science and open source are necessary for collaboration at scale. Both disciplines have much to learn from each other. Let’s work together to discover and articulate that part about “ where else? ”

Meanwhile, there’s a compelling dynamic in which both reproducible science and open source are necessary for collaboration at scale. Both disciplines have much to learn from each other. Ultimately, much of our program   Let’s work together to discover and articulate that part about “ what else? ” at JupyterCon 2018 is about what   these disciplines collected here   now must learn from each other

Thank you.

publica(ons, interviews, conference summaries… https://derwen.ai/paco   @pacoid

Jupyter Trends in 2018 Paco Nathan @pacoid Jupyter provides a rich - PowerPoint PPT Presentation

Jupyter Trends in 2018 Paco Nathan @pacoid Jupyter provides a rich set of extensible, re-usable building blocks , expressed through various open protocols, APIs, and standards. These get combine for a wide variety of use cases, as extensible

Notebook The Larger Jupyter Team @jupyterlab on GitHub @ProjectJupyter on Twitter Vidar Tonaas

JupyterLab: Ian Rose, UC Berkeley Jessica Forde, Jupyter The Evolution of the Jupyter Jason

Serverless Jupyter github.com/drola Matja Drolc 1 2 Example Jupyter notebook

Jupyter in HPC 1 Matthias Bussonnier A Physicist/Bio-Physicist About Me Core developer

Vis isualization and and optimizatio ion Jupyter Matplotlib scipy.optimize.minimize

A Brief History of Jupyter Notebooks William Horton Two difgerent worlds of Python What is a

Cal Poly Outline Jupyter + Computational Notebooks Data Science in Large, Complex

Jupyter and Spark on Mesos: Best Practices June 21 st , 2017 Agenda About me What is

ID2223 - Lab Preparation Jupyter Notebooks on Docker Containers In the lab assignments, we will

4 2017/18 GouTP @ SCEE About: Introduction to Jupyter notebooks Date: 8 th of March 2018 Who:

Conference 2018 Callysto: Bringing Jupyter and Computational Thinking to the K-12 Curriculum

Jupyter Graduates Douglas Blank, Ph.D. with Kara Breeden, B.A. & Nicole Petrozzo, B.A. Bryn

To Jupyter and Beyond Presented by Tim Ribaric and Daniel Brett First, a commercial! 2 Agenda

Using Jupyter to Create a Community for Satellite Imagery Analysis and Sharing DigitalGlobe

Private Equity Trends in CEE 5/6/2018 Rastko Petakovi Worldwide trends European trends CEE

Data Analysis with Python Pandas, Jupyter, and Friends Andreas Herten, 4 May 2017 The data

UBERMAG: INTERACTIVE MICROMAGNETIC SIMULATIONS IN JUPYTER 1 University of Southampton, Highfield,

IPyStata Stata + Python + Jupyter Notebook The whole is greater than the sum of its parts.

15-388/688 - Practical Data Science: Jupyter notebook lab J. Zico Kolter Carnegie Mellon

pyRT - Computer Graphics in Jupyter Notebooks for Fun and Teaching Image Generation using Pure

A case study of computational science in Jupyter notebooks Micromagnetic VRE - Hans Fangohr

Getting the most out of the ROOT tutorials Automated conversion from ROOT macros to Jupyter

2018 In Urbana Development Trends in Urbana in 2018 January 15th 2018 Libby Horwitz, AICP -

Programmeren IK 2019 Jelle van Assema Vorige week Jupyter Notebook Comprehensions

Jupyter Trends in 2018 Paco Nathan @pacoid Jupyter provides a rich - PowerPoint PPT Presentation

Jupyter Trends in 2018 Paco Nathan @pacoid Jupyter provides a rich set of extensible, re-usable building blocks , expressed through various open protocols, APIs, and standards. These get combine for a wide variety of use cases, as extensible

Notebook The Larger Jupyter Team @jupyterlab on GitHub @ProjectJupyter on Twitter Vidar Tonaas

JupyterLab: Ian Rose, UC Berkeley Jessica Forde, Jupyter The Evolution of the Jupyter Jason

Serverless Jupyter github.com/drola Matja Drolc 1 2 Example Jupyter notebook

Jupyter in HPC 1 Matthias Bussonnier A Physicist/Bio-Physicist About Me Core developer

Vis isualization and and optimizatio ion Jupyter Matplotlib scipy.optimize.minimize

A Brief History of Jupyter Notebooks William Horton Two difgerent worlds of Python What is a

Cal Poly Outline Jupyter + Computational Notebooks Data Science in Large, Complex

Jupyter and Spark on Mesos: Best Practices June 21 st , 2017 Agenda About me What is

ID2223 - Lab Preparation Jupyter Notebooks on Docker Containers In the lab assignments, we will

4 2017/18 GouTP @ SCEE About: Introduction to Jupyter notebooks Date: 8 th of March 2018 Who:

Conference 2018 Callysto: Bringing Jupyter and Computational Thinking to the K-12 Curriculum

Jupyter Graduates Douglas Blank, Ph.D. with Kara Breeden, B.A. &amp; Nicole Petrozzo, B.A. Bryn

To Jupyter and Beyond Presented by Tim Ribaric and Daniel Brett First, a commercial! 2 Agenda

Using Jupyter to Create a Community for Satellite Imagery Analysis and Sharing DigitalGlobe

Private Equity Trends in CEE 5/6/2018 Rastko Petakovi Worldwide trends European trends CEE

Data Analysis with Python Pandas, Jupyter, and Friends Andreas Herten, 4 May 2017 The data

UBERMAG: INTERACTIVE MICROMAGNETIC SIMULATIONS IN JUPYTER 1 University of Southampton, Highfield,

IPyStata Stata + Python + Jupyter Notebook The whole is greater than the sum of its parts.

15-388/688 - Practical Data Science: Jupyter notebook lab J. Zico Kolter Carnegie Mellon

pyRT - Computer Graphics in Jupyter Notebooks for Fun and Teaching Image Generation using Pure

A case study of computational science in Jupyter notebooks Micromagnetic VRE - Hans Fangohr

Getting the most out of the ROOT tutorials Automated conversion from ROOT macros to Jupyter

2018 In Urbana Development Trends in Urbana in 2018 January 15th 2018 Libby Horwitz, AICP -

Programmeren IK 2019 Jelle van Assema Vorige week Jupyter Notebook Comprehensions

Jupyter Graduates Douglas Blank, Ph.D. with Kara Breeden, B.A. & Nicole Petrozzo, B.A. Bryn