scientist meets web dev
play

Scientist meets web dev: how Python became the language of data Ga - PowerPoint PPT Presentation

Scientist meets web dev: how Python became the language of data Ga el Varoquaux Scientist meets web dev: how Python became the language of data Ga el Varoquaux Very diverse community This talk: a reflection on what we have in common,


  1. Scientist meets web dev: how Python became the language of data Ga¨ el Varoquaux

  2. Scientist meets web dev: how Python became the language of data Ga¨ el Varoquaux Very diverse community This talk: a reflection on what we have in common, Python I am talking about things you don’t understand (my science) and things I don’t understand (web dev)

  3. I actually did a PhD in quantum physics Hence I think I qualify as a “scientist” G Varoquaux 3

  4. I now do computer science for neuroscience Try to link neural activity to thoughts and cognition G Varoquaux 4

  5. I now do computer science for neuroscience Try to link neural activity to thoughts and cognition We attack it as a machine learning problem Python software: nilearn G Varoquaux 5

  6. On the way, we created a machine-learning library: scikit-learn G Varoquaux 6

  7. Data science with Python is hot Huge success. Cool. Data science is THE thing. G Varoquaux 7

  8. Data science with Python is hot Huge success. Cool. Data science is THE thing. Python is the go-to language How did it happen? We built scikit-learn, others pandas, etc..., but these were built on solid foundations G Varoquaux 7

  9. 1 Scientists come from Jupiter And web devs from Saturn? And sysadmins from Neptune? G Varoquaux 8

  10. 1 We’re different strings numbers (in arrays) databases arrays (of numbers) object-oriented arrays programming flow control arrays A bit of a culture gap G Varoquaux 9

  11. 1 Let’s do something together : sort EuroPython site 205 talks : How OpenStack makes Python better (and vice-versa) Introduction to aiohttp So you think your Python startup is worth $10 million... SQLAlchemy as the backbone of a Data Science company Learn Python The Fun Way Scaling Microservices with Crossbar.io If you can read this you don’t need glasses Let’s find some common topics with data science G Varoquaux 10

  12. 1 Let’s do something together : sort EuroPython site 205 talks : How OpenStack makes Python better (and vice-versa) Introduction to aiohttp So you think your Python startup is worth $10 million... SQLAlchemy as the backbone of a Data Science company Learn Python The Fun Way Scaling Microservices with Crossbar.io If you can read this you don’t need glasses Let’s find some common topics with data science Anyone who has used Python to search text for substring patterns has at least heard of T h e the regular expression module. Many of us p y . t e s t t o o l p r e w a y s e n use it extensively for parsers and lexers, t o w t s a r i t e r a p i t e s d a n t r t s f o d s i a i n i r y o m p l n g g u r P e i v e s y t h o h o w a q b o n u t c o i n t u i c k r s a d e . o s o i n t o l r p e T h i s m e d e v e o d u g r a t e d i s t o r e c t i o n t e h i e n c g u w t o i n w i w i t h t i s h i n h o t h e h a t M S g o r f e b o u t x e r c C o C a t u a l k a i s e s a j n g e t s r e t s . n d d y . L e x t e l e s s l s i o n s t o e a m x t e n p p s s a r e w n a o o l b u r o k s , t y o p h o o , a p g i n s r p l u y o u G Varoquaux 10

  13. 1 Let’s do something together : sort EuroPython site 205 talks : How OpenStack makes Python better (and vice-versa) Introduction to aiohttp So you think your Python startup is worth $10 million... SQLAlchemy as the backbone of a Data Science company Learn Python The Fun Way Scaling Microservices with Crossbar.io If you can read this you don’t need glasses Let’s find some common topics with data science import urllib2, bs4 Anyone who has used Python to search text for substring patterns has at least heard of T h e the regular expression module. Many of us p y . t e s t t o o l p r e w a y s e n use it extensively for parsers and lexers, t o w t s a r i t e r a p i t e s d a n t r t s f o d s i a i n i r y o m p l n g g u r P e i v e s y t h o h o w a q b o n u t c o i n t u i c k r s a d e . o s o i n t o l r p e T h i s m e d e v e o d u g r a t e d i s t o r e c t i o n t e h i e n c g u w t o i n w i w t i h t i s h i n h o t h e h a t M S g o r f e b o u t x e r c C o C a t u a l k a i s e s a j n g e t r s e t s . n d d y . L e x t e l e s s l s i o n s t o e a m x t e n p p s s a r e w n a o o b l u r o k s , t y o p h o o , a p g i n s r p l u y o u import sklearn, wordcloud G Varoquaux 10

  14. 1 Let’s do something together : sort EuroPython site Crawl the schedule to get a list of titles and URLs talk pages to retrieve abstract and tags bs4 : beautiful soup, matchings on the DOM tree G Varoquaux 11

  15. 1 Let’s do something together : sort EuroPython site Crawl the schedule to get a list of titles and URLs talk pages to retrieve abstract and tags bs4 : beautiful soup, matchings on the DOM tree Vectorize T erm Freq Anyone who has used Python to search text a for substring patterns has at least heard of 20 The py.test tool presents a rapid and simple the regular expression module. Many of us way to write tests for your Python code. This can 10 use it extensively for parsers and lexers, training gives a quick introduction with exercises o w u t h into some distinguishing features. a b o code p e r s 4 v e l o t e e d e e g r a c o r o i n t t h e w t w i t h o r h o t h a t M S a b o u C g o C a l k j a n e t s t n d d y . L is e x t e l e s s l i o n s 14 t o e a m t e n s p s s r e x n a p o l b a o w s , t o y o u r h o o k a p p n s , p l u g i module o u r 3 y pro fi ling 2 performance 1 Python 9 the 18 G Varoquaux 11

  16. 1 Let’s do something together : sort EuroPython site Crawl the schedule to get a list of titles and URLs talk pages to retrieve abstract and tags bs4 : beautiful soup, matchings on the DOM tree Vectorize All T erm Freq docs Anyone who has used Python to search text a for substring patterns has at least heard of 20 1321 The py.test tool presents a rapid and simple the regular expression module. Many of us way to write tests for your Python code. This can 10 540 use it extensively for parsers and lexers, training gives a quick introduction with exercises o w u t h into some distinguishing features. a b o code p e r s 4 208 v e l o t e e d e e g r a c o r o i n t t h e w t w i t h o r h o t h a t M S a b o u C g o C a l k j a n e t s t n d d y . L is e x t e l e s s l i o n s 14 964 t o e a m t e n s p s s r e x n a p o l b a o w s , t o y o u r h o o k a p p n s , p l u g i module o u r 3 123 y pro fi ling 2 7 performance 1 6 Python 9 191 the 18 1450 G Varoquaux 11

  17. 1 Let’s do something together : sort EuroPython site Crawl the schedule to get a list of titles and URLs talk pages to retrieve abstract and tags bs4 : beautiful soup, matchings on the DOM tree Vectorize All T erm Freq Ratio docs Anyone who has used Python to search text a for substring patterns has at least heard of 20 1321 .015 The py.test tool presents a rapid and simple the regular expression module. Many of us way to write tests for your Python code. This can 10 540 .018 use it extensively for parsers and lexers, training gives a quick introduction with exercises o w u t h into some distinguishing features. a b o code p e r s 4 208 .019 v e l o t e e d e e g r a c o r o i n t t h e w t w i t h o r h o t h a t M S a b o u C g o C a l k j a n e t s t n d d y . L is e x t e l e s s l i o n s 14 964 .014 t o e a m t e n s p s s r e x n a p o l b a o w s , t o y o u r h o o k a p p n s , p l u g i module o u r 3 123 .023 y pro fi ling 2 7 .286 performance 1 6 .167 Python 9 191 .047 the 18 1450 .012 TF-IDF in scikit-learn sklearn.feature extraction.text.TfidfVectorizer G Varoquaux 11

  18. 1 Let’s do something together : sort EuroPython site performance profiling Python module the code is can 7 a 0 9 7 0 7 0 9 0 8 8 7 7 0 5 3 0 0 0 7 2 5 7 0 7 9 9 7 7 0 0 documents 0 0 0 6 0 0 1 0 7 0 0 0 4 7 9 0 0 8 0 0 0 0 7 9 9 0 0 0 0 0 4 0 0 4 0 0 0 0 0 0 0 8 1 0 0 5 0 2 0 5 0 0 0 Term-document matrix G Varoquaux 12

  19. 1 Let’s do something together : sort EuroPython site performance profiling Python module the code is can 7 7 a 0 9 9 7 7 0 7 7 0 9 9 0 8 8 8 8 7 7 7 7 0 5 5 3 3 0 0 0 7 7 2 2 5 5 7 7 0 7 7 9 9 9 9 7 7 7 7 0 0 documents 0 0 0 6 6 0 0 1 1 0 7 7 0 0 0 4 4 7 7 9 9 0 0 8 8 0 0 0 0 7 7 9 9 9 9 0 0 0 0 0 4 4 0 0 4 4 0 0 0 0 0 0 0 8 8 1 1 0 0 5 5 0 2 2 0 5 5 0 0 0 Term-document matrix Can be a sparse matrix G Varoquaux 12

Recommend


More recommend