Open-source without headaches Edwin Dalmaijer @esdalmaijer 20 November 2018
Wait, isn’t open source a Good Thing ? • To science , open-source if unequivocally good • Tools are free and open to public scrutiny
Wait, isn’t open source a Good Thing ? • To science , open-source if unequivocally good • Tools are free and open to public scrutiny PyGaze PsychToolbox EEGLAB SPM
So what about those headaches ? • To a scientist , open-source is a distraction • Publishing open code requires additional time and efgort • Open code is not rewarded in systematic ways • More open code => fewer papers => lower grant chances
So what about those headaches ? • To a scientist , open-source is a distraction • Publishing open code requires additional time and efgort • Open code is not rewarded in systematic ways • More open code => fewer papers => lower grant chances • But what if you publish a paper on your code? • Unlike paper, software requires continued efgort • Unlike authors, people join and leave development teams
So what about those headaches ? • To a scientist , open-source is a distraction • Publishing open code requires additional time and efgort • Open code is not rewarded in systematic ways • More open code => fewer papers => lower grant chances • But what if you publish a paper on your code? • Unlike paper, software requires continued efgort • Unlike authors, people join and leave development teams Psychophysics toolbox (Matlab) PsychoPy (Python) Brainard (1997): 11578 citations Peirce (2007, 2009): 2471 citations Kleiner et al. (2007): 1954 citations Peirce et al: under review!
So what about those headaches ? • To a scientist , open-source is a distraction • Publishing open code requires additional time and efgort • Open code is not rewarded in systematic ways • More open code => fewer papers => lower grant chances • But what if you publish a paper on your code? • Unlike paper, software requires continued efgort • Unlike authors, people join and leave development teams • But doesn’t your toolbox get you exposure? • Important for early career, but doesn’t get you fellowships • How many PIs are ‘methods people’?
Does it really take up that much time? “When I try to run your Python script in OpenSesame / Unity / [other non-Python tool], it doesn’t work!” “Hi, I need help with this other software you didn’t develop!” “What kind of data quality would be achievable with my webcam? (See attached image of my face.)” “Your code didn’t work, what should I do?”
Does it really take up that much time? • Continuous work on development • Bug fjxes, new features, dependencies change • Continuous work on support • If people use your tools, they’ll ask questions • Communities are hard to build, and require critical mass that most science projects just don’t have
Is supporting open developers important? • Three omnipresent packages • About 90 million downloads • Estimated cost over $21 million • Just 15 active maintainers! Kelle Cruz, AstroPy May 2018
We need to reward software contributions • Science relies on crucial open software • Without these, most of us couldn’t do our jobs
We need to reward software contributions • Science relies on crucial open software • Without these, most of us couldn’t do our jobs • The current system punishes developers • Matthew efgect: less time for papers => fewer grants • Low pay, even lower job security
We need to reward software contributions • Science relies on crucial open software • Without these, most of us couldn’t do our jobs • The current system punishes developers • Matthew efgect: less time for papers => fewer grants • Low pay, even lower job security • We need to adjust academic reward structures • Citations to associated papers are not enough • More stable positions for open-source developers? • Include software overhead in grants ?
Post-soapbox usefulness
Two types of code among researchers • Script: analysis pipeline • Usually written in one long fjle • Pretty specifjc to one project • Usually not particularly useful to other people • analysis_fjnal2-October 2018.m • Libraries: set of more general functions • Importable to scripts from a central place • Combine functions for particular purposes • Tend to be useful to other people
What do you hate in other people’s code? • No README • No docstrings • Unhelpful commenting • Unclear variable names • All fjles reference each other
What is a good open-source project? • Clearly documented • README, function descriptions, and EXCESSIVE comments • Sensible structure • File structure and folders neatly organised • Sensible fjle names • Easy to fjnd and to download • For example through GitHub, GitLab, BitBucket, or OSF • Not dependent on hidden code. • Sensible dependencies; don’t use obscure homebrew
Start with a sensible folder structure • 2018 Super Amazing Study • analysis • data • pp01.tsv • pp01.cnt • ... • analysis_script_v3.py • eeg_functions.py • motion_tracking.py • experiments • constants.py • experiment_v4.py • custom_functions.py • literature • writing
Start with a sensible folder structure • 2018 Super Amazing Study • analysis • data • pp01.tsv • pp01.cnt • ... • analysis_script_v3.py • eeg_functions.py • motion_tracking.py • experiment • constants.py • experiment_v4.py • custom_functions.py • literature • writing
Start with a sensible folder structure • 2018 Super Amazing Study • analysis • data • pp01.tsv • pp01.cnt • ... • analysis_script_v3.py • eeg_functions.py • motion_tracking.py • experiment • constants.py • experiment_v4.py • custom_functions.py • literature • writing
Start with a sensible folder structure • 2018 Super Amazing Study • analysis • data • pp01.tsv • pp01.cnt • ... • analysis_script_v3.py • eeg_functions.py • motion_tracking.py • experiment • constants.py • experiment_v4.py • custom_functions.py • literature • writing
Add a README to every project • 2018 Super Amazing Study • README.md • analysis • data • pp01.tsv • pp01.cnt • ... • analysis_script_v3.py • eeg_functions.py • motion_tracking.py • experiment • constants.py • experiment_v4.py • custom_functions.py
Creating a new repository on GitHub
Creating a new repository on GitHub
Creating a new repository on GitHub
Creating a new repository on GitHub
Open folder in terminal / command prompt • 2018 Super Amazing Study • README.md • analysis • data • pp01.tsv cd “/home/documents/ • pp01.cnt 2018 Super Amazing Study” • ... • analysis_script_v3.py • eeg_functions.py • motion_tracking.py • experiment • constants.py • experiment_v4.py • custom_functions.py
Initialise a Git repository git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master
Add all current fjles to the repository git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master
Schedule fjles to be uploaded git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master
Connect the GitHub repository git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master
Upload committed fjles to GitHub repo! git init git add . git commit -m "fjrst commit" git remote add origin https://github.com/esdalmaijer/2018_Super_Amazing_Study.git git push origin master
Edit, add, commit, push; repeat! • 2018 Super Amazing Study Change something here... • README.md • analysis • data • pp01.tsv • pp01.cnt • ... • analysis_script_v3.py git add . • eeg_functions.py • motion_tracking.py git commit -m “description” • experiment • constants.py git push origin master • experiment_v4.py • custom_functions.py
Edit, add, commit, push; repeat! • 2018 Super Amazing Study Then run the magic words! • README.md • analysis • data • pp01.tsv • pp01.cnt • ... • analysis_script_v3.py git add . • eeg_functions.py • motion_tracking.py git commit -m “description” • experiment • constants.py git push origin master • experiment_v4.py • custom_functions.py
GitHub Desktop has a GUI instead • Some people don’t like the command line • Everyone has their preferences, don’t be embarrassed • GitHub Desktop is a graphical alternative • Available on Windows and on OS X
Principles of Object-Oriented Programming Class (blueprint)
Principles of Object-Oriented Programming Class (blueprint) Instance (realised object)
Principles of Object-Oriented Programming Class (blueprint) Instance (realised object)
Recommend
More recommend