git – An Introduction
Prevent this:
What is git?? ● A version control system for tracking changes in (text) files in git repositories. ● Note, that the files do not necessarily have to be any "source code" files, but text files. ● This does not include *.doc and *.docx files!
Getting started ● A repository is any directory on your or any other machine, of which we told git about.
Some terms ● In git, changes on a file can reside in three areas: – Local working directory with the files in it on which we work. Just as usual as it gets. – The index . A staging area where the state of the file is tracked, but not saved persistently. – The HEAD . A pointer to the last commit .
Some terms ● A commit is basically a persistently saved state (snapshot) of the repository. ● Basically, what we do when using git, is moving files around in the three mentioned areas. ● However GitHub Desktop (kind of) skips the add command.
The Workflow ● Create or edit a file. ● Add+commit it to the repository. ● As long as a new file has not been added to the repository, git does not care about it. ● As soon as it's in the repo, we can see changes made. ● Only committed changes are saved persistently and can be undone!
Logs and Reverts ● Create or edit a file. ● Add+commit it to the repository. ● As long as a new file has not been added to the repository, git does not care about it. ● As soon as it's in the repo, we can see changes made. ● Only committed changes are saved persistently and can be undone!
Remote repositories ● git natively supports synchronization between repositories. ● Non-local repositories are simply called remotes or remote repositories. ● Most common way of setting up a remote, is by cloning a remote repository: – This automatically registers the remote repository as such.
Remote repositories ● Synchronization does not happen automatically! ● Instead we push & pull our changes!
Remote repositories GitLab-Server push pull User 1 User 2 User 3 ● We stay in sync by keeping one central repository up-to-date and fetching changes from there!
Merge conflicts ● Can we get out of sync ? ● Obviously and unfortunately yes ! ● git is quite powerful merging changes in the same file! – Changing different lines in the same file, does normally not yield any problems. ● Changing the same line, does. What change is to be preferred? – Needs manual resolution!
Merge conflicts ● Also, binary files do not follow the concept of lines ● Hence, binary files cannot be merged! ● *.doc, *docx, *.xls, etc. are binary formats, not text files! They are a bad choice for usage with git. (Instead try using LaTex, which is text-file based) –
Merge conflicts ● Also, binary files do not follow the concept of lines ● Hence, binary files cannot be merged! ● *.doc, *docx, *.xls, *.pdf, *.png, *.jpg, compiled executables (*.exe), etc. are binary formats, not text files! They are a bad choice for usage with git. (Instead you may try using LaTex, which is text-file based) –
Let's create a merge conflict!
Extended Workflow ● Edit files and regulary add+commit your changes, best logically separated. ● Before e.g. leaving work/doing something else, push your changes! ● If it fails, you might have to pull changes first and possibly resolve some merge conflicts. – Hence, don't push in a hurry! Better push early and have time for resolving conflicts. ● When done, push the resolved merge!
.gitignore ● As said before, some files are unsuitable for being handled with git. We can tell git to ignore files by adding them to the .gitignore ● We can ignore single files, file patterns and directories ● If some files are anyhow needed in the repository, try to create them automatically! – E.g. LaTex compilation scripts (and exclude the compiled PDF), gnuplot or matplotlib scripts
Branches ● Assume you want to perform major changes, add a new feature, which may temporarily compromise functionality. – Adding such changes to the master (or production) branch, will cause all other users to be confronted with defect software. ● Branches allow for isolated development regions, where you can basically do, whatever you want! ● Later, you can merge your changes into the master branch.
Best Practices ● (Try to) Never push non-working code! – Ideally, a git repository's master branch is always ready to work out of the box! – If you know beforehand, that stuff won't work, use your own branch! – Document your work, label non-working features!
Best Practices ● Don't develop in your production environment! – Regarding e.g. large data simulations: You perform simulations/calculations on powerful servers. You should however not develop on those servers. – Instead develop your programs on your local machine. Use small test examples to check functionality. Even better, use unit testing!
Best Practices ● Don't develop in your production environment! – Why not? – Having your code working on two different machines assures at least some portability and reproducibility of your code! – This will most probably also force you to add setup and installation scripts (like creating python venvs, installing modules, etc.) – This will also hinder you from hard-coding paths in your code!
Best Practices ● Try not to hard-code! – Hard-coding means writing e.g. authentication parameters, data paths, etc. directly into the code. – Instead, use config-files, which hold information and are read on program start.
Best Practices ● Documentation – Information on how to install/setup your software should be delivered with the repository. – The README.md file is the perfect place for such information!
Best Practices ● Don't hack! – Mostly, there exist good, durable, elegant, but somewhat tedious solutions and on the other hand quick, but dirty solutions. – Always (try to) go for the good and durable one! – Even for small projects, you never know, when you come to the point, where you need to extend functionality. – Undoing the hack + implementing a robust solution does cost more time, than having it done correct from the beginning!
Best Practices ● Don't hack! – Following standard procedures also enhances the chances of getting help from colleagues or the web! – However, it might be quite hard, to know what is the „best“ solution to a (possibly very individual) problem...
Recommend
More recommend