Version Control

Why do you need version control?

  • It will help you manage your code most of your files (it is like track changes on steroids: it applies to all files in a folder).

  • It allows you to trace back your steps: if something breaks, you can figure out what happened.

  • NO MORE thesis_final_final_SERIOUSLYFINAL.Rmd

even better:

  • a good version control system allows you to collaborate and share!

  • a good version control system facilitates experimentation!

What is git?

  • Distributed Version Control system written by Linus Torvalds (of Linux fame)

  • Allows you to log updates, branch your work (so you can experiment without losing the original!), and keep all backups, while efficiently using your storage

  • Gives user a lot of control on what to track, and adds a narrative to changes (‘commit comments’)

  • Current standard for code

  • Open Source software written for the command line…

  • … but many GUI-clients exist nowadays, and most coding IDEs have built-in git.

Your turn: starting with git

  1. We have created a repository from a template repository on GitHub.

  2. After that, we cloned the repository to our local PC and moved some of our files into the repository.

  3. Navigate to your project folder in a terminal.

cd [path/to/project_folder]
  1. Verify that your project folder is a git repository by locating the .git folder:
ls -a

Your turn: Git history

Use Git log to check the history of your project:

git log

Alternatively, via GitHub it is also possible to view the history of your project. Let’s see how a mature project looks like in this repository of the happygitwithr book:

Git History

Git History: Compare!

Your turn: Useful git commands

git status

Git status shows you the status of your repository. It tells you which files are created, modified, or deleted relative to the last git snapshot (aka commit) of your project.

git diff

Git diff shows you the actual differences between the files that you have changed since the last snapshot.

git --help

This command will show you all the commands that you can use with git.

Your turn: Making changes

  1. If you have added your files to your git repository on your PC, now add all your files to the staging area:
git add *
  1. Commit all the files in the staging area to your repository:
git commit -m "add my files"

Your turn: pushing to github

You can now push the content of your local repository to the one on github:

git push

Congrats, you have made your first push and your scripts are online!

\___ THANK YOU!

Take a look at your online repository. Who is the author of your commits? If it is not you, you can configure git to use your identity (make sure github knows this email address):


git config --global user.name "Your Name"
git config --global user.email "your@email.com"

Your workflow

  1. Add the changed file(s) to the staging area, and commit the changes:
git add src/filename.py anotherfile.txt
git add config/configfile.json
git status
git commit -m "My commit message"
git push

NB: don’t forget your commit message (try what happens if you do!).

Image credit: Software Carpentries

Summary

Useful git commands:

git log
git log --oneline
git status
git diff
git diff HEAD~3
git --help

.gitignore

The .gitignore file in your template contains files that by definition will not be tracked by git.

For example, if you do not want to track a file .DS_Store (always present on my mac), you enter a line like this in your .gitignore file:

.DS_Store

Similarly, you can ensure all output in a folder will not be tracked:

results/

Or all files with a certain extension:

 *.dat 

NB: There is a .gitkeep file in your template – this does not do the opposite to .gitignore, but is instead used as a placeholder for folders: git does not track empty folders…

Your turn!

  1. Continue moving your files into the file template.

  2. Add, commit, and push all files you want to track! (Do you want to move a tracked file within a git repository? git mv path/to/file.svg newpath/file.svg and don’t forget to commit!)

  3. Are there (temporary) files you do not wish to track? Add them to the .gitignore file. Consider a .gitignore template for your language: examples on this github repo.

  4. Continue editing your code, and add/commit/push your changes. Can you do it from your IDE?

  5. Experiment with editing and committing on github itself. You can then ‘download’ your code to your local repository using git pull.

  6. What happens if you edit the same file online and locally, and try to push/pull?

Enjoy, and git responsibly!

  • Commits should be atomic: comprehensive ‘units’ of changes.

    • DO: edit/add an .svg and add it to your .Rmd presentation in the same commit
    • DON’T: edit for a full day and put this in a single commit (or worse: forget to…)
  • Commits should have informative messages so you (and others) can trace your steps

  • Track most files; .gitignore those files you don’t.

  • Explore new ideas with branches, keep a stable version on main

Do you want to learn more?

image/svg+xml