Using Git for Data Transfer on SURF Research Cloud
Git is a free-open source, version control system commonly used for managing code and data repositories. This guide shows you how to transfer data from GitHub or GitLab repositories to your SURF Research Cloud workspace.
Utrecht University’s RDM support team offers workshops if you want to learn more about Git and version control.
When to Use Git
Git is ideal for:
- Code repositories: Scripts, notebooks, and source code
- Collaborative projects: Sharing and syncing work with team members
Git is not recommended for research data, although there is nothing wrong with tracking small, non-sensitive data files with Git.
Prerequisites
- A GitHub or GitLab account with access to the desired repository
- Experience with Command Line Interface (CLI) is helpful. If you’re new to using command line, check out this introductory workshop by Utrecht University’s Digital Competence Center: Introduction to Bash.
Commands with git are run with the command line. Follow the instructions below to open a terminal in your workspace.
Open a Terminal
- Python Workbench/CLI workspaces: Open a terminal
- Jupyter Notebook/VRE Lab: Click
+in file browser → SelectTerminal - Windows workspaces: Use PowerShell or Command Prompt
- Desktop workspaces: Open the Terminal application
Quick Start: Clone a Repository
All workspaces come with git pre-installed, so you can start cloning repositories right away. You can check the version of git installed by running:
git --versionStep 1: Get the Repository URL
From GitHub:
- Go to your repository on GitHub
- Click the green
Codebutton - Copy the HTTPS URL (e.g.,
https://github.com/username/repository.git)
From GitLab:
- Go to your repository on GitLab
- Click the
Clonebutton - Copy the HTTPS URL
Step 2: Clone the Repository
In the terminal, first navigate to the folder where you want to put the repository (e.g. inside your storage volume and run:
git clone https://github.com/username/repository.git
#Replace the URL with your actual repository URL.This creates a folder with all your repository files.
Updating Your Data
If the repository is updated on GitHub/GitLab, you can pull your changes with the git pull command. First make sure you are located in your repository folder:
git pullUploading Changes Back (Requires Personal Access Token)
If you’ve made changes and want to push them back:
git add .
git commit -m "Description of changes"
git pushIf you have two-factor authentication (2FA) enabled on GitHub/GitLab, you’ll need to use a Personal Access Token instead of your password when pushing.
- GitHub: Create a Personal Access Token
- GitLab: Create a Personal Access Token
When prompted for a password, enter your token instead.
If your repository is part of the UU GitHub organization, you have to authorize your personal access token.
Don’t share your Personal Access Token. Treat it like your password
Common Git Commands
| Command | Purpose |
|---|---|
git clone <url> |
Download a repository |
git pull |
Get latest changes from remote |
git status |
Check what files have changed |
git add <file> |
Stage specific file for commit |
git add . |
Stage all changes for commit |
git commit -m "message" |
Save staged changes with a message |
git push |
Upload commits to remote repository |
git log |
View commit history |
git branch |
List branches |
Using Git in JupyterLab Interface (GUI Alternative)
Workspaces that support JupyterLab (Jupyter Notebook, VRE Lab) include a built-in Git extension that provides a graphical interface for Git operations, making it easier for users who prefer not to use the command line.
Accessing the Git UI
- In JupyterLab, look for the Git icon in the left sidebar
- Click it to open the Git panel.
If you don’t see it, you can install the extension with pip install jupyterlab-git.
Available Features
The Git UI allows you to:
- View changes: See modified, added, and deleted files
- Stage/unstage files: Click the + or - icons next to files
- Commit changes: Enter a commit message and click “Commit”
- Push/pull: Use the cloud icons to sync with remote repository
- View history: See commit history and branches
- Switch branches: Create and switch between branches
This provides a user-friendly alternative to command-line Git operations, especially useful for beginners.
Authentication: The same Personal Access Token requirements apply when pushing through the JupyterLab Git UI.
Troubleshooting
git: command not found
- Git is not installed.
- Try installing it with:
sudo apt install git
Authentication failed
- Check if your Personal Access Token
- Ensure you have access to the repository
- If you have 2FA enabled, use a Personal Access Token instead of your password
- Ensure your Personal access token is authorized for Single-Sign on, if your repository is under the UU GitHub organization.
Repository not found
- Verify the URL is correct (e.g.
httpsand notssh) - If the repository is private, ensure you have permission to access it
Large files cause errors
Tips
GitHub user documentation can be found here: GitHub Docs
Git documentation can be found here: Git Documentation