12  Git and GitHub on Hazel

Job scripts, config files, and helper utilities are code — and like all code, they benefit from version control. Tracking them in git and pushing to GitHub answers questions that come up constantly in research computing: what version of this script produced these results? what did I change last week that broke things? how do I share this pipeline with a collaborator?

This chapter is not a full git tutorial. It assumes you have a github account and know roughly what a commit is. This tutorial focuses on the parts that matter for running git on Hazel: setting up SSH-key authentication with GitHub, choosing where on the cluster your repositories live, and a workflow that keeps your laptop and the cluster in sync.

Important

Git operations that talk to GitHub (clone, push, pull, fetch) need internet access. On Hazel, that means login nodes only — compute nodes are firewalled off from the outside world. Run git from a login node, then submit jobs once your code is in place.

12.1 Git in Sixty Seconds

If you’ve used git before, this is the cheat sheet. If you haven’t, this is enough to follow along — supplement with the Pro Git book when you have time.

Command What it does
git clone <url> Download a repository from GitHub to a new local directory
git status Show what’s changed, staged, or untracked in your working copy
git add <file> Stage changes in <file> for the next commit
git commit -m "msg" Record staged changes as a new commit with message msg
git push Upload local commits to GitHub
git pull Download new commits from GitHub and merge into your branch
git log --oneline -n 10 Show the last ten commits, one per line
git diff Show unstaged changes line-by-line
git branch Show your current branch, and other existing branches
git checkout Switch a different branch
git checkout -b <new_branch_name> Create a new branch and switch to it

A typical edit-and-share cycle is git addgit commitgit push. Pulling other people’s changes (or your own from another machine) is git pull.

12.2 One-Time Setup on Hazel

The first time you use git on the cluster, tell it who you are. These two values get baked into every commit you make from this account.

$ git config --global user.name  "Your Name"
$ git config --global user.email "you@ncsu.edu"

Use the same email address that’s attached to your GitHub account so commits show up under your profile.

Tip

Set a sane default editor for commit messages so git doesn’t drop you into vi unexpectedly:

$ git config --global core.editor "nano"

12.3 Connecting Hazel to GitHub with SSH Keys

GitHub stopped accepting passwords for git operations in 2021. The two remaining options are SSH keys and personal access tokens; SSH keys are the better fit for HPC. They don’t expire, don’t need to be re-typed, and work cleanly inside scripts.

The idea: generate a keypair on Hazel, give the public half to GitHub, keep the private half in your home directory. When you git push, GitHub checks that you hold the matching private key and lets you in.

12.3.1 Step 1: Generate a Keypair

From a Hazel login node:

$ ssh-keygen -t ed25519 -C "you@ncsu.edu — hazel"

Press Enter to accept the default file location (~/.ssh/id_ed25519). When prompted for a passphrase, you can leave it blank — the key is already protected by the file system permissions on your home directory, and a passphrase will block automated pulls.

Note

The -C comment is a label, not a security setting. Use something like "hazel" or "unityid@hazel" so that when you look at your GitHub keys page you can tell which key came from which machine.

12.3.2 Step 2: Copy the Public Key

Print the public half — it’s the one ending in .pub:

$ cat ~/.ssh/id_ed25519.pub
ssh-ed25519 AAAAC3Nz...truncated...K3w== you@ncsu.edu — hazel

Select the entire line and copy it.

Warning

Never share ~/.ssh/id_ed25519 (the file without .pub). That’s the private key — anyone with a copy can act as you on any system you’ve added the matching public key to.

12.3.3 Step 3: Add the Key to GitHub

  1. Go to https://github.com/settings/keys
  2. Click New SSH key
  3. Title: Hazel (or whatever helps you identify it later)
  4. Key type: Authentication Key
  5. Paste the public key into the Key field and click Add SSH key

12.3.4 Step 4: Test the Connection

$ ssh -T git@github.com
Hi yourusername! You've successfully authenticated, but GitHub does not provide shell access.

That message — including the “does not provide shell access” line — means everything works. The first time you connect you’ll be asked to verify GitHub’s host fingerprint; type yes.

12.3.5 Step 5: Use SSH URLs When You Clone

GitHub shows two URL styles on every repository page. For SSH-key auth to work, you need the SSH form:

# SSH (works with your key)
$ git clone git@github.com:yourusername/your_project.git

# HTTPS (will prompt for credentials Hazel can't supply)
$ git clone https://github.com/yourusername/your_project.git

If you’ve already cloned a repo over HTTPS and want to switch, update its remote in place:

$ cd your_project
$ git remote set-url origin git@github.com:yourusername/your_project.git

12.4 Where to Put Repositories on Hazel

Each of the storage spaces on Hazel has a different fit for git:

Location Use it for Avoid for
/home/[UnityID] Personal dotfiles, small one-off scripts Anything that pulls in data or generates large outputs — 1 GB fills fast
/rs1/researchers/... Project repositories that pair scripts with data Repositories that don’t need durable storage
/share/$GROUP/$USER Throwaway clones for testing Anything you’d be sad to lose — scratch is wiped after 30 days

For the project layout introduced in Best Practices for Job Scripts (configs/, scripts/, logs/, data/, results/), /rs1/researchers/... is usually the right home: configs and job scripts are tracked in git, while data/ and results/ are gitignored and live alongside.

12.6 First-Time Checklist

Before running git operations on Hazel for the first time:

12.7 Resources