Things I forget: install git lfs and initialize in repo


November 4, 2018

git lfs is great for including (fairly) large files in git repositories. Since the entire history of files is saved, it prevents large files from blowing up the repo. I’m not sure why it isn’t installed by default with git. Anyway, I always forget how to use it.

Installing git lfs

Each computer accessing a repo with lfs enabled needs to have it installed, otherwise expected files may not appear and the user could (potentially) screw the repo up by deleting the wrong things. More instructions here, but for OS X I just run:

brew install git-lfs
git lfs install

Note that this only needs to be run once per computer, not per repo.

Enabling git lfs on a repo and adding files to it

For each repo, I need to tell git lfs which files or kinds of files to track. Suppose I want to use lfs for any csv files:

git lfs track '*.csv' '*.fst' '*.Rds' '*.gz' "*.zip"
git add .gitattributes '*.csv' '*.fst' '*.Rds' '*.gz' "*.zip"

First line tells lfs which patterns to track, second adds .gitattributes and all of the relevant files (if necessary) to the commit.

Checking in on git lfs

git lfs ls-files: Show a list of tracked files

Cleaning up an LFS repo

I’m frequently finding myself at the limits of my GitHub LFS storage space. Here’s how I use BFG to do that. The following code clones a bare repo, drops anything bigger than 10m in the history, and pushes the update to GitHub.

{bash, eval = FALSE} git clone --mirror git:// java -jar bfg.jar --strip-blobs-bigger-than 10M some-big-repo.git cd some-big-repo.git git reflog expire --expire=now --all && git gc --prune=now --aggressive git push

To run this, you need to know which repo is eating up your LFS storage. That’s actually not straightforward, since (as far as I can tell) there’s no quick way to see the storage for all of your repos.