Borg Backup from a git user's perspective Sun, Jun 6 2021 PM
Early 2021 was pouring over various backup, synchronization, and management solutions. Usually there is never a best tool, just a plethora of options, each with their own trade-offs, limitations, frustrations, and feature sets. Anyway, I stumbled across some great open source tools that I thought were genuinely superb at what they do, and I wanted to share across a few blog entries...
I have always wanted a backup tool that is basically just git but with encryption. Borg backup basically achieves this goal for me. That is, borg is not git, but it can snapshot a directory, giving us version control. It also has an excellent command line interface, and features like encryption, compression, remote access. There are already a lot of nice guides out there, so rather than make another, I will try to introduce borg to a prolific git user.
Using Borg
Get the package from apt (for ubuntu users)
sudo apt-get install borgbackup
Make a new directory to store backups
The borg init command is a lot like "git init --bare". Where a bare repository in git doesn't have any source files available in tree, it just has the files needed to store the repository, history and all.
As far as encryption, there are several options. The "--encryption=repokey" option basically means encrypt with and embedded private key. Borg will ask for a passphrase to encrypt the key (just like ssh-keygen). This passphrase must be provided everytime the borg command is used on the repository.
mkdir -p archive_dir borg init --encryption=repokey my_archive_dir
Create a snapshot in the archive
Snapshots are created using the borg create command. This is a lot like combining "git add + git commit" into one command. Any number of files and folders can be passed in. And there are extensive options for exclude files and patterns.
And note that "my_snapshot_name" is just the name of the snapshot. Borg has no concept of a tree of commits, its just a collection of different snapshots. So most guides recommend formatting the snapshot name with a sortable date string in the name.
borg create my_archive_dir::my_snapshot_name dir1/ dir2/ file1.txt file2.txt etc
List the available snapshots
Use the borg list command to show the available snapshots much like "git log".
borg list my_archive_dir
Extract the snapshots
Now we can use the borg extract command to retrieve files from a snapshot. Remember this isnt changing the state of the archive directory, we just need to provide a path where the snapshot is exacted to. -- Preferably a new directory.
borg extract my_archive_dir::my_snapshot_name some_output_dir/
The git equivalent of this is actually using git archive to extract a commit as a tarball and then to extract that tarball back out. Example "git archive --format=tar my_snapshot_name | (cd some_output_dir && tar xf -)"
Mount the archive
Now my favorite option here is to mount the archive. Borg uses fuse to mount a virtual filesystem where each snapshot in the archive is a top level directory in the mount. And you can just browse and copy from individual snapshots.
borg mount my_archive_dir /mnt # you should see your files -> ls /mnt/my_snapshot_name
Other mentionables
The remote server
There is no equivalent of pushing to the remote server. Basically borg lets you init the repo onto a remote machine because "my_archive_dir" can specify a machine available over ssh. I wasn't a fan of of having the tool run over the network. My solution is to use borg on a local filesystem and to just rsync the resulting archive directory.
Scripting backups
Borg backup is highly scriptable. Most if every command line argument is also available as an environment variable. Rather than chaining together huge strings of commands, you can write very neat and readable backup scripts. Or just keep that pesky passphrase dialog away.
Large backups
Having used other backup tools on a large directory set, I can say that borg is surprisingly fast. Not only did it get the job done, but the resulting archive was substantially smaller than expected. Future backups should also be small because the snapshots de-duplicate identical files.