Home · Linux · Vim · Programming · Trees · Quests

I keep most of my data in the cloud, but there are various files on my local workstation that need to be backed up. For this, I use rdiff-backup and s3 with s3fs. This setup gives me a mirror of my current files in the cloud and a history of changes made for as long as I want to keep them.

If you don’t want to use S3 or s3fs, you can skip to the rdiff-backup section, as that will work with any file system path, whether or not it is mounted to an s3 bucket.

Setup S3

Creating an AWS account and S3 bucket is out of the scope of this how-to. Suffice to say, you’ll need an S3 bucket along with an access key and secret that can read and write to it.

On your Linux system, you’ll need to install the AWS cli tool: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-linux.html

Once installed, run aws configure to provide your access key and secret.

I’ve noticed that things work much better if your cli’s default region (I use us-east-1) is where your s3 bucket lives. When these are different, some tools have issues, s3fs included.

I’ve also found that s3 allows you to put periods in your bucket names, but this also confuses many s3 clients, including s3fs. There are work-arounds, but if you can choose your s3 bucket name, don’t put a period in it.

Install rdiff-backup

rdiff-backup is a well-known python script that provides a mirror backup and stores changes over time. By mirror backup, I mean that the backup destination is a fully accessible copy of the files you’re backing up. A drawback of this is that the full backup is not compressed, but the benefit is super easy access to your most recent backup files.

rdiff-backup is also somewhat unique in that it stores incremental backup data separately. Rather than starting with a full backup, which gets older over time, and storing newer incomplete incremental backups on top of that over time, it appends historical changes in a sort of journal. I prefer this method because you always have a single full backup you can easily browse and restore from, while changes over time can grow as long as you want them to. You can truncate them whenever you want with a simple command without impacting your current full backup.

On Arch, just run pacman -S rdiff-backup to install it. I gather it’s available on most distributions.

Create include and exclude lists

You can use the –include and –exclude command line parameters if you only have a few exclusions, but I have a fairly large list, so I used files. I’ve got a file named exclude.txt listing all the full file system paths of files and folders I want to exclude. I also have an include.txt file to force some subfolders of an excluded file to be included. For example:

exclude.txt:

   /home/bagaag/Downloads
   /home/bagaag/.mozilla
   /home/bagaag/.minecraft

include.txt:

   /home/bagaag/.minecraft/saves
   /home/bagaag/.minecraft/screenshots

This lets me backup the saves and screenshots folders in .minecraft but leave the rest of that folder excluded.

Run rdiff-backup

Here’s the commands I run to backup. I just have these in a bash script aliased to backup on my system, and I’ve gotten in the habit of running backup before installing updates, which on Arch Linux is nearly daily. My laptop is off when I’m not suing it, so this works better for me than a cron job.

Here are the commands in my backup script:

  rdiff-backup --include-filelist /home/bagaag/scripts/include.txt --exclude-filelist /home/bagaag/scripts/exclude.txt /home/bagaag /mnt/backup
  rdiff-backup --remove-older-than 3M /mnt/backup

That second line removes history in the backup older than 3 months.

Restoring from backup

You can restore from backup by simply copying files from your backup folder. It’s that easy. If you want to go back in time, you can use the --restore-as-of and other commands rdiff-backup makes available for doing this. There are lots of options.

Check out <rdiff-backup.nongnu.org/examples.html> for a nice walk through the main features of rdiff-backup.