Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Using btrfs send and Snapper for fast incremental backups

Share your HowTo, Documentation, Tips and Tricks. Not for support questions!.
Post Reply
Message
Author
User avatar
pylkko
Posts: 1802
Joined: 2014-11-06 19:02

Using btrfs send and Snapper for fast incremental backups

#1 Post by pylkko »

I have been exploring the use of snapper and btrfs with Debian sid, and wrote stuff down to remember better. Then I thought I might just as well write my notes on the forum if some one else can use them or add to them.

Notice: always use latest kernel with btrfs. Probably not a good idea to follow anything here on Jessie. Currently RAID5/6 is broken and should not be used but the patch is in for kernel 4.12 (linux-next). So with 4.12 or later it might be safe again.

Why?

Traditionally, when one would want to do a system backup this would involve making an image file of the entire disk. Doing this, however, takes long and if done often takes up quite some space, which is probably why many people neglected it entirely. With tools like rsync it is possible to backup/copy only the files that have changed since last time. However, rsync needs to crawl the entire system to find the changes (slow) and when it sees a renamed file or one moved to another path, it removes the corresponding file at the backup location and replaces it anew. With btrfs send it is possible to send the difference between two states of a disk so that only the data difference/metadata is sent, which happens immediately (so a filedmoved to another path will not be moved, it will just pointed to in a new way). So, in theory, you could – if you ever wanted – to create a full backup of a 500 GB disk every second.

Automating snapshot creation with Snapper

Installing snapper can allow you to make automatic archival timeline snapshots (for example a snapshot every hour) or difference snapshot pair before and after apt usage (every time you purge/install or upgrade, dist-upgrade). That means that you don’t have to make the snapshots manually, it will be done automatically in the background as often as you want. You can also set it to delete them after some specific time has passed. By default, snapper makes read-only (archival) snapshots in the subvolume ./snapshots. If you do not want automatic snapshot you can make them manually (with snapper or with btrfs progs). You can do this with snapper by disabling the automatic ones in the conf. The command "create" makes a snapshot of the current situation. Manually created snapshots need to be made read-only or they will not be "sendable". Because btrfs snapshot does not include other subvolumes, when you make a snap of root using Snapper the .snapshots subvolume will not be included. Which is good, because otherwise you would get an infinite regression “a snapshot containing it’s own snapshot containing it’s own snapshot.. etc).

You can always list snapshots with

Code: Select all

snapper ls
and they can be manually removed also.

Code: Select all

snapper delete {number}
Since you can always mount a snapshot or use it like a directory, you can easily go back to past states of the disk and recover deleted or changed files. You can even use snapper to keep track of what changes happened in which files as we will see later.

Incremental backups using the automatically made snapshots

It is important to realize that a snapshot is only a snapshot and not a backup. By storing many snapshots on your computer, you can always roll back to a previous state, or just revert individual file changes, but the snapshots are located on the same physical disk and disks can fail. Thankfully you can also use btrfs to send snapshots to external locations (even over network). Snapper cannot do this alone (yet) but you can send the snapshots manually with btrfs-progs, from the collection of snapshots that snapper has made, to pull it off for now. First, as root create the location that will accept the backups. So take another drive (e.g USB HDD) with a btrfs partition and mount it at your chosen location (in this example /location/of/backup/). Then create a subvolume in that btrfs partition. As root:

Code: Select all

btrfs subvolume create /location/of/backup/.snapshots
I use this particular path because we want to use the same path name as Snapper uses for simplicity, but you can choose another name if you so will. Then make a directory for the backup:

Code: Select all

mkdir /location/of/backup/.snapshots/1
Now, send the first snapshot of the drive

Code: Select all

btrfs send /.snapshots/1/snapshot | btrfs receive /location/of/backup/.snapshots/1
This will take some time because it is essentially sending the entire subvolume (here, in this example, this subvolume contains then entire root system) over. Next, make some changes to the system, install, delete packages or something. Let’s say you install inkscape, and you have a default snapper root configuration. Snapper now made another snapshot (number 2) for you. (You could also test this by touching an empty file called “file” and manually making a snapshit, if you don't want to install anything for testing this.) You can now send the difference to the USB drive with

Code: Select all

mkdir /location/of/backup/.snapshots/2

Code: Select all

btrfs send -p /.snapshots/1/snapshot /.snapshots/2/snapshot | btrfs receive /location/to/backup/.snapshots/2
This should take a few seconds as it only sends the difference between 1 and 2 to the USB drive.

After this you can navigate to the USB drive’s path .snapshots/1/snapshot/ and you will see a copy of your entire root subvolume. The one in path .snapshots/1/snapshot has the root system as it was previous to the install of inkscape. The one in .snapshots/2/snapshot has the same root subvolume after the install. These two snapshots will take up approximately as much space as one (and not two) since the root subvolume is there only once with the difference data (which if it is added on top of snapshot 1 leads to snapshot 2). If you now delete the first original snapshot, you will still have the entire disk on the USB drive (but only the later state with inkscape (in out example), but not the first state without). You can save differences between any snapshots, so that for example you can have your computer make automatic snapshots every hour for 500 cycles and store snapshot number 1 on the USB and later send snapshot number 500 and have your machine automatically delete the 498 in between. That way you always have rollback options at hand while also having full backups, at a lower frequency.

You will notice that these subvolumes do not have .snapshots themselves. This means that any parts of the drive that you do not want to backup, you should create their own subvolume for this (for example var or tmp).

Because the differences often use so little space (presuming here that you do not uninstall half of you system and reinstall it back between every snapshot), you could easily make a full system backup every week and store one year's worth on a small USB flash stick.

Seeing diffs

Snapper allows you to see differences between files in two snapshots. So, let’s say you have two snapshots and you forgot why. You can issue:

Code: Select all

snapper status 1..2
From this you will notice that some files have changed (they will be listed with a little "c" in front of them). In my case I can see that one file that has changed is dpkg.log

Code: Select all

snapper  diff 1..2 /var/log/dpgk.log
and it will display a difference for that file in a way that is familiar to us from for example git:

Code: Select all

+++ /.snapshots/2/snapshot/var/log/dpkg.log	2016-11-07 13:06:29.368868818 +0200
@@ -4536,3 +4536,51 @@
 2016-11-07 11:05:45 trigproc libc-bin:amd64 2.24-5 <none>
 2016-11-07 11:05:45 status half-configured libc-bin:amd64 2.24-5
 2016-11-07 11:05:46 status installed libc-bin:amd64 2.24-5
+2016-11-07 13:06:09 startup archives unpack
+2016-11-07 13:06:09 install libpoppler64:amd64 <none> 0.48.0-2
+2016-11-07 13:06:09 status triggers-pending libc-bin:amd64 2.24-5
+2016-11-07 13:06:09 status half-installed libpoppler64:amd64 0.48.0-2
+2016-11-07 13:06:10 status unpacked libpoppler64:amd64 0.48.0-2
+2016-11-07 13:06:10 status unpacked libpoppler64:amd64 0.48.0-2
+2016-11-07 13:06:11 upgrade inkscape:amd64 0.91-11+b1 0.91-11+b2
+2016-11-07 13:06:11 status half-configured inkscape:amd64 0.91-11+b1
+2016-11-07 13:06:11 status unpacked inkscape:amd64 0.91-11+b1
+2016-11-07 13:06:11 status half-installed inkscape:amd64 0.91-11+b1

...
+2016-11-07 13:06:28 configure inkscape:amd64 0.91-11+b2 <none>
+2016-11-07 13:06:28 status unpacked inkscape:amd64 0.91-11+b2
+2016-11-07 13:06:28 status half-configured inkscape:amd64 0.91-11+b2
+2016-11-07 13:06:29 status installed inkscape:amd64 0.91-11+b2
We can see that the log file has changed and from from the added line marked with a plus sign we can see that inkscape was installed between the two snapshots. If you don’t specify a file, you get the difference for every single file = output can be gigantic. This is really useful for "bisecting" a problem. If you make a pre-snapshot before mucking around with let's say video conffiles and you later notice that something is not right, you can go back and see what file you changed and what content inside these files you changed.

Undoing individual changes or recovering an entire snapshot

A snapshot can be traversed like a path and mounted like a drive. So, if you need only one file that you accidentally deleted you can just open the snap and copy it back. You can also list changes and undo them. So with the command "diff" you see what changed and then you issue:

Code: Select all

snapper -v undochange 1..2 /some/file
Sometimes you might want to recover much more, like the entire system. You can always boot to a Snapper made snapshot by specifying it in the bootloader (if you have one, or in EFI). Basically, you go to grub and enter

Code: Select all

rootflags=subvol=.snapshots/1/snapshot
as the boot line, and you are booted to your root system without Inkscape. You can also boot into an old snapshot and make it the default one again.

These two strategies have downsides, one of which is that Snapper snapshots (and ones that can be “sent”) are read-only. This is a small problem since you can manually change it back to rw (read/write). Another thing is that now that the default subvolume is changed, the bootloader is no longer dealing with subvol=0 for / but subvolid=0/.snapshots/2/snapshot. This means that a part of the btrfs tree is above your mount point and it is now inaccessible. This may not be a problem, but in theory you will be wasting (a small amount of) disk space. Thirdly, when you bring the snapshot back, the subvolume .snapshots will be just an empty “path” and you cannot keep on using snapper as you did before without recreating the subvolume. So the methods (for mounting a snapshot root talked about above) probably can be said to work for an emergency situation where you need to go back to how things were right now.

But in order to fully get back to where you were you would need to boot into old snapshot (number 1 without Inkscape in our example) and do

Code: Select all

snapper rollback
. This will then 1) create a read-only (archival) snapshot of the current default subvolume (with Inkscape); the one you do not want 2) create a read-write one of the current mounted snapshot (or any number that you give to the command, see man snapper) and 3) set the btrfs default to subvolume to the correct snapshot. If you use fstab you might need to adjust it because of the subvolid=0/.snapshots/2/snapshot (although I believe that systemd will find and mount everything without fstab nowadays and this depends on what btrfs layout you used). At this point, if you want to continue to use Snapper you will need to remove .snapshots and subvolume and set it up again. Another solution is to manually copy/snapshot the old snapshot on to the new one. Any btrfs read-only snapshot that is snapshotted somewhere else will become read and write. Just mount the good snapshot and btrfs snapshot or rsync everything back to the real root. See this post for details on that: https://bbs.archlinux.org/viewtopic.php?id=194491 This is also what the official kernel guide suggests doing when recovering. They just delete the subvolume (they don't even make a backup of it) and then snapshot from the recover without even mounting it like so:

Code: Select all

# btrfs subvolume delete /root/btrfs-top-lvl/home
# btrfs subvolume snapshot /root/btrfs-top-lvl/snapshots/home/2015-01-01 /root/btrfs-top-lvl/home
source: https://btrfs.wiki.kernel.org/index.php ... _Snapshots

This method is perhaps a bit slower in bringing the system back to original state, but it does it more thoroughly and I think that in a way it is conceptually much more simpler.

Yet another way is to simply make a new subvolume on the drive (keep it empty, a sort of empty place holder) and if all goes bad and you need to recover from a USB snapshot, then snapshot the backed up one to that place hoder, edit fstab and boot it. This "just works". It is always, of course, possible to copy or move files form the back up, so in case the hard drive physically brakes entirely, it is possible to create a new partition of any type and move the files from the USB storage read-only snapshot to it. Just las from any other drive/partition.

These are but some of the advantages of btrfs. Other things it allows for include software RAID/auto healing/storage pools.

Much credit belongs to Head_on_a_Stick for helping me get a hand on btrfs.

EDIT added after Debian 9.0 was released:
It is also worth mentioning that there is now also a tool Debian called btrbk (0.24) (https://github.com/digint/btrbk) which can do automatic incremental snapshots to USB disk. I personally just don't see the point in automating that part of the process, whereas automating snapshot creation on the system in use does seems worth automating. Sending the archival snapshots to the external disk is something that I do more rarely and I want check that the backup works manually anyway. But perhaps someone else will find use for that tool.

Post Reply