These are archived pages, most of them date back to 2007-2012. This content might not be relevant or accurate anymore.

Encrypted backup with rsync (eCryptfs, LUKS)

⚠️ Do not use eCryptfs! ⚠️

  1. It’s slow.
  2. It’s really slow (compared to LUKS default).
  3. Really slow. It limited data writes to 10 MiB/s on Intel Atom.
  4. Filename encryption imposed limit on filename length (I believe it was over 100 characters but I managed to trigger it).

It is good enough to use but empty or small rsync update can take much more time (I am not sure where the bottleneck is). Also, dm-crypt is supposed to be parallelizing encryption since Linux 2.6.38. From what I’ve seen, it can be more than 3 times slower and LUKS can probably do even better but I am using slow drive. Go for LUKS.

So, I have been able to push empty update times for large tree (20G) to few minutes on this CPU:

model name	: Intel(R) Atom(TM) CPU  330   @ 1.60GHz
cpu MHz		: 1599.800
bogomips	: 3199.94

And this disadvantaged drive:

# LUKS

hdparm -Tt /dev/sdb
 Timing cached reads:   1012 MB in  2.00 seconds = 505.87 MB/sec
 Timing buffered disk reads: 116 MB in  3.03 seconds =  38.31 MB/sec

# eCryptfs

hdparm -Tt /dev/sda
 Timing cached reads:   1306 MB in  2.00 seconds = 652.71 MB/sec
 Timing buffered disk reads: 258 MB in  3.02 seconds =  85.44 MB/sec

LUKS setup

# turn off swap so we don't leak keys

swapoff -a
# make sure we have loaded kernel modules

modprobe dm_mod
# format partition to ext4, type "YES"

luksformat -t ext4 /dev/<device>
cryptsetup luksOpen /dev/<device> backup
cryptsetup status backup
#/dev/mapper/backup is active:

#  cipher:  aes-cbc-essiv:sha256
mount -t ext4 /dev/mapper/backup /backup

And we’re good to go.

Optimizing large file backup

If we don’t care about amount of data sent and we have fast network (like Gbit ethernet), then we are better off when we send large (binary) files whole: rsync -aPW. We need to build list of those files first (100M is quite reasonable according to my tests):

Example:

DEST=user@host:/path/to/backup/
BASENAME=/home/user

# rsync -W for big files -- more efficient on gbit

umask 077
FILES=/tmp/backup-100M
find "$BASENAME/" -size +100M | sed "s%^$BASENAME%%" >| $FILES

rsync -W --stats --super --rsh="ssh -c arcfour" -aHPv --numeric-ids --delete --files-from="$FILES" --exclude-from=/home/user/backup-exclude ~user/ --delete-excluded $DEST/home/
rm -f "$FILES"

Then as a step two we do full backup. You might need to tweak excludes to work with both commands.

rsync --stats --super --rsh="ssh -c arcfour" -aHPv --numeric-ids --block-size=131072 --no-i-r --delete --exclude-from=/home/user/backup-exclude --delete-excluded ~user/ $DEST/home/

Large file benchmark

rsync does full MD5 check after update so something like this is always executed:

md5sum 4G  23.35s user 6.08s system 61% cpu 47.809 total

Here are some tests (without filesystem encryption) for large file transfers. Please note that I am only changing first byte in file full of zeroes.

# over SSH with arcfour throughput peaks at over 30M/s

# create test file

dd if=/dev/zero of=100M bs=100M count=1

time rsync 100M destination:~/
#rsync 100M destination:~/  2.09s user 1.03s system 81% cpu 3.828 total

echo '\x01' | dd conv=notrunc bs=1 count=1 of=100M
time rsync 100M destination:~/
#rsync 100M destination:~/  1.21s user 0.03s system 40% cpu 3.085 total

#rsync 500M destination:~/  8.48s user 3.57s system 85% cpu 14.178 total

#rsync 500M destination:~/  3.36s user 0.16s system 17% cpu 19.917 total

#rsync 4G destination:~/  70.20s user 31.31s system 79% cpu 2:07.15 total

#rsync 4G destination:~/  27.67s user 1.46s system 14% cpu 3:25.80 total
 
 
 
disorder's homepage