For quite some time now I’ve been taking a pretty big risk with a lot of files at home. I have a linux box that I run samaba on in order to share files with my Windows machines. These Windows machines map that share as their “S” drive. Once upon a time, this 120GB drive had a twin in the box with it and everything that was on one drive was automatically placed on the other drive (using software RAID 1). Years ago, that mirror was broken and never restored. So, all my data was setting on one hard drive (this included all of Haylee’s Pictures, a bunch of other pictures, all my MP3 files, and a few other random files). I could always dig my CD’s out and re-rip them to get the MP3 files, but Haylee’s pictures were a little harder to replace. I mitigated this risk by occasionally going in and burning these pictures to a DVD, but that required some manual work to do the backup. It also left open to loss anything that was created more recently than the last backup.
A couple months ago I moved Haylee’s picture site off the server in our spare bedroom and onto DreamHost. This helped out with the risk to Haylee’s pictures because most of them had also been uploaded to that site. But not all our pictures are uploaded to that site and it didn’t do anything for the other types of files on the S drive. To make matters worse, some of the stuff on that S drive isn’t really for public consumption. Probably the most personal thing on that drive is our Quicken backup. While it’s not the end of the world if somebody gets ahold of that, it’s still not something you generally want “out there.” So what I needed was a secure and automated way to back up all my files. I had toyed with the idea of using an old DLT drive I had laying around, but getting that to running was a lot of work and the DLT drive isn’t exactly quiet. Due to some other recent developments, we moved the computer out of its own room out into the living room. A loud tape drive really doesn’t work in that environment. So I needed something slightly different.
Enter GnuPG, DreamHost’s Personal Backup feature, and some perl scripting magic.
First things first, we need to get GnuPG set up. This allows us to encypt all the files before they’re shipped off to DreamHost. All we need to do there is run
# gpg --gen-key
And answer all the questions. The majority of the time if there was a default offered, you can take it (though you may want to generate the largest key possible). Also, you’ll want to leave the password blank because we’re going to use it as part of an automated process. GnuPG will warn you that this is a bad idea, but there are times when it’s necessary. Then, just to test things out, let’s try to encrypt a file (I’m using Haylee.jpg as a sample):
# gpg -e -r "Jacob Steenhagen (Backup)" Haylee.jpg
Now when looking at the directory listing we see that Haylee.jpg.gpg has been created with pretty much the exact same file size:
# ll -h total 3.9M -rw-r--r-- 1 root root 2.0M 2009-02-05 14:23 Haylee.jpg -rw-r--r-- 1 root root 2.0M 2009-02-05 14:24 Haylee.jpg.gpg
Now, we delete the original file (it’s just a test copy) and decrypt it with:
# gpg Haylee.jpg.gpg
Then open the file in an Image viewer… sure enough, we were able to cleanly encrypt and decrypt the file.
OK, now that we know we have GnuPG working, let’s configure DreamHost to accept our backups. Once logged into DreamHost’s panel, go to Users -> Backups User. There will be a randomly assigned username there as well as two boxes asking for a password. You can choose a password or have DreamHost generate one for you. We’re only using the password during this setup phase, so it’s nothing you have to remember. You may as well let DreamHost generate it for you as even though we’re not going to be using it, it’ll still exist. By letting DreamHost generate it, it’ll be sufficiently random that it should not be guessable.
Hit the “Activate Backup User!” button and make note of the username and password on the next screen.
Now on the left side, click the “Backup Users” link again. This will tell you the server you’ve been assigned to, make note of that two for use in a couple minutes.
While Dreamhost is getting the user set up on their end, let’s generate another Public/Private key pair… this time for when we want to scp files to DreamHost’s backup server but don’t want to be stopped for a password:
# ssh-keygen -t rsa
Go ahead and leave all three of the options blank (again, we’re using this automated, we don’t want passwords getting in the way). Now let’s prep our authorized_keys file for sending to DreamHost.
# cat .ssh/id_rsa.pub > authorized_keys
Hopefully by now the backup user is ready. Remember your backup server from above? It’s time to use it.
# ftp backup.dreamhost.com
Now, when it asks for the username and password, use the ones we just created above.
ftp> mkdir .ssh ftp> cd .ssh ftp> put authorized_keys ftp> bye
We’ve now told SSH on DreamHost’s back server that it should accept a login using the RSA key we just generated. Let’s test it out (of course, you need to use the right username):
# scp Haylee.jpg b123456@backup.dreamhost.com:
If it says that it uploaded 100% of the file, it’s a success; we can now forget that ugly password DreamHost gave us. If not, there’s some troubleshooting to be done that’s outside the scope of this document.
OK, we’re almost there!! We just need the Perl script now that’s going to encrypt our data and ship it to DreamHost!
#!/usr/bin/perl use strict; use POSIX qw(strftime); # "config" options my $bu_root = "/home/share/smb-shared"; my $recip_key = 'Jacob Steenhagen (Backup)'; my $bu_dest = 'jsteenhagen@steenhagen.us:backup/share'; my $bu_mtime = "/home/share/backup/mtime"; my $bu_gpg = "/home/share/backup/gpg"; # Supporting subs.... see below for live code. sub recurse_dir { my ($dir) = @_; opendir(DIR, $dir); my @files = readdir(DIR); close(DIR); foreach my $file(@files) { next if $file eq '.'; next if $file eq '..'; next if $file eq '.backup'; next if $file eq '.gnupg'; my $rel_file = "$dir/$file"; $rel_file =~ s/^$bu_root\///; if (-d "$dir/$file") { if (!-d "$bu_mtime/$rel_file") { mkdir "$bu_mtime/$rel_file"; } if (!-d "$bu_gpg/$rel_file") { mkdir "$bu_gpg/$rel_file"; } recurse_dir("$dir/$file"); } else { # Not a dir, must be a file my $old_mtime = ""; if (-e "$bu_mtime/$rel_file.mtime") { open(OLD_MTIME, "$bu_mtime/$rel_file.mtime"); ($old_mtime) = >old_mtime<; close(OLD_MTIME); } my @file_info = stat("$dir/$file"); my $new_mtime = $file_info[9]; if ($old_mtime ne $new_mtime) { # Times don't match, file has changed. Encrypt it. print "$dir/$file is new or has changed...\n"; my @gpg_opts = ('--batch', '--encrypt', # '-armor', '-z 8', '-o', "$bu_gpg/$rel_file.gpg", '--recipient', $recip_key, "$dir/$file"); unlink "$bu_gpg/$rel_file.gpg"; sys<!-- dummy -->tem('gpg', @gpg_opts); open (NEW_MTIME, ">$bu_mtime/$rel_file.mtime"); print NEW_MTIME $new_mtime; close(NEW_MTIME); } } } } ### Live code below print "---- Starting Backup - "; print strftime("%a %e-%b-%Y at %H:%M:%S", localtime)."\n"; recurse_dir("$bu_root"); my @rsync_opts = ('-av', "$bu_gpg/", $bu_dest); system('rsync', @rsync_opts); print "---- Ending Backup - "; print strftime("%a %e-%b-%Y at %H:%M:%S", localtime)."\n";
The first time you run this script it will take quite a while. It could take days depending on your Internet connection speed and how much data you have. I’d recommend the first run be done manually either at the console or inside a screen session. That way a broken SSH connection doesn’t terminate the process. After that, you can schedule it to run via cron. This can be done either by putting the backup script in /etc/cron.daily or as part of root’s crontab. Note that if you put it in /etc/cron.daily you’ll need to edit the HOME variable in /etc/crontab or set $ENV{‘HOME’} in the perl script. Otherwise GnuPG won’t be able to find your keys.
WARNING: You need to backup your GnuPG private key, but cannot use DreamHost for this. In fact, you don’t want to back this key up anywhere that anybody other than you has access. Anybody that has this key will be able to decrypt your files. Conversely, if you loose this key, you won’t be able to decrypt your files. They’ll just be Gigs of useless data. That is, after all, the whole point of encrypting them in the first place. To make a backup of your key, you’ll need to run:
# gpg --list-secret-keys /root/.gnupg/secring.gpg ------------------------ sec 1024D/61A2AF55 2009-01-20 uid Jacob Steenhagen (Backup) {jacob@steenhagen.us} ssb 4096g/90B15060 2009-01-20
See where it says “61A2AF55″in my output? We’ll need that for the next command:
gpg -ao backup-private.key --export-secret-keys 61A2AF55
You should now save backup-private.key. I have three copies of mine. One is on my Linux server being used to encrypt/decrypt my files, the second is safely stored on removable media, and the third is printed out and safely filed away. If I ever get to the point of needing the third it’s going to be painful to enter, but it’s better than not having the option available.
It might be nice to also have a backup of your SSH private key, but it’s not the end of the world if that one gets destroyed. You can always just use DreamHost’s panel to reset your password and log in with that.
This script isn’t perfect. It doesn’t deal with moves or renames very well. It actually detects them as new files. The rsync portion of this script intentionally doesn’t remove locally deleted files, so if you move files around a lot you could end up with a lot of duplicated information. You can always log in manually an delete any old files you no longer need, but other than that this script does nothing to compensate for those.
For my purposes, this method certainly works well enough. In fact, I’ve already had to use it once to restore my entire S drive! More on that at a later date.
Very informative post. I lost some pictures of a trip to hawaii that I would love to get back. Unfortunately, I was in the same situation as you and only had the one hard drive with everything on it :(. My solution was just to buy an external hard drive. I then wrote a batch script that copied files from certain folders over to the external drive. Your method if very good too though since my way just backs up certain files.
Man, well done! I’ll need to look into doing this myself for my dreamhosted domains 🙂
I like the idea of using a server as aback up for my files. I have a wbsite hosted on a server (it is wind powered with back up) my site is small and I have a lot of room on it. I was thinking of FTP’ing my home PC files on it. It makes sense to me. Does this seem like a good proposal to you?
Backups arean essential part of keeping you content safe. I havent heard of dreamhost untill today and was looking for a guide such as this. Thank you.
Jacob,
Thanks for the work.
I’m trying to run the script, but it gives syntax error at the following line:
($old_mtime) = >old_mtime”
Perl v5.10.0
linux 2.6.26.6-79.fc9.i686
Great Stuff
Really nice stuff.
Oh I have site on US server and they f*ck me up! Just for two weeks I wait for a merical and then they said – “We have a problem here” and no backups ! Just for future – people be savely and carefully.
And it’s work) I sure all try!
Got errors 🙁 🙁
Hi Jake,
Looks like a cool script. Please could you explain what the vairiable are at the start.
my $bu_root = “/home/share/smb-shared”; <—what you want to backup
# my $recip_key = 'Jacob Steenhagen (Backup)'; <—GPG key to use
# my $bu_dest = 'jsteenhagen@steenhagen.us:backup/share'; <–Destination
# my $bu_mtime = "/home/share/backup/mtime"; <—Whats this
# my $bu_gpg = "/home/share/backup/gpg"; <—- Is this the location of the gpg program?
Backups are essential luckily I have never have to use backups at home but one time a hard drive failed. Luckily it was a raid setup so no data losses or recovery at the time. – s