Backup roundup (part 1)

No doubt we all know people who have lost data without having made backups — in my years of working for a big University, I’ve seen everything from a floppy disk apparently containing the sole copy of someone’s thesis to a hard drive containing gigabytes of course materials and email. For the first case, the answer is obvious: simply keeping a second copy on another disk, or on the network, would have avoided the problem entirely. For entire hard drives, the problem is far more difficult: even burning backups to DVD becomes cumbersome for this volume of data.

With gigabytes of data from digital photography, plus gigabytes more of stored email, version-controlled source code and various forms of content I work with for clients, data protection is a big challenge for me, and no doubt for many others in similar jobs.

A good starting point for hard drives is to move to a RAID array. My primary home storage is a RAID 5 array, my new office desktop is RAID 1. This protects me from data loss from the failure of any single hard drive, but gives no protection against file system bugs or accidental (or malicious) deletion. A more subtle vulnerability also exists: if a disk returns corrupted data, as opposed to returning an error, most implementations will accept the data without detecting any problem.

Sun’s ZFS and Network Appliance’s WAFL add checksums to detect disks returning corrupted data (as opposed to failing to return data, which is all a conventional parity-based RAID implementation will address). They also both offer snapshots, protecting against file deletion or alteration — but, as Joyent discovered with a week-long outage in early 2008, offering no protection against file system bugs.

At work, I was able to implement a network-based backup system for individual desktop machines: one Linux server with a large RAID 6 array running BackupPC which makes daily backups of all the client machines, using file level de-duplication and compression to pack backups of dozens of machines into a few cheap SATA hard drives. This has already saved several people’s bacon, pulling gigabytes of deleted files back in minutes. (ReiserFS issues have caused some downtime, so I have just switched to using ext3 instead.)

For typical (non-server) environments, there are two clear options: traditional backups to an extra local disk or optical media, or online backup services such as Mozy or Carbonite. Apple’s Time Machine is perhaps the most high profile example of the former, as well as being the most user-friendly. I’ve been using all three in a variety of settings for a while now, encountering the good and bad points of each. Over the next three posts, I intend to examine each in turn.