On Thu, Jul 24, 2003 at 04:10:00PM -0500, Adam Maloney wrote: > I believe Carl is "the-man" on this subject, but I'll put in $.02 since I heard someone taking my name in vain, I suppose I ought to throw in my opinion. > The cost of your back-up solution should be reflective of the monetary > value of the data. first, most important rule, right there. there are times that it's worth building a whole replicated datacenter connected via private fiber and fiber-channel repeaters. some of the companies who had offices in the World Trade Center are probably glad they had something like that. a whole lot of them sure wish they did. needless to say, if all you're backing up is your blog on your co-lo'ed webserver; something less drastic is in order. :) I don't know what all kinds of data you're talking about, but keep in mind that a lot of services are easily replicable; DNS and SMTP have failover built into the protocol, and it's advantageous to have a DNS and a mail server somewhere offsite. I belive AFS has replication/failover built into it, but I could be wrong. (Amy?) In any case, AFS is more trouble than most people want to deal with. :) > 70Gb burned to CD? Ick. I once looked at the economics of an automated backup solution using a CD or DVD autoloader. aside from the cost of the burner itself (not too many $K), the cost of media ends up making it more expensive than tape in not too long a time. Tape is fast and reusable; CD-Rs are not. CD-RWs are even slower; but one of the problems becomes the *huge* stacks of CDs that you'll need to back-up your data. storing those things costs you money too. DVDs hold more data; but they are marginally more expensive per byte. 70GB/4.7GB(per DVD) = 15 discs. looks like DVDs are down to less than $1/disk http://store.yahoo.com/blankcdcdr/dvdr-media-dvd-r.html); so I guess the economics have changed a bit since I last looked; but even so, spending $15 (plus the amortized cost of a $3000 DVD autoloader) per backup is not something you'd want to do every night. I don't know how long it would take to burn those 15 DVDs either; but I'm sure good tape drives would be notably faster. it's not a bad idea for occasional, long-term permanent storage tho. (look at www.mondorescue.com). > Also, transferring 70Gb to your off-site location might take awhile. > Over a T-1 it will take more than 100 hours (70,000MByte = 560,000 MBit / > 1.5 MBit = 373,333 sec = 103h). this is why some sort of differential backup is a worthwhile thing. I've built workable systems with rsync scripts; which only requires one full transfer of the data to the backup server (much like Nate described in his post), and ever after (at least in theory) only needs to transfer the files that change that night. there's a couple of good pre-built systems that do this better than what I've cobbled together. I took a good look at this one: http://www.stearns.org/rsync-backup/ and found it's pretty good. it's client-side-initiated; so it would be very good for backing up laptops and other occasionally-connected devices. it makes a nice live filesystem that you can browse, and you can even browse previous days' backups as a live filesystem (it uses hardlinks to avoid replicating identical files). some people didn't like it; because they belived that allowing the clients to initiate the backups made the security weaker. it uses a chroot'ed jail for each client's backup process tho; and in a lot of ways I'd rather that the backup server was exposed to a limited number of clients, rather than try to secure remote-initiation access to a large number of clients. I haven't tried these yet: http://rdiff-backup.stanford.edu/ http://stitch.bentlogic.net/ but they look pretty good. I've heard good things about rdiff-backup. > DLT4 can do 35Gb raw/70Gb compressed on 1 tape. Tapes are about $60-$70 > each (last I bought them anyways). I think you can get DLT4 drives for > under $1,000 now. don't buy DLT. buy AIT. AIT is *amazingly* fast to search, because it keeps an index of filemarks in an NVRAM chip on the tape. this is OS-independent; and makes your restores blazing fast. (which is handy when the CEO deletes his spreadsheet by accident and wants it back 5 minutes ago, instead of 5 hours from now). also, AIT uses spinning read/write heads, so the tape doesn't have to move as fast, which makes 'backhitching' or 'shoeshining' less of a problem, and is less wear on the tape. last I knew, cost was comparable to DLT, but that might have changed. > > 1. copy some files nightly to a central server (that is out of the > > datacenter, but in the same building :) ) and burn them to cd every now > > and then. Its about 70 gigs of data right now. this is something like what I've done for one client in the past. it's a good and workable scheme. just keep in mind (and I think you have it) that you need *historical* backups as well as a replication. you can have differential historical backups on disk (like rsync-backup uses); but if you want to take it offsite, something more durable than a disk is desireable. that's what tape is still good for (still the cheapest alternative for short-term reliable offsite backup). then again, if you only do offsite backups once a week, and want them for archival purposes, it may be worthwhile to get a DVD autoloader and just burn yourself a stack of DVDs. > > 2. Put tapes on each machine, get lots of tapes. this is really expensive, considering how much tape drives cost, relative to the price of a computer now. it's very convenient tho. possibly worthwhile for centralized servers at remote (netwise) locations. > > > > 3. Get a nicer tapedrive that can backup several machines on one tape considering the rate at which disk drives are growing (which makes people sloppy about what they put on disk, which means the drives fill up); this is becoming less and less viable. > > > > are there other options that we should look at? I think rewriteable optical media will be the future of backups; but I don't know if the big backup tool vendors are adding that capability into their systems. I think we'll need the next generation of media (50-90GB disks) before it becomes really viable for smaller operations. certainly Plasmon is doing it right now; but their solutions are very expensive. (albeit very fast and reliable, and with write-once media, largely tamperproof, which has its advantages in some buisnesses). Carl Soderstrom. -- Systems Administrator (and sometimes backup administrator) Real-Time Enterprises www.real-time.com _______________________________________________ TCLUG Mailing List - Minneapolis/St. Paul, Minnesota http://www.mn-linux.org tclug-list at mn-linux.org https://mailman.real-time.com/mailman/listinfo/tclug-list