Offsite Disaster BackupsPosted: November 8, 2010
To: IT Community
Subject: RE: [it] Offsite disaster backups
We have tested full datacenter recovery more than 3 times this year and are confident and very happy with the system I am about to describe. Personnel who perform the recovery are given the backup media, recovery system and checklist. For testing purposes, personnel who perform the backups will not perform the recovery. For our next DR test, I expect someone outside of the IT dept will perform the datacenter recovery. I have tried doing this with BUE and failed miserably every time.
I don’t trust tape; I’ve been burned too many times. We scrapped BUE tapes and went to robocopy, ftp (NISC’s tarballs) and VSS/ntbackup scripts to a networked PC with removable 1.5TB hard drives. CDW sells these massive drives for $95 each. The hard drives are rotated to an offsite safe. Checklists, OS media, and everything one would need to recover all business systems is stored on this single drive.
I did something similar at another site that I am unable to visit routinely. I setup the same style scripted backup server described above. Instead of relying on manual drive removal to a remote location, I subscribed to backup services with MozyOnline. This service offers unlimited storage/backup for a single PC. The company data is scripted to be backed up to this backup server (D2D) and then this backup data is streamed to Mozy’s offsite servers.
Offsite network backup – about $50 a year
Manual – 1.5TB disk – http://www.cdw.com/shop/products/default.aspx?EDC=1944172
Consider the RTO if going with Mozy or other internet-based backup solution. The speed of recovery depends on how fast the Internet connection is. With HD’s and other physical media, recovery is as fast as the machine can copy files (<5 hours for a full datacenter recovery here).
The purpose of this post is to answer several requests for information as to how and what I did to create this backup system. After reading this, you should have enough know-how to physically setup a backup machine with removable hard drives and point your Windows servers to backup to it. If you don’t want to take out your drives and move them offsite, don’t worry. There are two inexpensive methods that I will describe on how to take your backups offsite using online PC backup services.
This stuff is relatively easy in comparison to setting up something like BackupExec and is very simple to understand. After being burned by tape and all of the nifty software that goes along with it, I decided to go native. Your operating system or application usually has everything you will need to perform a backup and restore without the need for special software or support. Using OS scheduling and some scripting, I was able to work around some of the idiosyncrasies of handling log files that come with the native ntbackup utility – something that I think is absolutely necessary to review when performing a backup and restore.
What is really attractive to me in using the native backup and restore utilities is that any plain-Jane systems administrator worth their salt will be able to restore from backup. Setting aside the underwhelming interface and confusing mouse-click rituals one must go through in order to run a backup, BUE and its derivatives are very expensive. They charge extra to backup to a network share! Symantec (Seagate) BUE and the like were marketed when tape was king. Back in the day, it was difficult to keep track of all of the tapes floating around. They thought that a centralized tape database would make life easier; in doing so it introduced a point of failure. This nice central backup index must also be backed up. What is forgotten usually is the backup of the backup server. In the event of a disaster you will have to call support to re-index your tapes which is time consuming and adds to an already stressful situation.
Ntbackup is excellent for backing up Exchange information stores and files that you want to maintain a good history of backups and to keep the RPO down. An example would be a group of accounting spreadsheet files that are modified throughout the day. As I started using the ntbackup utility, I found that there are some shortcomings. The fact that bkf files were designed to be streamed to tape medium makes them rather large and bulky files. Sometimes the file is so large that the utility locks up during a restore operation. This only happened to me if I backed up more than 400GB of data in a single run. When I ran my first recovery test, the bkf files failed to be read. This isn’t good. So, I worked on finding a solution. Upon reading up on this phenomenon on some blogs, several people suggested just using volume shadow copy with Robocopy , a utility that you can download from Microsoft if you don’t already have it installed on your operating system. Robocopy is very robust and allows you to mirror directory structures over network storage. It also makes backups really simple and really fast.
To backup files that you really don’t think that ntbackup will be good at managing, such as user’s home directories and file shares that don’t really change often, I use VSS Snapshots with Robocopy to mirror these changes. Basically, a backup script creates a Volume Shadow Copy Service Snapshot, and “mounts” the snapshot to a drive letter on the system that’s being backed up. Robocopy mirrors the contents of the snapshot to the backup drive.
The one thing I will not go into gross detail here is how I am backing up NISC’s nightly tarball using ftp. This required some hacking of their shell script that pushes our database to their backup servers every night. I don’t want to endorse breaking your support agreements. I will give you a hint at the end, however.
Setting up the scripts is the most complicated part of this backup process. The recovery is very simple – simple enough for a user who can follow a checklist to recover from with no help from you, the administrator. So, it’s well worth your time to configure and document.
The following flowchart will give you an idea of the steps that I go through in setting up this simple method for backing up my numerous servers.
Configure Your BACKUP Machine
The backup machine was purchased from CDW for under $500. It is a HP Proliant server with a removable hard drive backplane from StarTech. The backplane simply takes the place of 2ea 5.25” bay slots and holds 3 removable drives. Additional drive trays are about $25. The following are the exact parts that I purchased from CDW:
1.5TB SATA drive: http://www.cdw.com/shop/products/default.aspx?EDC=1944172
SATA Backplane: http://www.cdw.com/shop/products/default.aspx?EDC=1929035
Controller card: http://www.cdw.com/shop/products/default.aspx?edc=954339
The computer/server type or style is unimportant. I just purchased the cheapest one I could find with enough 5.25” drive bays and a Gigabit NIC. For the purposes of this backup server, I used a trial version of Windows Server 2003; however, there is no reason why Windows XP Professional cannot work with this setup. Unless you are going to have more than 5 concurrent backups running at the same time, there really isn’t a need for a server OS. Coincidentally, running your backup “server” on XP Pro is one way to get away with backing up your server using Mozy’s offsite backup for personal computer backup prices instead of paying expensive professional fees.
Because the drives are removable, it is helpful to label them. What I did was label each drive Alpha through Tango and rotate the drives offsite. I also recommend labeling the NTFS drive as well to correspond with the physical label.
Prepare Storage for Backup Data
After inserting your target backup drive, make sure that you decide on a static drive letter to backup to. This will require a little bit of manual work. I use F as the target drive for my backups. So, all of the scripts that I will describe will be setup to write to the F drive. All you have to do after inserting the drive is go to the disk management snap-in and change the drive letter to F. Now we can run the driveprep script.
The purpose of the driveprep script is to prepare your drives for the backup scripts that you are about to run for the week. Because we move the drives offsite, the oldest drive will become stale and old data will need to be removed for some of the utilities. On the backup server, I keep all of my scripts, installation utilities, checklists, directions, flowcharts, and ISO images for various things in the C:\ISO and C:\VMScripts folders. The idea is to delete and recopy the data every backup drive to assist during a worst case scenario when all you have is a backup drive during your recovery effort. Should something change with the procedures, they should be noted in here. If there are any new utilities required to perform a recovery, they should be placed here. In the end, all one has to do is plug this drive into a machine and read the c:\ISO\readme.txt file to get started with the recovery. So simple a caveman could do it…
On the desktop, create a text file called driveprep.cmd using notepad. Cut and paste this
@echo offrem ########################rem This command script preps the backup drive to receive VCB files from ESXi hostsrem Ian Flemingrem 1/15/2010rem ########################Echo ****************************************************************Echo *This will prep drive F: for backups *Echo *Be sure you have the right drive installed before continuing *Echo ****************************************************************Echo .Echo Press Ctrl-C to cancelpauserem ### Beginrem ### Prepare/clean VCB directoriesecho cleaning backup folder. . .If Exist F:\backups ( rmdir f:\BACKUPS /s /q )If Exist F:\backupd ( rmdir f:\BACKUPd /s /q )echo cleaning scripts folder. . .If Exist F:\VMScripts ( rmdir F:\VMScripts /s /q )echo cleaning ISO folder. . .If Exist f:\ISO ( rmdir f:\ISO /s/q )If Exist F:\leka ( Echo Exchange server backup folder exists ) else ( mkdir f:\leka )echo Copying scripts folder. . .echo Copying VMScripts and OSI folder from C:xcopy C:\VMScripts f:\VMScripts\*.* /s /q > nulxcopy c:\ISO f:\ISO\*.* /s /q > nulecho Done!echo .echo *************************************************echo * This drive is prepared to receive VCB backups *echo *************************************************pause
What this is doing:
I have two ESXi hosts that I like to backup using Veeam SCP. In order to keep the backups from being duplicated stale, I have to delete the files in each directory that is scripted to be copied. Generally, each directory in F will contain backups for a machine. So, the “If exist f:\backups (rmdir f:\BACKUPS /s /q)” just deletes the old backup and preps the medium to receive new backup files. Likewise, the scripts and ISO folders are copied as well.
This is only an example of a driveprep script. You will have to modify it based on what you need to cleanup prior to running your scripts. Some of you script gods can modify your backup scripts per machine to be sure that this is taken care of with each backup; however, I find it easy to just use a prep-script prior to running the actual backup.
So, you have your dirveprep script configured. Insert the drive, assign the drive to F, and run this script. You are ready to do your backups!
Some of your systems will require you to backup using the native NTBackup application. NTBackup is a pretty OK program even with its file size limitations. If you have an Exchange server, you will need to use NTBackup to backup you information stores. For all other file-based backup systems (such as a MS SQL server), I recommend that you write the backup to a file and build a robocopy script to back these backup files up instead of using NTBackup.
The bad thing about NTBackup is that the log files are difficult to manage. Because I like to have log files with my backups, it was necessary for me to create an ugly script to retrieve the backup logs and put them on the same medium as the target backup device (the F drive on my backup machine). The following script is used to perform the NTBackup and move the corresponding log file to the same medium:
@echo offREM ####REM Ian's ugly backup script (modified 7-13-2010)REM This script is intended to be run on the server that you want to backup using the command schedulerREM It will move the logfiles off of the server and onto the target along with the backup fileREM Be sure that the final destination folder (f:\leka) exists before using this batch file. (see your driveprep script!)REM This folder is used as the final resting place for the logs after copying and renaming them.REM ####If Exist F:\leka ( echo Leka Backup Folder Exists ) else ( mkdir f:\leka )net use f: \\nec-backup\f$ /user:nec-backup\administrator passwordmkdir C:\oldlogscopy "C:\Documents and Settings\Administrator.NEC\Local Settings\Application Data\Microsoft\Windows NT\NTbackup\Data\backup*.log" c:\oldlogs\*.* /ydel "C:\Documents and Settings\Administrator.NEC\Local Settings\Application Data\Microsoft\Windows NT\NTbackup\Data\*.log"REM ####REM This is where the backup action happens...REM C:\scripts\System.bks contains what NTBackup is going to backupREM To change the backup, use the following switches:REM /FU Enables a "file unbuffered" setting to bypass the cache managerREM /m [normal] [copy] [differential] [incremental] [daily]REMREM ####\WINDOWS\system32\ntbackup.exe backup "@C:\scripts\System.bks" /n "Leka System Backup" /FU /v:no /r:no /rs:yes /hc:off /m normal /j "Leka System Backup" /l:f /f "f:\leka\System.bkf"REM ####REM OK... Let’s move those logfiles...mkdir C:\templogcopy "C:\Documents and Settings\Administrator.NEC\Local Settings\Application Data\Microsoft\Windows NT\NTbackup\Data\backup*.log" c:\templog\*.* /ycopy "C:\Documents and Settings\Administrator.NEC\Local Settings\Application Data\Microsoft\Windows NT\NTbackup\Data\backup*.log" c:\oldlogs\*.* /yFOR /F "usebackq tokens=1" %%n IN (`dir c:\templog /b`) DO @FOR /F "usebackq tokens=2,3,4 delims=/ " %%d IN (`date /t`) DO @FOR /F "usebackq tokens=1,2 delims=: " %%t IN (`time /t`) DO @ren c:\templog\%%n %%d%%e%%f-%%t%%u.logREM ###REM Change the following location to where you want the log files to be copiedREM ###copy c:\templog\*.* f:\lekaREM ###del c:\templog\*.logrmdir /q c:\templogdel "C:\Documents and Settings\Administrator.NEC\Local Settings\Application Data\Microsoft\WindowsNT\NTbackup\Data\*.log"copy c:\oldlogs\*.log "C:\Documents and Settings\Administrator.NEC\Local Settings\Application Data\Microsoft\Windows NT\NTbackup\Data\*.*" /Ydel c:\oldlogs\*.logrmdir /q c:\oldlogsnet use f: /delete
I am backing up the company share drive (around 1TB) using this method. It takes an average of 10 minutes to run depending on the number of changes that occur on the production drive. In the following scripts, the server’s name is Kahuna and the only drive that I am interested in backing up is the O drive. The scripts are run on Kahuna and are scheduled using the windows command scheduler. I put all of my scripts in the c:\scripts folder to make it easy to remember.
After enabling shadow copies on the O drive, I create a snapshot using this script and mount the snapshot of O to the B drive. Create the file c:\scripts\vss-exec.cmd and paste the following in it:
call vss-setvar.cmd@ECHO OFFdosdev B: %SHADOW_DEVICE_1%net use f: \\nec-backup\f$ /user:nec-backup\administrator passwordRobocopy B: f:\kahuna\O_Drive /copy:datso /b /sec /mir /s /r:0 /w:0 /XF *.pst /log+:F:\kahuna\O_Drive_Backup.txt /nfl /ndl /njhnet use f: /deletedosdev -r -d B:
Dosdev.exe can be downloaded from http://www.ltr-data.se/opencode.html and is necessary to map the shadow to a logical drive letter. You will also need vshadow.exe which is available from the Microsoft MSDN site.
Schedule the following command to run on your server:
‘vshadow.exe -script=vss-setvar.cmd -exec=vss-exec.cmd o:’
Once the snapshot is created and exposed, I used Robocopy with the mirror (/MIR) switch to copy the contents to the backup drive. The backup having completed, the script calls the temporary variables script generated by the vss-setvar.cmd script, reinstates the snapshot ID, and then the shadow copy is removed cleanly.
Note: The cool thing with the Robocopy script is that it just copies the changes and deletes the old files – it doesn’t copy the entire drive. This makes updating your backup really fast in comparison to using the other methods described above. The /MIR switch takes care of synchronization of the backups with the production file system. Using other copy methods (like VEEAM for my VM backups, xcopy, or NTBackup) requires some directory cleanup work. If you choose to use other methods and no cleanup is conducted, your backup files will be duplicated and stale in the target directory. They will eventually eat up all of your disk space on your backup medium.
In creation of this method, I used a script sourced from an MSDN blog called CreateShadow. I heavily modified it to suit my purpose.
This is an extremely efficient and robust backup system and is completely free of any license fees. I may improve it in the future by adding functionality with multiple backup sets. Now, I’m just happy with the way it is working for me.
I’m not going to get into too much detail here. I am currently using VCB to create a snapshot backup on an ESXi datastore and then I use Veeam to SCP the files over to the drive. Veeam has a copy scheduler built in and is a pretty fast way to move the VM snapshot over to removable medium.
Ben and I will be going over this and other methods at the 2011 TechAdvantage conference in Orlando as part of our “Hitchhiker’s Guide to the Galaxy of Virtual Servers” presentation. All of this stuff we will describe is free which is what I like to focus on working for non-profits.
For now, here is part of the script that must run on the backup server to kick off the VCB scripts on the ESXi hosts:
@echo offrem Trigger the ghettoVCB.sh on targeted ESXi hosts to prep for backuprem add more hosts by copying this line:rem c:\ISO\plink.exe root@[EnterESXIPHere] -pw [Password] "cd [Script location] && nohup ./ghettoVCB.sh -f vms_to_backup > backuplog.txt &"echo Executing VCB backup on host . . .c:\ISO\plink.exe root@VMHost1 -pw password "cd /vmfs/volumes/datastore1/scripts/ && nohup ./ghettoVCB.sh -f vms_to_backup -l backuplog.txt"c:\ISO\plink.exe root@VMHost2 -pw password "cd /vmfs/volumes/datastore1/scripts/ && nohup ./ghettoVCB.sh -f vms_to_backup -l backupdell.txt"rem now, let's grab the backup log and put it in F:\backups...echo .echo Moving log file to NEC-BACKUPecho .winscp /console /command "option batch on" "open root:password@VMhost1" "get /vmfs/volumes/datastore1/scripts/backuplog.txt f:\ISO\backuplog.txt" "exit"winscp /console /command "option batch on" "open root:password@VMhost2" "get /vmfs/volumes/datastore1/scripts/backuplog.txt f:\ISO\backupdell.txt" "exit"
I’m a real fan of Mozy backup services. They are cheap and it works really well for me. The nice thing about Mozy is that they have unlimited storage for your backups. The idea that we have been working on here is centralizing all of the backups to be on the F drive of the backup server. Now that all of the files are on the F drive, all you have to do is install Mozy on your backup server and backup the F drive. That’s it! You might want to schedule your driveprep to run every week or so to delete stale data; however, if you are only using the robocopy method, this is a perpetual backup system that you will never need to touch. Note: in order to do this cheaply, your backup server needs to be a PC OS and not a server OS.
There you have it. I hope this sheds some light on how this is accomplished. Do you have a better, simpler, or cheaper way to do this? Let us know!