How’s your data backup doing?
Whether your business application stores financial data such as invoices, or PI (personal information) data such as personal or customer records – at the core of any business software is the DATA.
What changed…
Well, a lot has changed over the last decade as many IT departments have started using virtual technology.
But be FOREWARNED because virtualizing your database server doesn’t add any additional data loss protection; in fact, it makes database backup slightly more difficult due to the differences in physical versus virtual server hardware.
What are the minimum backup requirements, now?
The same as before. At a minimum, you still need a backup and a restore plan, but the methods have surely changed (or some would say they have improved).
Basic Backup Plan Framework
A backup plan clearly documents how servers and database backups should happen to protect against data loss…
…and it should provide an overview of the backup procedure and staff responsibilities.
The plan should cover these tasks but can cover many more:
- who is responsible (assign one person to own and drive it)
- when the backup happens (create a clear schedule for when jobs run – GANTT Chart)
- how often it happens (Daily, Weekly, Monthly)
- where the backup is stored (onsite, offsite, on tape, or on disk – for how long)
- who monitors the backups (daily monitoring)
- who re-runs the backup job if it fails (yes, rerun if they fail)
All this should be in the basic backup plan to ensure clarity and accountability for backing up servers, data, and logs.
Restore Plan
A backup restore plan should have steps for restoring files and data from the backup archive. This process should be tested to ensure it works and is up-to-date.
Backup and restore plans should be reviewed regularly due to hardware and software changes.
4 Ways to Backup a Database and a DB Virtual Server
Before we begin, lets qualify causes of data loss for readers new to virtualization (Reference)
- Accidental Deletion – Doesn’t matter how – it happens all the time.
- Computer Viruses and Malware – Yes, I’ve seen this happen.
- Physical Damage – hardware failures still happen more often than people think due to badly architected infrastructure or old hardware that just dies.
- Accidental Formatting – Oops!
- Storage Crashes – even tier 1 storage can crash, I know from experience.
- Logical Errors – A code release was rolled that caused something to happen to data tables or indexes and now the data is out-of-whack.
- Continued Use After Signs Of Failure – shame on you!
- Power Failure – no comment.
- Firmware Corruption – rare but I could see it.
- Natural or Man-made Disasters – think 911.
- Single failure domain – a single point of failure.
- Malicious Attack – this is a big risk.
1. Data-only Backup (Data and Logs)
First, let’s begin with the basic data-only backup. This method will create a point-in-time backup of the data and logs that can be restored in the event something happens to the data.
The PRO is this is quick for restoring data and logs – only.
The CON is if the server hosting the database fails, a new server will need to be built and have all the software pre-installed and configured – this includes reapplying any patches, tweaks, scheduled jobs, and stored procedures because they have all been lost.
2. File-level Backup via Traditional Backup Software (Files and Data/Logs)
File level backups normally require an agent installed on the virtual server. This type of backup is commonly found on physical servers but make can it’s way into virtual servers too due to the costs associated with upgrading. Or someone did a P2V and just left the backup as it was on the physical server.
The PRO is restoring from data loss can be easy if the backup job has followed a couple of steps needed to ensure the database is properly quiescence before it is backed up. And the correct files or directories have been selected.
The CON is most backup software’s require an additional license to add the agent for backing up locked files, otherwise, you will not backup important files and the database backup will not be completed. (Always check with your vendor)
Quiesce is used to describe pausing or altering the state of running processes on a computer, particularly those that might modify information stored on disk during a backup, in order to guarantee a consistent and usable backup. Ref. Wikipedia
3. Front-end Block Level Backup of Full System State (OS plus Data)
Using a VMware-aware backup software such as Veeam will do a full system backup of the database and OS.
The PRO is this will allow for a complete point-in-time system state recovery that will speed up the restore time and get your database server back into operation. And as in our example of Veeam, it is agent-less and takes a full snapshot copy of the virtual server at a block-level via a vCenter API command, and then copies it to remote storage for quick recovery.
The backup job can be scheduled to run daily incremental, which rolls up to a full as each new daily is completed. At the end of the month, you can archive the last backup as your offsite monthly backup, while storing 30 days onsite for quick recoveries as needed. This method can guarantee a quick full system state recovery in case of data loss, buck also in the event of a system or hardware failure.
The CON is it can be costly for software licenses, hardware, and storage to support this backup solution.
4. Back-end Block Level Backup using Storage Replication (OS plus Data)
Storage replication is another block-level backup that protects against data loss and provides quick recovery.
The PRO of this technology is it handles the backup on the back-end storage and completely replicates the storage LUN or datastore where the database VM files are stored.
This also happens at the block level and allows for a full system state recovery like Veeam, however it is much faster because the recovery is handled within the storage network and normally doesn’t have to copy files. There are also advanced features depending on the storage vendor for encryption, de-duping, restoring, etc.
The CON of storage replication is the need for additional storage units and licenses.
Note: This list does not cover physical server backups which in this case you would do full system state and data backups and a bare-metal restore for recovery of the complete system.
Summary:
It’s an understatement to say Data Loss can happen at any time and for many different reasons. The key takeaway though – is to try to avoid the loss in the first place…
…And secondly to have a good backup and recovery solution and plan in place so you’re not scrambling to rebuild the data from the source files and transactions.
So whether your backup plan uses file-level, block-level, or something new, the best backup and recovery plan is the one that works.