Data protection - part 1
Even I promised to publish tips and tricks to improve life with NetBackup I would like to start with more generic area. It's data protection in general. This article is for providers (to know what to ask) but also for data owners (to know what questions to expect).
At the beginning there is a discussion between backup provider and data owner (many times just different departments of the same company).
The initial phase immediately jumps to questions like:
- What is frequency and retention of backups?
- Can you run my backups every 15 minutes?
- I need my weekly backups in offsite facility.
Everything is wrong. Really, really wrong.
The absolute first question for data owner should be:
What is value of your data?
I know the answer already:
We don't know.
This is the principial mistake during any negotiation about data protection.
Let's take an example. When you own a car you exactly know what's its value and how much you are willing to pay for protection. How expensive car alarm you will buy, what insurance package you will select. Whether to pay for parking lot or park it on the street.
In case of data very rarely the owner knows the value. But every price of protection seems to be expensive.
To know value of data is a mandatory step. As I have mentioned early people just don't know.
O.K. Ask owner to split data to few categories by importance and ask a simple questions:
- How much data (expressed by days, hours, minutes) you can lost to not affect your business? How much money you will lost if your data from the past 10 minutes, 1 hour, 1 day just disappear?
- How long you can continue your business without this kind of data? Many data owners believe that they have to have data available everytime. It is not true. Nobody lost a business if invoices was sent 3 days later. Missing boring reports for management Monday meeting really doesn't send you to bankrupcy. But system tracking your goods delivery is that keeps you live. Late package delivery decrease your reputation, lost package will drive yours customers away.
Many times you invest more to financial reporting system than to system generating money.
The result of initial discussion should be categorization like follows:
- Most important data - we cannot lost any of them and have to be available everytime
→ high availability solution based not only on backup. The price will be really high but still less than possible financial lost.
- Data with less importancy - we can survive loosing past 15 minutes of data and can survive 1 hour of outage
→ system with asynchronous data replication, snapshots, frequent backup of database transaction logs
- Not too much important data - loosing data from today or yesterday will really complicate our business
- Even less important data - we can recreate system by new installation but configration would be good to have at least from past weekend
- Backup will be more expensive than data value itself - we can recreate data from production database during night