If you’ve ever asked yourself (or management) a question—what is the acceptable level of data loss, and how long might it take in case of a disaster?
It means you were thinking about RPO and RTO even without knowing these abbreviations.
RTO – Recovery Time Objective
Recovery Time Objective (or RTO in short) – is a number that defines the time that took from disaster occurred until systems were up and running again.
For example, if a disaster on a database happens at noon, and it takes 4 hours for a DBA to restore a new instance, RTO will be equal to 4 hours (RTO and RPO measurement unit is time). If AlwaysOn is used with automatic failover, it will soon switch to an active replica.
Of course, various scenarios should be taken into consideration while measuring the Recovery Time Objective.
RPO – Recovery Point Objective
Recovery Point Objective (or RPO) – a time gap between the last backup being taken, and the disaster happening.
For example, the last database backup was made at 1:00, and the disaster happened at 6:00. RPO, in this case, is 5 hours.
To reduce this number – more frequent backups are needed.
If data is critical, the recommendation is to set transaction log backups to 5 minutes or other small numbers depending on the environment and the actual hardware.
Make a decision
Of course, nobody wants to lose any of their data and wants to have 100% uptime.
However, achieving this may not be easy or inexpensive. Therefore, it might be wise to reconsider business requirements to determine if 99.99999% uptime, rather than 99%, is truly necessary in your case.
Another piece of advice – always do periodic tests of your disaster recovery setup.
Sometimes numbers may look nice in theory.
However, reality may be different, and RPO, RTO may be higher there. Especially if calculations were made on-premises hardware, disaster recovery includes restoring the server environment on the cloud.