SQL Server High Availability – *ALL* you need to know

You probably heard Service Level Agreement (SLA), Recovery Time Objective (RTO) and Recovery Point Objective (RPO) terminology.

Another common phrase is – High Availability – it means that the SQL Server databases will be reachable, having in mind the least possible downtime, after any server crash or failure.

It is closely related to high availability requirements – and Microsoft SQL Server has several possible solutions for that. That depends on a budget, SQL Server version, requirements for automation, scope, IT support level, and so on.

Possible solutions to make SQL Server more robust and reliable are:

  • Replication
  • Log Shipping
  • Mirroring
  • Windows Clustering
  • Always on Availability Groups

Replication

This technology can be divided into the following types:

  • Transactional: It depends on a synchronization process with the SQL Server Transaction Logs. This type of replication begins with a snapshot of the primary database that is applied to the subscriber, after that, all transactions that occur on the publisher will be escalated to the subscriber. This option provides the lowest latency, compared to other types of replication.
  • Snapshot: When data is being changed a little, it is best to use this type of replication. Snapshot is taken of the entire database and is copied over to the subscriber database. It is also used as an initial data set in some circumstances to start subsequent replication processes.
  • Merge: If subscriber and publisher are not connected all the time (like laptop and server), this type of mirroring suits the best. All the changes are being tracked, and when publisher and subscriber come online again – changes are being synchronized. 
  • Peer to Peer: This type of replication has the advantage of involving all nodes for transaction execution. In some cases, it may help to scale-out data on several instances for a better balance between them.

To sum up this technology depending on its type, it can help to scale-out data, all replicated databases can be accessed, and replication can run on both sides.

However, replication may get out of synchronization, talking about high availability it requires manual failover, and snapshot replication can be time-consuming in some cases.

Log shipping

This Microsoft solution is dependent on transaction logs. Needless to say, to use transaction logs database must be in the Full recovery model.

The main technique is to use SQL agent jobs, which will:

  • Generate transaction log backup on primary DB
  • Transfer log backup to secondary database/databases
  • Restore that log backup on the secondary database(s)

Before this works, primary database backup has to be restored on the secondary node with no recovery statement. SQL Agent jobs can be scheduled at an interval, depending on requirements. Probably the most commonly used schedule is 15 minutes.

Looking at the advantages of this technology, it is worth mentioning that it supports several secondary servers. Also, secondary instances may be queried, if the query is not running at the time of log backup restore.
The downside in High availability is that it requires manual failover and secondary database must be brought up to date by restoring all necessary missing log backups.

Mirroring

This method works as a high availability solution. To use this method, a minimum of two servers needed (Primary – called principal, and secondary – called mirror). The third server is recommended (witness), however, it is optional. Witness server monitors the connection between principal and mirror and can change role or perform automatic failover.

When a server acts as secondary – it has a database restore with no recovery option. During the mirroring process, new transactions are being copied to that database. If they are copied to secondary first and then committed on principal – mirroring runs in high safety mode (also called synchronous). If new transactions committed to a principal database without waiting for secondary – mirroring runs in high-performance mode (also called asynchronous).

Although this solution can have automatic failover, it has downsides. The Mirror database on the second server cannot be queried until it becomes primary. Additionally, one mirroring solution is limited to two servers only.

Windows clustering

There is another method for high availability – windows clustering. To use this method, shared storage and two (or more) servers needed. In fact, windows clustering works at the server level, not the database level. In case one physical machine fails, another physical one may take over the workload where the previous server left off. This is common in environments, where 99% or higher uptime required.

Windows clustering is useful in case of hardware failure, it is also useful during maintenance windows. The physical server may be failover to another one, then it is software/hardware that may be upgraded, and then it can be brought back, without any higher downtime on the databases.

However, this type of high availability solution has another medal side. It is complex to set up, it requires extra redundant hardware, which may be only rarely used (only then the particular node is active). Additionally, data itself is not being protected, because the same storage is being used, redundancy is only on the server-side.

Always On Availability Groups

Microsoft SQL Server feature, first introduced in SQL Server 2012 version, is called Always On Availability Groups. It is based on previously mentioned Windows Clustering and mirroring hybrid solution, however, it has more advantages.

This setup is made from one primary server, called Primary replica, and that hosts database copy for read and write operations. Another part is called secondary server that serves a read-only database copy. There might be up to 8 secondary servers, which can also be used, for example, reporting purposes.

Compared to Windows clustering, Availability Groups does not need shared storage – which makes the system more reliable, because each server uses it’s own storage drives.

Availability Groups runs not on the SQL instance level – they do it on individual databases. So some databases might be included in Always-On, others – might not.

Even if it looks like Always On is the best solution, it comes with a price – starting from SQL Server 2012, this feature comes with Enterprise Edition only. Later editions have offers to use it on Standard edition as well.

Conclusion

To sum up, various possibilities to make SQL Server more reliable and providing highly – available databases solution, all depend on particular requirements. Some solutions might fit on one situation, others – on different SQL Server setup. In short, each technology has its advantages and dark sides, and the most important ones are cost, complexity for setup, and the ability to failover automatically in case of a server crash.

Let our expert SQL Server Consulting team handle the technical complexities, so you can focus on driving your business forward with confidence.

Mark Varnas

Mark Varnas

Hey I'm Mark, one of the guys behind Red9. I make a living performance tuning SQL Servers and making them more stable. I channel my SQL into our SQL Managed Services, SQL Consulting and our internal database products.

Leave a Reply

Your email address will not be published. Required fields are marked *