How to Compare Performance of Two SQL Servers: Part 2

TL;DR

Before you migrate SQL Server, prove the new box is faster. Distributed Replay gives you a per-query OLD vs. NEW table of executions, CPU, storage reads, and duration, so you catch the queries that regressed while you can still fix them.

Capture 4 hours of real workload on your busiest day from the OLD server, then replay that exact workload on the NEW one.
Check the execution counts roughly match on OLD and NEW. If they diverge, you broke something in the capture, so the numbers don’t count and you redo the run.
Sort by the duration delta to find queries that got slower, then fix them with an index change or a stored-proc rewrite before you cut over.

What we are aiming for

In Part 1, I talked about:

Why you would compare the performance of two SQL Servers?
And the result that we want to get by doing this comparison.

In the big picture, this is what I am doing:

Capturing SQL workload on OLD. I will do this on the busiest day of the week and will capture 4 hours worth of workload

Re-run that exact same workload on the NEW

There will be a bunch of steps in between. But that’s what I am doing.

The end result

Here is the end result that I will get.

It shows:

SQL call made.

How many times the sucker was run on the OLD and NEW.

And then shows CPU, Storage and Duration numbers. OLD vs. NEW and the difference.

In reality, all these numbers are for me. Not for the customer.

The customer only cares about the yellow part: how long the SQL call takes to run.

But if I can reduce CPU and storage numbers, that means the server will have more capacity, storage will not be under so much pressure and there will be room to breathe.

Understanding the table

Let’s define what these columns mean:

Column A: “SQL Text”
This SQL that was executed. There are few things done to SQL Text, like removing parameters, all capitalized, spaces adjusted, etc. – I will be aggregating (group by) all numbers based on that. I need SP to call EXEC SP1 @lastname = ‘jones’, be the same as EXEC SP1 @lastname = ‘smith’.
Column B: “OLD – number of executions”
Since we rerunning the workload from OLD to NEW, this show how many times this SQL call was made. On OLD server.
Column C: “NEW – number of executions”
How many times this call was made on the NEW server? Columns B & C are sanity checks – they show several executions and if they are not the same or very close, it means I did something wrong and these numbers can not be trusted. Re-do.
Column D: “OLD – Avg CPU”
CPU cycles it took on average to process this SQL Text. On OLD Server.
Column E: “NEW – Avg CPU”
CPU cycles it took on average to process this SQL Text. On NEW Server.
Column F: “DELTA – Avg CPU”
The difference between OLD and NEW CPU cycle consumption.
Column G: “OLD – Avg Storage Reads”
Storage operations it took to process this SQL Text. On OLD.
Column H: “NEW – Avg Storage Reads”
Storage operations it took to process this SQL Text. On NEW.
Column I: “DELTA – Avg Storage Reads”
The difference in storage operations between OLD vs. NEW.
Column J: “OLD – Avg Duration”
The duration it took for this SQL Text to run. On OLD.
Column K: “NEW – Avg Duration”
The duration it took for this SQL Text to run. On NEW.
Column L: “DELTA – Avg Duration”
The difference in Duration OLD vs. NEW.

This is what the bottom of the Excel looks like:

Here we have TOTAL results of running workload on NEW vs. OLD.

Not bad for not changing a single line of code!

These are typical numbers we are seeing by the way.

Upgrade your SQL Server to the newest version (anything above SQL 2016) and that’s what you get.

Totals are for the big picture.

Will my SQL Server run faster or slower on the NEW server? Here, it’s a “Hell Yes!”

As an analytical DBA, I like the detail level. I want to sort this Excel based on the last column “DELTA – Avg Duration”. And find SQL calls that became worse. Prevent the disaster.

Then I take them and figure out what can be done to improve them. Maybe it’s simply an index change. Or stored procedure re-write. Either way, it needs to be addressed.

I can see database backups are a lot slower on the NEW. I will drill down and look into why.

Also (line 8) one stored process is considerably slower and it runs 481 times in 4h. So it’s a big deal. Will look into that next.

And that is how it’s done!

If you want to migrate the SQL Server and hope that things will just work on the NEW one. Sure, go for it.

If you have databases that are mission critical, where errors or slowdowns affect a lot of things, spending a bit of time on thoroughly testing may be worth it.

This method removes risk from SQL Server migrations and upgrades.

This is not for everybody though. Simpler SQL Servers don’t need this.

There you have it. Now you know what type of result you can get out of Distributed Replay.

Speak with a SQL Expert

In just 30 minutes, we will show you how we can eliminate your SQL Server headaches and provide  operational peace of mind

Schedule My Call Now ➜

Article by

Mark Varnas

Founder | CEO | SQL Veteran

Hey, I'm Mark, one of the guys behind Red9. I make a living performance tuning SQL Servers and making them more stable.

Comparing Performance of Two SQL Servers: Part 2

What we are aiming for

The end result

Understanding the table

Speak with a SQL Expert

Discover More

Discover what clients are saying about Red9