SQL Performance tuned!
Done! But wait…why cant I feel a (damn) difference?!
I ran into an issue recently.
After doing (an incredible!) job at tuning SQL Server Performance, I received this note from the client:
“the changes are made, but we don’t feel the difference”
[shortened to distill down to the point]
Which…umhh… really bothers me…
If I die this week, this probably had something to do with it… nah, just kidding…
The real problem
He then proceeded to say:
“The stats show a very large percentage improvement but what is the tangible results that you can point us to? Want to learn to better understand the value provided.”
I sent him this below. (I plan to point clients here in case this happens again.)
The only way to accurately measure T-SQL query performance is with BEFORE and AFTER numbers.
And it’s not by looking using SQL Server Management Studio right bottom corner to show how long query ran for. DON’T use that method.
Since I touched on that, lets see why.
If I leave work and start driving home, how long will it take? 15min? Or 60 min? If you live anywhere where there is traffic – you know that’s a big unknown.
Same with SQL.
You can query now. It may take 60 seconds. Run again and it completes in 15 sec. The only difference – the load on the server was different. There were different things running while you ran yours.
Add other components query travels through – from disk on that db server you querying down to your to your screen – and you will see that all of these components may be doing different things any second.
How to measure query performance properly?
Here are few ways:
Method 1 – SET STATISTICS IO, TIME ON
Method 2 – SQL Server Profiler
Method 3 – Extended Events
Lets start with the simplest – method1.
Simply add “SET STATISTICS IO, TIME ON” above your query:
“SET STATISTICS IO, TIME ON”
and then dont look at SSMS bottom right corner for timing.
Look at output this commands shows. I am using AdventuresWorks db.
Will run this query:
SELECT oh.CustomerID, SUM(LineTotal)
FROM Sales.SalesOrderDetail od
JOIN Sales.SalesOrderHeader oh
GROUP BY oh.CustomerID
Things to notice:
1. Query time in SSMS shown as zero. Wait… its free then? So if I ran 100K of these per minute, it would still be free? Of course not.
2. Logical reads – this is the the metric we want! This shows how many read operations SQL had to “consume” from disk drive (ignore SSDs, RAIDs, Azure or AWS for simplicity, disk act different there, but idea is still the same). Why? 95% of SQL Servers out there bottleneck on storage or disk drive throughput. So if you tune slowest component, things will get better!
3. CPU time – how many CPU cycles did query consume! Here we want “CPU time” – which in this case shows 140 ms SQL is saying “it took 125ms to get this query run”. “Elapsed time” shows = 345 milliseconds – that’s where SQL says, “since you are not the only thing that runs on CPU, and you compete with other things such as OS or other programs, in reality you had to wait 339 milliseconds to get your query back”.
Now we that we have some gas, some utensils, and a grill – we cooking with gasoline!! Lets prove a point now.
Now I have my TRUE cost of the query:
CPU & Storage
CPU = 140ms
Storage cost = 1247+689 = 1,936
This is our BEFORE number. Remember that.
Now you spent countless of hours how to tune same query and you came up with this cool method of indexed views. Lets see how that performs
CPU cost = 0
Storage = 92
This is our AFTER number.
Conclusion of this tuning:
1. CPU performance increased 140x
2. Storage performance increased 92x (or 2,104% faster)
Can I hear a “WHOOOP….WHOOOP”?!
So for this client numbers looked like this:
Pretty darn good. Wouldn’t you say?
In less technical terms…
Tuning is a bit like car driving through traffic. If seven highway lanes becomes one, traffic throughput will never be greater than one lane. Never.
Once that single lane problem is fixed, things will pile up at the next bottleneck! Which may be a spot with 2 lanes. Once you remove one bottleneck, you find the next one. And the next one.
“Damn it… I thought, once I tune, everything starts to fly” – you say… Nope. Not if you have a ton of things happening. Often SQL tuning gets the job done and is enough. But in some more rare cases, you discover another bottleneck. And all of the sudden its your application or web servers that is the bottleneck.
So until all road is at 7 lanes, from point A to B, traffic simply wont go at 7 lane speeds. Simple, right?
Back to technical world…
There could be lots of things happening.
It could be that app does multiple queries to return a single screen and queries that were improved is just a small small drop in the bucket of what application actually needs to show that single screen.
It could be multiple calls to db. So we tunes few, but there are others left.
It could slow network.
It could be app is waiting on slow client to consume rows sent.
It could bunch of other things.
The way to know is to measure from end to end. And in those cases database guy is a wrong person for this job. There are tools for this or app instrumentation (look into how StackOverflow does it)… And database latency is just a small component.
All database guys can do is to show you results of tuning using BEFORE vs. AFTER numbers. (and not by showing duration in SSMS right bottom)
And in this case, client tuning numbers were very solid. And we got it done using the cheapest method – indexes. Didnt even have to get into tuning TSQL codee or stored procs.
And dont get used to 35,000% improvements. Those wont last. Its great that we got them. But it just means you got a ton of low hanging fruit there. But often tuning of 200% is great.
It would be challenging to do DB tuning in any better way – I am not THE BEST, believe me, I know. But I also know, I am good. Numbers dont lie, so not gonna feel bad about saying it 🙂
So let’s come back to original question “What is the benefit of tuning like this if you can’t “feel the benefit” then?”
1. When queries consume less resources – server capacity increases.
2. You can now handle more queries on same hardware, at same cost. The best companies out there tune on on-going basis, so worst performing queries are handled periodically, capacity increased, server hardware gets more life, and things are faster in general.
3. Not to mention, your queries are faster.
It pains me (feels like more than that) that value is not seen. However, at the same time, I want you, the client, to be happy.
Speed is a feature. It took me a while to understand what that means. But reading how StackOverflow treat their code, really helped. Lots of respect to Nick Craver
there! I wish to be half as good that guy!
So moral of the story… I really feel like I provided an excellent service tuning.
And next time this issue is raised, I will point the person to this blog post and see if that helps.
I would love to hear how you handle issue like this? What could have been done different or better? Basically tell me where I suck!