Case study – High CPU Usage (Emergency call)

Problem Summary:

  1. The client was experiencing severe SQL Server performance issues;
  2. Multiple queries were taking 1, 5, 10 minutes or longer or timing out;
  3. The CPU usage was reaching  100% many times during the day long;

Detecting the problem

We started monitoring the SQL Server environment.

A comprehensive methodology was applied to look at various levels SQL depended on: CPU, RAM, networking, storage, VMWare setup, Windows Operating System settings, and few more areas.

It may seem like overkill to look at all the layers, but Red9 practices prove that comprehensive analysis is effective. Especially when stability and speed matter.

The CPU usage was low until 12:30 pm EST, when CPU usage was observed up to 70% until 13:30 EST.

Figure 1- High CPU usage detected.

After investigation, we detected 12 long-running sessions as per the image below.

Figure 2- TSQL calls.

Only 12 executions were able to pressure the CPU to 70%. 15-20 executions would reach 100% CPU usage (as the client told us).

Analyzing the execution plan, we find an index scan that can be the possible root cause for the high number of reads hence high CPU usage.

Figure 3- TSQL Query plan.

The table tbl_customer_list had 1,585,002 rows and 43 indexes, which looks like too many indexes (usually  5/6 should be enough).

Also, it was observed that the IO latency increased again due to the reads happening. Probably a secondary root cause of performance issues. Latency higher than 20ms is a problem.

What was done:

  1. We improved the database indexes (disabled 12 duplicated indexes only for this table);
  2. Changed instance-level settings Cost threshold for parallelism from 5 to 50M;
  3. Max degree of parallelism from 0 (unlimited) to 8;
  4. Optimize for ad hoc workloads from 0 to 1;

As a bonus, we deployed some monitors for:

  1. Disk latency
  2. Activity monitor (long-running queries);
  3. Single plan cache clean maintenance

After all the modifications, the CPU usage went down to 1%.

Figure 4- CPU usage after modifications.

Conclusion

Reviewing all relevant performance metrics as a unit, we have the best chance of choosing the most effective course of action. In this case, it was clear that initial efforts should be directed towards query tuning and the indexing strategy.

Make the performance of someone else’s problem. Hire a company like ours. We’d love to help! We do tuning every day and have done it for over a decade now.

See more SQL Server Performance Optimization examples here.

We tune slow SQL Servers every day. For over a DECADE now. Contact us!

Mark Varnas

Mark Varnas

I love making performance tuning SQL Servers fast and making them more stable. And I channel that obsession into our SQL Managed Services and new content here. When I'm not writing about SQL, I spend time outside hiking, skiing, mountain biking, or trying a new recipe.

Leave a Reply

Your email address will not be published.