SQL Server Performance Tuning

SQL Query Optimization: A Practical Guide for Database Admins and Developers

Updated
32 min read
Written by
Mark Varnas

Introduction

The first thing and the main problem you’re going to face as soon as you start working with databases is SQL query optimization.

You may not notice the issue at first, but as the data grows and the workload increases, queries start performing poorly and taking more time to complete as they’re unable to scale. Why is that? There may be different reasons for it, and through this guide, you will be presented with different problems and solutions for you to check and implement on your own code.

Keep in mind that reducing query execution resource usage to the appropriate level leads to lower infrastructure costs, either directly or indirectly—this is especially important in cloud environments.

You must remember that there is no magic bullet; each query requires a different approach involving one or multiple solutions. However, you don’t need to make a lot of changes at once—improving the top worst performers with an effective approach usually frees enough server resources to handle the remaining queries.

Fundamentals of SQL Query Performance

There are many components utilized during query execution, but they can be summarized in only a few: disk/storage, network, memory, motherboard, and CPU.

If we put speeds at each of them, the list goes as follows:

  • Disk/storage: between 200-550 MB/s (SSD), 80-160 MB/s (HDD)
  • Network: 125 MB/s (gigabit fiber optic), 32 MB/s (ethernet)
  • Memory: 17-25 GB/s
  • Motherboard: 984 MB/s – 15 GB/s
  • CPU: 175 GB/s

You also need to consider the entire process of retrieving data:

  1. The database engine first searches the memory.
  2. If the data isn’t found, it requests the data from storage.
  3. If there’s not enough memory space, the data in memory must be saved to storage.
  4. The new data is then loaded into memory.

The slowest part of a query is typically reading data from the disk and transmitting it over the network.

Here, you must account for the physical limitations of storage: delays from mechanical components, lack of a RAID 10 configuration, bandwidth limitations, signal interference, error checking, packet redundancy, and more.

This means the main goal in query optimization is to avoid going down to the storage. In simple terms, it can be achieved as described below.

Help the database engine choose the best execution plan

Only for small tables perform a table scan (whole table), which has a time complexity of O(n). For large tables it´s faster to scan a clustered or non-clustered index (B-tree) with a time complexity of O(log n).

Table partitioning allows reading data from multiple files at once or from multiple disks at once if configured separately.

Partition elimination skips unnecessary data and only scans the files and disks where the data is to be found.

Reduce the amount of information transmitted if it’s not for processing

If you don’t know your data distribution, this article can help you find that out, which allows you to modify your tables and queries to use appropriate data types and lengths.

Using SELECT * is discouraged since only a few columns need to be displayed or updated.

Using OFFSET and FETCH is encouraged unless you cache data at the application layer. Linked servers retrieve the whole table before filtering that out, so it´s better to use filtered openquery / openrowset or Polybase with pushdown. Partitioning very large or very busy tables helps to only process the needed data.

Another problem is outdated statistics, which cause the engine to overestimate the number of rows it needs. You can see this in the execution plan’s estimated number of rows and the accompanying warnings. The less data you retrieve from the disk, the more memory will be available for all queries, reducing the need to access the disk unless absolutely necessary.

Keep the most used data in memory

If you know what your most used data is, the one with the highest number of transactions, and the one where most of the computations occur, then it´s a good idea to keep it in memory.

Relying on the engine will only slow down the process and increase the number of operations moving data in and out of the storage, while you know it should be in memory all the time. This is achieved through memory-optimized tables, memory-optimized indexes, and columnstore indexes.

Another setting that keeps the most used data in memory is Lock Pages in Memory, which can improve performance by avoiding paging data to disk.

Configure the underlying data files appropriately

Many operators create objects in the tempdb: Index Spool, Eager Spool, Lazy Spool, Spool, Table Spool, and Table-Valued Function. The tempdb is volatile but it works as any other database: moving data in and out of disk. That’s why having multiple files improves performance allowing more data to flow and more operations to occur at the same time. The newer versions of the SQL Server installer already account for this.

Instant File Initialization is another datafile configuration that improves performance but has a security warning; if the environment is safe enough you can enable this setting. Data files, log files, and tempdb files must be separated for optimal performance. And datafiles growth must be configured properly.

Use the right number of CPUs

Parallelism is important, as it allows performing multiple operations at the same time producing results faster. But this is an extremely expensive operation, and you don´t want to waste valuable resources on the first query that requests it, especially if it’s unneeded.

Other queries run multiple times per second, and the CPUs they reserve are unavailable to be used by expensive queries that can´t run with fewer CPUs. In all cases, there is too much overhead involved that it slows down your server response. The maximum degree of parallelism is another setting that newer versions of the SQL Server installer account for, but the cost threshold for parallelism must still be changed manually from the default 5.

Filter data using JOINs without making Cartesian product

Let’s start with the simplest model: you have tables for categories, subcategories, and products. If you join the products to the subcategories and categories tables, you must use both keys in the join, as follows:

SELECT CategoryName
	,SubcategoryName
	,ProductName
FROM Products p
INNER JOIN Subcategories s ON s.SubcategoryId = p.SubcategoryId
	AND s.CategoryId = p.CategoryId
INNER JOIN Categories c ON c.CategoryId = s.CategoryId

If you forget either the subcategory or the category in the join, a Cartesian product is done matching all products with all subcategories or categories: if there are 1000 subcategories and 10000 products, the resulting match will have 10 million rows.

Not all models are simple, sometimes it’s hard to pinpoint this issue, so you must rely on the execution plan analysis to determine how many rows are in the input and in the output of all operations, or look at the I/O utilized overall.

Avoid the need to iterate results, think in sets instead

Programmers are often taught early on to write pseudo-code, detailing step by step how a computer should act. However, the sooner you learn how the database engine works, the better. A database isn’t just a file or a repository; you don’t instruct it on how to find the data, you simply tell it what you need, and the engine determines the best approach to retrieve it.

This is accomplished by using the Structured Query Language (SQL) and here you can find a very good introduction. Whenever you feel the urge to use a subquery, cursor, loop, etc., remember that you’re not thinking in terms of sets. Take a step back and try to express the same logic using appropriate set-based constructs from the language. It won’t be easy at first, but it will be rewarding. Over time, you’ll get used to this approach, which will set you apart from other programmers.

Avoid type conversions as they avoid the indexes

When you’re designing a database, it’s common that you forget what type or size you used in a different table, but you will notice it during the query execution because:

  1. The index created to improve performance is not used.
  2. The execution plan indicates an implicit type conversion warning.
  3. You will see a function applied to one of the sides of an expression.

This also occurs when you use Language-Integrated Query (LINQ) or an object-relational mapping (ORM) because it constructs the queries for you, and sometimes you don’t specify well enough the underlying data structure.

This also happens if you apply any kind of cast or function to a field in a table. To avoid this, rewrite the expression on the other side of the query. If you notice these issues, adjust the data types and sizes on the appropriate side of the expression (rather than on the table column) so that they match correctly.

Reduce the number of records blocked during an update

This usually happens when you have heap tables, you don’t have an index on which rows are to be updated, or the index is not selective enough. Again, the execution plan can tell you how many rows are being blocked, so you can act appropriately.

Reduce the amount of time being blocked during an update

Here are three key points:

  • Heap tables and indexing: If you have heap tables (without a clustered index) or an index that is not selective enough, the engine may lock most or even the entire table while searching for rows to update. During this time, no other queries can access data from those tables, causing significant delays.
  • Number of indexes: The more indexes you define on a table, the longer it takes the engine to update each one with new data. Therefore, it’s recommended to only have the minimum number of necessary indexes to optimize performance.
  • Transaction duration: Typically, records are locked at the start of a procedure and unlocked at the end. However, if the system fails, those records may remain locked until the issue is resolved. To avoid this, focus on atomic operations. Lock records only while updating and reading results, then release the lock immediately. While this requires more design time and additional code to validate the row state, it’s the most reliable long-term approach.

Avoid the creation of another execution plan if there is one already available

Every time the database engine needs to create an execution plan, it consumes CPU, memory, and takes a few milliseconds multiplied by the number of executions per second of a query, which can become a problem and cause slowness in all your other queries.

A brief explanation is described in this article, and ways to find them are described in this article, both from Brent Ozar. If you’re using Query Store, how to find them is described in this article from Erik Darling.

After you’ve identified the reason for having multiple plans, you need to either fix your code or do a good parametrization of your SQL code, then the engine won’t have to re-do its work every time.

Avoid extra delays and additional processing

In the previous topic, we’ve seen how to avoid re-doing the calculations to create an execution plan when there is one already available. Another way to avoid extra processing is referencing objects by their two-part name so the computer doesn’t have to search the internal catalogs every time a query runs.

Also, returning the rows ordered is an expensive computation, the same with removing duplicates using DISTINCT, and the same with formatting values for display: all of this is better done at the application layer.

Views that join multiple tables, contain a large number of rows or use linked servers can be resource-intensive, as they are constructed each time they are queried. In such cases, it’s more efficient to use a materialized view, which stores the data and doesn’t need to be rebuilt on each query. Materialized views can be updated at intervals you specify, reducing the overhead and improving performance.

Another source of extra computations is like the following:

SELECT OrderName
FROM WorkOrders
WHERE DateCreated > GETDATE() - 1

In this case, the date subtraction operation is being performed for every row in the table.

However, if the result will be the same for all rows, why not calculate it in advance? This would prevent unnecessary computations.

The query can be rewritten as follows:

DECLARE @Date DATE

SET @Date = DATEADD(DAY, - 1, GETDATE())

SELECT OrderName
FROM WorkOrders
WHERE DateCreated > @Date

Another common pattern I’ve seen is converting DATETIME to date, selecting DATEPART‘s of a date and then joining them, converting char types to VARCHAR, converting VARCHAR types to INTEGER, etc. Very often this is unnecessary as you can rewrite the query using a lot fewer operations and still produce the same output. In this case, it truly applies that “less is more”.

Reduce the number of threads used

This query identifies cases where the number of used threads exceeds the number of reserved threads. This situation indicates potential blocking or malfunctioning hardware, which can lead to performance issues and must be addressed to prevent slowness.

Reduce the amount of memory used

This query identifies instances where memory usage exceeds the planned amount. Scarcity of memory can occur during concurrent executions or due to the high frequency of query executions.

When this happens, it’s a clear sign that you should review the execution plan to determine if an issue is causing the query to consume more memory than expected.

Top optimization techniques and best practices

Now that we’ve outlined the objectives of query optimization, let’s dive deeper into each one with detailed analysis and practical examples.

Don’t rely solely on indexes

Often, you’ll think indexes resolve bad queries which is not true:

  • If the code is bad, it’s not going to use the index no matter how hard you try.
  • There are so many different index types that you need to choose the most appropriate one. The one less used is full text, even when it accelerates your searches on character fields, but not everyone knows how they work.
  • Each index you add will consume space and reduce performance for inserts, updates, and deletes.
  • If the index is not selective, which means it must return less than 70% of the underlying table rows, then it’s not useful at all.
  • If the WHERE condition doesn’t use all the index leading columns, it is not used.
  • If the WHERE condition uses any comparison other than equality, it is not used.

Before deciding whether to create a new index, use the rest of this guide to address other performance issues. After resolving those, evaluate if adding an index is necessary and beneficial.

It’s also important to review the current index design to see if an existing index can be expanded or optimized instead of creating a new one.

ORDER BY

You should ask yourself: why am I returning the data sorted? Sorting is a resource-intensive operation that consumes both memory and CPU, and for large tables, it also incurs significant I/O costs when using tempdb.

Since you’re paying for each CPU used in the SQL Server instance, those CPUs are better allocated to more critical tasks than sorting results.

Instead, consider sorting the data at the application layer, whether it’s in ASP.NET, PowerShell, PowerBI, Python, or any other platform. On the application server side, you can add as many CPUs and nodes as needed without additional licensing costs, and in cloud environments, you can scale up CPU resources during high workloads.

Appropriate data types

As mentioned earlier, this article can help you identify if you’re using the appropriate data types and lengths:

  • If you only have numbers, don’t use a string datatype. Even a BIGINT uses only 8 bytes, while a string uses one byte per character. Also, searching on numbers is faster and is always possible, while searching on strings is costly and is not always possible.
  • If the max value is 290, SMALLINT uses only 2 bytes while INT uses 4 bytes. If it is less than 255, TINYINT uses only 1 byte.
  • If you only have dates, don’t use a DATETIME type (8 bytes), use a date type (3 bytes).
  • If the number of distinct rows is low, using a TINYINT as key is more scalable than having a string one. You can document your database about what each number means, or you can have a catalog table with TINYINT keys (where searches are performed) and string values (to get the meaning).

Appropriate indexes

As mentioned earlier, this article can help you identify if your indexes will be selective or not:

  • If the mode value is more than 10% of the total rows, a filtered index on the mode value is the smallest and most selective.
  • If the mode value is more than 90% of the total rows, a filtered index excluding the mode value is the smallest and most selective.
  • If the number of distinct rows is small, an index on that column won’t make queries go faster.
  • If the number of missing values is high, a filtered index where IS NOT NULL is the smallest and most selective.
  • If the number of missing values is low, a filtered index where IS NULL is the smallest and most selective.

Avoid using OR

This is a common operator in programming languages but doesn’t work the same in SQL language.

Instead of this query which is not going to use any index:

SELECT ProductName
FROM Products
WHERE SubcategoryId = 1
	OR CategoryId = 2

Rewrite it to use the indexes without incurring overhead from removing duplicates:

SELECT ProductName
FROM Products
WHERE SubcategoryId = 1

UNION ALL

SELECT ProductName
FROM Products
WHERE CategoryId = 2

UNION ALL faster than UNION

As mentioned earlier, UNION sorts the results and compares them against the second table to remove any duplicates, so this overhead can be avoided if it’s not necessary.

Avoid correlated subqueries

A correlated subquery appears when you want to apply a filtering condition without using sets, but instead trying to tell the engine how to do its work (programmatically). See three distinct examples below:

SELECT ProductName
FROM Products p
WHERE ProductId NOT IN (
		SELECT ProductId
		FROM WorkOrders
		WHERE CategoryId = 1
		)

SELECT ProductId
	,DatePurchased
	,Quantity
FROM WorkOrders wo
WHERE DatePurchased = (
		SELECT MAX(DatePurchased)
		FROM WorkOrders wo2
		WHERE wo2.ProductId = wo.ProductId
		)

SELECT ProductName
	,(
		SELECT SubcategoryName
		FROM Subcategories s
		WHERE s.SubcategoryId = p.SubcategoryId
		) SubcategoryName
FROM Products p
  • In the first query, you’re trying to tell the engine to search all work orders that satisfy the condition, and then use that to filter the products.
  • In the second query, you’re trying to tell the engine to get the latest date and show the product details on that date.
  • In the third query, you’re explicitly requesting the engine to gather additional data row by row.

In every case, you’re thinking programmatically instead of in sets. All queries are slow because they execute the subquery part once for every row in the outer query.

Let’s convert the correlated subqueries to a set-based approach:

SELECT ProductName
FROM Products p
INNER JOIN WorkOrders o ON o.ProductId = p.ProductId
WHERE o.CategoryId > 1

SELECT ProductId
	,DatePurchased
	,Quantity
FROM WorkOrders wo
INNER JOIN (
	SELECT ProductId
		,MAX(DatePurchased)
	FROM WorkOrders
	GROUP BY ProductId
	) wo2 ON wo2.ProductId = wo.ProductId
	AND wo2.DatePurchased = wo.DatePurchased

SELECT ProductName
	,SubcategoryName
FROM Products p
INNER JOIN Subcategories s ON s.SubcategoryId = p.SubcategoryId

Note the magical word is the JOIN between the tables, so the engine is truly going to do what you intend, but without doing it row by row.

Another common issue is using TOP with ORDER BY, which is computationally expensive. Instead, use a set operation to filter the rows you want to display (e.g., using MIN and MAX), which is more efficient.

Avoid using cursors

This goes in-line with the previous item: if you try to programmatically do things row by row, you’re putting a toll on CPU, memory, tempdb, etc. There’s always a set-based approach equivalent to what you’re trying to express, but it requires practice to get used to it.

The benefits in the long run are exceptional, and they’re worth it.

Avoid functions, procedures, views, and dynamic code

This, again, is the result of thinking programmatically.

  • You aim to encapsulate logic to avoid having multiple versions of the same code (maintainability).
  • You think that if the code is fast, it will perform well even when executed multiple times per second (scalability).
  • You want to call the same code from multiple places (reusability).

However, in many cases, the cost of this approach can’t be determined at compile time, so a fixed estimation of rows is used.

  • If you include such logic in a WHERE condition, an index may not be used.
  • If you add complex logic, set-based processing is bypassed, and the query is processed row by row. This also inhibits parallelism.

So, whenever you’re overthinking and planning to do it programmatically, step back and think in terms of sets – this is the language of the engine.

Do not SELECT *

It’s the easiest query to write brainlessly.

If you specify the least number of columns that will either, be displayed or processed, you will:

  • Reduce I/O, especially by omitting string and BLOB columns
  • Save CPU and memory on columns not to be used
  • Use the appropriate index based on the information to be used; otherwise, the engine chooses to scan the whole table because that’s where all the columns you’re requesting exist.

Limit results with OFFSET and FETCH

This is especially useful in paginated queries, where you’re not going to display a thousand records to the user so there’s no need to get all rows every time the user refreshes the page. See an example below:

SELECT ProductId
	,ProductName
FROM Products
ORDER BY ProductName OFFSET 20 ROWS

FETCH NEXT 10 ROWS ONLY

In this case, the ORDER BY is necessary because the engine needs to provide the results consistently, otherwise you omit the order.

Handle ISNULL in the SELECT part

This aligns with the earlier recommendation to avoid using functions in the WHERE clause, as doing so can prevent the use of indexes. Remember:

  • NULL represents the absence of a value (similar to “does not exist”),
  • 0 indicates an exhausted quantity and an empty string means there is no label.

Additionally, avoid using the same procedure to return all rows and a filtered subset of rows (e.g., WHERE ISNULL(@Param1, 0) = 0) because the engine uses different optimized methods for each scenario.

Use equality, not inequality

When you use the operators =, >, and <, the index defined on that column is utilized.

The only exception is the not-equal operator <>. This makes sense because, with <>, the engine must read all rows and discard those that don’t match, rather than efficiently determining which rows to retrieve based on the index.

Use LIKE ‘string%’, not LIKE ‘%string’

This is similar to the previous point: when you place a wildcard character at the beginning of a string (e.g., %value), the index cannot be used. However, if the wildcard is not the first character (e.g., value%), the index can be used to narrow down the search area. A more efficient approach, though, is to use full-text search, which allows searching for text in any position and order, providing better performance for such queries.

Use sp_executesql, not execute

This is because sp_executesql allows passing parameters to dynamic queries, which helps create a single execution plan for the query, reducing both resource usage and execution time. If different parameters significantly alter the execution plan, you can use dynamic queries within the same stored procedure to handle them separately, optimizing performance.

Avoid using DISTINCT

Using DISTINCT forces an extra sort operation; instead, try to filter out the duplicate rows by using appropriate joins.

Use SET NOCOUNT ON

This is the easiest fix to use and is most used in stored procedures.

It won’t change the number of affected rows; rather, it simply avoids an additional counter and reduces extra network traffic when returning this information. When you multiply this by the number of executions per second, the optimization becomes more significant.

Avoid views

We’ve mentioned earlier one problem with views, but there are more:

  • The estimated number of rows is inaccurate
  • The resulting rows are calculated when the view is used, which has a cost and takes time
  • It may not be used but instead its underlying tables and indexes

So, if you’re facing performance issues with views, consider adding the following:

  1. Recreate them using WITH SCHEMABINDING for an accurate estimated number of rows
  2. Convert it into a materialized view by creating a clustered index, so the data exists all the time
  3. Reference the view using WITH(NOEXPAND) for it to be chosen first, but if it is slower then better leave the database engine to decide what to use.

Avoid complex queries

This is the best advice you can get, as it emphasizes that debugging, reusability, and maintainability are only achievable with simpler queries.

It’s recommended to break complex queries into smaller parts, storing intermediate results in table variables, temporary tables, or common table expressions.

This way, performance can be measured and optimized in those smaller parts, allowing for easier improvements when necessary.

Avoid large write operations

When updating large numbers of rows, the transaction log and tempdb can grow excessively until all operations are committed, causing slowness when they exceed available memory.

However, committing too frequently can make the query run even slower. It’s best to test different batch sizes to find the optimal balance for your data volume and the maximum size you want for the log and tempdb. Also, keep in mind that during these operations, the tables are locked, so it’s often necessary to run them during offline hours or a planned outage (e.g., a release window).

How to find slow queries in Microsoft SQL Server

There are several free tools used to find performance issues in SQL Server, but do you know how to use them?

Usually, what you’re looking for from these tools is:

  • Plan-level warnings
  • Operator-level warnings: columns with no statistics, spill to tempdb, no join predicate, etc.
  • Missing index notifications
  • Scans in general, like table scan
  • Seek predicates with a range of values
  • The cardinality estimation model version must be appropriate for the SQL Server version you’re running: 160 for Azure SQL and SQL Server 2022, 150 for SQL Server 2019, 140 for SQL Server 2017, 130 for SQL Server 2016
  • Appropriate trace flags: i.e. trace flag 4199 enables query optimizer hotfixes (non-default).
  • Wait statistics
  • Number of rows read vs actual number of rows
  • Estimated number of rows vs estimated number of rows to be read

Below is a list of tools provided with SQL Server ordered from easier to harder and the link that shows you how to use them; learning them just requires you to practice their usage:

Conclusion

You may have noticed that performance tuning is not an easy task; it requires both practice and a deep understanding of what to look for and how to measure improvements.

When you get a call in the middle of the night or during a holiday, resolving these issues can be especially challenging, as they often require code changes.

That’s why I recommend monitoring performance during normal workloads, so you can identify potential problems early and know what can be resolved quickly.

By regularly performing a database health checklist, you can address most general performance issues and focus solely on the real critical problems when they arise.

Speak with a SQL Expert

In just 30 minutes, we will show you how we can eliminate your SQL Server headaches and provide 
operational peace of mind

Article by
Mark Varnas
Founder | CEO | SQL Veteran
Hey, I'm Mark, one of the guys behind Red9. I make a living performance tuning SQL Servers and making them more stable.

Discover More

SQL Server Health Check SQL Server Migrations & Upgrades SQL Server Performance Tuning SQL Server Security SQL Server Tips

Discover what clients are saying about Red9

Red9 has incredible expertise both in SQL migration and performance tuning.

The biggest benefit has been performance gains and tuning associated with migrating to AWS and a newer version of SQL Server with Always On clustering. Red9 was integral to this process. The deep knowledge of MSSQL and combined experience of Red9 have been a huge asset during a difficult migration. Red9 found inefficient indexes and performance bottlenecks that improved latency by over 400%.

Rich Staats 5 stars
Rich Staats
Cloud Engineer
MetalToad

Always willing to go an extra mile

Working with Red9 DBAs has been a pleasure. They are great team players and have an expert knowledge of SQL Server database administration. And are always willing to go the extra mile to get the project done.
5 stars
Evelyn A.
Sr. Database Administrator

Boosts server health and efficiency for enhanced customer satisfaction

Since adding Red9 to the reporting and DataWarehousing team, Red9 has done a good job coming up to speed on our environments and helping ensure we continue to meet our customer's needs. Red9 has taken ownership of our servers ensuring they remain healthy by monitoring and tuning inefficient queries.
5 stars
Andrew F.
Datawarehousing Manager
See more testimonials