SQL Server Query Tuning

There is a lot of literature out there about indexing and query optimization but most of the literature I have seen is relatively abstract. Authors discuss the selectivity and cardinality of the data and the characteristics of clustered and nonclustered indexes, etc. But often they do not show readers a practical way to apply that knowledge.

The purpose of the article is to describe one method to put those fundamentals into practice.

Prerequisites

Query optimization is too broad a subject to be covered completely in an article of this size. I am going to assume that the reader has a baseline familiarity with SQL Server and indexing fundamentals.

We will use a query based on the AdventureWorks sample database from Microsoft. If you want to follow along you can download Adventureworks as well as the other sample databases.

Download Sample Databases

What is Query Optimization?

Optimization is a general term that is applied to many different techniques for enhancing the performance of SQL Server. In this article, we will only be discussing one aspect of optimization. That is the improvement of query performance through the application of indexes.

How to Measure Query Cost

Before you can begin tuning a query you must know how to quantify the cost of the query in order to measure progress as you try different ways to optimize the query.

there are three metrics in common use: execution time, cpu usage and disk usage. I prefer to use disk usage metrics, specifically Logical Page Reads. I will explain why I think this is the best metric to use.

Execution Time

While it is true that your ultimate goal is to reduce the execution time of the query, execution time itself is not the best metric to use when measuring query cost. There are too many things that can effect query duration: for example whether the queried data is in memory cache or has to be retrieved from disk makes a tremendous difference in execution time. Blocking and other concurrency issues also affect execution time as does the general resource load on the hardware at the time of execution.

CPU

CPU usage is another way to measure query cost. Sometimes your goal is to reduce CPU usage rather than to reduce execution time. But again, CPU usage is not the best metric for measuring query cost when you are considering optimization through indexing.

Indexes affect page reads. indexes do not directly affect CPU use. Lowered cpu usage is a side effect of reducing page reads.

Logical Page Reads

Logical Page Reads is the most stable metric to use when measuring query cost or measuring the progress you are making. Logical reads includes all pages read either from memory or from physical disk by the query.

Since physical reads are much more expensive than memory reads, you might think that physical reads is the metric we should use to measure query cost.

Reducing physical reads is important goal, but physical reads is not a stable metric to use for optimization.

The first time you execute the query in question you may see a lot of physical reads. The next time you may see none. The number of physical reads is just an accident of whether the data is in memory cache or on disk. It cannot be relied on from one execution to the next.

Given the same query, the same data, the same indexing and same query plan, the number of logical reads will always remain the same. This gives you a stable metric and focuses you on what you are actually trying to do, which is to improve performance by changing and/or adding indexes to reduce logical page reads.

If you reduce logical page reads, you will also minimize physical page reads, cpu use and execution time in roughly the same proportion.

Optimizing on a Test Server

Another benefit of using logical reads as your metric is that you can restore a copy of the database onto a test server and do your optimization there. The statement you are optimizing will probably run more quickly without other users on the server and without other interference factors that affect it on the production server. Execution time will therefore not be a reliable metric for judging progress.

But logical reads will be the same on this server as it is on the production server because data, indexing, etc are the same. If you make a dramatic reduction in logical reads by creating indexes in this environment you can be certain that you will see the same benefit when you create the indexes in the production environment.

When you are working on a test server, you also have the option to clear all buffers between runs of the query you are optimizing. This will insure that executions are running under exactly the same conditions.

To clear all buffers execute these two statements between runs of the query you are trying to optimize.

DBCC FREEPROCCACHE

DBCC DROPCLEANBUFFERS

You should not attempt this on a production server, at least not when users are working. It will cause all cached data to be flushed to disk and force all query plans to be recompiled. On a pruduction system this can seriously affect performance for a few minutes. However, this “worst-case” scenario is exactly what you want when you optimize.

Viewing Query Cost

There are a number of ways to capture the amount of disk io, cpu use and execution time of a query. What has proven simplest for me is to open a query window in Management Studio (SSMS) to use for optimization. I first execute the statement ‘SET STATISTICS IO ON’. This turns the Messages tab in SSMS into a record of the i\o used by any query executed from that window.

You can also execute ‘SET STATISTICS TIME ON’ to view cpu usage and query execution time if you like. I normally don’t do this because the cpu time and execution time that it reports are not things that are directly changed by indexing, as I described above.

Now after setting STATISTICS IO, if I execute the following query in this window:

select h.SalesOrderID, d.LineTotal
from Sales.SalesOrderHeader h
join Sales.SalesOrderDetail d
on h.SalesOrderID = d.SalesOrderID
where unitpricediscount = 0.30
and h.Status = 5

The Messages tab in SSMS will look like this (word wrap in the newsletter format garbles it a bit):

(61 row(s) affected)

Table ‘SalesOrderHeader’. Scan count 0, logical reads 195, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table ‘SalesOrderDetail’. Scan count 1, logical reads 1240, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Normally, I also click the ‘Show Actual Query Plan’ button on the task bar or select it in the ‘Query’ menu. But examining query plans is a large subject in itself. We will tackle query plans in detail in another article. For now, we will ignore them except to mention that they are the best way to see what indexes are being used by the query.

Identify what needs to be optimized

Sometimes you know what code is causing the problem. For example, a scheduled job executes every four hours and every time it runs you see query timeouts and get complaints from your users. There is no mystery about the code that is causing the problem.

Then there is the more general case when everything is running slower than the users would like. To help you decide where to start, the query below aggregates execution statisticss for the 50 queries that use the most resources on your SQL Server.

By default the results are ordered by total logical reads from the highest resource users to the lowest. The multiple ORDER BY clauses are commented out but you can comment and uncomment to get different views on resource usage by query.

USE MASTER;
GO;
SELECT TOP 50 t.[text] AS [Batch],
SUBSTRING(t.[text], (qs.[statement_start_offset]/2) + 1,
((CASE qs.[statement_end_offset]
WHEN -1 THEN DATALENGTH(t.[text]) ELSE qs.[statement_end_offset] END
– qs.[statement_start_offset])/2) + 1) AS [Statement]
, qs.[execution_count] AS [Count]
, qs.[total_worker_time] AS [Tot_CPU], (qs.[total_worker_time] /
qs.[execution_count]) AS [Avg_CPU]
, qs.[total_physical_reads] AS [Tot_Phys_Reads],
(qs.[total_physical_reads] / qs.[execution_count]) AS [Avg_Phys_Reads]
, qs.[total_logical_writes] AS [Tot_Logic_Writes],
(qs.[total_logical_writes] / qs.[execution_count]) AS [Avg_Logic_Writes]
, qs.[total_logical_reads] AS [Tot_Logic_Reads],
(qs.[total_logical_reads] / qs.[execution_count]) AS [Avg_Logic_Reads]
, qs.[total_clr_time] AS [Tot_CLR_Time], (qs.[total_clr_time] /
qs.[execution_count]) AS [Avg_CLR_Time]
, qs.[total_elapsed_time] AS [Tot_Elapsed_Time], (qs.[total_elapsed_time]
/ qs.[execution_count]) AS [Avg_Elapsed_Time]
, qs.[last_execution_time] AS [Last_Exec], qs.[creation_time] AS [Creation Time]
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS t
— ORDER BY [Tot_CPU] DESC
— ORDER BY [Avg_CPU] DESC
— ORDER BY [Tot_Phys_Reads] DESC
— ORDER BY [Tot_Logic_Writes] DESC
ORDER BY [Tot_Logic_Reads] DESC
— ORDER BY [Avg_Logic_Reads] DESC
— ORDER BY [Tot_CLR_Time] DESC
— ORDER BY [Tot_Elapsed_Time] DESC
— ORDER BY [Count] DESC

The result set from this query is an ordered list of the SQL statements having the biggest overall impact on your performance. You can look on it as a to-do list.

Note that the first Column, ‘Batch’ is the code for the entire stored procedure, trigger or sql batch. The ‘Statement’ column contains just that part of the code to which the statistics apply. This is very handy for quickly finding the problem code within a stored procedure.

Sorting to Find High Impact Code

The default sort by total logical reads shows us the code with the most overall impact on the server. However, your problem may be less frequently run queries that have a huge impact on performance each time they run rather than those who have the highest cumulative cost.

For that, you might want to sort by average logical reads. This will highlight the show-stoppers.

A Quick Review of Optimization Basics

Just for a refresher, we will run through a list of often-overlooked optimization basics:

Cardinality is simply the number of distinct values that appear in a column. there’s nothing difficult about that. The practical importance of cardinality is that we use it to calculate Selectivity.

Selectivity is a measure of how many rows (what percent of all the rows in the table) would be returned by a filter on a single value for the column. Selectivity of a column is calculated as (total number of rows in table)/( number of unique values in column).

If a column has low selectivity (any given value will return a high number of rows) then the optimizer may not use an index at all but might prefer a full table scan.

Selectivity is very important when considering a nonclustered index. If the selectivity is such that the filter will return more than a small percentager of the rows in a table, it is very unlikely the nonclustered index will ever be used.

Optimization begins at the FROM clause. There is not much that you can put in the SELECT clause (except a subselect statement) that will effect your optimization plans.

Since optimization begins at the FROM clause, you can optimize searched UPDATE and searched DELETE statements without having to actually delete or update rows in the table. Simply rewrite the query as a SELECT statement using the same joins and the same WHERE clause.

If there are joins between two or more tables, check to see that columns on both sides of the join are indexed. One side of the join is generally the primary key of a parent table. The PK always has a unique index but the column on the child table side of the join (the Foreign Key) does not have an index created automatically like the PK. You should check to make sure the child side of the joins are indexed unless the table with the Foreign Key is very small.

Indexing unindexed search arguments in the WHERE clause is generally the most important part of optimization and the part you look at first.

A Simple Example

We will use the following query for our optimization

select h.SalesOrderID, d.LineTotal
from Sales.SalesOrderHeader h
join Sales.SalesOrderDetail d
on h.SalesOrderID = d.SalesOrderID
where unitpricediscount = 0.30
and h.Status = 5

This query is trivial because the tables in the sample databases are so small. You will have to exercise your imagination to view this query as a showstopper. But it does give us a chance to illustrate some of the techniques you might use for optimization. Let’s begin.

Using sp_helpindex we find out that neither of the search arguments UnitPricediscount or Status are indexed.

USE AdventureWorks;
Exec Sp_helpindex ‘Sales.SalesOrderHeader’;
Exec Sp_helpindex ‘Sales.SalesOrderDetail’;

This is a promising start but clearly we need a little more information about the tables before we go throwing indexes at the problem. This is where the more abstract concepts like selectivity and cardinality come into play.

Here is a quick way to get the cardinality of the Status column.

select count(distinct Status) from sales.salesorderheader

We find that there is only one distinct value for Status. All records in SalesOrderHeader contain the same value for Status, that is the number 5. This means that an index on Status is useless for optimizing our query.

We turn to UnitPriceDiscount and run the same sort of query. We find that there are only 9 unique values in this column. This is better than Status but it is still marginal. Will a nonclustered index help us?

Using ‘select count(*) from sales.salesorderdetail’, we find that there are 121,317 records in the table

A bit of quick division gives us the average selectivity of the column. We divide 121317 by 9 and find that on average a given value for UnitPriceDiscount will return 13,479 rows. That is about 12% of the entire table and is getting near the point where a clustered index scan of the entire table will be preferred by the optimizer rather than an index seek on a nonclustered index.

Fortunately the optimizer doesn’t rely on averages, it relies on statistics,.. histograms that give it much more granular information about data distribution.

We can make a rough approximation of this granularity by executing:

select UnitPriceDiscount, count(*) as [Count] from sales.salesorderdetail
group by unitpricediscount
order by count(*) desc

We get the number of rows returned for each of the 9 values in the table:

UnitPriceDiscount	Count of Rows
0.00	118035
0.02	1310
0.05	606
0.15	590
0.20	231
0.10	178
0.40	169
0.35	137
0.30	61

With this more detailed information we can see that our query is very selective for the value 0.30. When filtering on the value 0.30 the query will return no more that 61 rows.

Putting a nonclustered index on UnitPriceDiscount results in an efficient index seek on SalesOrderDetail for this query. Here is the information from the Messages tab after creating the index and executing the query again:

(61 row(s) affected)

Table ‘SalesOrderHeader’. Scan count 0, logical reads 195, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table ‘SalesOrderDetail’. Scan count 1, logical reads 198, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

You can see that the logical reads on the SalesOrderDetail table dropped from 1240 to 198. This is a very modest improvement in comparison with what we normally see when optimizing. However, it illustrates some important points.

We avoided creating a useless index by examining the cardinality of the data.

We found and created a useful index by digging more deeply into the selectivity of the column data.

We were able to measure the amount of improvement by comparing logical page reads before and after creating the index.

More About This Example

You rarely see a cardinality of 1 anywhere but in a sample database. If there are a number of possible values, you will usually find most of them represented in a production database.

However it is common to see counts heavily skewed toward a single value as we saw in the counts for UnitPriceDiscount. Well over 90% of the rows in the table had a discount of 0.0 resulting in very poor selectivity for that value. The rest of the values were very selective.

It is very common to find columns with a cardinality of 2. These columns are often overlooked for indexing because of the low cardinality. These are bit fields, yes/no type columns. If you just take a high level look at the cardinality and selectivity you might not consider them for an index. But if you take a closer look at the distribution of values as we did above for UnitPriceDiscount, you might find reasons to index them.

For example let’s consider a ‘shipping status’ column on an order table. It has two possible values, ‘shipped’ and ‘not shipped’

Most companies that have been in business for any significant amount of time will have many more shipped orders than unshipped orders. Even though the’ shipping status’ column has a cardinality of only 2, the selectivity on the ‘not shipped’ values makes it well worth having an index on this column if you often search for unshipped orders.

Conclusion

Once more I have taken on a huge topic and have run out of space before doing it justice. I have tried to hit some of the more important points but there is much more to be said on this subject.

I hope that this much of it has been helpful to you. We will take up the subject again in a future newsletter.

Additional menu