The order in which tables are accessed by the query engine is a critical factor in query performance. Its importance is sometimes underestimated and join order is often overlooked when a query needs optimization. However, it can be argued that join order is the most important aspect of an execution plan. Mistakes in join order will result in a sub-optimal plan.
One reason for for overlooking join order while optimizing might be that most of the time the internal query optimizer in SQL Server gets the join order right.
The optimizer does not necessarily process joins in the order we place them in the FROM clause of the query. It is usually smart enough to re-arrange them in a more efficient order, at least for queries that only involve a few tables.
40,000 Ways to Get it Wrong
However, the number of possible join orders for a query increases factorially with the number of tables mentioned in the joins. For example, a query that joins 8 tables has 40,320 possible join orders. That rules out trial and error for all but the simplest queries.
The job of the query optimizer is not to come up with the most efficient execution plan. Its job is to come up with the most efficient execution plan that it can find in a very short amount of time. It cannot fully explore the cost of 40,000 different join orders in the time available. That is why it sometimes comes up with a sub-optimal join order in queries with a lot of joins.
The use of non-standard joins like those below will also pose big problems for the optimizer.
- Joining to a user defined table-valued function or table variable
- Joining to a subselect aliased as a table
- Using conditional logic in the join
Using any of these or any of a host of other join oddities can lead to a bad execution plan and a slow execution time. One reason is that the SQL Server optimizer is dependent on data distribution statistics and none of these pseudo-tables have statistics.
How Do I Change the Join Order?
There is nothing simple about dealing with bad join order. At a minimum you can avoid the use of non-standard joins as they are described above. That will work in many cases to allow the optimizer to come up with a better plan.
Of course you can force a join order with FORCEPLAN or a plan guide if you think the optimizer got it wrong. However, before you can do that you need to know what the optimal join order is.
Finding the correct order is neither easy nor fast on a complex query. At any rate, there is no room in a short article like this for a necessarily long tutorial. Instead I will point you to a source that I consider the best for understanding and fixing these critical join problems.
A few years ago, Dan Tow wrote SQL Tuning, a brilliant book on this subject. While a few years old, the book is as relevant now as it was when it was published and nothing is likely to change that any time soon.
Dan begins with three brief chapters that provide details specific to SQL Server, Oracle and DB2 ( I suspect that his editors made him do that to broaden the market for the book.) But the methodology he lays out in the rest of the book is platform independent. The heart of the book involves a process and an elegant system of notation that makes it much simpler to find the correct join order for the most complex query..
I have read a lot of SQL books, blogs, articles, etc. and only two stand out in my memory as being works of genius. This is one of them. It is not a book for beginners but if you have a basic understanding of database management systems this book will take you to another level. It costs about $18 for the eBook or $28 for the print book.
You can get it HERE.
Full Disclosure: I have no financial interest in the sale of this book. I recommend it on its merits.