Influencing the Execution Plan
I had a performance problem recently with SQL Server, and I went through the standard performance checklist, however it didn't solve the problem permanently. Sometimes it would perform well, but most times it was performing poorly. I knew the next step was to mess with the execution plan. This is something I really don't like.
You do not want to force SQL Server to use a particular execution plan, because SQL Server can pick different execution plans depending on how much data will be processed. When it processes a few rows, it will choose a plan that is optimized for a few rows (and typically use nested loops). If the same script processes a lot of rows, it will use a plan that is optimized for a lot of rows (and use merge joins or hash joins). By forcing SQL Server to use single execution plan, you prevent it from using the most efficient execution plan for different scenarios.
But what happens if SQL Server estimates the wrong number of rows? The worst thing it can do is estimate few rows, use an execution plan optimized for a few rows, and actually process a large number of rows. In this scenario, you will find a very slow query.
I found an easy fix for this situation. Use the OPTION(HASH JOIN, MERGE JOIN) modifier to any SELECT, INSERT, UPDATE, or DELETE statement. For instance:
UPDATE cust SET CustomerSourceID = th.SourceID FROM Customer cust INNER JOIN TransactionHeader th ON th.CustomerID = cust.CustomerID WHERE cust.CustomerSourceID IS NULL OPTION (HASH JOIN, MERGE JOIN)
The OPTION (HASH JOIN, MERGE JOIN) modifier does not allow SQL Server to use nested loops. Since nested loops are typically efficient for a small number of rows, this causes SQL Server to optimize your query for a large number of rows. Even if this query encounters a few rows, the plan will be moderately efficient.
The good thing about OPTION (HASH JOIN, MERGE JOIN) is
- It does not require a statement to be restructured.
- It will not likely introduce any bugs.
The bad thing about it is
- You prevent SQL Server from selecting the best execution plan for all scenarios. The plan will be optimized for a large number of rows.