Every once in a while I run into this SQL problem. I find that a database is designed with empty tables that are then loaded with data. That data is then processed in certain ways according to the purpose of the application and then the data is removed from the table by deletion, archiving or truncation, ready for another run of the same process. This is what I mean by using tables as buckets.
Sometimes this is a big performance problem and sometimes not, depending on a lot of things. First among those things is the code that does the processing.
When this bucket process is a problem it is often a problem that is difficult or impossible to fix without going back to square one of your database design.
There are lots of sub problems to work out but often the worst problems are with statistics and indexes.
As you probably know, indexes depend on statistics: No usable stats, no usable indexes, bad execution plan.
When the database option Auto_Update_Stats is enabled (it is enabled by default) then The rule for automatic updating of stats is triggered when 20% of the rows in a table plus 500 rows, have been modified, SQL Server considers the associated column and/or index statistics objects to be stale and won’t use them. It will instead update each invalidated object automatically the next time the query optimizer needs to use that statistic.
Let’s see what happens when this bucket process is being executed.
- The empty table is loaded with rows of data. that is a 100% change.
- AutoUpdate Stats kicks in.
- the data is being processed. It is a good bet that makes a lot of changes to the data.
- AutoUpdate Stats kicks in again.
- After the data is processed it is removed from the table. 100% change.
- AutoUpdate Stats kicks in again.
You see the problem?
There are some things we can do that might help. First, optimize the sql code that does the data processing as well as it can be optimized considering the situation.
We could disable Auto_Update_Stats but then the processing step is run without stats or usable indexes or a decent execution plan. A further problem is that the Auto_update_stats setting is database wide, so you are disabling it for all tables. bad mojo.
Another option is that we can manually or programatically update stats or rebuild the appropriate indexes at the appropriate points in this process.
If you think this might slow things down you are right.
There is a trace flag that changes the auto update limits for large databases but I have not had much luck with it in relation to this specific problem.
In the final analysis the best way I know of to deal with this problem is not to have it. Be careful to avoid it in your database design.