The following example will prove that the missing index feature will not consider creating filtered indexes.
/************************************************************************** missing index feature - limitations It does not suggest filtered indexes **************************************************************************/ DROP TABLE IF EXISTS [dbo].[missingIndex]; GO -- Create a table CREATE TABLE [missingIndex] ( [col1] int NOT NULL , [col2] int NULL ); GO ; -- Insert 100.000 test records into the table WITH [cte] AS ( SELECT [numbers].[col1] FROM ( VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9) ) AS [numbers] ([col1]) ) INSERT [dbo].[missingIndex] ( [col1] , [col2] ) SELECT [c1].[col1] , CASE WHEN [c1].[col1] < 9 THEN NULL ELSE [c1].[col1] END AS [col2] FROM [cte] AS [c1] CROSS JOIN [cte] AS [c2] CROSS JOIN [cte] AS [c3] CROSS JOIN [cte] AS [c4] CROSS JOIN [cte] AS [c5]; GO
-- Uneven distribution -- 90,000 NULL -- 10,000 9 SELECT COUNT(*), [col2] FROM [dbo].[missingIndex] GROUP BY [col2]; GO
-- Index recommendation from missing index dmv -- Costs : 0,268838 -- Table 'missingIndex'. Scan count 1, logical reads 211 -- The Query Processor estimates that implementing the following index could -- improve the query cost by 73.2966%. SELECT * FROM [dbo].[missingIndex] WHERE [col2] = 9 OPTION (QUERYTRACEON 8757); GO -- Index recommendation from missing index dmv -- Create a non clustered index including col1 CREATE NONCLUSTERED INDEX [missing_index_feature] ON [dbo].[missingIndex] ([col2]) INCLUDE ([col1]); GO -- 272 pages are allocated for the non clustered index SELECT * FROM [sys].[dm_db_index_physical_stats]( DB_ID() , OBJECT_ID('missingIndex', 'U'), NULL , NULL, 'detailed' ) WHERE [index_id] > 1; GO -- Flush pages from buffer CHECKPOINT; DBCC DROPCLEANBUFFERS; GO -- Costs : 0,0336894 -- Index Seek -- Table 'missingIndex'. Scan count 1, logical reads 30 SELECT * FROM [dbo].[missingIndex] WHERE [col2] = 9; GO -- Get the pages in buffer pool for the table missingIndex -- 29 pages are loaded into cache SELECT DB_NAME(DB_ID()) AS [db_name] , OBJECT_SCHEMA_NAME([object_id], DB_ID()) AS [schema_name] , OBJECT_NAME([object_id]) AS [object_name] , [file_id] , [page_id] , [page_level] , [page_type] , [row_count] , ([free_space_in_bytes]) , ([read_microsec]) AS [read_microsec] FROM [sys].[dm_os_buffer_descriptors] AS [bd] INNER JOIN ( SELECT [object_id], [index_id], [allocation_unit_id] FROM [sys].[allocation_units] AS [au] INNER JOIN [sys].[partitions] AS [p] ON [au].[container_id] = [p].[hobt_id] AND ( [au].[type] = 1 OR [au].[type] = 3 ) UNION ALL SELECT [object_id], [index_id], [allocation_unit_id] FROM [sys].[allocation_units] AS [au] INNER JOIN [sys].[partitions] AS [p] ON [au].[container_id] = [p].[partition_id] AND [au].[type] = 2 ) AS [obj] ON [bd].[allocation_unit_id] = [obj].[allocation_unit_id] WHERE [database_id] = DB_ID() AND [object_id] = OBJECT_ID('missingIndex', 'U'); GO -- Drop index DROP INDEX IF EXISTS [missing_index_feature] ON [missingIndex]; GO
-- Create a filtered on col2 -- Col2 IS NOT NULL -- Improvement : 94,38% CREATE NONCLUSTERED INDEX [filtered_index] ON [dbo].[missingIndex] ([col2] ASC) INCLUDE ([col1]) WHERE ([col2] IS NOT NULL); GO -- 28 pages are allocated for the non clustered filtered index SELECT * FROM [sys].[dm_db_index_physical_stats]( DB_ID() , OBJECT_ID('missingIndex', 'U'), NULL , NULL, 'detailed' ) WHERE [index_id] > 1; GO -- Flush pages from buffer CHECKPOINT; DBCC DROPCLEANBUFFERS; GO -- Costs : 0,0156153 -- Index Seek -- Table 'missingIndex'. Scan count 1, logical reads 30 SELECT * FROM [dbo].[missingIndex] WHERE [col2] = 9; GO -- Get the pages in buffer pool for the table missingIndex -- 29 pages are loaded into cache SELECT DB_NAME(DB_ID()) AS [db_name] , OBJECT_SCHEMA_NAME([object_id], DB_ID()) AS [schema_name] , OBJECT_NAME([object_id]) AS [object_name] , [file_id] , [page_id] , [page_level] , [page_type] , [row_count] , ([free_space_in_bytes]) , ([read_microsec]) AS [read_microsec] FROM [sys].[dm_os_buffer_descriptors] AS [bd] INNER JOIN ( SELECT [object_id], [index_id], [allocation_unit_id] FROM [sys].[allocation_units] AS [au] INNER JOIN [sys].[partitions] AS [p] ON [au].[container_id] = [p].[hobt_id] AND ( [au].[type] = 1 OR [au].[type] = 3 ) UNION ALL SELECT [object_id], [index_id], [allocation_unit_id] FROM [sys].[allocation_units] AS [au] INNER JOIN [sys].[partitions] AS [p] ON [au].[container_id] = [p].[partition_id] AND [au].[type] = 2 ) AS [obj] ON [bd].[allocation_unit_id] = [obj].[allocation_unit_id] WHERE [database_id] = DB_ID() AND [object_id] = OBJECT_ID('missingIndex', 'U'); GO -- Housekeeping DROP TABLE IF EXISTS [dbo].[missingIndex]; GO
The table I created contains of 100,000 records but has an uneven distribution on col2. Only 10,000 or 10% of the records has a value for col2.
SQL Server recommends to create an index on col2 which would improve the execution by 73%.
When we create the index suggested from the missing index feature, we can see an improvement. The engine uses the index, which reduces logical reads from 211 to 30 pages and the costs are down from 0.268838 to 0.0336894.
The new index allocates 272 pages in the leaf of the index B-Tree, which is approx. 2MB.
Worth to mention that SQL Server only saved 29 pages in the buffer pool because the engine was able to apply a predicate push down.
Dropping the existing index and creating a covered filtered index instead will have some interesting effects.
If we execute the query with the support of the filtered index again we can see the same plan as before. SQL Server uses an Index Seek operator to satisfy the query as before.
The number of pages SQL Server reads is the same as before, namely 30. Also, the number of pages in cache is the same as before. So from that perspective there is no benefit to replace the recommended index with the new filtered one.
But as the filtered index only includes the records where col2 is not null, we will see that the index size is reduced dramatically. The number of pages in the leaf of the filtered index is 28 which is equal to approx. 0.22MB only. Comparing to the former index this is a reduction of approx. 90%.
The costs of the query with the filtered index is reduced to 0.0156153 comparing to 0.0336894.
The reason why the query with the filtered index is considered cheaper is because SQL Server assumes that the engine must always read the pages from the disk rather than the buffer pool and we only have 28 pages for the filtered index in the data file. But as we have the same number of pages in the buffer pool for both indexes the costs of the query should be the same.
In summary we can see that a filtered index can save a lot of disk space and the maintenance for these indexes is much cheaper. Nevertheless, you have to keep in mind that a filtered index can be used only for the predicate defined in the index creation. If we use a predicate which is not part of the index filter SQL Server has to use a table scan instead. So, be careful using filtered indexes without knowing the expected distribution of the data.
I could show more examples of the limitations of the missing index feature, but I think these blog post series have made it clear that this feature is to be used with caution. I strongly recommend that you do not apply the suggested indexes without reviewing them. You should understand the suggestions as an indication and instead analyze the plan cache and query execution statistics with sound mind. Focus on the most expensive queries and if you are not sure the new index will improve query execution you can apply hypothetical indexes to avoid creating indexes that slow down the production environment.
It is worth noting that the Database Tuning Advisor does a much better job than the missing index function. For all the examples I’ve shown including the above, DTA suggested the correct index, but DTA has its own limitations.
I am a Microsoft Most Valuable Professional for Data Platform from Germany specialized in Microsoft SQL Server performance tuning, troubleshooting and T-SQL programming. You can meet me at international and national user groups and conferences.
Related Posts
- The limitations of the missing index feature – It suggests creating indexes on imprecise and not persisted columns
- The limitations of the missing index feature – It reports only include columns for some queries
- The limitations of the missing index feature – It does not specify an order for columns to be used in an index
- The limitations of the missing index feature – It does not consider trivial plans
- The limitations of the missing index feature – It cannot gather statistics for more than 500 missing index groups
inside-sqlserver is a friend of Redgate
Share the passion