Implementing the right indexes will improve the execution time and IO reads for a Select, Update or Delete. Unfortunately sometimes it is not that easy to identify missing indexes without having a deep understanding about Query Execution Plans. That is why SQL Server gives us a hint about missing indexes in an Execution Plan.

Sounds beneficial but in this blog post I will proof that it is not always a good idea to rely on these index recommendations.

According to Microsoft, there are the several Limitations for Using the Missing Indexes Feature.

  • It is not intended to fine tune an indexing configuration.
  • It cannot gather statistics for more than 500 missing index groups.
  • It does not specify an order for columns to be used in an index.
  • For queries involving only inequality predicates, it returns less accurate cost information.
  • It reports only include columns for some queries, so index key columns must be manually selected.
  • It returns only raw information about columns on which indexes might be missing.
  • It can return different costs for the same missing index group that appears multiple times in XML Showplans.

First of all I am going to explain the first restriction which is the limit of 500 records in the sys.dm_db_missing_index_group_stats view.

/*******************************************************************************

	missing index feature - limitations
	It cannot gather statistics for more than 500 missing index groups.

********************************************************************************/

/*
	
	Statistics are gathered for a maximum of 500 missing index groups. 
	After this threshold is reached, SQL Server will flush a certain amount of
	missing index information from memory. 
	This threshold cannot be changed.

*/

USE AdventureWorks2017;
GO

-- Create a table with 600 columns
DROP TABLE IF EXISTS test;
GO

CREATE TABLE test
(
	col1 smallint NOT NULL
);
GO

DECLARE @i smallint = 2;
DECLARE @col nvarchar(200) = '';
WHILE @i <= 600
	BEGIN
		SET @col =
			'ALTER TABLE test ADD col' + CONVERT(varchar(4), @i)
			+ ' AS col1 % ' + CONVERT(varchar(4), @i) + ' PERSISTED NOT NULL';
		EXEC sp_executesql @col;
		SET @i += 1;
	END;
GO

SET NOCOUNT ON;
GO

-- Insert 1.000 records
INSERT test
	(
		col1
	)
SELECT TOP 1000
	ROW_NUMBER() OVER ( ORDER BY message_id ) AS col1
FROM sys.messages;
GO

-- Get the data 
SELECT TOP 1000 * FROM test;
GO

DECLARE @max smallint;
DECLARE @i smallint = 1;
DECLARE @sql nvarchar(4000) = '';
-- Get the number of missing index records
SELECT @max = 500 - COUNT(*) FROM sys.dm_db_missing_index_group_stats;

WHILE @i <= @max
	BEGIN
		SET @sql =
			N'SELECT TOP 1 col' + CONVERT(varchar(3), @i)
			+ N' FROM test WHERE col' + CONVERT(varchar(3), @i) + N' = '
			+ CONVERT(varchar(3), @i)
			+ N' OPTION (RECOMPILE, QUERYTRACEON 8757 );';
		EXEC sys.sp_executesql @sql;
		SET @sql = N'';
		SET @i += 1;
	END;
GO

The code above is generating a simple table with 600 columns and 1,000 records.
Finally we run a dynamic SQL statement n times, where n is the remaining number of missing index information SQL Server is capable to gather.

This is the first query SQL Server generates and executes:

SELECT col1 FROM test WHERE col1 = 1
OPTION ( RECOMPILE, QUERYTRACEON 8757 );
GO

This is the second query :

SELECT col2 FROM test WHERE col1 = 2
OPTION ( RECOMPILE, QUERYTRACEON 8757 );
GO

So I am sure you understand what the script is doing.

In the execution plan we can see that SQL Server complains about a missing index for each of the queries.

That is why we finally see 500 missing index records:


-- 500 missing index information
SELECT COUNT(*) FROM sys.dm_db_missing_index_group_stats 
GO

Now let us execute another 50 unique queries to exceed the threshold of 500 records.

-- Execute another 50 unique queries
DECLARE @max smallint;
DECLARE @i smallint;
DECLARE @sql nvarchar(4000) = '';
SELECT @i = COUNT(*) + 1 FROM sys.dm_db_missing_index_group_stats;
SELECT @max = @i + 50;

WHILE @i <= @max
	BEGIN
		SET @sql =
			N'SELECT TOP 1 col' + CONVERT(varchar(3), @i)
			+ N' FROM test WHERE col' + CONVERT(varchar(3), @i) + N' = '
			+ CONVERT(varchar(3), @i)
			+ N' OPTION (RECOMPILE, QUERYTRACEON 8757 );';
		EXEC sys.sp_executesql @sql;
		SET @sql = N'';
		SET @i += 1;
	END;
GO

For a specific time period you can see even 550 missing index records in sys.dm_db_missing_index_group_stats which is more than the limit of 500.


-- 550 missing index statistics
SELECT * FROM sys.dm_db_missing_index_group_stats 
GO

But after a while a background process flushes a specific portion of the memory to free up space for the next queries.


-- 480 missing index statistics
SELECT * FROM sys.dm_db_missing_index_group_stats 
GO

Consequently you will lose the information about these missing indexes and therefore the possibility to create these indexes.

The only way to avoid losing the information is to harden the missing index information to disk but even with this approach you will lose records because you cannot predict when the limit of 500 records is reached so it is impossible to determine the appropriate schedule for the job.

I am going to continue with the limitations of the missing index feature in future blogs and I am sure afterwards you will prefer other techniques to improve query execution.

Reader Rating: 0 (0 Votes)