Missing Indexes

Index plays an important role in performance tuning. A proper index can improve the performance and a bad index can hamper the performance.

What Are Missing Indexes?

SQL Server keeps index statistics behind the scenes.  It knows how an index has been used, how often it is used, or if it is even being used at all.  But SQL Server doesn’t just monitor the indexes that we currently have on a database, it even keeps up with statistics for indexes that it wishes we had!

When SQL Server is processing a query, it makes a suggestion for an index that it believes will help that query run faster.  These are known as Missing Index Suggestions. 

Guidance and Recommendations

When SQL Server recommends a missing index, it does so based on the execution of that query at that specific point in time.  It doesn’t take into consideration any other factors such as current server load, or any existing indexes, so you should consider these index recommendations as just that, recommendations.  This is just for guidance. Please note that you should not create all the missing indexes this script suggests.

  • Do not just blindly create whatever indexes SQL Server is recommending without any prior analysis.
  • You should not create more than 10 indexes per table.
  • The script is a good starting point. You should pay attention to Average User Impact and Average Total User Cost when you are going to create an index.

Script

The below query will show missing index suggestions for the specified database.  It pulls information from the [sys].[dm_db_missing_index_groups] [sys].[dm_db_missing_index_group_stats] and [sys].[dm_db_missing_index_details] DMVs. 

You can also narrow it down to a specified table by un-commenting the WHERE clause statement and specifying the table name.  This script also generates a CREATE INDEX script for each record, so that you can take a better look at the index and decide if you want to use it.

Here is the script, which we can use to identify missing indexes on a database.

SELECT db.[name] AS [DatabaseName]

    ,id.[object_id] AS [ObjectID]

       ,OBJECT_NAME(id.[object_id], db.[database_id]) AS [ObjectName]

    ,id.[statement] AS [FullyQualifiedObjectName]

    ,id.[equality_columns] AS [EqualityColumns]

    ,id.[inequality_columns] AS [InEqualityColumns]

    ,id.[included_columns] AS [IncludedColumns]

    ,gs.[unique_compiles] AS [UniqueCompiles]

    ,gs.[user_seeks] AS [UserSeeks]

    ,gs.[user_scans] AS [UserScans]

    ,gs.[last_user_seek] AS [LastUserSeekTime]

    ,gs.[last_user_scan] AS [LastUserScanTime]

    ,gs.[avg_total_user_cost] AS [AvgTotalUserCost]  — Average cost of the user queries that could be reduced by the index in the group.

    ,gs.[avg_user_impact] AS [AvgUserImpact]  — The value means that the query cost would on average drop by this percentage if this missing index group was implemented.

    ,gs.[system_seeks] AS [SystemSeeks]

    ,gs.[system_scans] AS [SystemScans]

    ,gs.[last_system_seek] AS [LastSystemSeekTime]

    ,gs.[last_system_scan] AS [LastSystemScanTime]

    ,gs.[avg_total_system_cost] AS [AvgTotalSystemCost]

    ,gs.[avg_system_impact] AS [AvgSystemImpact]  — Average percentage benefit that system queries could experience if this missing index group was implemented.

    ,gs.[user_seeks] * gs.[avg_total_user_cost] * (gs.[avg_user_impact] * 0.01) AS [IndexAdvantage]

    ,’CREATE INDEX [IX_’ + OBJECT_NAME(id.[object_id], db.[database_id]) + ‘_’ + REPLACE(REPLACE(REPLACE(ISNULL(id.[equality_columns], ”), ‘, ‘, ‘_’), ‘[‘, ”), ‘]’, ”) + CASE

        WHEN id.[equality_columns] IS NOT NULL

            AND id.[inequality_columns] IS NOT NULL

            THEN ‘_’

        ELSE ”

        END + REPLACE(REPLACE(REPLACE(ISNULL(id.[inequality_columns], ”), ‘, ‘, ‘_’), ‘[‘, ”), ‘]’, ”) + ‘_’ + LEFT(CAST(NEWID() AS [nvarchar](64)), 5) + ‘]’ + ‘ ON ‘ + id.[statement] + ‘ (‘ + ISNULL(id.[equality_columns], ”) + CASE

        WHEN id.[equality_columns] IS NOT NULL

            AND id.[inequality_columns] IS NOT NULL

            THEN ‘,’

        ELSE ”

        END + ISNULL(id.[inequality_columns], ”) + ‘)’ + ISNULL(‘ INCLUDE (‘ + id.[included_columns] + ‘)’, ”) AS [ProposedIndex]

    ,CAST(CURRENT_TIMESTAMP AS [smalldatetime]) AS [CollectionDate]

FROM [sys].[dm_db_missing_index_group_stats] gs WITH (NOLOCK)

INNER JOIN [sys].[dm_db_missing_index_groups] ig WITH (NOLOCK) ON gs.[group_handle] = ig.[index_group_handle]

INNER JOIN [sys].[dm_db_missing_index_details] id WITH (NOLOCK) ON ig.[index_handle] = id.[index_handle]

INNER JOIN [sys].[databases] db WITH (NOLOCK) ON db.[database_id] = id.[database_id]

WHERE  db.[database_id] = DB_ID()

–AND OBJECT_NAME(id.[object_id], db.[database_id]) = ‘YourTableName’

ORDER BY ObjectName, [IndexAdvantage] DESC

OPTION (RECOMPILE);

Note: – It’s important to mention that these Missing Index statistics get reset every time SQL Server is restarted, so if you’ve recently rebooted the server you may not have many index stats accumulated yet.

References

https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-db-missing-index-group-stats-transact-sql?view=sql-server-ver15

https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-db-missing-index-groups-transact-sql?view=sql-server-ver15

https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-db-missing-index-details-transact-sql?view=sql-server-ver15

Max_Grant_Percent & Min_Grant_Percent Query Hints

In this article, we will look closely at the new query memory grant options that are available min_grant_percent and max_grant_percent in SQL Server 2012.

MAX_GRANT_PERCENT

A percentage value that specifies the maximum amount of memory that can be granted to a query. Range is 0.0 to 100.0. Float value is valid.

If the size of this max memory limit is smaller than the required memory to run a query, the required memory is granted to the query.

MIN_GRANT_PERCENT

A percentage value that specifies the minimum amount of memory that should be granted to a query. Range is 0.0 to 100.0. Float value is valid.

Note: These two new query memory grant options aren’t available for index creation or rebuild.

Demo

Let’s run the following SELECT statement with the Actual Execution Plan enabled:

SELECT * FROM dbo.TAT_TIME
WHERE WIP_STATUS = 'B'
ORDER BY TK_EMPL_UNO

This query returned around 15 million records 15,287,989.

Next, go to the execution plan and right click and select “Show Execution Plan XML …” It will open the execution plan as XML. Let’s focus on the memory grant information in the current execution plan.

<MemoryGrantInfo DesiredMemory="9451520" RequestedMemory="9451520" GrantWaitTime="0" GrantedMemory="9451520"

As you can notice here the desired memory is 9.45 GB and the granted memory is 9.45 GB.

Let’s use the max_grant_percent query hint and execute the same query now.

SELECT * FROM dbo.TAT_TIME
WHERE WIP_STATUS = 'B'
ORDER BY TK_EMPL_UNO OPTION(max_grant_percent = 50)

We have specified to grant only 50 percent of the maximum amount of memory to the query.

Let’s see the memory grant information for this query in the current execution plan.

<MemoryGrantInfo DesiredMemory="9451520" RequestedMemory="5422824" GrantWaitTime="0" GrantedMemory="5422824"

As you can notice here the desired memory is 9.45 GB but the granted memory is 5.42 GB only.

So the query still desires about 9.45 GB of memory, but since we applied max_grant_percent query hint, the query receives 50% of the maximum memory grant.

References

https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query?view=sql-server-2017

https://support.microsoft.com/en-ae/help/3107401/new-query-memory-grant-options-are-available-min-grant-percent-and-max

Best Practices for Creating Indexes

Poorly designed indexes and a lack of indexes are primary sources of database application bottlenecks. Designing efficient indexes is paramount to achieving good database and application performance. This article provides the best practices to help you design effective indexes to meet the needs of your application.

There are lot of factors that requires to be considered before creating/adding an index on a table.

  • The selection of the right indexes for a database and its workload is a complex balancing act between query speed and update cost. You may have to experiment with several different designs before finding the most efficient index.
  • SQL Server has to perform some maintenance when index keys are updated, so each additional index could impose a slight performance penalty on the performance of INSERT, UPDATE and DELETE statements. Always consider that the cost of index maintenance could be far less than the performance improvement you’ll reap for your SELECT statements. Always consider index read/write statistics. The performance gains should outweigh the losses.
  • Understand the characteristics of the database and the table itself. Consider the total number of writes and reads on the table before creating/adding an index.
  • Avoid over-indexing heavily updated tables and keep indexes narrow, that is, with as few columns as possible.
  • Indexing small tables (less than 64 kb or 100 rows) may not be optimal because it can take the query optimizer longer to traverse the index searching for data than to perform a simple table scan.
  • Peruse the built-in tools for analysing and recommending indexes but do not rely on them solely. Index tuning wizard, Database Engine Tuning Advisor (DETA), execution plans etc. are good tools for starting the index analysis. However, they only recommends indexes that are relevant for a particular workload you supply. These tools cannot reasonably estimate how frequently any particular query will be executed and certainly doesn’t know which queries are most important to your users. These tools allows you to choose from a number of tuning options and provides excellent recommendations for improving performance by creating indexes and statistics. Even so you still has to consider lot of other factors such as considering complete workload etc. before creating/adding an index.
  • While troubleshooting the queries performance and adding indexes, first understand the characteristics of the columns used in the queries. For example, an index is ideal for columns that have an integer data type and are also unique or non-null columns.
  • After you have determined that an index is appropriate for a query, you should select the type of index that best fits your situation.
  • Keep the length of the index key short for clustered indexes. Additionally, clustered indexes benefit from being created on unique or non-null columns.  Try to create the clustered index on the column which is used most frequently for retrieving data.
  • Create non-clustered indexes on the columns that are frequently used in predicates and join conditions in queries. However, you should avoid adding unnecessary columns. Adding too many index columns can adversely affect disk space and index maintenance performance.
  • Columns that are of the ntexttextimagevarchar(max)nvarchar(max), and varbinary(max)data types cannot be specified as index key columns. However, varchar(max)nvarchar(max)varbinary(max), and xml data types can participate in a non-clustered index as non-key index columns.
  • Consider using filtered indexes on columns that have well-defined subsets, for example sparse columns, columns with mostly NULL values, columns with categories of values, and columns with distinct ranges of values. A well-designed filtered index can improve query performance, reduce index maintenance costs, and reduce storage costs.
  • Consider indexing computed columns.
  • The referential integrity provided by foreign keys help provide the good data. Performance, on the other hand, can be hurt if foreign keys aren’t indexed properly. A foreign key ensures a child record such as an Order has a parent record such as a Customer. When you insert or update an Order the FK needs to verify that you associated it with a valid Customer, so it implicitly adds a join to the Customer table. When you update or delete a Customer then it has to verify that no Orders will be orphaned, so it implicitly adds a join to the Order table. These joins are where the performance complaints come from.
  • Indexes on views can provide significant performance gains when the view contains aggregations, table joins, or a combination of aggregations and joins.
  • Determine the optimal storage location for the index. A non-clustered index can be stored in the same file group as the underlying table, or on a different file group. The storage location of indexes can improve query performance by increasing disk I/O performance. For example, storing a non-clustered index on a file group that is on a different disk than the table file group can improve performance because multiple disks can be read at the same time.
  • Determine which index options might enhance performance when the index is created or maintained. For example, creating a clustered index on an existing large table would benefit from the ONLINE index option. The ONLINE option allows for concurrent activity on the underlying data to continue while the index is being created or rebuilt.
  • Consider the order of the columns if the index will contain multiple columns. The column that is used in the WHERE clause or participates in a join, should be placed first. Additional columns should be ordered based on their level of distinctness, that is, from the most distinct to the least distinct.
  • While defining indexes, you should consider whether the data for the index key column should be stored in ascending or descending order. Specifying the order in which key values are stored in an index is useful when queries referencing the table have ORDER BY clauses.
  • Don’t automatically accept the default value of 100 for the fill factor for your indexes. It may or may not best meet your needs. A high fill factor is good for seldom changed data, but highly Modified data needs a lower fill factor to reduce page splitting.
  • Analyse the existing indexes to see if there are superfluous ones due to the new ones that have been added. Be sure to eliminate duplicate indexes, that is, multiple indexes created on the same set of columns. Such indexes provide no additional benefit but use up extra disk space and might hinder performance of INSERT, UPDATE and DELETE statements.
  • Periodically, check and remove unused and redundant indexes.

References

https://docs.microsoft.com/en-us/sql/relational-databases/sql-server-index-design-guide

https://docs.microsoft.com/en-us/sql/relational-databases/performance/database-engine-tuning-advisor

https://docs.microsoft.com/en-us/sql/relational-databases/indexes/create-filtered-indexes

https://docs.microsoft.com/en-us/sql/relational-databases/indexes/indexes-on-computed-columns

https://docs.microsoft.com/en-us/sql/relational-databases/partitions/partitioned-tables-and-indexes

https://docs.microsoft.com/en-us/sql/relational-databases/indexes/set-index-options

https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-indexes-transact-sql

https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-index-columns-transact-sql

Choosing the Correct Column Order While Creating Indexes

In this article, we will look at the order of the columns in your index and how this order impacts query plans and performance.

Demo

For our demo, let’s take the dbo.BLT_BILL_AMT table which contains more than 10 million records of bill transaction amounts.

Say we want to count up all of the records for a specific accounting period and for a particular billing employee.

SELECT COUNT(1) FROM dbo.BLT_BILL_AMT WHERE PERIOD > 198003 AND BILL_EMPL_UNO = 65

Since there are no indexes on PERIOD and BILL_EMPL_UNO columns, SQL Server just chooses to do a clustered index scan as shown below.

23

Number of Rows Read: 10147854

Actual Number of Rows: 7374

Let’s create an index on PERIOD and BILL_EMPL_UNO columns and execute our SELECT query and observe the execution plan.

CREATE NONCLUSTERED INDEX [BLT_BILL_AMT0] ON [dbo].[BLT_BILL_AMT]
(
       [PERIOD] ASC,
       [BILL_EMPL_UNO] ASC
)
GO
SELECT COUNT(1) FROM dbo.BLT_BILL_AMT WHERE PERIOD > 198003 AND BILL_EMPL_UNO = 65
GO

24

SQL Server uses the new index. Let’s look at the properties on that index’s seek:

25

Number of Rows Read: 10146562

Actual Number of Rows: 7374

Because the data is organized by PERIOD first, SQL Server “seeks” to the PERIOD > 198003 – meaning, it seeks to PERIOD > 198003 (10146562 records) – but then it has to keep reading to find BILL_EMPL_UNO = 65 records.

From this we can conclude that, we have to put the most selective column first in the index, so that the number of possible matching rows is the smallest, i.e. we find the rows as quickly as possible.

So, let’s drop the index and re-create it with the column order as BILL_EMPL_UNO first and PERIOD as second.

DROP INDEX [BLT_BILL_AMT0] ON [dbo].[BLT_BILL_AMT]
GO
CREATE NONCLUSTERED INDEX [BLT_BILL_AMT0] ON [dbo].[BLT_BILL_AMT]
(
       [BILL_EMPL_UNO] ASC,
       [PERIOD] ASC
)
GO

And run our query again. SQL Server uses the new index. Let’s look at the properties on that index’s seek:

26

Number of Rows Read: 7374

Actual Number of Rows: 7374

That’s a nice, lightweight, fast seek.

In order to pick the right field order, you need to understand the selectivity of the data, the selectivity of your queries, and whether you’re doing equality searches or inequality searches, and what happens to the data after you retrieve it – like how it needs to be joined or ordered.

Summary

You can see from these simple examples above that the ordering of the columns in your index does have performance impact. The best practice that has been quite helpful when creating indexes is to make sure you are always working with the smallest result set possible. This means that your indexes should start with any columns in your WHERE clause followed by the columns in your JOIN condition. Additional columns should be ordered based on their level of distinctness, that is, from the most distinct to the least distinct.