Tsql when to use index?

Question

Ramshreya Bhagat · Answer

An index helps speed up SELECT queries and WHERE clauses, but it slows down data input, with UPDATE and INSERT statements. Indexes can be created or dropped with no effect on the data.

Esther xarwj · Answer

In this article, we will discuss the most important points that we should consider when designing an optimal SQL index. Before going through the index design procedure, let us revise the SQL Server index concept.

SQL index is considered as one of the most important factors in the SQL Server performance tuning field. It helps in speeding up the queries by providing swift access to the requested data, called index seek operation, instead of scanning the whole table to retrieve a few records. It works similar to the book’s index that helps in identifying the location of each unique word, by providing the page where you can find that word, rather than spending the whole weekdays reading the book to check a specific subject or identifying that word. In other words, the existence of that index will save time and resources.

SQL Server provides us with two main types of indexes. The clustered index that is used to store the whole table data based on a select index key consists of one or multiple columns, with the ability to create only one clustered index per each table. The clustered index existence covert the table from an unsorted heap table to a sorted clustered table.

The second main type of SQL Server indexes is the non-clustered index in which the leaf nodes of that index stores only the index key values with a pointer to the storage location of that rows in the main heap table or the clustered index, with the ability to create up to 99 non-clustered indexes per each table.

SQL Server provides us also with other special purposes SQL indexes, derived from the clustered and non-clustered types, that can help in improving the performance of the T-SQL queries. These indexes include the Unique index, Filtered index, Spatial Index, XML index, Clomunstore index, Full-Text index, and Hash index.

After creating the index, we need also to monitor that index usage to make sure that it still efficient and useful for us. This can be performed by gathering statistical information about the indexes and its usage then perform the proper maintenance operation on these indexes to keep it in a healthy state.

The target of the SQL Server index design task is to have an index that SQL Server Query Optimizer will choose to enhance the performance of the submitted queries. I used to describe the index, in all my articles and sessions, as a double-edged sword. In this way, I will make sure that we are not blaming the index for our mistakes. The index will be our superhero and improve the performance of our queries if we design it in a correct way. But if that index is poorly designed, it will cause performance degradation in our queries and slow down the data retrieval process. In other words, the absence of poorly designed indexes is better than its existence.

The process of selecting the right SQL Server index that fits your database and workload requirements is not an easy mission, but also not impossible mission. In this process, you need to balance between the index gain in the shape of speeding up the data retrieval operation and the index overhead on the data insertion and modification operations.

To help you in designing a proper index that SQL Server Query Optimizer will take advantage of to enhance your queries performance, we will discuss here the top five points that you need to consider when planning to create an index.

In order to design a proper index, you need to study the characteristics of the database on which the SQL Server index will be created. If the database is created to handle the Online Transaction Processing (OLTP) workload, with a large number of inserting and data modification queries, it is recommended not to overload the database with a large number of indexes. This is due to the fact that inserting, updating or deleting any row on the underlying table will also require reflecting the same changes to all related indexes in that table. So, you should create the minimum possible number of indexes in the OLTP tables with the least possible number of columns participating in the index’s key. In this way, we can take advantage of the created SQL indexes in speeding up the data retrieval process with minimal overhead on the data modification operations.

If the database is created to handle Online Analytical Processing (OLAP) workload, which is used in Data Warehouse as a part of the Business Intelligence structure, most of the workload will be in the shape of SELECT queries to retrieve a large amount of analytical data for analysis or reporting purposes, and a small number of data modification queries. In this case, you can create a large number of SQL Server indexes, adding all required columns as index key or non-key columns to enhance the performance of the SELECT queries and get the requested data faster.

Another thing to consider when indexing a database table is the size of the table. If the table is small with less than 1000 pages, no performance enhancement can be gained from indexing that table, as SQL Server Query Optimizer will prefer scanning the whole table rather than examining the SQL index and try to create the best possible plan. In other words, this index on the small table will not be used and will have overhead on the table as it should be maintained when the table is changed.

You need also to have a look at the database views and check the ones that contain multiple joins and aggregations and create indexes on these views to enhance the performance of reading from it.

Studying the queries that are hitting the database tables very frequently, by checking with the system developer or using the profiling tools such as SQL Profiler or Extended Events, will help in designing the SQL Server index that helps more in enhancing the overall system performance.

After getting statistics about the frequently executed queries, we should check the columns that are used in the predicates and join conditions in these queries and create the proper index, by adding all necessary columns to the index to cover the frequently executed query and avoid any unnecessary column, to speed up the data retrieval operation.

As a SQL Server developer, it is recommended to write data insertion and modification queries to insert, update or delete as many rows as possible in the same query, rather than writing multiple queries. This will help in reducing the overhead of the index on the data modification statement, where all these changes performed on the table will be replicated to the SQL index as one-shot when executed as a single query.

After studying the characteristic of the frequently executed query that we need to enhance and having a list of columns to participate in the index key, we need to consider some points when choosing which column we should add to the index.

The first point is the column characteristic. Not all data types are recommended to be used as an index key. For example, the best candidate data type for the SQL Server index is the integer column due to its small size. On the other hand, columns with text, ntext, image, varchar(max), nvarchar(max), and varbinary(max) data types cannot participate in the index key. However, most of it still can be added to the non-clustered non-key columns. A column with XML data type can be added only to an XML index. In addition, a column with UNIQUE and NOT NULL values will be a good candidate, due to its high selectivity level, as an index key column that makes the index more useful.

The second point is the location of the column in the query. For instance, the columns used in the WHERE clause, the JOIN prediction, LIKE and the ORDER BY clause are the best candidate columns to be indexed. In addition, indexing the computed columns and the foreign key columns will enhance the performance of the queries that read from these columns.

An important point to consider after selecting the proper columns to be involved in the index key is the order of the columns in the index key, especially when the key consists of multiple columns. Try to place the columns that are used in the query conditions first in the SQL Server index key. Also, try to define the columns sorting criteria, ascending or descending, in a way that matches the order used in the ORDER BY clause in your query. In this way, you will overcome the SORT operator high overhead, enhancing the performance of the query.

To that step, we have made a decision that we need to create an index in a specific database table to cover a specific query that is called very frequently, and we need to add candidate column(s) to the index key. We need now to decide which SQL index type fits the query requirements. In other words, we need to specify if we should create a clustered or non-clustered index, a unique or non-unique index, columnstore, or rowstore index. All these decisions will be made based on the query coverage and enhancements requirement.

As mentioned previously, SQL Server provides us with different types of special purposes indexes that we can use to enhance the performance of the queries. For example, try to use a Filtered index on the columns that have well-defined data subsets, such as Sparse columns with mostly NULL values.

It is recommended to start indexing the table by creating a clustered index, that covers the column(s) called very frequently, which will convert it from the heap table to a sorted clustered table, then create the required non-clustered indexes that cover the remaining queries in the system. In this way, the non-clustered indexes will be built over the SQL Server clustered index and the pointers on the leaf level nodes of the non-clustered indexes will also point to the location of the row in the sorted clustered index.

If the clustered index is created on a table with clustered indexes already exist, all the non-clustered indexes will be dropped and created again to change the pointers in the leaf level nodes that were pointing to the heap table to point to the newly-created clustered index. So that, creating it in the correct order will overcome the overhead of recreating the non-clustered index again.

When a SQL Server index is created, it will be stored in the same filegroup where the main table is created. The partitioned clustered index and the non-clustered index can be stored on the same filegroup as the main table or on a different filegroup.

Selecting the proper storage criteria for the index during the design phase will help in improving the query performance by increasing the I/O performance. For instance, creating the non-clustered index on a filegroup located in a different disk drive than the disk drive where the main table is created will improve the performance of the queries that use that non-clustered index, as it will not be affected by the concurrent reading of the data and SQL index pages, that are spread across different disks, which will be performed on different disk drives.

In addition, the clustered and non-clustered indexes, that are created over large tables, can be partitioned across multiple filegroups, with each filegroup stored on a separate disk drive, improving the concurrent data access and retrieval operations, due to the fact that the data is distributed over different disk drives within the SQL index and the Query Optimizer will process only the partitions that the query will access, excluding all other partitions.

Another important storage concept that should be considered also is the FILLFACTOR option, which is an option that can be defined when creating or rebuilding an index, with its value between 0 and 100, that specifies the percentage of space that will be filled on each leaf-level data page in the created index. For example, setting the FILLFACTOR value to 80% will leave 20% of each page empty during the SQL Server index creation or rebuilding process, and this 20% percent will help when a new data is inserted, or an existing data is modified but not fit in the current space, where the data will be inserted in that free space instead of splitting the current page into multiple pages causing index fragmentation issue that will degrade the index performance with time. Fill factor will help in enhancing the performance of the T-SQL queries and minimize the amount of index storage and index maintenance overhead.

It is also recommended to create narrow indexes with the least possible number of useful columns, rather than creating wide indexes with many unnecessary columns, as it requires less disk space and will have less SQL Server index maintenance overhead.

All the previously mentioned points will help in designing the most optimal index that enhance the performance of the T-SQL queries, but it is very important to test the index first on a development environment before creating it on the production environment and make sure that it is useful for your workload and keep monitoring it and marinating it once created on the production environment.

Roddy Salfate · Answer

Indexing makes columns faster to query by creating pointers to where data is stored within a database.

Imagine you want to find a piece of information that is within a large database. To get this information out of the database the computer will look through every row until it finds it. If the data you are looking for is towards the very end, this query would take a long time to run.

Visualization for finding the last entry:

If the table was ordered alphabetically, searching for a name could happen a lot faster because we could skip looking for the data in certain rows. If we wanted to search for “Zack” and we know the data is in alphabetical order we could jump down to halfway through the data to see if Zack comes before or after that row. We could then half the remaining rows and make the same comparison.

This took 3 comparisons to find the right answer instead of 8 in the unindexed data.

Indexes allow us to create sorted lists without having to create all new sorted tables, which would take up a lot of storage space.

An index is a structure that holds the field the index is sorting and a pointer from each record to their corresponding record in the original table where the data is actually stored. Indexes are used in things like a contact list where the data may be physically stored in the order you add people’s contact information but it is easier to find people when listed out in alphabetical order.

Let’s look at the index from the previous example and see how it maps back to the original Friends table:

We can see here that the table has the data stored ordered by an incrementing id based on the order in which the data was added. And the Index has the names stored in alphabetical order.

There are two types of databases indexes:

Both clustered and non-clustered indexes are stored and searched as B-trees, a data structure similar to a binary tree. A B-tree is a “self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time.” Basically it creates a tree-like structure that sorts data for quick searching.

Here is a B-tree of the index we created. Our smallest entry is the leftmost entry and our largest is the rightmost entry. All queries would start at the top node and work their way down the tree, if the target entry is less than the current node the left path is followed, if greater the right path is followed. In our case it checked against Matt, then Todd, and then Zack.

To increase efficiency, many B-trees will limit the number of characters you can enter into an entry. The B-tree will do this on it’s own and does not require column data to be restricted. In the example above the B-tree below limits entries to 4 characters.

Clustered indexes are the unique index per table that uses the primary key to organize the data that is within the table. The clustered index ensures that the primary key is stored in increasing order, which is also the order the table holds in memory.

The clustered index will be automatically created when the primary key is defined:

Once filled in, that table would look something like this:

The created table, “friends”, will have a clustered index automatically created, organized around the Primary Key “id” called “friends_pkey”:

When searching the table by “id”, the ascending order of the column allows for optimal searches to be performed. Since the numbers are ordered, the search can navigate the B-tree allowing searches to happen in logarithmic time.

However, in order to search for the “name” or “city” in the table, we would have to look at every entry because these columns do not have an index. This is where non-clustered indexes become very useful.

Non-clustered indexes are sorted references for a specific field, from the main table, that hold pointers back to the original entries of the table. The first example we showed is an example of a non-clustered table:

They are used to increase the speed of queries on the table by creating columns that are more easily searchable. Non-clustered indexes can be created by data analysts/ developers after a table has been created and filled.

Note: Non-clustered indexes are not new tables. Non-clustered indexes hold the field that they are responsible for sorting and a pointer from each of those entries back to the full entry in the table.

You can think of these just like indexes in a book. The index points to the location in the book where you can find the data you are looking for.

Non-clustered indexes point to memory addresses instead of storing data themselves. This makes them slower to query than clustered indexes but typically much faster than a non-indexed column.

You can create many non-clustered indexes. As of 2008, you can have up to 999 non-clustered indexes in SQL Server and there is no limit in PostgreSQL.

To create an index to sort our friends’ names alphabetically:

This would create an index called “friends_name_asc”, indicating that this index is storing the names from “friends” stored alphabetically in ascending order.

Note that the “city” column is not present in this index. That is because indexes do not store all of the information from the original table. The “id” column would be a pointer back to the original table. The pointer logic would look like this:

In PostgreSQL, the “\d” command is used to list details on a table, including table name, the table columns and their data types, indexes, and constraints.

The details of our friends table now look like this:

Query providing details on the friends table: \d friends;

Looking at the above image, the “friends_name_asc” is now an associated index of the “friends” table. That means the query plan, the plan that SQL creates when determining the best way to perform a query, will begin to use the index when queries are being made. Notice that “friends_pkey” is listed as an index even though we never declared that as an index. That is the clustered index that was referenced earlier in the article that is automatically created based off of the primary key.

We can also see there is a “friends_city_desc” index. That index was created similarly to the names index:

This new index will be used to sort the cities and will be stored in reverse alphabetical order because the keyword “DESC” was passed, short for “descending”. This provides a way for our database to swiftly query city names.

After your non-clustered indexes are created you can begin querying with them. Indexes use an optimal search method known as binary search. Binary searches work by constantly cutting the data in half and checking if the entry you are searching for comes before or after the entry in the middle of the current portion of data. This works well with B-trees because they are designed to start at the middle entry; to search for the entries within the tree you know the entries down the left path will be smaller or before the current entry and the entries to the right will be larger or after the current entry. In a table this would look like:

Comparing this method to the query of the non-indexed table at the beginning of the article, we are able to reduce the total number of searches from eight to three. Using this method, a search of 1,000,000 entries can be reduced down to just 20 jumps in a binary search.

Indexes are meant to speed up the performance of a database, so use indexing whenever it significantly improves the performance of your database. As your database becomes larger and larger, the more likely you are to see benefits from indexing.

When data is written to the database, the original table (the clustered index) is updated first and then all of the indexes off of that table are updated. Every time a write is made to the database, the indexes are unusable until they have updated. If the database is constantly receiving writes then the indexes will never be usable. This is why indexes are typically applied to databases in data warehouses that get new data updated on a scheduled basis(off-peak hours) and not production databases which might be receiving new writes all the time.

NOTE: The newest version of Postgres (that is currently in beta) will allow you to query the database while the indexes are being updated.

To test if indexes will begin to decrease query times, you can run a set of queries on your database, record the time it takes those queries to finish, and then begin creating indexes and rerunning your tests.

To do this, try using the EXPLAIN ANALYZE clause in PostgreSQL.:

Which on my small database yielded:

This output will tell you which method of search from the query plan was chosen and how long the planning and execution of the query took.

Only create one index at a time because not all indexes will decrease query time.

If adding an index does not decrease query time, you can simply remove it from the database.

To remove an index use the DROP INDEX command:

The outline of the database now looks like:

Which shows the successful removal of the index for searching names.

Lalitha Mookhey · Answer

Indexes are special lookup tables that the database search engine can use to speed up data retrieval. Simply put, an index is a pointer to data in a table. An index in a database is very similar to an index at the end of a book.

For example, if you want to reference all the pages in a book that discuss a certain topic, you first refer to the index, which lists all topics alphabetically and are then referred to one or more specific page numbers.

An index helps speed up SELECT queries and WHERE clauses, but it slows down data input, with UPDATE and INSERT statements. Indexes can be created or dropped with no effect on the data.

Creating an index involves the CREATE INDEX statement, which allows you to name the index, to specify the table and which column or columns to index, and to indicate whether the index is in ascending or descending order.

Indexes can also be unique, similar to the UNIQUE constraint, in that the index prevents duplicate entries in the column or combination of columns on which there's an index.

Following is the basic syntax of CREATE INDEX.

A single-column index is one that is created based on only one table column. Following is the basic syntax.

Unique indexes are used not only for performance, but also for data integrity. A unique index does not allow any duplicate values to be inserted into the table. Following is the basic syntax.

A composite index is an index on two or more columns of a table. Following is the basic syntax.

Whether to create a single-column index or a composite index, take into consideration the column(s) that you may use very frequently in a query's WHERE clause as filter conditions.

Should there be only one column used, a single-column index should be the choice. Should there be two or more columns that are frequently used in the WHERE clause as filters, the composite index would be the best choice.

Implicit indexes are indexes that are automatically created by the database server when an object is created. Indexes are automatically created for primary key constraints and unique constraints.

An index can be dropped using MS SQL SERVER DROP command. Care should be taken when dropping an index because performance may be slowed or improved.

Following is the basic syntax.

Although indexes are intended to enhance the performance of databases, there are times when they should be avoided. The following guidelines indicate when the use of an index should be reconsidered −

Alle Beaulieu · Answer

A SQL index is used to retrieve data from a database very fast. Indexing a table or view is, without a doubt, one of the best ways to improve the performance of queries and applications.

A SQL index is a quick lookup table for finding records users need to search frequently. An index is small, fast, and optimized for quick lookups. It is very useful for connecting the relational tables and searching large tables.

SQL indexes are primarily a performance tool, so they really apply if a database gets large. SQL Server supports several types of indexes but one of the most common types is the clustered index. This type of index is automatically created with a primary key. To make the point clear, the following example creates a table that has a primary key on the column “EmployeeId”:

You’ll notice in the create table definition for the “EmployeePhoto” table, the primary key at the end of “EmployeeId” column definition. This creates a SQL index that is specially optimized to get used a lot. When the query is executed, SQL Server will automatically create a clustered index on the specified column and we can verify this from Object Explorer if we navigate to the newly created table, and then the Indexes folder:

Notice that not only creating a primary key creates a unique SQL index. The unique constraint does the same on the specified columns. Therefore, we got one additional unique index for the “MyRowGuidColumn” column. There are no remarkable differences between the unique constraint and a unique index independent of a constraint. Data validation happens in the same manner and the query optimizer does not differentiate between a unique SQL index created by a constraint or manually created. However, a unique or primary key constraint should be created on the column when data integrity is the objective because by doing so the objective of the index will be clear.

So, if we use a lot of joins on the newly created table, SQL Server can lookup indexes quickly and easily instead of searching sequentially through potentially a large table.

SQL indexes are fast partly because they don’t have to carry all the data for each row in the table, just the data that we’re looking for. This makes it easy for the operating system to cache a lot of indexes into memory for faster access and for the file system to read a huge number of records simultaneously rather than reading them from the disk.

Additional indexes can be created by using the Index keyword in the table definition. This can be useful when there is more than one column in the table that will be searched often. The following example creates indexes within the Create table statement:

This time, if we navigate to Object Explorer, we’ll find the index on multiple columns:

We can right-click the index, hit Properties and it will show us what exactly this index spans like table name, index name, index type, unique/non-unique, and index key columns:

We must briefly mention statistics. As the name implies, statistics are stat sheets for the data within the columns that are indexed. They primarily measure data distribution within columns and are used by the query optimizer to estimate rows and make high-quality execution plans.

Therefore, any time a SQL index is created, stats are automatically generated to store the distribution of the data within that column. Right under the Indexes folder, there is the Statistics folder. If expanded, you’ll see the sheet with the same specified name as we previously did to our index (the same goes for the primary key):

There is not much for users to do on SQL Server when it comes to statistics because leaving the defaults is generally the best practice which ultimately auto-creates and updates statistics. SQL Server will do an excellent job with managing statistics for 99% of databases but it’s still good to know about them because they are another piece of the puzzle when it comes to troubleshooting slow running queries.

Also worth mentioning are selectivity and density when creating SQL indexes. These are just measurements used to measure index weight and quality:

These two are proportional one to another and are used to measure both index weight and quality. Essentially how this works in the real world can be explained in an artificial example. Let’s say that there’s an Employees table with 1000 records and a birth date column that has an index on it. If there is a query that hits that column often coming either from us or application and retrieves no more than 5 rows that means that our selectivity is 0.995 and density is 0.005. That is what we should aim for when creating an index. In the best-case scenario, we should have indexes that are highly selective which basically means that queries coming at them should return a low number of rows.

When creating SQL indexes, I always like to set SQL Server to display information of disk activity generated by queries. So the first thing we can do is to enable IO statistics. This is a great way to see how much work SQL Server has to do under the hood to retrieve the data. We also need to include the actual execution plan and for that, I like to use a SQL execution plan viewing and analysis tool called ApexSQL Plan. This tool will show us the execution plan that was used to retrieve the data so we can see what SQL indexes, if any, are used. When using ApexSQL Plan, we don’t really need to enable IO statistics because the application has advanced I/O reads stats like the number of logical reads including LOB, physical reads (including read-ahead and LOB), and how many times a database table was scanned. However, enabling the stats on SQL Server can help when working in SQL Server Management Studio. The following query will be used as an example:

Notice that we also have the CHECKPOINT and DBCC DROPCLEANBUFFERS that are used to test queries with a clean buffer cache. They are basically creating a clean system state without shutting down and restarting the SQL Server.

So, we got a table inside the sample AdventureWorks2014 database called “SalesOrderDetail”. By default, this table has three indexes, but I’ve deleted those for the testing purposes. If expanded, the folder is empty:

Next, let’s get the actual execution plan by simply pasting the code in ApexSQL Plan and clicking the Actual button. This will prompt the Database connection dialog first time in which we have to choose the SQL Server, authentication method and the appropriate database to connect to:

This will take us to the query execution plan where we can see that SQL Server is doing a table scan and it’s taking most resources (56.2%) relative to the batch. This is bad because it’s scanning everything in that table to pull a small portion of the data. To be more specific, the query returns only 1.021 rows out of 121.317 in total:

If we hover the mouse over the red exclamation mark, an additional tooltip will show the IO cost as well. In this case, 99.5 percent:

So, 1.021 rows out of 121.317 returned almost instantly on the modern machine but SQL Server still has to do a lot of work and as data fills up in the table the query could get slower and slower over time. So, the first thing we have to do is create a clustered index on the “SalesOrderDetail” table. Bear in mind that we should always choose the clustered index wisely. In this case, we are creating it on the “SalesOrderID” and “SalesOrderDetailID” because we’re expecting so much data on them. Let’s just go ahead and create this SQL index by executing the query from below:

Actually, before we do that. Let’s quickly switch over to the IO reads tab and take a shot from there just so we have this information before doing anything:

After executing the above query, we will have a clustered index created by a primary key constraint. If we refresh the Indexes folder in Object Explorer, we should see the newly created clustered, unique, primary key index:

Now, this isn’t going to improve performance a great deal. As a matter of fact, if we run the same query again it will just switch from the table scan to a clustered index scan:

However, we paved the way for the future nonclustered SQL indexes. So, without further ado let’s create a nonclustered index. Notice that ApexSQL Plan determines missing indexes and create queries for (re)creating them from the tooltip. Feel free to review and edit the default code or just hit Execute to create the index:

If we execute the query again, SQL Server is doing a nonclustered index seek instead of the previous scan. Remember seeks are always better than scans:

Don’t let the number fools you. Even though some numbers are higher relative to the batch compared to the previous runs this doesn’t necessarily mean that it’s a bad thing. If we switch over to IO reads again and compare them to the previous results, just look at those reads going drastically down from 1.237 to 349, and 1.244 to 136. The reason this was so efficient is that SQL Server used only the SQL indexes to retrieve the data:

Indexing strategy guidelines

Poorly designed SQL indexes and a lack of them are primary sources of database and application performance issues. Here are a few indexing strategies that should be considered when indexing tables:

I hope this article on the SQL indexing strategy has been informative and I thank you for reading.

Nodelman vrisplr Real · Answer

Indexes should not be used on small tables. Indexes should not be used on columns that return a high percentage of data rows when used as a filter condition in a query's WHERE clause. Tables that have frequent, large batch update jobs run can be indexed.

Raam Bernert · Answer

Watch this week's video on YouTube

One of the things that the SQL Server query optimizer does is determine how to retrieve the data requested by your query.

Usually it does a pretty good job, which is a great because if it didn't then we'd be spending most of our days programming sorting and joining algorithms instead of having fun actually working with our data.

Sometimes the query optimizer has a lapse in judgement and createds a less-than-efficient plan, requiring us to step in and save the day.

One way to "fix" a poor performing plan is to use an index hint. While we normally have no control over how SQL Server retrieves the data we requested, an index hint forces the query optimizer to use the index specified in the hint to retrieve the data (hence, it's really more of a "command" than a "hint").

Sometimes when I feel like I'm losing control I like using an index hint to show SQL Server who's boss. I occasionally will also use index hints when debugging poor performing queries because it allows me to confirm whether using an alternate index would improve performance without having to overhaul my code or change any other settings.

While I like using index hints for short-term debugging scenarios, that's about the only time they should be used because they can create some pretty undesirable outcomes.

For example, let's say I have this nice simple query and index here:

This index was specifically created for a different query running on the Posts table, but it will also get used by the simple query above.

Executing this query without any hints causes SQL Server to use it anyway (since it's a pretty good index for the query), and we get decent performance: only 1002 logical reads.

I wish all of my execution plans were this simple.

Let's pretend we don't trust the SQL Server optimizer to always choose this index, so instead we force it to use it by adding a hint:

With this hint, the index will perform exactly the same: 1002 logical reads, a good index seek, etc...

But what happens if in the future a better index gets added to the table?

If we run the query WITHOUT the index hint, we'll see that SQL Server actually chooses this new index because it's smaller and we can get the data we need in only 522 logical reads:

This execution plan looks the same, but you'll notice the smaller, more data dense index is being used.

If we had let SQL Server do it's job, it would have given us a great performing query! Instead, we decided to intervene and hint (ie. force) it to use a sub-optimal index.

The above example is pretty benign - sure, without the hint SQL Server would have read about half as many pages, but this isn't a drastic difference in this scenario.

What could be disastrous is if because of the hint, the query optimizer decides to make a totally different plan that isn't nearly as efficient. Or if one day someone drops the hinted index, causing the query with the hint to down right fail:

Ask Sawal

Tsql when to use index?

Related Questions

More Questions

Contact