Firstly, I should point out I don't have much knowledge on SQL Server indexes.
首先,我应该指出我对SQL Server索引没有太多的知识。
My situation is that I have an SQL Server 2008 database table that has a varchar(max) column usually filled with a lot of text.
我的情况是,我有一个SQL Server 2008数据库表,它的varchar(max)列通常包含大量文本。
My ASP.NET web application has a search facility which queries this column for keyword searches, and depending on the number of keywords searched for their may be one or many LIKE '%keyword%' statements in the SQL query to do the search.
我的ASP。NET web应用程序有一个搜索工具,它查询这个列的关键字搜索,根据搜索的关键字的数量,可以是一个或多个,比如SQL查询中的“%关键字%”语句来进行搜索。
My web application also allows searching by various other columns in this table as well, not just that one column. There is also a few joins from other tables too.
我的web应用程序还允许对该表中的其他列进行搜索,而不仅仅是这一列。也有一些来自其他表的连接。
My question is, is it worthwhile creating an index on this column to improve performance of these search queries? And if so, what type of index, and will just indexing the one column be enough or do I need to include other columns such as the primary key and other searchable columns?
我的问题是,是否值得在本专栏中创建一个索引来改进这些搜索查询的性能?如果是,是什么类型的索引,只索引一个列就足够了,还是需要包括其他列,比如主键和其他可搜索的列?
4 个解决方案
#1
7
It's not worthwhile creating a regular index if you're doing LIKE '%keyword%' searches. The reason is that indexing works like searching a dictionary, where you start in the middle then split the difference until you find the word. That wildcard query is like asking you to lookup a word that contains the text "to" or something-- the only way to find matches is to scan the whole dictionary.
如果您正在进行“%关键字%”搜索,那么不值得创建常规索引。原因是索引工作就像搜索字典,从中间开始,然后将差异分离,直到找到单词为止。这个通配符查询就像让您查找一个包含文本“to”或其他内容的单词——查找匹配项的唯一方法是扫描整个字典。
You might consider a full-text search, however, which is meant for this kind of scenario (see here).
不过,您可以考虑使用全文搜索,它适用于这种场景(请参见这里)。
#2
20
The best analogy I've ever seen for why an index won't help '%wildcard%'
searches:
我所见过的最好的类比是,为什么一个索引不能帮助“%通配符%”搜索:
Take two people. Hand each one the same phone book. Say to the person on your left:
需要两个人。给每个人同样的电话簿。对你左边的人说:
Tell me how many people are in this phone book with the last name "Smith."
告诉我这本电话簿上有多少人姓史密斯。"
Now say to the person on your right:
现在对你右边的人说:
Tell me how many people are in this phone book with the first name "Simon."
告诉我这本电话簿上有多少人的名字是“西蒙”。
An index is like a phone book. Very easy to seek for the thing that is at the beginning. Very difficult to scan for the thing that is in the middle or at the end.
索引就像一本电话簿。很容易找到一开始的东西。很难扫描到中间或最后的东西。
Every time I've repeated this in a session, I see light bulbs go on, so I thought it might be useful to share here.
每次我在会议上重复这一点时,我看到灯泡都亮着,所以我想在这里分享一下可能会有用。
#3
15
you cannot create an index on a varchar(max) field. The maximum amount of bytes on a index is 900. If the column is bigger than 900 bytes, you can create the index but any insert with more then 900 bytes will fail.
您不能在varchar(max)字段上创建索引。索引的最大字节数是900。如果列大于900字节,您可以创建索引,但是任何超过900字节的插入都将失败。
I suggest you to read about fulltext search. It should suits you in this case
我建议你读一下全文搜索。在这种情况下应该适合你
#4
0
The best way to find out is to create a bunch of test queries that resemble what would happen in real life and try to run them against your DB with and without the index. However, in general, if you are doing many SELECT queries, and little UPDATE/DELETE queries, an index might make your queries faster.
最好的方法是创建一组测试查询,这些查询类似于现实生活中发生的情况,并尝试在没有索引的情况下运行它们。但是,一般来说,如果您正在执行许多SELECT查询,并且很少更新/删除查询,那么索引可能会使您的查询速度更快。
However, if you do a lot of updates, the index might hurt your performance, so you have to know what kind of queries your DB will have to deal with before you make this decision.
但是,如果进行大量更新,索引可能会影响您的性能,因此您必须知道在做出此决定之前,您的DB需要处理哪些查询。
#1
7
It's not worthwhile creating a regular index if you're doing LIKE '%keyword%' searches. The reason is that indexing works like searching a dictionary, where you start in the middle then split the difference until you find the word. That wildcard query is like asking you to lookup a word that contains the text "to" or something-- the only way to find matches is to scan the whole dictionary.
如果您正在进行“%关键字%”搜索,那么不值得创建常规索引。原因是索引工作就像搜索字典,从中间开始,然后将差异分离,直到找到单词为止。这个通配符查询就像让您查找一个包含文本“to”或其他内容的单词——查找匹配项的唯一方法是扫描整个字典。
You might consider a full-text search, however, which is meant for this kind of scenario (see here).
不过,您可以考虑使用全文搜索,它适用于这种场景(请参见这里)。
#2
20
The best analogy I've ever seen for why an index won't help '%wildcard%'
searches:
我所见过的最好的类比是,为什么一个索引不能帮助“%通配符%”搜索:
Take two people. Hand each one the same phone book. Say to the person on your left:
需要两个人。给每个人同样的电话簿。对你左边的人说:
Tell me how many people are in this phone book with the last name "Smith."
告诉我这本电话簿上有多少人姓史密斯。"
Now say to the person on your right:
现在对你右边的人说:
Tell me how many people are in this phone book with the first name "Simon."
告诉我这本电话簿上有多少人的名字是“西蒙”。
An index is like a phone book. Very easy to seek for the thing that is at the beginning. Very difficult to scan for the thing that is in the middle or at the end.
索引就像一本电话簿。很容易找到一开始的东西。很难扫描到中间或最后的东西。
Every time I've repeated this in a session, I see light bulbs go on, so I thought it might be useful to share here.
每次我在会议上重复这一点时,我看到灯泡都亮着,所以我想在这里分享一下可能会有用。
#3
15
you cannot create an index on a varchar(max) field. The maximum amount of bytes on a index is 900. If the column is bigger than 900 bytes, you can create the index but any insert with more then 900 bytes will fail.
您不能在varchar(max)字段上创建索引。索引的最大字节数是900。如果列大于900字节,您可以创建索引,但是任何超过900字节的插入都将失败。
I suggest you to read about fulltext search. It should suits you in this case
我建议你读一下全文搜索。在这种情况下应该适合你
#4
0
The best way to find out is to create a bunch of test queries that resemble what would happen in real life and try to run them against your DB with and without the index. However, in general, if you are doing many SELECT queries, and little UPDATE/DELETE queries, an index might make your queries faster.
最好的方法是创建一组测试查询,这些查询类似于现实生活中发生的情况,并尝试在没有索引的情况下运行它们。但是,一般来说,如果您正在执行许多SELECT查询,并且很少更新/删除查询,那么索引可能会使您的查询速度更快。
However, if you do a lot of updates, the index might hurt your performance, so you have to know what kind of queries your DB will have to deal with before you make this decision.
但是,如果进行大量更新,索引可能会影响您的性能,因此您必须知道在做出此决定之前,您的DB需要处理哪些查询。