I have a database table that has more than 50 Million record and to improve searching i had to create a non clustered indexes, and once i create one it takes 5 ~ 10 minutes to be created so i guess in the background it sorts the data according to the index.
我有一个拥有超过5000万条记录的数据库表,为了改善搜索,我必须创建一个非聚集索引,一旦我创建了一个,需要5~10分钟才能创建,所以我想在后台根据数据对数据进行排序到索引。
So for example before adding index to my table searching was awful and takes long time and when i added the non clustered index, searching was fast.
因此,例如在向我的表添加索引之前,搜索非常糟糕并且需要很长时间,并且当我添加非聚集索引时,搜索速度很快。
But that was only when i had 50 million records.
但那只是我有5000万条记录的时候。
The question is, what if i defined the index at the very beginning when creating the table before adding any data to the table? Would it give the same search performance i am getting right now? or do i have to delete and recreate the index every now and then to sort the data regularly?
问题是,如果在向表中添加任何数据之前创建表时最初定义了索引,该怎么办?它会提供我现在获得的相同搜索性能吗?或者我是否必须不时删除并重新创建索引以定期对数据进行排序?
I am sorry if my question seemed stupid, i just started learning about indexes and it is a confusing topic for me.
我很抱歉,如果我的问题看起来很愚蠢,我刚开始学习索引,对我来说这是一个令人困惑的话题。
4 个解决方案
#1
4
A non-clustered index keeps a copy of the indexed fields in a special structure optimised for searching. Creating an index on 50 million records obviously takes some time.
非聚集索引将索引字段的副本保存在为搜索而优化的特殊结构中。在5000万条记录上创建索引显然需要一些时间。
Once the index is created, it"s maintained automatically as records are added, deleted or updated, so you should only need to reindex if you've had a serious crash of the system or the disk.
创建索引后,它会在添加,删除或更新记录时自动维护,因此如果系统或磁盘严重崩溃,您只需要重新索引。
So generally, it's best to create the index at the time you create the table.
因此,通常,最好在创建表时创建索引。
There is an operation called 'updating statistics' which helps the query optimiser to improve its search performance. The details vary between database engines.
有一项称为“更新统计信息”的操作可帮助查询优化器提高其搜索性能。数据库引擎的细节有所不同。
#2
3
Databases indexes work like those in books.
数据库索引的工作方式类似于书籍。
It's actually a pointer to the right rows in your table, based and ordered on a specific key (the column for which you define the index).
它实际上是指向表中正确行的指针,基于特定键(您为其定义索引的列)并对其进行排序。
So, basically, yes, if you create the index before inserting data, you should get the same search speed when you use it later on when the table is loaded with lots of records.
所以,基本上,是的,如果你在插入数据之前创建索引,当你在表中加载了大量记录时,你应该在以后使用它时获得相同的搜索速度。
Although, since each time you insert (or delete, or update the specific key) a record the index needs to be updated, inserting (or deleting or updating) large amount of data will be a bit slower.
虽然,每次插入(或删除或更新特定键)记录时索引都需要更新,插入(或删除或更新)大量数据会慢一些。
Indexes can get fragmented if you do a lot of insert and delete on the table. Thus, deleting and recreating them is usually part of a good maintenance plan.
如果在表上执行大量插入和删除操作,索引可能会碎片化。因此,删除和重新创建它们通常是良好维护计划的一部分。
#3
2
Check out the free scripts from ola hallengren. One is on index maintenance and statisics.
查看ola hallengren的免费脚本。一个是索引维护和统计。
General rule of thumb,
一般经验法则,
Index fragmentation between 10 and 30 pct, re-organize.
索引碎片在10到30%之间,重新组织。
Fragmentation > = 30 pct, rebuild.
碎片> = 30 pct,重建。
With a re-organize., you need to update your statistics.
通过重新组织,您需要更新统计信息。
The rebuild automatically does it.
重建会自动完成。
Indexing is a huge part of optimizing query performance.
索引是优化查询性能的重要部分。
- John
- 约翰
http://ola.hallengren.com/
#4
1
Indexes can be created prior to data be inserted into the table in question. The index is simply updated every time rows are inserted or updated, assuming the update touches fields involved in the index in question.
可以在将数据插入到相关表中之前创建索引。假设更新触及相关索引中涉及的字段,则每次插入或更新行时都会更新索引。
when rows are inserted the index may become fragmented to allow the index to maintain the desired logical order or rows in the index. For instance, if the index has rows like A, B, and E and you added a row containing C or D the index would be split so the new row fits between B and E. This fragmentation can be repaired with Olla Hallengren's scripts as Crafty DBA mentioned in his answer, however depending on how your system storage is configured this may be doing work for nothing.
插入行时,索引可能会碎片化,以允许索引在索引中维护所需的逻辑顺序。例如,如果索引包含A,B和E等行,并且您添加了包含C或D的行,则将拆分索引,以便新行适合B和E之间。可以使用Olla Hallengren的脚本作为Crafty修复此碎片DBA在他的回答中提到,但是根据你的系统存储配置方式,这可能是无效的。
Do yourself a favor and look at http://www.brentozar.com/sql/index-all-about-sql-server-indexes/ for some excellent info on SQL Server indexing.
帮自己一个忙,看看http://www.brentozar.com/sql/index-all-about-sql-server-indexes/,了解SQL Server索引的一些优秀信息。
#1
4
A non-clustered index keeps a copy of the indexed fields in a special structure optimised for searching. Creating an index on 50 million records obviously takes some time.
非聚集索引将索引字段的副本保存在为搜索而优化的特殊结构中。在5000万条记录上创建索引显然需要一些时间。
Once the index is created, it"s maintained automatically as records are added, deleted or updated, so you should only need to reindex if you've had a serious crash of the system or the disk.
创建索引后,它会在添加,删除或更新记录时自动维护,因此如果系统或磁盘严重崩溃,您只需要重新索引。
So generally, it's best to create the index at the time you create the table.
因此,通常,最好在创建表时创建索引。
There is an operation called 'updating statistics' which helps the query optimiser to improve its search performance. The details vary between database engines.
有一项称为“更新统计信息”的操作可帮助查询优化器提高其搜索性能。数据库引擎的细节有所不同。
#2
3
Databases indexes work like those in books.
数据库索引的工作方式类似于书籍。
It's actually a pointer to the right rows in your table, based and ordered on a specific key (the column for which you define the index).
它实际上是指向表中正确行的指针,基于特定键(您为其定义索引的列)并对其进行排序。
So, basically, yes, if you create the index before inserting data, you should get the same search speed when you use it later on when the table is loaded with lots of records.
所以,基本上,是的,如果你在插入数据之前创建索引,当你在表中加载了大量记录时,你应该在以后使用它时获得相同的搜索速度。
Although, since each time you insert (or delete, or update the specific key) a record the index needs to be updated, inserting (or deleting or updating) large amount of data will be a bit slower.
虽然,每次插入(或删除或更新特定键)记录时索引都需要更新,插入(或删除或更新)大量数据会慢一些。
Indexes can get fragmented if you do a lot of insert and delete on the table. Thus, deleting and recreating them is usually part of a good maintenance plan.
如果在表上执行大量插入和删除操作,索引可能会碎片化。因此,删除和重新创建它们通常是良好维护计划的一部分。
#3
2
Check out the free scripts from ola hallengren. One is on index maintenance and statisics.
查看ola hallengren的免费脚本。一个是索引维护和统计。
General rule of thumb,
一般经验法则,
Index fragmentation between 10 and 30 pct, re-organize.
索引碎片在10到30%之间,重新组织。
Fragmentation > = 30 pct, rebuild.
碎片> = 30 pct,重建。
With a re-organize., you need to update your statistics.
通过重新组织,您需要更新统计信息。
The rebuild automatically does it.
重建会自动完成。
Indexing is a huge part of optimizing query performance.
索引是优化查询性能的重要部分。
- John
- 约翰
http://ola.hallengren.com/
#4
1
Indexes can be created prior to data be inserted into the table in question. The index is simply updated every time rows are inserted or updated, assuming the update touches fields involved in the index in question.
可以在将数据插入到相关表中之前创建索引。假设更新触及相关索引中涉及的字段,则每次插入或更新行时都会更新索引。
when rows are inserted the index may become fragmented to allow the index to maintain the desired logical order or rows in the index. For instance, if the index has rows like A, B, and E and you added a row containing C or D the index would be split so the new row fits between B and E. This fragmentation can be repaired with Olla Hallengren's scripts as Crafty DBA mentioned in his answer, however depending on how your system storage is configured this may be doing work for nothing.
插入行时,索引可能会碎片化,以允许索引在索引中维护所需的逻辑顺序。例如,如果索引包含A,B和E等行,并且您添加了包含C或D的行,则将拆分索引,以便新行适合B和E之间。可以使用Olla Hallengren的脚本作为Crafty修复此碎片DBA在他的回答中提到,但是根据你的系统存储配置方式,这可能是无效的。
Do yourself a favor and look at http://www.brentozar.com/sql/index-all-about-sql-server-indexes/ for some excellent info on SQL Server indexing.
帮自己一个忙,看看http://www.brentozar.com/sql/index-all-about-sql-server-indexes/,了解SQL Server索引的一些优秀信息。