使用INDEX的原因和位置 - 优点和缺点

时间:2022-09-20 13:36:50

I'm quite new to database programming and I am wondering what the negative effects of indexes are? As far as I understood, indexes speed up operations which have to search the database for a specific value (for example a SELECT).

我对数据库编程很陌生,我想知道索引的负面影响是什么?据我所知,索引加速了必须在数据库中搜索特定值的操作(例如SELECT)。

Consider this example:

考虑这个例子:

For the table Example, with an index on column user_name, the operation:

对于表示例,使用列user_name的索引,操作:

SELECT TestField FROM Example WHERE user_name=XXXX

Will be faster as a result of the Index.

由于指数会更快。

My question is: what are cons of using indexes? If an index just give us pros (performance gaining), why aren't they set as default?

我的问题是:使用索引的缺点是什么?如果一个索引只给我们专业人士(性能提升),他们为什么不设置为默认值?

4 个解决方案

#1


13  

Well you can probably fill books about indices but in short here a few things to think about, when creating an index:

那么你可以填写有关索引的书籍,但总之,在创建索引时需要考虑一些事项:

While it (mostly) speeds up a select, it slows down inserts, updates and deletes because the database engine does not have to write the data only, but the index, too. An index need space on hard disk (and much more important) in RAM. An index that can not be held in RAM is pretty useless. An index on a column with only a few different values doesn't speed up selects, because it can not sort out much rows (for example a column "gender", which usually has only two different values - male, female).

虽然它(大多数)加速了select,但它减慢了插入,更新和删除的速度,因为数据库引擎不必仅写入数据,而是编写索引。索引需要RAM中的硬盘空间(更重要的是)。无法保存在RAM中的索引是没用的。仅具有少量不同值的列上的索引不会加快选择速度,因为它无法对多行进行排序(例如,列“性别”,通常只有两个不同的值 - 男性,女性)。

If you use MySQL for example you can check, if the enginge uses an index by adding "explain" before the select - for your above example EXPLAIN SELECT TestField FROM Example WHERE username=XXXX

如果您使用MySQL作为示例,您可以检查,如果enginge通过在select之前添加“explain”来使用索引 - 对于您的上述示例EXPLAIN SELECT TestField FROM示例WHERE username = XXXX

#2


7  

What are indexes for, what are they in database?

什么是索引,它们在数据库中是什么?

Without index on column user_name system would have to scan the entire Example table on a row-by-row basis to find all matching entries. If the data distribution in particular table points that there are only a few rows or so this is clearly an inefficient way of obtaining those rows.

如果没有列user_name,则系统必须逐行扫描整个Example表以查找所有匹配的条目。如果特定表中的数据分布指出只有几行左右,这显然是获得这些行的低效方法。

However, when using indexes, you are redirecting the power of search to a different, tree structure, that has faster lookups and very small depth.

但是,在使用索引时,您将搜索的功能重定向到具有更快查找和非常小深度的不同树结构。

Please have in mind, that indexes are pure redundancy. Database index is just like a telephone book one or any other index in a book you might be willing to read (probably a part of, to quickly find what you're looking for).

请记住,索引是纯冗余。数据库索引就像电话簿中的一个或您可能愿意阅读的书中的任何其他索引(可能是快速找到您要查找的内容的一部分)。

If you are interested in a chapter of a book the index lets you find it relatively quickly so that you don't have to skim through many pages to get it.

如果您对书的一个章节感兴趣,索引可以让您相对快速地找到它,这样您就不必浏览很多页面来获取它。

Why aren't indexes created on default?

为什么默认情况下不创建索引?

Index is a data structure that is created alongside a table and maintains itself whenever a table is changed. The fact of it's existance implies usage of data storage.

索引是与表一起创建的数据结构,并在表更改时自行维护。它存在的事实意味着使用数据存储。

If you would index every column on a large table, the storage needed to keep indexes would exceed the size of table itself by far.

如果要为大型表上的每一列编制索引,那么保留索引所需的存储将远远超过表本身的大小。

Self maintenance of an index structure also means that whenever an UPDATE, INSERT, DELETE occurs, the index has to be updated, and that costs time.

索引结构的自我维护也意味着每当发生UPDATE,INSERT,DELETE时,必须更新索引,并且这会花费时间。

There are situations, when you need to retrieve most of the table, or the entire table, and in this case Sequence scan of the whole table would be more efficient than doing the tree traversal and leaf node chain.

有些情况下,当您需要检索大部分表或整个表时,在这种情况下,整个表的序列扫描比执行树遍历和叶节点链更有效。

#3


0  

The main reason why don't we use an index as a default is the maintenance problem. i.e when we generally update(insert,delete,or update) that particular column which is indexed in a table then the index must be updated dynamically which is a bit time consuming process. Moreover it becomes an overhead to maintain this index.

我们不使用索引作为默认值的主要原因是维护问题。即,当我们通常更新(插入,删除或更新)在表中索引的特定列时,必须动态更新索引,这是一个耗时的过程。此外,维护此索引成为一种开销。

#4


-2  

Depends on how you have your indexes but essentially they are unique identifiers for each table row usually incremented by one value for example:

取决于你如何获得索引,但实际上它们是每个表行的唯一标识符,通常增加一个值,例如:

mytable{
 index   |  name  |   m/f   | age 
     1   | bob    | male    |  22 |
     2   | joe  b | male    |  27 |
     3   | sam    | female  |  42 |
     4   | bef    | female  |  21 |
}

See how we can check the number 3 for "sam" instead of going through each table each row and each column..

看看我们如何检查“sam”的数字3,而不是每行和每列检查每个表格。

#1


13  

Well you can probably fill books about indices but in short here a few things to think about, when creating an index:

那么你可以填写有关索引的书籍,但总之,在创建索引时需要考虑一些事项:

While it (mostly) speeds up a select, it slows down inserts, updates and deletes because the database engine does not have to write the data only, but the index, too. An index need space on hard disk (and much more important) in RAM. An index that can not be held in RAM is pretty useless. An index on a column with only a few different values doesn't speed up selects, because it can not sort out much rows (for example a column "gender", which usually has only two different values - male, female).

虽然它(大多数)加速了select,但它减慢了插入,更新和删除的速度,因为数据库引擎不必仅写入数据,而是编写索引。索引需要RAM中的硬盘空间(更重要的是)。无法保存在RAM中的索引是没用的。仅具有少量不同值的列上的索引不会加快选择速度,因为它无法对多行进行排序(例如,列“性别”,通常只有两个不同的值 - 男性,女性)。

If you use MySQL for example you can check, if the enginge uses an index by adding "explain" before the select - for your above example EXPLAIN SELECT TestField FROM Example WHERE username=XXXX

如果您使用MySQL作为示例,您可以检查,如果enginge通过在select之前添加“explain”来使用索引 - 对于您的上述示例EXPLAIN SELECT TestField FROM示例WHERE username = XXXX

#2


7  

What are indexes for, what are they in database?

什么是索引,它们在数据库中是什么?

Without index on column user_name system would have to scan the entire Example table on a row-by-row basis to find all matching entries. If the data distribution in particular table points that there are only a few rows or so this is clearly an inefficient way of obtaining those rows.

如果没有列user_name,则系统必须逐行扫描整个Example表以查找所有匹配的条目。如果特定表中的数据分布指出只有几行左右,这显然是获得这些行的低效方法。

However, when using indexes, you are redirecting the power of search to a different, tree structure, that has faster lookups and very small depth.

但是,在使用索引时,您将搜索的功能重定向到具有更快查找和非常小深度的不同树结构。

Please have in mind, that indexes are pure redundancy. Database index is just like a telephone book one or any other index in a book you might be willing to read (probably a part of, to quickly find what you're looking for).

请记住,索引是纯冗余。数据库索引就像电话簿中的一个或您可能愿意阅读的书中的任何其他索引(可能是快速找到您要查找的内容的一部分)。

If you are interested in a chapter of a book the index lets you find it relatively quickly so that you don't have to skim through many pages to get it.

如果您对书的一个章节感兴趣,索引可以让您相对快速地找到它,这样您就不必浏览很多页面来获取它。

Why aren't indexes created on default?

为什么默认情况下不创建索引?

Index is a data structure that is created alongside a table and maintains itself whenever a table is changed. The fact of it's existance implies usage of data storage.

索引是与表一起创建的数据结构,并在表更改时自行维护。它存在的事实意味着使用数据存储。

If you would index every column on a large table, the storage needed to keep indexes would exceed the size of table itself by far.

如果要为大型表上的每一列编制索引,那么保留索引所需的存储将远远超过表本身的大小。

Self maintenance of an index structure also means that whenever an UPDATE, INSERT, DELETE occurs, the index has to be updated, and that costs time.

索引结构的自我维护也意味着每当发生UPDATE,INSERT,DELETE时,必须更新索引,并且这会花费时间。

There are situations, when you need to retrieve most of the table, or the entire table, and in this case Sequence scan of the whole table would be more efficient than doing the tree traversal and leaf node chain.

有些情况下,当您需要检索大部分表或整个表时,在这种情况下,整个表的序列扫描比执行树遍历和叶节点链更有效。

#3


0  

The main reason why don't we use an index as a default is the maintenance problem. i.e when we generally update(insert,delete,or update) that particular column which is indexed in a table then the index must be updated dynamically which is a bit time consuming process. Moreover it becomes an overhead to maintain this index.

我们不使用索引作为默认值的主要原因是维护问题。即,当我们通常更新(插入,删除或更新)在表中索引的特定列时,必须动态更新索引,这是一个耗时的过程。此外,维护此索引成为一种开销。

#4


-2  

Depends on how you have your indexes but essentially they are unique identifiers for each table row usually incremented by one value for example:

取决于你如何获得索引,但实际上它们是每个表行的唯一标识符,通常增加一个值,例如:

mytable{
 index   |  name  |   m/f   | age 
     1   | bob    | male    |  22 |
     2   | joe  b | male    |  27 |
     3   | sam    | female  |  42 |
     4   | bef    | female  |  21 |
}

See how we can check the number 3 for "sam" instead of going through each table each row and each column..

看看我们如何检查“sam”的数字3,而不是每行和每列检查每个表格。