When a database table has been partitioned in mysql, how are the individual partitions accessed / queried?
在mysql中对数据库表进行分区时,如何访问/查询各个分区?
EDIT
编辑
In response to @Crack's comment.
回应@Crack的评论。
So when a partition is in place in a table, then I still would use a normal query. Where does the "pruning" come in, at the database side of the query? Is it pretty much a complex stored Where
clause that is applied to every query then? Why are the partitions named if they are not individually accessed?
因此,当一个分区在表中就位时,我仍然会使用普通查询。 “修剪”在查询的数据库端进入何处?它是一个复杂的存储Where子句,然后应用于每个查询吗?如果分区未单独访问,为什么分区命名?
2 个解决方案
#1
3
Ok, let's take this one part at a time.
好吧,让我们一次拿这个部分。
So when a partition is in place in a table, then I still would use a normal query.
因此,当一个分区在表中就位时,我仍然会使用普通查询。
Yes. Partitioning is transparent to you, it is meant to optimize (when well used) query performance by dividing physical. storage of data and indexes into separate "bins".
是。分区对您来说是透明的,它意味着通过划分物理来优化(当使用良好时)查询性能。将数据和索引存储到单独的“箱”中。
Where does the "pruning" come in, at the database side of the query? Is it pretty much a complex stored Where clause that is applied to every query then?
“修剪”在查询的数据库端进入何处?它是一个复杂的存储Where子句,然后应用于每个查询吗?
Yes and no. Depending on partitioning schema, MySQL puts your data into disjoint "bins". Later it reads the WHERE
clause of your query and knows which partitions it must check to answer it. MySQL documentation has a few nice examples in documentation: Partition Pruning.
是的,不是。根据分区模式,MySQL会将您的数据放入不相交的“分档”中。稍后它会读取查询的WHERE子句,并知道必须检查哪些分区才能回答它。 MySQL文档在文档中有一些很好的例子:分区修剪。
It allows you to store each partition on different physical storage device and MySQL can run some operations in parallel or don't scan some partitions at all (see examples from link above).
它允许您将每个分区存储在不同的物理存储设备上,MySQL可以并行运行某些操作,或者根本不扫描某些分区(参见上面链接中的示例)。
Why are the partitions named if they are not individually accessed?
如果分区未单独访问,为什么分区命名?
They are individually accessed, but you don't make this decision - optimizer does it. Partition names make it easier for you to manage them. You can find possible operations in documentation (Partition Management).
它们是单独访问的,但您没有做出这个决定 - 优化器会这样做。分区名称使您可以更轻松地管理它们。您可以在文档中找到可能的操作(分区管理)。
Since MySQL 5.6.2 you can select data from individual partitions, see Partition Selection. Just an advice - don't use this syntax if you don't have to, because using it makes your queries bound to storage structure of your data (and don't use an unstable version of MySQL in production ;).
从MySQL 5.6.2开始,您可以从各个分区中选择数据,请参阅分区选择。只是一个建议 - 如果不需要,请不要使用此语法,因为使用它会使您的查询绑定到数据的存储结构(并且不会在生产中使用不稳定版本的MySQL;)。
#2
3
The pruning of data really comes into play mainly when you insert data.
修剪数据主要是在插入数据时发挥作用。
For example, assume I've partitioned a table by hash on id
- an integer column, and my hash function is simply checking if the integer is odd/even. So, MySQL would effectively be creating two bins - the odd bin
and the even bin
.
例如,假设我已经通过对一个整数列的哈希对表进行了分区,而我的哈希函数只是检查整数是奇数还是偶数。因此,MySQL实际上会创建两个箱 - 奇数箱和偶箱。
When I insert id = 1
, MySQL applies the hash function. Since the result is odd
, the data is put in the odd bin
. When I insert id = 2
, the data would go to the even bin
.
当我插入id = 1时,MySQL应用哈希函数。由于结果是奇数,因此将数据放入奇数仓中。当我插入id = 2时,数据将转到偶数bin。
Querying doesn't involve any pruning, just a bit of smart logic. MySQL knows from a query fired on this table that it could potentially improve performance if it could only look at one partition (half the data in our case). So, it attempts to identify the partition.
查询不涉及任何修剪,只需要一点智能逻辑。 MySQL知道从这个表上触发的查询,如果它只能查看一个分区(在我们的例子中是数据的一半),它可能会提高性能。因此,它尝试识别分区。
When a query is now fired involving the id
column in a where
, MySQL would again apply the hash function to the value passed. Suppose i say WHERE id = 2 AND <some other condition>
, the hash returns even
. So, now MySQL only looks at the even bin
.
当现在触发涉及id列的查询时,MySQL将再次将哈希函数应用于传递的值。假设我说WHERE id = 2 AND <其他条件> ,则哈希返回偶数。所以,现在MySQL只关注偶数bin。
In this trivial example, you can see how when querying/inserting data, only one half of the complete data set needs to be scanned/updated, effectively improving my performance by approx. 2 times (let's discount the hashing overhead for now).
在这个简单的示例中,您可以看到在查询/插入数据时,只需要扫描/更新完整数据集的一半,从而有效地提高了我的性能。 2次(让我们现在打折哈希开销)。
#1
3
Ok, let's take this one part at a time.
好吧,让我们一次拿这个部分。
So when a partition is in place in a table, then I still would use a normal query.
因此,当一个分区在表中就位时,我仍然会使用普通查询。
Yes. Partitioning is transparent to you, it is meant to optimize (when well used) query performance by dividing physical. storage of data and indexes into separate "bins".
是。分区对您来说是透明的,它意味着通过划分物理来优化(当使用良好时)查询性能。将数据和索引存储到单独的“箱”中。
Where does the "pruning" come in, at the database side of the query? Is it pretty much a complex stored Where clause that is applied to every query then?
“修剪”在查询的数据库端进入何处?它是一个复杂的存储Where子句,然后应用于每个查询吗?
Yes and no. Depending on partitioning schema, MySQL puts your data into disjoint "bins". Later it reads the WHERE
clause of your query and knows which partitions it must check to answer it. MySQL documentation has a few nice examples in documentation: Partition Pruning.
是的,不是。根据分区模式,MySQL会将您的数据放入不相交的“分档”中。稍后它会读取查询的WHERE子句,并知道必须检查哪些分区才能回答它。 MySQL文档在文档中有一些很好的例子:分区修剪。
It allows you to store each partition on different physical storage device and MySQL can run some operations in parallel or don't scan some partitions at all (see examples from link above).
它允许您将每个分区存储在不同的物理存储设备上,MySQL可以并行运行某些操作,或者根本不扫描某些分区(参见上面链接中的示例)。
Why are the partitions named if they are not individually accessed?
如果分区未单独访问,为什么分区命名?
They are individually accessed, but you don't make this decision - optimizer does it. Partition names make it easier for you to manage them. You can find possible operations in documentation (Partition Management).
它们是单独访问的,但您没有做出这个决定 - 优化器会这样做。分区名称使您可以更轻松地管理它们。您可以在文档中找到可能的操作(分区管理)。
Since MySQL 5.6.2 you can select data from individual partitions, see Partition Selection. Just an advice - don't use this syntax if you don't have to, because using it makes your queries bound to storage structure of your data (and don't use an unstable version of MySQL in production ;).
从MySQL 5.6.2开始,您可以从各个分区中选择数据,请参阅分区选择。只是一个建议 - 如果不需要,请不要使用此语法,因为使用它会使您的查询绑定到数据的存储结构(并且不会在生产中使用不稳定版本的MySQL;)。
#2
3
The pruning of data really comes into play mainly when you insert data.
修剪数据主要是在插入数据时发挥作用。
For example, assume I've partitioned a table by hash on id
- an integer column, and my hash function is simply checking if the integer is odd/even. So, MySQL would effectively be creating two bins - the odd bin
and the even bin
.
例如,假设我已经通过对一个整数列的哈希对表进行了分区,而我的哈希函数只是检查整数是奇数还是偶数。因此,MySQL实际上会创建两个箱 - 奇数箱和偶箱。
When I insert id = 1
, MySQL applies the hash function. Since the result is odd
, the data is put in the odd bin
. When I insert id = 2
, the data would go to the even bin
.
当我插入id = 1时,MySQL应用哈希函数。由于结果是奇数,因此将数据放入奇数仓中。当我插入id = 2时,数据将转到偶数bin。
Querying doesn't involve any pruning, just a bit of smart logic. MySQL knows from a query fired on this table that it could potentially improve performance if it could only look at one partition (half the data in our case). So, it attempts to identify the partition.
查询不涉及任何修剪,只需要一点智能逻辑。 MySQL知道从这个表上触发的查询,如果它只能查看一个分区(在我们的例子中是数据的一半),它可能会提高性能。因此,它尝试识别分区。
When a query is now fired involving the id
column in a where
, MySQL would again apply the hash function to the value passed. Suppose i say WHERE id = 2 AND <some other condition>
, the hash returns even
. So, now MySQL only looks at the even bin
.
当现在触发涉及id列的查询时,MySQL将再次将哈希函数应用于传递的值。假设我说WHERE id = 2 AND <其他条件> ,则哈希返回偶数。所以,现在MySQL只关注偶数bin。
In this trivial example, you can see how when querying/inserting data, only one half of the complete data set needs to be scanned/updated, effectively improving my performance by approx. 2 times (let's discount the hashing overhead for now).
在这个简单的示例中,您可以看到在查询/插入数据时,只需要扫描/更新完整数据集的一半,从而有效地提高了我的性能。 2次(让我们现在打折哈希开销)。