I read that
我读到
SELECT is a horizontal partition of the relation into two set of tuples.
SELECT是将关系划分为两组元组的水平分区。
and
和
PROJECT is a vertical partition of the relation into two relations.
项目是一种垂直分割关系的两种关系。
However, I don't understand what that means. Can you explain it in layman's terms?
然而,我不明白那是什么意思。你能用外行的话解释一下吗?
4 个解决方案
#1
33
Not a complete answer to the question but it answers what is asked in the question title. So the general meaning of horizontal and vertical database partitioning is:
这个问题没有一个完整的答案,但它回答了题目中提出的问题。因此,水平和垂直数据库分区的一般意义是:
Horizontal partitioning involves putting different rows into different tables. Perhaps customers with ZIP codes less than 50000 are stored in CustomersEast, while customers with ZIP codes greater than or equal to 50000 are stored in CustomersWest. The two partition tables are then CustomersEast and CustomersWest, while a view with a union might be created over both of them to provide a complete view of all customers.
水平分区涉及将不同的行放入不同的表中。邮政编码小于50000的客户可能存储在customerast中,而邮政编码大于或等于50000的客户则存储在CustomersWest中。然后,这两个分区表是CustomersEast和CustomersWest,而在两个分区表中可能会创建一个视图,以提供所有客户的完整视图。
Vertical partitioning involves creating tables with fewer columns and using additional tables to store the remaining columns. Normalization also involves this splitting of columns across tables, but vertical partitioning goes beyond that and partitions columns even when already normalized.
垂直分区涉及创建具有较少列的表,并使用其他表存储其余列。规范化还涉及到跨表的列的拆分,但是垂直分区超越了这个范围,甚至在已经规范化的情况下也会对列进行划分。
See more details here.
看到更多的细节。
#2
8
A projection creates a subset of attributes in a relation hence a "vertical partition"
投影在关系中创建属性的子集,因此是“垂直分区”
A selection creates a subset of the tuples in a relation hence a "horizontal partition"
一个选择在一个关系中创建元组的子集,因此是一个“水平分区”
Given a table (r)
as
给定一个表(r)。
a : b : c : d : e
-----------------
1 : 2 : 3 : 4 : 5
1 : 2 : 3 : 4 : 5
2 : 2 : 3 : 4 : 5
2 : 2 : 3 : 4 : 5
An expression such as
一个表达式如
PROJECT a, b (SELECT a=1 (r))
-- SELECT a, b FROM r WHERE a=1
Would "do"
将“做”
a : b | c : d : e
-----------------
1 : 2 | 3 : 4 : 5
1 : 2 | 3 : 4 : 5
================= < -- horizontal partition (by SELECTION)
2 : 2 | 3 : 4 : 5
2 : 2 | 3 : 4 : 5
^ -- vertical partition (by PROJECTION)
Resulting in
导致
a : b
------
1 : 2
1 : 2
#3
4
Necromancing.
I think the existing answers are too abstract.
Necromancing。我认为现有的答案太抽象了。
So here my attempts at a more practical explanation:
所以在这里,我试图给出一个更实际的解释:
Partitioning form a developer's point of view is all about performance.
More exactly, it's about what happens when you have large amounts of data in your tables, and you still want to query the data fast.
从开发人员的角度来看,分区就是性能。更确切地说,它是关于当您的表中有大量数据时发生的情况,并且您仍然希望快速查询数据。
Here some excerpts from slides by Bill Karwin about what exactly horizontal partitioning is all about:
下面是Bill Karwin关于水平分区的一些摘录:
The above is bad, because:
以上是不好的,因为:
The solution:
水平PARTITONING
Horizontal partitioning divides a table into multiple tables. Each table then contains the same number of columns, but fewer rows.
水平分区将一个表划分为多个表。然后,每个表包含相同数量的列,但行数更少。
The difference: Query Performance and simplicity
区别在于:查询性能和简单性
Now, on the difference between horizontal and vertical partitioning:
"Tribbles" can also accumulate in columns. Example:
“小石子”也可以聚集在柱子上。例子:
The solution to that problem is VERTICAL PARTITIONING
Proper normalization is ONE form of vertical partitioning
解决这个问题的方法是垂直分区正确的规范化是垂直分区的一种形式
To quote technet
引用技术
Vertical partitioning divides a table into multiple tables that contain fewer columns.
垂直分区将表划分为包含更少列的多个表。
The two types of vertical partitioning are normalization and row splitting:
垂直分割的两种类型是归一化和行分割:
Normalization is the standard database process of removing redundant columns from a table and putting them in secondary tables that are linked to the primary table by primary key and foreign key relationships.
规范化是从表中删除冗余列并将它们放在次要表中(主键和外键关系将它们链接到主表)的标准数据库过程。
Row splitting divides the original table vertically into tables with fewer columns. Each logical row in a split table matches the same logical row in the other tables as identified by a UNIQUE KEY column that is identical in all of the partitioned tables. For example, joining the row with ID 712 from each split table re-creates the original row. Like horizontal partitioning, vertical partitioning lets queries scan less data. This increases query performance. For example, a table that contains seven columns of which only the first four are generally referenced may benefit from splitting the last three columns into a separate table. Vertical partitioning should be considered carefully, because analyzing data from multiple partitions requires queries that join the tables.
行分割将原始表垂直地划分为具有较少列的表。分割表中的每个逻辑行都与其他表中的相同逻辑行相匹配,由所有分区表中相同的惟一键列标识。例如,将每个分割表中的ID 712连接到一行,将重新创建原始行。与水平分区一样,垂直分区允许查询扫描更少的数据。这增加查询性能。例如,一个包含7个列的表,其中只有前四个列通常被引用,可以从将最后三个列拆分为一个单独的表中受益。应该仔细考虑垂直分区,因为分析来自多个分区的数据需要连接表的查询。
Vertical partitioning also could affect performance if partitions are very large.
如果分区很大,垂直分区也会影响性能。
That sums it up nicely.
这很好地总结了。
Now on SELECT vs. PROJECT:
This SO post describes the difference as such:
这篇文章描述了两者的不同之处:
Select Operation : This operation is used to select rows from a table (relation) that specifies a given logic, which is called as a
predicate
. The predicate is a user defined condition to select rows of user's choice.Select Operation:该操作用于从指定给定逻辑(作为谓词)的表(关系)中选择行。谓词是用户定义的条件,用于选择用户选择的行。
Project Operation : If the user is interested in selecting the values of a few attributes, rather than selection all attributes of the Table (Relation), then one should go for
PROJECT
Operation.项目操作:如果用户感兴趣的是选择几个属性的值,而不是选择表(关系)的所有属性,那么就应该进行项目操作。
SELECT is an actual SQL operation (statement), while PROJECT is a term used in relational algebra.
SELECT是一个实际的SQL操作(语句),而PROJECT是关系代数中使用的术语。
Judging from you posting this on SO and not on MathOverflow, I would suggest you don't read relational algebra books if you just want to learn SQL for developing application.
从您在SO而不是MathOverflow上发布的文章来看,如果您只想学习SQL来开发应用程序,我建议您不要阅读关系代数书籍。
If you are in dire need of a recommendation for a good book about (advanced) SQL, here is one
如果您迫切需要推荐一本关于(高级)SQL的好书,这里有一本
SQL Antipatterns: Avoiding the Pitfalls of Database Programming
Bill Karwin
ISBN-13: 978-1934356555
ISBN-10: 1934356557SQL反模式:避免数据库编程的陷阱Bill Karwin ISBN-13: 978-1934356555 ISBN-10: 1934356557
That's the one book about SQL worth reading.
Most other books about SQL that I've seen out there can be summed up by this cynical statement about photoshop books:
这是一本关于SQL的书,值得一读。我所见过的大多数关于SQL的书都可以用这句关于photoshop书籍的冷嘲热讽来总结:
There are more books about photoshop than people actually using photoshop.
关于photoshop的书比人们使用photoshop的书要多。
#4
1
Consider a single table in a database, it has some rows and columns.
考虑数据库中的一个表,它有一些行和列。
There are two ways your could pick data: You could pick some rows, or you could pick some columns (well ok, three ways, you could pick some rows, and within that pick some columns.)
有两种方法可以选择数据:您可以选择一些行,或者选择一些列(好的,三种方法,您可以选择一些行,在其中选择一些列。)
You can think of select as picking some rows - that's horizontal (and not picking the rest, hence partitioning)
您可以将选择看作选择一些行——这是水平的(而不是选择rest,因此分区)
You can think of project as picking some columns - that's vertical (and not picking the rest)
您可以将项目看作选择一些列——这是垂直的(而不是选择其余的)
#1
33
Not a complete answer to the question but it answers what is asked in the question title. So the general meaning of horizontal and vertical database partitioning is:
这个问题没有一个完整的答案,但它回答了题目中提出的问题。因此,水平和垂直数据库分区的一般意义是:
Horizontal partitioning involves putting different rows into different tables. Perhaps customers with ZIP codes less than 50000 are stored in CustomersEast, while customers with ZIP codes greater than or equal to 50000 are stored in CustomersWest. The two partition tables are then CustomersEast and CustomersWest, while a view with a union might be created over both of them to provide a complete view of all customers.
水平分区涉及将不同的行放入不同的表中。邮政编码小于50000的客户可能存储在customerast中,而邮政编码大于或等于50000的客户则存储在CustomersWest中。然后,这两个分区表是CustomersEast和CustomersWest,而在两个分区表中可能会创建一个视图,以提供所有客户的完整视图。
Vertical partitioning involves creating tables with fewer columns and using additional tables to store the remaining columns. Normalization also involves this splitting of columns across tables, but vertical partitioning goes beyond that and partitions columns even when already normalized.
垂直分区涉及创建具有较少列的表,并使用其他表存储其余列。规范化还涉及到跨表的列的拆分,但是垂直分区超越了这个范围,甚至在已经规范化的情况下也会对列进行划分。
See more details here.
看到更多的细节。
#2
8
A projection creates a subset of attributes in a relation hence a "vertical partition"
投影在关系中创建属性的子集,因此是“垂直分区”
A selection creates a subset of the tuples in a relation hence a "horizontal partition"
一个选择在一个关系中创建元组的子集,因此是一个“水平分区”
Given a table (r)
as
给定一个表(r)。
a : b : c : d : e
-----------------
1 : 2 : 3 : 4 : 5
1 : 2 : 3 : 4 : 5
2 : 2 : 3 : 4 : 5
2 : 2 : 3 : 4 : 5
An expression such as
一个表达式如
PROJECT a, b (SELECT a=1 (r))
-- SELECT a, b FROM r WHERE a=1
Would "do"
将“做”
a : b | c : d : e
-----------------
1 : 2 | 3 : 4 : 5
1 : 2 | 3 : 4 : 5
================= < -- horizontal partition (by SELECTION)
2 : 2 | 3 : 4 : 5
2 : 2 | 3 : 4 : 5
^ -- vertical partition (by PROJECTION)
Resulting in
导致
a : b
------
1 : 2
1 : 2
#3
4
Necromancing.
I think the existing answers are too abstract.
Necromancing。我认为现有的答案太抽象了。
So here my attempts at a more practical explanation:
所以在这里,我试图给出一个更实际的解释:
Partitioning form a developer's point of view is all about performance.
More exactly, it's about what happens when you have large amounts of data in your tables, and you still want to query the data fast.
从开发人员的角度来看,分区就是性能。更确切地说,它是关于当您的表中有大量数据时发生的情况,并且您仍然希望快速查询数据。
Here some excerpts from slides by Bill Karwin about what exactly horizontal partitioning is all about:
下面是Bill Karwin关于水平分区的一些摘录:
The above is bad, because:
以上是不好的,因为:
The solution:
水平PARTITONING
Horizontal partitioning divides a table into multiple tables. Each table then contains the same number of columns, but fewer rows.
水平分区将一个表划分为多个表。然后,每个表包含相同数量的列,但行数更少。
The difference: Query Performance and simplicity
区别在于:查询性能和简单性
Now, on the difference between horizontal and vertical partitioning:
"Tribbles" can also accumulate in columns. Example:
“小石子”也可以聚集在柱子上。例子:
The solution to that problem is VERTICAL PARTITIONING
Proper normalization is ONE form of vertical partitioning
解决这个问题的方法是垂直分区正确的规范化是垂直分区的一种形式
To quote technet
引用技术
Vertical partitioning divides a table into multiple tables that contain fewer columns.
垂直分区将表划分为包含更少列的多个表。
The two types of vertical partitioning are normalization and row splitting:
垂直分割的两种类型是归一化和行分割:
Normalization is the standard database process of removing redundant columns from a table and putting them in secondary tables that are linked to the primary table by primary key and foreign key relationships.
规范化是从表中删除冗余列并将它们放在次要表中(主键和外键关系将它们链接到主表)的标准数据库过程。
Row splitting divides the original table vertically into tables with fewer columns. Each logical row in a split table matches the same logical row in the other tables as identified by a UNIQUE KEY column that is identical in all of the partitioned tables. For example, joining the row with ID 712 from each split table re-creates the original row. Like horizontal partitioning, vertical partitioning lets queries scan less data. This increases query performance. For example, a table that contains seven columns of which only the first four are generally referenced may benefit from splitting the last three columns into a separate table. Vertical partitioning should be considered carefully, because analyzing data from multiple partitions requires queries that join the tables.
行分割将原始表垂直地划分为具有较少列的表。分割表中的每个逻辑行都与其他表中的相同逻辑行相匹配,由所有分区表中相同的惟一键列标识。例如,将每个分割表中的ID 712连接到一行,将重新创建原始行。与水平分区一样,垂直分区允许查询扫描更少的数据。这增加查询性能。例如,一个包含7个列的表,其中只有前四个列通常被引用,可以从将最后三个列拆分为一个单独的表中受益。应该仔细考虑垂直分区,因为分析来自多个分区的数据需要连接表的查询。
Vertical partitioning also could affect performance if partitions are very large.
如果分区很大,垂直分区也会影响性能。
That sums it up nicely.
这很好地总结了。
Now on SELECT vs. PROJECT:
This SO post describes the difference as such:
这篇文章描述了两者的不同之处:
Select Operation : This operation is used to select rows from a table (relation) that specifies a given logic, which is called as a
predicate
. The predicate is a user defined condition to select rows of user's choice.Select Operation:该操作用于从指定给定逻辑(作为谓词)的表(关系)中选择行。谓词是用户定义的条件,用于选择用户选择的行。
Project Operation : If the user is interested in selecting the values of a few attributes, rather than selection all attributes of the Table (Relation), then one should go for
PROJECT
Operation.项目操作:如果用户感兴趣的是选择几个属性的值,而不是选择表(关系)的所有属性,那么就应该进行项目操作。
SELECT is an actual SQL operation (statement), while PROJECT is a term used in relational algebra.
SELECT是一个实际的SQL操作(语句),而PROJECT是关系代数中使用的术语。
Judging from you posting this on SO and not on MathOverflow, I would suggest you don't read relational algebra books if you just want to learn SQL for developing application.
从您在SO而不是MathOverflow上发布的文章来看,如果您只想学习SQL来开发应用程序,我建议您不要阅读关系代数书籍。
If you are in dire need of a recommendation for a good book about (advanced) SQL, here is one
如果您迫切需要推荐一本关于(高级)SQL的好书,这里有一本
SQL Antipatterns: Avoiding the Pitfalls of Database Programming
Bill Karwin
ISBN-13: 978-1934356555
ISBN-10: 1934356557SQL反模式:避免数据库编程的陷阱Bill Karwin ISBN-13: 978-1934356555 ISBN-10: 1934356557
That's the one book about SQL worth reading.
Most other books about SQL that I've seen out there can be summed up by this cynical statement about photoshop books:
这是一本关于SQL的书,值得一读。我所见过的大多数关于SQL的书都可以用这句关于photoshop书籍的冷嘲热讽来总结:
There are more books about photoshop than people actually using photoshop.
关于photoshop的书比人们使用photoshop的书要多。
#4
1
Consider a single table in a database, it has some rows and columns.
考虑数据库中的一个表,它有一些行和列。
There are two ways your could pick data: You could pick some rows, or you could pick some columns (well ok, three ways, you could pick some rows, and within that pick some columns.)
有两种方法可以选择数据:您可以选择一些行,或者选择一些列(好的,三种方法,您可以选择一些行,在其中选择一些列。)
You can think of select as picking some rows - that's horizontal (and not picking the rest, hence partitioning)
您可以将选择看作选择一些行——这是水平的(而不是选择rest,因此分区)
You can think of project as picking some columns - that's vertical (and not picking the rest)
您可以将项目看作选择一些列——这是垂直的(而不是选择其余的)