排序选择WITHOUT WHERE或ORDER BY子句的结果顺序

时间:2022-06-04 09:28:50

I have a table with a PK clustered index as well as other indexes on it, both unique and non-unique. If I issue (exactly):

我有一个带有PK聚簇索引的表以及其上的其他索引,包括唯一和非唯一。如果我发出(确切地说):

SELECT * FROM table_name

or

SELECT col1, col2 FROM table_name

in what order will the rows be returned?

以什么顺序返回行?

This is the first question in an interview questionnaire a customer has forwarded us. Here are the instructions:

这是客户转发给我们的访谈问卷中的第一个问题。以下是说明:

If the answer to this question is incorrect, terminate the interview immediately! The individual, regardless of their stated ability does not understand SQL-Based relational database management systems. This is SQL-101 logic for the past 25+ years. The correct answer is: “unknown/random/undetermined because no ORDER BY clause was specified as part of the query”.

如果这个问题的答案不正确,请立即终止面试!个人,无论其声明的能力如何,都不了解基于SQL的关系数据库管理系统。这是过去25年多来的SQL-101逻辑。正确答案是:“未知/随机/未确定,因为没有将ORDER BY子句指定为查询的一部分”。

I am somehow not convinced that this is actually correct. All comments welcome.

我不知道这确实是正确的。欢迎所有评论。

Thanks,

Raj

2 个解决方案

#1


8  

Even if a table has a primary key/clustered index, you can't be sure about the order of rows. Although in the execution plan there will be an index/heap scan at the end, if query is performed in parallel on many cores, the resulting dataset won't be sorted due to parallel streams merge plan step.

即使表具有主键/聚簇索引,也无法确定行的顺序。虽然在执行计划中最后会有索引/堆扫描,但如果在许多核心上并行执行查询,则由于并行流合并计划步骤,结果数据集将不会被排序。

You probably won't see it on small databases, but try creating one with many files on separate harddrives and run a simple query on a multicore machine. Most likely you'll get results "partialy sorted" by ID - i.e. there will be blocks where rows are sorted, but blocks will be retrieved in semi-random order.

您可能不会在小型数据库上看到它,但尝试在单独的硬盘上创建一个包含许多文件的数据库,并在多核计算机上运行简单查询。很可能你会通过ID得到“部分排序”的结果 - 也就是说会有行排序的块,但是会以半随机顺序检索块。

#2


5  

The instructions speak to SQL at a conceptual level, at which the result of a query is a relationship, and relationships are unordered. Moving from the conceptual to the actual, the reason no implicit ordering is defined in the SQL standard is so RDBMSs are free to return whatever order is most efficient for their implementation.

这些指令在概念层面上与SQL对话,在该层面上,查询的结果是关系,而关系是无序的。从概念到实际,在SQL标准中没有定义隐式排序的原因是RDBMS可以*地返回对它们的实现最有效的任何顺序。

#1


8  

Even if a table has a primary key/clustered index, you can't be sure about the order of rows. Although in the execution plan there will be an index/heap scan at the end, if query is performed in parallel on many cores, the resulting dataset won't be sorted due to parallel streams merge plan step.

即使表具有主键/聚簇索引,也无法确定行的顺序。虽然在执行计划中最后会有索引/堆扫描,但如果在许多核心上并行执行查询,则由于并行流合并计划步骤,结果数据集将不会被排序。

You probably won't see it on small databases, but try creating one with many files on separate harddrives and run a simple query on a multicore machine. Most likely you'll get results "partialy sorted" by ID - i.e. there will be blocks where rows are sorted, but blocks will be retrieved in semi-random order.

您可能不会在小型数据库上看到它,但尝试在单独的硬盘上创建一个包含许多文件的数据库,并在多核计算机上运行简单查询。很可能你会通过ID得到“部分排序”的结果 - 也就是说会有行排序的块,但是会以半随机顺序检索块。

#2


5  

The instructions speak to SQL at a conceptual level, at which the result of a query is a relationship, and relationships are unordered. Moving from the conceptual to the actual, the reason no implicit ordering is defined in the SQL standard is so RDBMSs are free to return whatever order is most efficient for their implementation.

这些指令在概念层面上与SQL对话,在该层面上,查询的结果是关系,而关系是无序的。从概念到实际,在SQL标准中没有定义隐式排序的原因是RDBMS可以*地返回对它们的实现最有效的任何顺序。