没有Order By子句的SQL Select语句的顺序

时间:2022-01-11 22:48:20

As I know, from the relational database theory, a select statement without an order by clause should be considered has not particular order. But actually in SQL Server and Oracle (I've tested on those 2 platforms), if I query from a table without an order by clause multiple times, I always get the results in the same order. Does this behavior can be relied on? Anyone can help to explain a little?

据我所知,从关系数据库理论来看,没有order by子句的select语句应该被认为没有特定的顺序。但实际上在SQL Server和Oracle中(我在这两个平台上进行了测试),如果我从没有order by子句的表中多次查询,我总是以相同的顺序得到结果。这种行为是否可以依赖?有人可以帮忙解释一下吗?

4 个解决方案

#1


34  

No, that behavior cannot be relied on. The order is determined by the way the query planner has decided to build up the result set. simple queries like select * from foo_table are likely to be returned in the order they are stored on disk, which may be in primary key order or the order they were created, or some other random order. more complex queries, such as select * from foo where bar < 10 may instead be returned in order of a different column, based on an index read, or by the table order, for a table scan. even more elaborate queries, with multipe where conditions, group by clauses, unions, will be in whatever order the planner decides is most efficient to generate.

不,这种行为不能依赖。顺序由查询计划程序决定构建结果集的方式决定。像select * from foo_table这样的简单查询很可能按照它们存储在磁盘上的顺序返回,这可能是主键顺序或它们的创建顺序,或者其他一些随机顺序。更复杂的查询,例如select * from foo,其中bar <10可以根据索引读取或表顺序以不同列的顺序返回,用于表扫描。更复杂的查询,使用multipe,其中条件,group by子句,联合,将以规划者决定的最有效生成的顺序排列。

The order could even change between two identical queries just because of data that has changed between those queries. a "where" clause may be satisfied with an index scan in one query, but later inserts could make that condition less selective, and the planner could decide to perform a subsequent query using a table scan.

由于这些查询之间的数据发生了变化,订单甚至可以在两个相同的查询之间进行更改。一个查询中的索引扫描可能满足“where”子句,但后来的插入可能使该条件的选择性降低,并且计划程序可以决定使用表扫描执行后续查询。


To put a finer point on it. RDBMS systems have the mandate to give you exactly what you asked for, as efficiently as possible. That efficiency can take many forms, including minimizing IO (both to disk as well as over the network to send data to you), minimizing CPU and keeping the size of its working set small (using methods that require minimal temporary storage).

更精确一点。 RDBMS系统的任务是尽可能高效地为您提供所要求的内容。这种效率可以采取多种形式,包括最小化IO(无论是磁盘还是通过网络向您发送数据),最小化CPU并保持其工作集的大小(使用需要最少临时存储的方法)。

without an ORDER BY clause, you will have not asked exactly for a particular order, and so the RDBMS will give you those rows in some order that (maybe) corresponds with some coincidental aspect of the query, based on whichever algorithm the RDBMS expects to produce the data the fastest.

如果没有ORDER BY子句,您将不会确切地询问特定的顺序,因此RDBMS将根据RDBMS期望的任何算法以某种顺序(可能)与查询的某些巧合方面相对应地为您提供这些行。以最快的速度生成数据。

If you care about efficiency, but not order, skip the ORDER BY clause. If you care about the order but not efficiency, use the ORDER BY clause.

如果您关心效率而不关心订单,请跳过ORDER BY子句。如果您关心订单而不关心效率,请使用ORDER BY子句。

Since you actually care about BOTH use ORDER BY and then carefully tune your query and database so that it is efficient.

既然你真的关心BOTH使用ORDER BY然后仔细调整你的查询和数据库,以便它是有效的。

#2


5  

No, you can't rely on getting the results back in the same order every time. I discovered that when working on a web page with a paged grid. When I went to the next page, and then back to the previous page, the previous page contained different records! I was totally mystified.

不,您不能依赖每次都以相同的顺序返回结果。我发现在使用分页网格的网页上工作时。当我转到下一页,然后回到上一页时,上一页包含不同的记录!我完全神秘化了。

For predictable results, then, you should include an ORDER BY. Even then, if there are identical values in the specified columns there, you can get different results. You may have to ORDER BY fields that you didn't really think you needed, just to get a predictable result.

因此,对于可预测的结果,您应该包含ORDER BY。即便如此,如果指定的列中存在相同的值,则可以得到不同的结果。您可能必须使用您认为不需要的ORDER BY字段,以获得可预测的结果。

#3


3  

Tom Kyte has a pet peeve about this topic. For whatever reason, people are fascinated by this, and keep trying to come up with cases where you can rely upon a specific order without specifying ORDER BY. As others have stated, you can't. Here's another amusing thread on the topic on the AskTom website.

汤姆·基特对这个话题有一点小小的尝试。无论出于何种原因,人们对此着迷,并继续尝试提出您可以依赖特定订单而不指定ORDER BY的情况。正如其他人所说,你做不到。这是AskTom网站上关于这个主题的另一个有趣的帖子。

#4


0  

The Right Answer

This is a new answer added to correct the old one. I've got answer from Tom Kyte and I post it here:

这是为纠正旧答案而添加的新答案。我从Tom Kyte得到答案,我在这里发布:

If you want rows sorted YOU HAVE TO USE AN ORDER. No if, and, or buts about it. period. http://tkyte.blogspot.ru/2005/08/order-in-court.html You need order by on that IOT. Rows are sorted in leaf blocks, but leaf blocks are not stored sorted. fast full scan=unsorted rows.

如果要排序行,则必须使用订单。不,如果,或,或是关于它。期。 http://tkyte.blogspot.ru/2005/08/order-in-court.html您需要在该物联网上订购。行在叶块中排序,但叶块不按存储排序。快速全扫描=未排序的行。

https://twitter.com/oracleasktom/status/625318150590980097

https://twitter.com/oracleasktom/status/625318150590980097

https://twitter.com/oracleasktom/status/625316875338149888

https://twitter.com/oracleasktom/status/625316875338149888


The Wrong Answer

(Attention! The original answer on the question was placed below here only for the sake of the history. It's wrong answer. The right answer is placed above)

(注意!这个问题的原始答案仅仅是为了历史而放在这里。这是错误的答案。正确的答案放在上面)

As Tom Kyte wrote in the article mentioned before:

正如Tom Kyte在前面提到的文章中写道:

You should think of a heap organized table as a big unordered collection of rows. These rows will come out in a seemingly random order, and depending on other options being used (parallel query, different optimizer modes and so on), they may come out in a different order with the same query. Do not ever count on the order of rows from a query unless you have an ORDER BY statement on your query!

您应该将堆组织表视为一个无序的大行集合。这些行将以看似随机的顺序出现,并且取决于所使用的其他选项(并行查询,不同的优化器模式等),它们可能以相同的查询以不同的顺序出现。除非您的查询中有ORDER BY语句,否则不要依赖查询中的行顺序!

But note he only talks about heap-organized tables. But there is also index-orgainzed tables. In that case you can rely on order of the select without ORDER BY because order implicitly defined by primary key. It is true for Oracle.

但请注意,他只讨论堆组织表。但也有索引表格。在这种情况下,您可以依赖于没有ORDER BY的选择顺序,因为主键隐式定义了顺序。对Oracle而言确实如此。

For SQL Server clustered indexes (index-organized tables) created by default. There is also possibility for PostgreSQL store information aligning by index. More information can be found here

对于默认情况下创建的SQL Server聚簇索引(索引组织表)。 PostgreSQL商店信息也有可能通过索引进行对齐。更多信息可以在这里找到

UPDATE: I see, that there is voting down on my answer. So I would try to explain my point a little bit. In the section Overview of Index-Organized Tables there is a phrase:

更新:我明白了,我的回答是投票。所以我会试着解释一下我的观点。在索引组织表概述部分中,有一个短语:

In an index-organized table, rows are stored in an index defined on the primary key for the table... Index-organized tables are useful when related pieces of data must be stored together or data must be physically stored in a specific order.

在索引组织表中,行存储在表的主键上定义的索引中...当相关的数据片必须一起存储或数据必须按特定顺序物理存储时,索引组织表非常有用。

http://docs.oracle.com/cd/E25054_01/server.1111/e25789/indexiot.htm#CBBJEBIH

http://docs.oracle.com/cd/E25054_01/server.1111/e25789/indexiot.htm#CBBJEBIH

Because of index, all data is stored in specific order, I believe same is true for Pg. http://www.postgresql.org/docs/9.2/static/sql-cluster.html

由于索引,所有数据都按特定顺序存储,我相信Pg也是如此。 http://www.postgresql.org/docs/9.2/static/sql-cluster.html

If you don't agree with me please give me a link on the documenation. I'll be happy to know that there is something to learn for me.

如果您不同意我的意见,请给我一个关于文件的链接。我很高兴知道有一些东西需要我学习。

#1


34  

No, that behavior cannot be relied on. The order is determined by the way the query planner has decided to build up the result set. simple queries like select * from foo_table are likely to be returned in the order they are stored on disk, which may be in primary key order or the order they were created, or some other random order. more complex queries, such as select * from foo where bar < 10 may instead be returned in order of a different column, based on an index read, or by the table order, for a table scan. even more elaborate queries, with multipe where conditions, group by clauses, unions, will be in whatever order the planner decides is most efficient to generate.

不,这种行为不能依赖。顺序由查询计划程序决定构建结果集的方式决定。像select * from foo_table这样的简单查询很可能按照它们存储在磁盘上的顺序返回,这可能是主键顺序或它们的创建顺序,或者其他一些随机顺序。更复杂的查询,例如select * from foo,其中bar <10可以根据索引读取或表顺序以不同列的顺序返回,用于表扫描。更复杂的查询,使用multipe,其中条件,group by子句,联合,将以规划者决定的最有效生成的顺序排列。

The order could even change between two identical queries just because of data that has changed between those queries. a "where" clause may be satisfied with an index scan in one query, but later inserts could make that condition less selective, and the planner could decide to perform a subsequent query using a table scan.

由于这些查询之间的数据发生了变化,订单甚至可以在两个相同的查询之间进行更改。一个查询中的索引扫描可能满足“where”子句,但后来的插入可能使该条件的选择性降低,并且计划程序可以决定使用表扫描执行后续查询。


To put a finer point on it. RDBMS systems have the mandate to give you exactly what you asked for, as efficiently as possible. That efficiency can take many forms, including minimizing IO (both to disk as well as over the network to send data to you), minimizing CPU and keeping the size of its working set small (using methods that require minimal temporary storage).

更精确一点。 RDBMS系统的任务是尽可能高效地为您提供所要求的内容。这种效率可以采取多种形式,包括最小化IO(无论是磁盘还是通过网络向您发送数据),最小化CPU并保持其工作集的大小(使用需要最少临时存储的方法)。

without an ORDER BY clause, you will have not asked exactly for a particular order, and so the RDBMS will give you those rows in some order that (maybe) corresponds with some coincidental aspect of the query, based on whichever algorithm the RDBMS expects to produce the data the fastest.

如果没有ORDER BY子句,您将不会确切地询问特定的顺序,因此RDBMS将根据RDBMS期望的任何算法以某种顺序(可能)与查询的某些巧合方面相对应地为您提供这些行。以最快的速度生成数据。

If you care about efficiency, but not order, skip the ORDER BY clause. If you care about the order but not efficiency, use the ORDER BY clause.

如果您关心效率而不关心订单,请跳过ORDER BY子句。如果您关心订单而不关心效率,请使用ORDER BY子句。

Since you actually care about BOTH use ORDER BY and then carefully tune your query and database so that it is efficient.

既然你真的关心BOTH使用ORDER BY然后仔细调整你的查询和数据库,以便它是有效的。

#2


5  

No, you can't rely on getting the results back in the same order every time. I discovered that when working on a web page with a paged grid. When I went to the next page, and then back to the previous page, the previous page contained different records! I was totally mystified.

不,您不能依赖每次都以相同的顺序返回结果。我发现在使用分页网格的网页上工作时。当我转到下一页,然后回到上一页时,上一页包含不同的记录!我完全神秘化了。

For predictable results, then, you should include an ORDER BY. Even then, if there are identical values in the specified columns there, you can get different results. You may have to ORDER BY fields that you didn't really think you needed, just to get a predictable result.

因此,对于可预测的结果,您应该包含ORDER BY。即便如此,如果指定的列中存在相同的值,则可以得到不同的结果。您可能必须使用您认为不需要的ORDER BY字段,以获得可预测的结果。

#3


3  

Tom Kyte has a pet peeve about this topic. For whatever reason, people are fascinated by this, and keep trying to come up with cases where you can rely upon a specific order without specifying ORDER BY. As others have stated, you can't. Here's another amusing thread on the topic on the AskTom website.

汤姆·基特对这个话题有一点小小的尝试。无论出于何种原因,人们对此着迷,并继续尝试提出您可以依赖特定订单而不指定ORDER BY的情况。正如其他人所说,你做不到。这是AskTom网站上关于这个主题的另一个有趣的帖子。

#4


0  

The Right Answer

This is a new answer added to correct the old one. I've got answer from Tom Kyte and I post it here:

这是为纠正旧答案而添加的新答案。我从Tom Kyte得到答案,我在这里发布:

If you want rows sorted YOU HAVE TO USE AN ORDER. No if, and, or buts about it. period. http://tkyte.blogspot.ru/2005/08/order-in-court.html You need order by on that IOT. Rows are sorted in leaf blocks, but leaf blocks are not stored sorted. fast full scan=unsorted rows.

如果要排序行,则必须使用订单。不,如果,或,或是关于它。期。 http://tkyte.blogspot.ru/2005/08/order-in-court.html您需要在该物联网上订购。行在叶块中排序,但叶块不按存储排序。快速全扫描=未排序的行。

https://twitter.com/oracleasktom/status/625318150590980097

https://twitter.com/oracleasktom/status/625318150590980097

https://twitter.com/oracleasktom/status/625316875338149888

https://twitter.com/oracleasktom/status/625316875338149888


The Wrong Answer

(Attention! The original answer on the question was placed below here only for the sake of the history. It's wrong answer. The right answer is placed above)

(注意!这个问题的原始答案仅仅是为了历史而放在这里。这是错误的答案。正确的答案放在上面)

As Tom Kyte wrote in the article mentioned before:

正如Tom Kyte在前面提到的文章中写道:

You should think of a heap organized table as a big unordered collection of rows. These rows will come out in a seemingly random order, and depending on other options being used (parallel query, different optimizer modes and so on), they may come out in a different order with the same query. Do not ever count on the order of rows from a query unless you have an ORDER BY statement on your query!

您应该将堆组织表视为一个无序的大行集合。这些行将以看似随机的顺序出现,并且取决于所使用的其他选项(并行查询,不同的优化器模式等),它们可能以相同的查询以不同的顺序出现。除非您的查询中有ORDER BY语句,否则不要依赖查询中的行顺序!

But note he only talks about heap-organized tables. But there is also index-orgainzed tables. In that case you can rely on order of the select without ORDER BY because order implicitly defined by primary key. It is true for Oracle.

但请注意,他只讨论堆组织表。但也有索引表格。在这种情况下,您可以依赖于没有ORDER BY的选择顺序,因为主键隐式定义了顺序。对Oracle而言确实如此。

For SQL Server clustered indexes (index-organized tables) created by default. There is also possibility for PostgreSQL store information aligning by index. More information can be found here

对于默认情况下创建的SQL Server聚簇索引(索引组织表)。 PostgreSQL商店信息也有可能通过索引进行对齐。更多信息可以在这里找到

UPDATE: I see, that there is voting down on my answer. So I would try to explain my point a little bit. In the section Overview of Index-Organized Tables there is a phrase:

更新:我明白了,我的回答是投票。所以我会试着解释一下我的观点。在索引组织表概述部分中,有一个短语:

In an index-organized table, rows are stored in an index defined on the primary key for the table... Index-organized tables are useful when related pieces of data must be stored together or data must be physically stored in a specific order.

在索引组织表中,行存储在表的主键上定义的索引中...当相关的数据片必须一起存储或数据必须按特定顺序物理存储时,索引组织表非常有用。

http://docs.oracle.com/cd/E25054_01/server.1111/e25789/indexiot.htm#CBBJEBIH

http://docs.oracle.com/cd/E25054_01/server.1111/e25789/indexiot.htm#CBBJEBIH

Because of index, all data is stored in specific order, I believe same is true for Pg. http://www.postgresql.org/docs/9.2/static/sql-cluster.html

由于索引,所有数据都按特定顺序存储,我相信Pg也是如此。 http://www.postgresql.org/docs/9.2/static/sql-cluster.html

If you don't agree with me please give me a link on the documenation. I'll be happy to know that there is something to learn for me.

如果您不同意我的意见,请给我一个关于文件的链接。我很高兴知道有一些东西需要我学习。