I need to fetch latest user from User table.Which one of the below queries has best performance in Postgres for doing this.?
我需要从User表中获取最新用户。以下其中一个查询在Postgres中具有最佳性能。
Select MAX(u.id) from User u;
or
要么
Select u.id from User u order by desc limit 1;
2 个解决方案
#1
0
This is an elaboration of the comment.
这是评论的详细说明。
If you have an index on user(id)
, then both formulations should use that index. I'm pretty sure they would have essentially the same execution plan.
如果您有一个user(id)索引,那么两个公式都应该使用该索引。我很确定他们的执行计划基本相同。
If you don't have a (b-tree) index, then I think the max()
version will be faster. I think it will read the data once and extract the max()
in a single pass. The order by
will have to sort all the records.
如果你没有(b-tree)索引,那么我认为max()版本会更快。我认为它会读取一次数据并在一次传递中提取max()。订单必须对所有记录进行排序。
Sometimes databases have some very specific optimizations that might apply (such as an optimization that might recognize a special case with limit
and order by
). I don't think any apply in this case.
有时,数据库具有一些可能适用的非常具体的优化(例如可能识别具有限制和排序的特殊情况的优化)。我认为在这种情况下不适用。
#2
0
This may depend on your PostgreSQL version, but I tested the two approaches on a representative table (which is what you should do):
这可能取决于您的PostgreSQL版本,但我在代表性表格上测试了这两种方法(这是您应该做的):
explain analyze select max(id) from versions;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.21..0.21 rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.08..0.21 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=1)
-> Index Only Scan Backward using index_versions_on_id on versions (cost=0.08..98474.35 rows=787172 width=4) (actual time=0.030..0.030 rows=1 loops=1)
Index Cond: (id IS NOT NULL)
Heap Fetches: 1
Planning time: 0.143 ms
Execution time: 0.062 ms
(8 rows)
explain analyze select id from versions order by id desc limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.08..0.21 rows=1 width=4) (actual time=0.025..0.025 rows=1 loops=1)
-> Index Only Scan Backward using index_versions_on_id on versions (cost=0.08..98080.76 rows=787172 width=4) (actual time=0.024..0.024 rows=1 loops=1)
Heap Fetches: 1
Planning time: 0.099 ms
Execution time: 0.044 ms
(5 rows)
This was from 9.4.5, on a unique index on a table with 860,000 rows.
这是从9.4.5,在一个有860,000行的表上的唯一索引。
This showed that the order by technique was marginally faster, but for me it is not enough to decide that you should use that method -- performance is not everything, and I prefer the semantics of the max() approach.
这表明技术的顺序稍微快一点,但对我来说,仅仅决定你应该使用那种方法是不够的 - 性能不是一切,我更喜欢max()方法的语义。
#1
0
This is an elaboration of the comment.
这是评论的详细说明。
If you have an index on user(id)
, then both formulations should use that index. I'm pretty sure they would have essentially the same execution plan.
如果您有一个user(id)索引,那么两个公式都应该使用该索引。我很确定他们的执行计划基本相同。
If you don't have a (b-tree) index, then I think the max()
version will be faster. I think it will read the data once and extract the max()
in a single pass. The order by
will have to sort all the records.
如果你没有(b-tree)索引,那么我认为max()版本会更快。我认为它会读取一次数据并在一次传递中提取max()。订单必须对所有记录进行排序。
Sometimes databases have some very specific optimizations that might apply (such as an optimization that might recognize a special case with limit
and order by
). I don't think any apply in this case.
有时,数据库具有一些可能适用的非常具体的优化(例如可能识别具有限制和排序的特殊情况的优化)。我认为在这种情况下不适用。
#2
0
This may depend on your PostgreSQL version, but I tested the two approaches on a representative table (which is what you should do):
这可能取决于您的PostgreSQL版本,但我在代表性表格上测试了这两种方法(这是您应该做的):
explain analyze select max(id) from versions;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.21..0.21 rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.08..0.21 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=1)
-> Index Only Scan Backward using index_versions_on_id on versions (cost=0.08..98474.35 rows=787172 width=4) (actual time=0.030..0.030 rows=1 loops=1)
Index Cond: (id IS NOT NULL)
Heap Fetches: 1
Planning time: 0.143 ms
Execution time: 0.062 ms
(8 rows)
explain analyze select id from versions order by id desc limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.08..0.21 rows=1 width=4) (actual time=0.025..0.025 rows=1 loops=1)
-> Index Only Scan Backward using index_versions_on_id on versions (cost=0.08..98080.76 rows=787172 width=4) (actual time=0.024..0.024 rows=1 loops=1)
Heap Fetches: 1
Planning time: 0.099 ms
Execution time: 0.044 ms
(5 rows)
This was from 9.4.5, on a unique index on a table with 860,000 rows.
这是从9.4.5,在一个有860,000行的表上的唯一索引。
This showed that the order by technique was marginally faster, but for me it is not enough to decide that you should use that method -- performance is not everything, and I prefer the semantics of the max() approach.
这表明技术的顺序稍微快一点,但对我来说,仅仅决定你应该使用那种方法是不够的 - 性能不是一切,我更喜欢max()方法的语义。