So, I have a quite large table (date-partitioned), e.g table1. There is multicolumn index on (shop_id, g_id, check_date).
我有一个很大的表(数据分区)e。表1。在(shop_id、g_id、check_date)上有多个olumn索引。
And I'm trying to run the query:
我试图运行查询:
SELECT shop_id, g_id, max(check_date)
FROM table1
GROUP BY shop_id, g_id;
The execution is really slow - Seq Scan. How to optimize/rewrite the query, so it may use index. There also a table which contains unique G_IDs and another table with unique SHOP_IDs.
执行真的很慢——Seq扫描。如何优化/重写查询,以便使用索引。还有一个表包含唯一的g_id,另一个表包含唯一的shop_id。
1 个解决方案
#1
2
You could rewrite this query using analytic functions, e.g.
您可以使用分析函数重写此查询。
SELECT
t.shop_id,
t.g_id,
t.check_date
FROM
(
SELECT shop_id, g_id, check_date,
DENSE_RANK() OVER (PARTITION BY shop_id, g_id ORDER BY check_date DESC) dr
FROM table1
) t
WHERE t.dr = 1;
Add an index on both the shop_id
and g_id
columns to cover the entire partition:
在shop_id和g_id列上添加一个索引,以覆盖整个分区:
CREATE INDEX your_idx ON table1 (shop_id, g_id);
#1
2
You could rewrite this query using analytic functions, e.g.
您可以使用分析函数重写此查询。
SELECT
t.shop_id,
t.g_id,
t.check_date
FROM
(
SELECT shop_id, g_id, check_date,
DENSE_RANK() OVER (PARTITION BY shop_id, g_id ORDER BY check_date DESC) dr
FROM table1
) t
WHERE t.dr = 1;
Add an index on both the shop_id
and g_id
columns to cover the entire partition:
在shop_id和g_id列上添加一个索引,以覆盖整个分区:
CREATE INDEX your_idx ON table1 (shop_id, g_id);