I'm trying to port some old MySQL queries to PostgreSQL, but I'm having trouble with this one:
我试图将一些旧的MySQL查询移植到PostgreSQL,但是我在这方面遇到了麻烦:
DELETE FROM logtable ORDER BY timestamp LIMIT 10;
PostgreSQL doesn't allow ordering or limits in its delete syntax, and the table doesn't have a primary key so I can't use a subquery. Additionally, I want to preserve the behavior where the query deletes exactly the given number or records -- for example, if the table contains 30 rows but they all have the same timestamp, I still want to delete 10, although it doesn't matter which 10.
PostgreSQL在其删除语法中不允许排序或限制,而且该表没有主键,因此我不能使用子查询。此外,我还希望保留查询删除给定数字或记录的行为——例如,如果表包含30行,但它们都有相同的时间戳,我仍然希望删除10行,尽管哪个10行并不重要。
So; how do I delete a fixed number of rows with sorting in PostgreSQL?
所以;如何删除PostgreSQL中的固定行数?
Edit: No primary key means there's no log_id
column or similar. Ah, the joys of legacy systems!
编辑:没有主键意味着没有log_id列或类似的列。啊,遗留系统的乐趣!
5 个解决方案
#1
104
You could try using the ctid
:
你可以试试ctid:
DELETE FROM logtable
WHERE ctid IN (
SELECT ctid
FROM logtable
ORDER BY timestamp
LIMIT 10
)
The ctid
is:
ctid是:
The physical location of the row version within its table. Note that although the
ctid
can be used to locate the row version very quickly, a row'sctid
will change if it is updated or moved byVACUUM FULL
. Thereforectid
is useless as a long-term row identifier.在其表内的行版本的物理位置。请注意,尽管ctid可以非常快速地定位到行版本,但是如果它被更新或被真空填充,那么该行的ctid将会发生变化。因此ctid作为长期行标识符是无用的。
There's also oid
but that only exists if you specifically ask for it when you create the table.
也有oid,但只有当你在创建表时特别要求它才会存在。
#2
30
Postgres docs recommend to use array instead of IN and subquery. This should work much faster
Postgres文档建议使用数组而不是IN和subquery。这应该会快得多
DELETE FROM logtable
WHERE id = any (array(SELECT id FROM logtable ORDER BY timestamp LIMIT 10));
This and some other tricks can be found here
这里还有其他一些技巧。
#3
10
delete from logtable where log_id in (
select log_id from logtable order by timestamp limit 10);
#4
2
Assuming you want to delete ANY 10 records (without the ordering) you could do this:
假设您想删除任何10条记录(没有订购),您可以这样做:
DELETE FROM logtable as t1 WHERE t1.ctid < (select t2.ctid from logtable as t2 where (Select count(*) from logtable t3 where t3.ctid < t2.ctid ) = 10 LIMIT 1);
For my use case, deleting 10M records, this turned out to be faster.
对于我的用例来说,删除10M记录,结果会更快。
#5
1
You could write a procedure which loops over the delete for individual lines, the procedure could take a parameter to specify the number of items you want to delete. But that's a bit overkill compared to MySQL.
您可以编写一个循环遍历删除的过程,该过程可以使用一个参数来指定要删除的项的数量。但与MySQL相比,这有点过头了。
#1
104
You could try using the ctid
:
你可以试试ctid:
DELETE FROM logtable
WHERE ctid IN (
SELECT ctid
FROM logtable
ORDER BY timestamp
LIMIT 10
)
The ctid
is:
ctid是:
The physical location of the row version within its table. Note that although the
ctid
can be used to locate the row version very quickly, a row'sctid
will change if it is updated or moved byVACUUM FULL
. Thereforectid
is useless as a long-term row identifier.在其表内的行版本的物理位置。请注意,尽管ctid可以非常快速地定位到行版本,但是如果它被更新或被真空填充,那么该行的ctid将会发生变化。因此ctid作为长期行标识符是无用的。
There's also oid
but that only exists if you specifically ask for it when you create the table.
也有oid,但只有当你在创建表时特别要求它才会存在。
#2
30
Postgres docs recommend to use array instead of IN and subquery. This should work much faster
Postgres文档建议使用数组而不是IN和subquery。这应该会快得多
DELETE FROM logtable
WHERE id = any (array(SELECT id FROM logtable ORDER BY timestamp LIMIT 10));
This and some other tricks can be found here
这里还有其他一些技巧。
#3
10
delete from logtable where log_id in (
select log_id from logtable order by timestamp limit 10);
#4
2
Assuming you want to delete ANY 10 records (without the ordering) you could do this:
假设您想删除任何10条记录(没有订购),您可以这样做:
DELETE FROM logtable as t1 WHERE t1.ctid < (select t2.ctid from logtable as t2 where (Select count(*) from logtable t3 where t3.ctid < t2.ctid ) = 10 LIMIT 1);
For my use case, deleting 10M records, this turned out to be faster.
对于我的用例来说,删除10M记录,结果会更快。
#5
1
You could write a procedure which loops over the delete for individual lines, the procedure could take a parameter to specify the number of items you want to delete. But that's a bit overkill compared to MySQL.
您可以编写一个循环遍历删除的过程,该过程可以使用一个参数来指定要删除的项的数量。但与MySQL相比,这有点过头了。