如何快速从SQL数据库中删除大量记录?

时间:2021-08-30 22:13:05

We have a table with about 1.5 million records. This table has a lot of FK relations to from different tables.

我们有一张约150万条记录的表格。该表与不同的表有很多FK关系。

The problem is that 1 million record just duplicates which have to be deleted. We try to delete 1000 records at the time, but it's a very slow process.

问题是100万条记录只是重复,必须删除。我们当时尝试删除1000条记录,但这是一个非常缓慢的过程。

What I have in mind is to copy temporarily records that have to stay to a new table. Truncate existing one and copy records that have to stay back. With restoring primary key and all relations to the other tables. So from client side you cannot see any difference.

我想到的是临时复制必须留在新表中的记录。截断现有的并复制必须留下的记录。将主键和所有关系还原到其他表。所以从客户端来看,你看不出有任何区别。

Not sure if it's an efficient way or not.

不确定它是否是一种有效的方式。

If it's I would love to see basic implementation of it so I can follow and apply to my case. If not I would like to see efficient way of doing it.

如果我希望看到它的基本实现,那么我可以关注并适用于我的案例。如果不是,我希望看到有效的方式。

Thank you

2 个解决方案

#1


2  

Our company has a bunch of temporary data stored in databases. When we need to delete a bunch of them, we break it up into a few hundred rows and delete them chunks at a time. We have an application whose sole purpose in life is to run a few queries like this over and over again:

我们公司有一堆临时数据存储在数据库中。当我们需要删除一堆它们时,我们将它分成几百行并一次删除它们。我们有一个应用程序,它的唯一目的就是一遍又一遍地运行这样的一些查询:

with topFew as (select top 100 * from table) delete topFew

I suggest you whip up something simple like this, and just let it run for a few hours. Go work on something else while it's processing.

我建议你掀起像这样简单的东西,让它运行几个小时。在处理过程中继续处理其他事情。

#2


1  

Performance of the delete can be improved by self joining the table using rowid. It can be even optimized by using a bulk collect and FORALL

使用rowid自行连接表可以提高删除的性能。甚至可以通过使用批量收集和FORALL来优化它

     DECLARE

     limit_in integer;
     CURSOR C1 is
     Select min(b.rowid) 
       from table_name a, table_name b
      where a.primary_key = b.primary_key;

      TYPE C1_rec IS TABLE OF C1%ROWTYPE
      INDEX BY PLS_INTEGER;

     C1_record C1_rec

     BEGIN
     limit_in:=10000  --- Can be changed based on performance
     OPEN C1;
       LOOP
        FETCH C1 BULK COLLECT INTO C1_record LIMIT limit_in;
        FORALL indx in 1..c1_record.count
         DELETE FROM table_name where row_id = C1_record(i);
         commit;
       END LOOP;
     END;

The table that is to be deleted has child tables, So there will be a constraint Violation.

要删除的表具有子表,因此将存在约束违规。

So before executing the above piece of code, It is a better option to alter the foreign key constraint TO HAVE DELETE CASCADE. We cannot modify a constraint to add delete cascade. So the foreign key should be dropped and recreated to have delete cascade

因此,在执行上面的代码之前,更改外键约束TO HAVE DELETE CASCADE是一个更好的选择。我们无法修改约束来添加删除级联。因此应删除外键并重新创建以删除级联

    ALTER child_table
    ADD CONSTRAINT fk_name
    foreign_key (C1)
    references parent_table (C2) on delete cascade;

Delete cascade would clean up your child tables as well..

删除级联也会清理子表。

#1


2  

Our company has a bunch of temporary data stored in databases. When we need to delete a bunch of them, we break it up into a few hundred rows and delete them chunks at a time. We have an application whose sole purpose in life is to run a few queries like this over and over again:

我们公司有一堆临时数据存储在数据库中。当我们需要删除一堆它们时,我们将它分成几百行并一次删除它们。我们有一个应用程序,它的唯一目的就是一遍又一遍地运行这样的一些查询:

with topFew as (select top 100 * from table) delete topFew

I suggest you whip up something simple like this, and just let it run for a few hours. Go work on something else while it's processing.

我建议你掀起像这样简单的东西,让它运行几个小时。在处理过程中继续处理其他事情。

#2


1  

Performance of the delete can be improved by self joining the table using rowid. It can be even optimized by using a bulk collect and FORALL

使用rowid自行连接表可以提高删除的性能。甚至可以通过使用批量收集和FORALL来优化它

     DECLARE

     limit_in integer;
     CURSOR C1 is
     Select min(b.rowid) 
       from table_name a, table_name b
      where a.primary_key = b.primary_key;

      TYPE C1_rec IS TABLE OF C1%ROWTYPE
      INDEX BY PLS_INTEGER;

     C1_record C1_rec

     BEGIN
     limit_in:=10000  --- Can be changed based on performance
     OPEN C1;
       LOOP
        FETCH C1 BULK COLLECT INTO C1_record LIMIT limit_in;
        FORALL indx in 1..c1_record.count
         DELETE FROM table_name where row_id = C1_record(i);
         commit;
       END LOOP;
     END;

The table that is to be deleted has child tables, So there will be a constraint Violation.

要删除的表具有子表,因此将存在约束违规。

So before executing the above piece of code, It is a better option to alter the foreign key constraint TO HAVE DELETE CASCADE. We cannot modify a constraint to add delete cascade. So the foreign key should be dropped and recreated to have delete cascade

因此,在执行上面的代码之前,更改外键约束TO HAVE DELETE CASCADE是一个更好的选择。我们无法修改约束来添加删除级联。因此应删除外键并重新创建以删除级联

    ALTER child_table
    ADD CONSTRAINT fk_name
    foreign_key (C1)
    references parent_table (C2) on delete cascade;

Delete cascade would clean up your child tables as well..

删除级联也会清理子表。