数据库:删除或不删除记录

时间:2022-05-05 22:12:15

I don't think I am the only person wondering about this. What do you usually practice about database behavior? Do you prefer to delete a record from the database physically? Or is it better to just flag the record with a "deleted" flag or a boolean column to denote the record is active or inactive?

我不认为我是唯一对此感到疑惑的人。关于数据库行为,您通常练习什么?您愿意从数据库中删除一条记录吗?还是仅仅用“已删除”标记记录或布尔列标记记录是活动的还是不活动的?

8 个解决方案

#1


44  

It definitely depends on the actual content of your database. If you're using it to store session information, then by all means wipe it immediately when the session expires (or is closed), you don't want that garbage lying around. As it cannot really be used again for any practical purposes.

这肯定取决于数据库的实际内容。如果您正在使用它来存储会话信息,那么在会话到期时(或者关闭)时,务必立即将其擦除,您不希望周围有垃圾。因为它不能真正用于任何实际目的。

Basically, what you need to ask yourself, might I need to restore this information? Like deleted questions on SO, they should definitely just be marked 'deleted', as we're actively allowing an undelete. We also have the option to display it to select users as well, without much extra work.

基本上,你需要问自己的是,我是否需要恢复这些信息?就像删除的问题一样,它们肯定应该被标记为“删除”,因为我们正在积极地允许删除。我们也可以选择显示它来选择用户,没有额外的工作。

If you're not actively seeking to fully restore the data, but you'd still like to keep it around for monitoring (or similar) purposes. I would suggest that you figure out (to the extent possible of course) an aggregation scheme, and shove that off to another table. This will keep your primary table clean of 'deleted' data, as well as keep your secondary table optimized for monitoring purposes (or whatever you had in mind).

如果您没有积极地寻求完全恢复数据,但是您仍然希望将其保留在周围以进行监视(或类似的)目的。我建议您(在一定程度上)确定一个聚合方案,并将其推到另一个表中。这将使您的主表不包含“已删除”的数据,并使您的辅助表为监视目的(或您想到的任何东西)进行优化。

For temporal data, see: http://talentedmonkeys.wordpress.com/2010/05/15/temporal-data-in-a-relational-database/

对于时间数据,请参见:http://talentedmonkeys.wordpress.com/2010/05/15/timedata - -database/。

#2


23  

Pros of using a delete flag:

使用删除标志的优点:

  1. You can get the data back later if you need it,
  2. 如果你需要,你可以稍后取回数据,
  3. Delete operation (updating the flag) is probably quicker than really deleting it
  4. 删除操作(更新标志)可能比真正删除它要快

Cons of using a delete flag:

使用删除标志的缺点:

  1. It is very easy to miss AND DeletedFlag = 'N' somewhere in your SQL
  2. 在SQL中很容易忽略和删除flag = 'N'
  3. Slower for the database to find the rows that you are interested in amongst all the crap
  4. 数据库在所有垃圾中找到感兴趣的行要慢一些
  5. Eventually, you'll probably want to really delete it anyway (assuming your system is successful. What about when that record is 10 years old and it was "deleted" 4 minutes after originally created)
  6. 最终,您可能会想要删除它(假设您的系统是成功的)。当那张唱片10岁的时候,它被“删除”了4分钟后,
  7. It can make it impossible to use a natural key. You may have one or more deleted rows with the natural key and a real row wanting to use that same natural key.
  8. 它可以使人无法使用自然的钥匙。您可能有一个或多个带有自然键的已删除行和一个真正想要使用相同自然键的行。
  9. There may be legal/compliance reasons why you are meant to actually delete data.
  10. 可能有法律/法规遵循的原因导致您实际上要删除数据。

#3


18  

As a complement to all posts...

作为对所有职位的补充……

However, if you plan to mark the record, its good to consider making a view, for active records. This would save you from writing or forgetting the flag in your SQL query. You might consider a view for non-active records too, if you think that also serve a purpose.

但是,如果您打算标记记录,最好考虑为活动记录创建一个视图。这样可以避免在SQL查询中写入或忘记标记。如果您认为非活动记录也有作用,那么您也可以考虑非活动记录的视图。

#4


10  

I am glad to have found this thread. I too was wondering what people thought about this issue. I have implemented the 'marked as deleted' for about 15 years on many systems. Whenever a user would call to say something was accidentally deleted it was certainly a lot easier to mark it un-deleted than recreate it or restore from a backup.

我很高兴找到了这条线。我也想知道人们对这个问题的看法。我已经在许多系统上实现了“标记为已删除”大约15年。每当用户打电话说某个东西被意外删除时,将它标记为未删除肯定比重新创建或从备份中恢复要容易得多。

We are using postgresql and Ruby on rails it looks like we could do this in 1 of two ways, modify rails or add an ondelete trigger and does instead a pl/pgsql function to mark as deleted. I am leaning toward the latter.

我们正在使用postgresql和Ruby on rails,看起来我们可以通过以下两种方式之一来实现这一点:修改rails或添加ondelete触发器,然后使用pl/pgsql函数将其标记为已删除。我倾向于后者。

As for performance hits, it will be interesting to see the results of EXPLAIN-ANALYZE on large tables to few deleted items as well as many deleted items.

对于性能的影响,我们将会看到解释的结果——在大表上对少量删除项和大量删除项进行分析。

In systems used over time I have found, new users tend to do silly things like delete things accidentally. So when people are new in a position they have all the access rights of the person previously in that position except with zero experience. Accidentally deleting something and being able to quickly recover gets everyone back to work quickly.

我发现,在长期使用的系统中,新用户往往会做一些愚蠢的事情,比如不小心删除东西。所以当人们刚进入一个职位时他们拥有之前那个职位的所有访问权除了零经验。不小心删除了一些东西,并且能够快速恢复,这让每个人都能很快地恢复工作。

But as someone said, sometimes you may need that particular key back for some reason, at that point you would need to really delete it, then re-create the records (on undelete it and modify the record).

但是正如有人所说,有时由于某种原因,您可能需要返回那个特定的键,这时您需要真正地删除它,然后重新创建记录(在undelete上并修改记录)。

#5


6  

There are also legal issues either way if personal data is involved. I think it greatly depends on where you are (or where the database is), and what the terms of use are.

如果涉及个人数据,也存在法律问题。我认为这很大程度上取决于你在哪里(或者数据库在哪里),以及使用条款是什么。

In some cases people can ask to be removed from your system, in which case a hard delete is needed (or at least clearing out all of the personal information).

在某些情况下,人们可以要求从您的系统中删除,在这种情况下需要硬删除(或者至少清除所有的个人信息)。

I would check with your legal department before you adopt a strategy either way if personal information is involved.

如果涉及到个人信息,我将在你采取任何一种策略之前与你的法律部门协商。

#6


5  

I mark them as deleted, and don't really delete. However every once in a while I sweep out all the junk and archive it, so it doesn't kill performance.

我把它们标记为已删除,而不是真正的删除。然而,每隔一段时间,我就会清理掉所有的垃圾并将其存档,这样就不会破坏性能。

#7


2  

If you are concerned about "dormant" records slowing down your database access, you may want to move those rows into another table acting as an "archive" table.

如果您担心“休眠”记录会减慢数据库访问速度,您可能希望将这些行移动到另一个充当“归档”表的表中。

#8


1  

For user-entered/managed data I've used the flag method you describe and given the user an "empty the trash" interface to actually delete items if they choose to.

对于用户输入/管理的数据,我使用了您描述的标志方法,并为用户提供了一个“清空垃圾”界面,如果用户愿意,可以实际删除项目。

#1


44  

It definitely depends on the actual content of your database. If you're using it to store session information, then by all means wipe it immediately when the session expires (or is closed), you don't want that garbage lying around. As it cannot really be used again for any practical purposes.

这肯定取决于数据库的实际内容。如果您正在使用它来存储会话信息,那么在会话到期时(或者关闭)时,务必立即将其擦除,您不希望周围有垃圾。因为它不能真正用于任何实际目的。

Basically, what you need to ask yourself, might I need to restore this information? Like deleted questions on SO, they should definitely just be marked 'deleted', as we're actively allowing an undelete. We also have the option to display it to select users as well, without much extra work.

基本上,你需要问自己的是,我是否需要恢复这些信息?就像删除的问题一样,它们肯定应该被标记为“删除”,因为我们正在积极地允许删除。我们也可以选择显示它来选择用户,没有额外的工作。

If you're not actively seeking to fully restore the data, but you'd still like to keep it around for monitoring (or similar) purposes. I would suggest that you figure out (to the extent possible of course) an aggregation scheme, and shove that off to another table. This will keep your primary table clean of 'deleted' data, as well as keep your secondary table optimized for monitoring purposes (or whatever you had in mind).

如果您没有积极地寻求完全恢复数据,但是您仍然希望将其保留在周围以进行监视(或类似的)目的。我建议您(在一定程度上)确定一个聚合方案,并将其推到另一个表中。这将使您的主表不包含“已删除”的数据,并使您的辅助表为监视目的(或您想到的任何东西)进行优化。

For temporal data, see: http://talentedmonkeys.wordpress.com/2010/05/15/temporal-data-in-a-relational-database/

对于时间数据,请参见:http://talentedmonkeys.wordpress.com/2010/05/15/timedata - -database/。

#2


23  

Pros of using a delete flag:

使用删除标志的优点:

  1. You can get the data back later if you need it,
  2. 如果你需要,你可以稍后取回数据,
  3. Delete operation (updating the flag) is probably quicker than really deleting it
  4. 删除操作(更新标志)可能比真正删除它要快

Cons of using a delete flag:

使用删除标志的缺点:

  1. It is very easy to miss AND DeletedFlag = 'N' somewhere in your SQL
  2. 在SQL中很容易忽略和删除flag = 'N'
  3. Slower for the database to find the rows that you are interested in amongst all the crap
  4. 数据库在所有垃圾中找到感兴趣的行要慢一些
  5. Eventually, you'll probably want to really delete it anyway (assuming your system is successful. What about when that record is 10 years old and it was "deleted" 4 minutes after originally created)
  6. 最终,您可能会想要删除它(假设您的系统是成功的)。当那张唱片10岁的时候,它被“删除”了4分钟后,
  7. It can make it impossible to use a natural key. You may have one or more deleted rows with the natural key and a real row wanting to use that same natural key.
  8. 它可以使人无法使用自然的钥匙。您可能有一个或多个带有自然键的已删除行和一个真正想要使用相同自然键的行。
  9. There may be legal/compliance reasons why you are meant to actually delete data.
  10. 可能有法律/法规遵循的原因导致您实际上要删除数据。

#3


18  

As a complement to all posts...

作为对所有职位的补充……

However, if you plan to mark the record, its good to consider making a view, for active records. This would save you from writing or forgetting the flag in your SQL query. You might consider a view for non-active records too, if you think that also serve a purpose.

但是,如果您打算标记记录,最好考虑为活动记录创建一个视图。这样可以避免在SQL查询中写入或忘记标记。如果您认为非活动记录也有作用,那么您也可以考虑非活动记录的视图。

#4


10  

I am glad to have found this thread. I too was wondering what people thought about this issue. I have implemented the 'marked as deleted' for about 15 years on many systems. Whenever a user would call to say something was accidentally deleted it was certainly a lot easier to mark it un-deleted than recreate it or restore from a backup.

我很高兴找到了这条线。我也想知道人们对这个问题的看法。我已经在许多系统上实现了“标记为已删除”大约15年。每当用户打电话说某个东西被意外删除时,将它标记为未删除肯定比重新创建或从备份中恢复要容易得多。

We are using postgresql and Ruby on rails it looks like we could do this in 1 of two ways, modify rails or add an ondelete trigger and does instead a pl/pgsql function to mark as deleted. I am leaning toward the latter.

我们正在使用postgresql和Ruby on rails,看起来我们可以通过以下两种方式之一来实现这一点:修改rails或添加ondelete触发器,然后使用pl/pgsql函数将其标记为已删除。我倾向于后者。

As for performance hits, it will be interesting to see the results of EXPLAIN-ANALYZE on large tables to few deleted items as well as many deleted items.

对于性能的影响,我们将会看到解释的结果——在大表上对少量删除项和大量删除项进行分析。

In systems used over time I have found, new users tend to do silly things like delete things accidentally. So when people are new in a position they have all the access rights of the person previously in that position except with zero experience. Accidentally deleting something and being able to quickly recover gets everyone back to work quickly.

我发现,在长期使用的系统中,新用户往往会做一些愚蠢的事情,比如不小心删除东西。所以当人们刚进入一个职位时他们拥有之前那个职位的所有访问权除了零经验。不小心删除了一些东西,并且能够快速恢复,这让每个人都能很快地恢复工作。

But as someone said, sometimes you may need that particular key back for some reason, at that point you would need to really delete it, then re-create the records (on undelete it and modify the record).

但是正如有人所说,有时由于某种原因,您可能需要返回那个特定的键,这时您需要真正地删除它,然后重新创建记录(在undelete上并修改记录)。

#5


6  

There are also legal issues either way if personal data is involved. I think it greatly depends on where you are (or where the database is), and what the terms of use are.

如果涉及个人数据,也存在法律问题。我认为这很大程度上取决于你在哪里(或者数据库在哪里),以及使用条款是什么。

In some cases people can ask to be removed from your system, in which case a hard delete is needed (or at least clearing out all of the personal information).

在某些情况下,人们可以要求从您的系统中删除,在这种情况下需要硬删除(或者至少清除所有的个人信息)。

I would check with your legal department before you adopt a strategy either way if personal information is involved.

如果涉及到个人信息,我将在你采取任何一种策略之前与你的法律部门协商。

#6


5  

I mark them as deleted, and don't really delete. However every once in a while I sweep out all the junk and archive it, so it doesn't kill performance.

我把它们标记为已删除,而不是真正的删除。然而,每隔一段时间,我就会清理掉所有的垃圾并将其存档,这样就不会破坏性能。

#7


2  

If you are concerned about "dormant" records slowing down your database access, you may want to move those rows into another table acting as an "archive" table.

如果您担心“休眠”记录会减慢数据库访问速度,您可能希望将这些行移动到另一个充当“归档”表的表中。

#8


1  

For user-entered/managed data I've used the flag method you describe and given the user an "empty the trash" interface to actually delete items if they choose to.

对于用户输入/管理的数据,我使用了您描述的标志方法,并为用户提供了一个“清空垃圾”界面,如果用户愿意,可以实际删除项目。