当一些行可能是已经存在的行的副本时，如何插入多个行?

So I have a checkbox form where users can select multiple values. Then can then go back and select different values. Each value is stored as a row (UserID,value).

我有一个复选框表单，用户可以选择多个值。然后可以返回并选择不同的值。每个值都存储为一行(UserID,value)。

How do you do that INSERT when some rows might be duplicates of an already-existing row in the table?

当某些行可能是表中已经存在的行的重复时，如何进行插入?

Should I first delete the existing values and then INSERT the new values?

我应该先删除现有的值，然后插入新的值吗?

ON DUPLICATE KEY UPDATE seems tricky since I would be INSERTing multiple rows at once, so how would I define and separate just the ones that need UPDATING vs. the ones that need INSERTING?

对于重复的键更新似乎很棘手，因为我将同时插入多个行，那么我如何定义和分离需要更新的行和需要插入的行呢?

For example, let's say a user makes his first-time selection:

例如，假设用户第一次选择:

INSERT INTO 
   Choices(UserID,value) 
VALUES 
   ('1','banana'),('1','apple'),('1','orange'),('1','cranberry'),('1','lemon')

What if the user goes back later and makes different choices which include SOME of the values in his original query which will thus cause duplicates?

如果用户稍后返回并做出不同的选择，其中包括原始查询中的一些值，从而导致重复，该怎么办?

How should I handle that best?

我该如何处理这个问题?

3 个解决方案

#1

In my opinion, simply deleting the existing choices and then inserting the new ones is the best way to go. It may not be the most efficient overall, but it is simple to code and thus has a much better chance of being correct.

在我看来，简单地删除现有的选项，然后插入新的选项是最好的方法。它可能不是最有效的整体，但是它很容易编码，因此更有可能是正确的。

Otherwise it is necessary to find the intersection of the new choices and old choices. Then either delete the obsolete ones or change them to the new choices (and then insert/delete depending on if the new set of choices is bigger or smaller than the original set). The added risk of the extra complexity does not seem worth it.

否则，有必要找到新选择和旧选择的交集。然后，要么删除过时的选项，要么将它们更改为新的选项(然后根据新的选项集的大小，插入/删除取决于新的选项集是否大于原始集)。额外的复杂性增加的风险似乎不值得。

Edit As @Andrew points out in the comments, deleting the originals en masse may not be a good plan if these records happened to be "parent" records in a referential integrity definition. My thinking was that this seemed like an unlikely situation based on the OP's description. But it is definitely worth consideration.

@Andrew在评论中指出，如果这些记录碰巧是引用完整性定义中的“父”记录，那么一起删除原始记录可能不是一个好的计划。我的想法是，根据OP的描述，这看起来不太可能。但这绝对值得考虑。

#2

It's not clear to me when you would ever need to update a record in the database in your case.

我不清楚您何时需要更新数据库中的记录。

It sounds like you need to maintain a set of choices per user, which the user may on occasion change. Therefore, each time the user provides a new set of choices, any prior set of choices should be discarded. So you would delete all old records, then insert any new ones.

听起来您需要为每个用户维护一组选项，用户有时可能会更改这些选项。因此，每次用户提供一组新的选项时，都应该丢弃之前的任何一组选项。所以你要删除所有的旧记录，然后插入任何新的记录。

You might consider carrying out a comparison of the prior and new choices - either in the server or client code - in order to calculate the minimum set of deletes and/or inserts needed to reduce database writes. But that smells like premature optimisation.

为了计算减少数据库写操作所需的最小删除和/或插入集，您可以考虑在服务器或客户端代码中对先前和新选项进行比较。但那闻起来像是过早的优化。

Putting all that to one side - if you want a re-insert to be ignored then you should use INSERT IGNORE, then existing rows will be quietly ignored and new ones will be inserted.

将所有这些放到一边——如果您希望重新插入被忽略，那么您应该使用INSERT IGNORE，然后会悄悄地忽略现有的行，插入新的行。

#3

I don't know much about mysql but in MS SQL 2000+ we can execute a stored proc with XML as one of it's parameters. This XML would contain a list of identity-value pairs. We would open this XML as a table using openxml and figure out which rows need to be deleted or inserted using left or right outer join. As of SQL 2008 (I think) we have a new merge statement that let's us perform delete, update and insert row operations in one statement on ONE table. This way we can take advantage of Set mathematical operations from SQL instead of looping through arrays in the application code.

我不太了解mysql，但是在MS SQL 2000+中，我们可以使用XML作为一个参数来执行存储的proc。这个XML将包含标识值对的列表。我们将使用openxml将这个XML作为一个表打开，并计算需要删除或使用左或右外连接插入哪些行。在SQL 2008中(我认为)，我们有了一个新的merge语句，我们可以在一个表中的一个语句中执行删除、更新和插入行操作。通过这种方式，我们可以利用SQL中的集合数学操作，而不是在应用程序代码中对数组进行循环。

You can also keep your select list retrieved from the database in session and compare the "old list" to the "newly selected list" in your application code. You would need to figure out which rows need to be deleted or added. You probably don't need to worry about updates because you are probably only keeping foreign keys in this table and the descriptions are in some kind of a reference table.

您还可以在会话中保存从数据库检索的选择列表，并将“旧列表”与应用程序代码中的“新选择列表”进行比较。您需要弄清楚哪些行需要删除或添加。您可能不需要担心更新，因为您可能只在这个表中保留外键，并且描述在某种引用表中。

There is another way in SQL 2008 that involves using user defined data-types as custom tables but I don't know much about it.

SQL 2008中还有另一种方法，它涉及使用用户定义的数据类型作为自定义表，但我对此知之甚少。

Personally, I prefer the XML route because you just send the end-state into the sp and your sp automatically figures out which rows need to deleted or inserted.

就我个人而言，我更喜欢XML路由，因为您只需将最终状态发送到sp，您的sp就会自动计算出哪些行需要删除或插入。

Hope this helps.

希望这个有帮助。

#1