在SQL服务器上插入或更新的解决方案

Assume a table structure of MyTable(KEY, datafield1, datafield2...).

假设MyTable的表结构(KEY、datafield1、datafield2…)。

Often I want to either update an existing record, or insert a new record if it doesn't exist.

通常，我要么更新一个现有记录，要么插入一个不存在的新记录。

Essentially:

从本质上讲:

IF (key exists)
  run update command
ELSE
  run insert command

What's the best performing way to write this?

写这个的最佳方式是什么?

21 个解决方案

#1

302

don't forget about transactions. Performance is good, but simple (IF EXISTS..) approach is very dangerous.
When multiple threads will try to perform Insert-or-update you can easily get primary key violation.

不要忘记事务。性能是好的，但是简单的方法(如果存在的话)是非常危险的。当多个线程尝试执行插入或更新时，您可以很容易地获得主键违背。

Solutions provided by @Beau Crawford & @Esteban show general idea but error-prone.

@Beau Crawford和@Esteban提供的解决方案显示了总体思路，但也容易出错。

To avoid deadlocks and PK violations you can use something like this:

为了避免死锁和PK违规，您可以使用以下方法:

begin tran
if exists (select * from table with (updlock,serializable) where key = @key)
begin
   update table set ...
   where key = @key
end
else
begin
   insert into table (key, ...)
   values (@key, ...)
end
commit tran

或

begin tran
   update table with (serializable) set ...
   where key = @key

   if @@rowcount = 0
   begin
      insert into table (key, ...) values (@key,..)
   end
commit tran

#2

349

See my detailed answer to a very similar previous question

请看我对一个非常相似的问题的详细回答

@Beau Crawford's is a good way in SQL 2005 and below, though if you're granting rep it should go to the first guy to SO it. The only problem is that for inserts it's still two IO operations.

@Beau Crawford在SQL 2005和下面是一个很好的方法，但是如果您授予代表，那么它应该属于第一个人。唯一的问题是，对于insert，它仍然是两个IO操作。

MS Sql2008 introduces merge from the SQL:2003 standard:

MS Sql2008引入了SQL:2003标准中的merge:

merge tablename with(HOLDLOCK) as target
using (values ('new value', 'different value'))
    as source (field1, field2)
    on target.idfield = 7
when matched then
    update
    set field1 = source.field1,
        field2 = source.field2,
        ...
when not matched then
    insert ( idfield, field1, field2, ... )
    values ( 7,  source.field1, source.field2, ... )

Now it's really just one IO operation, but awful code :-(

现在它只是一个IO操作，但是糟糕的代码:

#3

139

Do an UPSERT:

做一个插入:

UPDATE MyTable SET FieldA=@FieldA WHERE Key=@Key

IF @@ROWCOUNT = 0
   INSERT INTO MyTable (FieldA) VALUES (@FieldA)

http://en.wikipedia.org/wiki/Upsert

#4

Many people will suggest you use MERGE, but I caution you against it. By default, it doesn't protect you from concurrency and race conditions any more than multiple statements, but it does introduce other dangers:

许多人会建议你使用合并，但我提醒你不要这样做。默认情况下，它不会保护您不受并发和竞态条件的影响，但它确实引入了其他危险:

http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/

Even with this "simpler" syntax available, I still prefer this approach (error handling omitted for brevity):

即使有了这种“更简单”的语法，我还是更喜欢这种方法(为了简洁起见，省略了错误处理):

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
UPDATE dbo.table SET ... WHERE PK = @PK;
IF @@ROWCOUNT = 0
BEGIN
  INSERT dbo.table(PK, ...) SELECT @PK, ...;
END
COMMIT TRANSACTION;

A lot of folks will suggest this way:

很多人会这样建议:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
IF EXISTS (SELECT 1 FROM dbo.table WHERE PK = @PK)
BEGIN
  UPDATE ...
END
ELSE
  INSERT ...
END
COMMIT TRANSACTION;

But all this accomplishes is ensuring you may need to read the table twice to locate the row(s) to be updated. In the first sample, you will only ever need to locate the row(s) once. (In both cases, if no rows are found from the initial read, an insert occurs.)

但是，所有这一切都是为了确保您可能需要对表进行两次读取，才能找到要更新的行。在第一个示例中，您只需要定位行一次。(在这两种情况下，如果在初始读取中没有找到行，就会发生插入。)

Others will suggest this way:

其他人会这样建议:

BEGIN TRY
  INSERT ...
END TRY
BEGIN CATCH
  IF ERROR_NUMBER() = 2627
    UPDATE ...
END CATCH

However, this is problematic if for no other reason than letting SQL Server catch exceptions that you could have prevented in the first place is much more expensive, except in the rare scenario where almost every insert fails. I prove as much here:

然而，如果除了让SQL Server捕获您本来可以避免的异常之外，没有其他原因，这就会产生问题，除非是在几乎所有插入都失败的罕见场景中。我在这里证明了很多:

http://www.mssqltips.com/sqlservertip/2632/checking-for-potential-constraint-violations-before-entering-sql-server-try-and-catch-logic/
http://www.mssqltips.com/sqlservertip/2632/checking-for-potential-constraint-violations-before-entering-sql-server-try-and-catch-logic/
http://www.sqlperformance.com/2012/08/t-sql-queries/error-handling
http://www.sqlperformance.com/2012/08/t-sql-queries/error-handling

#5

IF EXISTS (SELECT * FROM [Table] WHERE ID = rowID)
UPDATE [Table] SET propertyOne = propOne, property2 . . .
ELSE
INSERT INTO [Table] (propOne, propTwo . . .)

Edit:

编辑:

Alas, even to my own detriment, I must admit the solutions that do this without a select seem to be better since they accomplish the task with one less step.

唉，即使对我自己不利，我也必须承认，在没有选择的情况下完成任务的解决方案似乎更好，因为它们完成任务的步骤更少。

#6

If you want to UPSERT more than one record at a time you can use the ANSI SQL:2003 DML statement MERGE.

如果希望每次更新多个记录，可以使用ANSI SQL:2003 DML语句合并。

MERGE INTO table_name WITH (HOLDLOCK) USING table_name ON (condition)
WHEN MATCHED THEN UPDATE SET column1 = value1 [, column2 = value2 ...]
WHEN NOT MATCHED THEN INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])

Check out Mimicking MERGE Statement in SQL Server 2005.

查看SQL Server 2005中的模拟合并语句。

#7

Although its pretty late to comment on this I want to add a more complete example using MERGE.

虽然现在对此发表评论已经很晚了，但我想使用MERGE添加一个更完整的示例。

Such Insert+Update statements are usually called "Upsert" statements and can be implemented using MERGE in SQL Server.

这样的Insert+Update语句通常被称为“Upsert”语句，可以在SQL Server中使用MERGE来实现。

A very good example is given here: http://weblogs.sqlteam.com/dang/archive/2009/01/31/UPSERT-Race-Condition-With-MERGE.aspx

这里有一个很好的例子:http://weblogs.sqlteam.com/dang/archive/2009/01/31/upsert - race - conditionwith-merge.aspx。

The above explains locking and concurrency scenarios as well.

上面也解释了锁和并发场景。

I will be quoting the same for reference:

我将引用同样的引用:

ALTER PROCEDURE dbo.Merge_Foo2
      @ID int
AS

SET NOCOUNT, XACT_ABORT ON;

MERGE dbo.Foo2 WITH (HOLDLOCK) AS f
USING (SELECT @ID AS ID) AS new_foo
      ON f.ID = new_foo.ID
WHEN MATCHED THEN
    UPDATE
            SET f.UpdateSpid = @@SPID,
            UpdateTime = SYSDATETIME()
WHEN NOT MATCHED THEN
    INSERT
      (
            ID,
            InsertSpid,
            InsertTime
      )
    VALUES
      (
            new_foo.ID,
            @@SPID,
            SYSDATETIME()
      );

RETURN @@ERROR;

#8

/*
CREATE TABLE ApplicationsDesSocietes (
   id                   INT IDENTITY(0,1)    NOT NULL,
   applicationId        INT                  NOT NULL,
   societeId            INT                  NOT NULL,
   suppression          BIT                  NULL,
   CONSTRAINT PK_APPLICATIONSDESSOCIETES PRIMARY KEY (id)
)
GO
--*/

DECLARE @applicationId INT = 81, @societeId INT = 43, @suppression BIT = 0

MERGE dbo.ApplicationsDesSocietes WITH (HOLDLOCK) AS target
--set the SOURCE table one row
USING (VALUES (@applicationId, @societeId, @suppression))
    AS source (applicationId, societeId, suppression)
    --here goes the ON join condition
    ON target.applicationId = source.applicationId and target.societeId = source.societeId
WHEN MATCHED THEN
    UPDATE
    --place your list of SET here
    SET target.suppression = source.suppression
WHEN NOT MATCHED THEN
    --insert a new line with the SOURCE table one row
    INSERT (applicationId, societeId, suppression)
    VALUES (source.applicationId, source.societeId, source.suppression);
GO

Replace table and field names by whatever you need. Take care of the using ON condition. Then set the appropriate value (and type) for the variables on the DECLARE line.

用任何需要的东西替换表和字段名。注意使用条件。然后为声明行上的变量设置适当的值(和类型)。

Cheers.

欢呼。

#9

You can use MERGE Statement, This statement is used to insert data if not exist or update if does exist.

可以使用MERGE语句，如果不存在则使用此语句插入数据，如果存在则使用update语句。

MERGE INTO Employee AS e
using EmployeeUpdate AS eu
ON e.EmployeeID = eu.EmployeeID`

#10

In SQL Server 2008 you can use the MERGE statement

在SQL Server 2008中，可以使用MERGE语句。

#11

If going the UPDATE if-no-rows-updated then INSERT route, consider doing the INSERT first to prevent a race condition (assuming no intervening DELETE)

如果执行UPDATE If -no-rows- updates然后插入路由，考虑先进行插入以防止竞争条件(假设没有插入删除)

INSERT INTO MyTable (Key, FieldA)
   SELECT @Key, @FieldA
   WHERE NOT EXISTS
   (
       SELECT *
       FROM  MyTable
       WHERE Key = @Key
   )
IF @@ROWCOUNT = 0
BEGIN
   UPDATE MyTable
   SET FieldA=@FieldA
   WHERE Key=@Key
   IF @@ROWCOUNT = 0
   ... record was deleted, consider looping to re-run the INSERT, or RAISERROR ...
END

Apart from avoiding a race condition, if in most cases the record will already exist then this will cause the INSERT to fail, wasting CPU.

除了避免竞争条件之外，如果在大多数情况下记录已经存在，那么这将导致插入失败，浪费CPU。

Using MERGE probably preferable for SQL2008 onwards.

对于SQL2008以后，最好使用MERGE。

#12

MS SQL Server 2008 introduces the MERGE statement, which I believe is part of the SQL:2003 standard. As many have shown it is not a big deal to handle one row cases, but when dealing with large datasets, one needs a cursor, with all the performance problems that come along. The MERGE statement will be much welcomed addition when dealing with large datasets.

MS SQL Server 2008引入了MERGE语句，我认为这是SQL:2003标准的一部分。正如许多人所指出的，处理一个行案例并不是什么大问题，但是在处理大型数据集时，需要一个游标，以及随之而来的所有性能问题。当处理大型数据集时，合并语句将是非常受欢迎的添加。

#13

Before everyone jumps to HOLDLOCK-s out of fear from these nafarious users running your sprocs directly :-) let me point out that you have to guarantee uniqueness of new PK-s by design (identity keys, sequence generators in Oracle, unique indexes for external ID-s, queries covered by indexes). That's the alpha and omega of the issue. If you don't have that, no HOLDLOCK-s of the universe are going to save you and if you do have that then you don't need anything beyond UPDLOCK on the first select (or to use update first).

在所有人都因为害怕这些不可靠的用户直接运行您的sprocs而跳转到holdlock之前:-)让我指出，您必须通过设计保证新的PK-s的唯一性(标识键、Oracle中的序列生成器、外部ID-s的惟一索引、索引覆盖的查询)。这就是问题的和。如果你没有的话，宇宙中没有锁将会拯救你，如果你有的话，那么你就不需要在第一个选择(或者使用update first)上使用UPDLOCK之外的任何东西了。

Sprocs normally run under very controlled conditions and with the assumption of a trusted caller (mid tier). Meaning that if a simple upsert pattern (update+insert or merge) ever sees duplicate PK that means a bug in your mid-tier or table design and it's good that SQL will yell a fault in such case and reject the record. Placing a HOLDLOCK in this case equals eating exceptions and taking in potentially faulty data, besides reducing your perf.

Sprocs通常在非常可控的条件下运行，并假设有一个可信的调用者(中间层)。这意味着，如果一个简单的upsert模式(update+insert或merge)曾经看到重复的PK，这意味着在中间层或表设计中出现了一个bug，那么在这种情况下，SQL会喊出错误并拒绝记录，这很好。在这种情况下放置一个HOLDLOCK就等于吃异常和接受潜在的错误数据，除了减少你的perf。

Having said that, Using MERGE, or UPDATE then INSERT is easier on your server and less error prone since you don't have to remember to add (UPDLOCK) to first select. Also, if you are doing inserts/updates in small batches you need to know your data in order to decide whether a transaction is appropriate or not. It it's just a collection of unrelated records then additional "enveloping" transaction will be detrimental.

话虽如此，使用MERGE或UPDATE然后INSERT在您的服务器上更容易，也更不容易出错，因为您不必记住在第一次选择时添加(UPDLOCK)。此外，如果您正在进行小批的插入/更新，您需要了解您的数据，以便确定事务是否合适。它只是一个不相关记录的集合，那么附加的“信封”事务将是有害的。

#14

Does the race conditions really matter if you first try an update followed by an insert? Lets say you have two threads that want to set a value for key key:

如果您首先尝试更新，然后进行插入，那么竞态条件真的重要吗?假设有两个线程希望为key key设置值:

Thread 1: value = 1
Thread 2: value = 2

线程1:value = 1线程2:value = 2

Example race condition scenario

示例竞态条件场景

key is not defined
关键是没有定义的
Thread 1 fails with update
线程1在更新时失败
Thread 2 fails with update
线程2在更新时失败
Exactly one of thread 1 or thread 2 succeeds with insert. E.g. thread 1
线程1或线程2中恰好有一个在插入时成功。例如,线程1
The other thread fails with insert (with error duplicate key) - thread 2.

另一个线程在insert(带有错误的重复键)- thread 2中失败。
- Result: The "first" of the two treads to insert, decides value.
- 结果:要插入的两个步骤中的“第一个”决定值。
- Wanted result: The last of the 2 threads to write data (update or insert) should decide value
- 想要的结果:写数据(更新或插入)的最后两个线程应该决定值

But; in a multithreaded environment, the OS scheduler decides on the order of the thread execution - in the above scenario, where we have this race condition, it was the OS that decided on the sequence of execution. Ie: It is wrong to say that "thread 1" or "thread 2" was "first" from a system viewpoint.

但;在多线程环境中，OS调度程序决定线程执行的顺序——在上面的场景中，我们有这个竞争条件，操作系统决定执行的顺序。Ie:从系统的角度来说，说“线程1”或“线程2”是“第一个”是错误的。

When the time of execution is so close for thread 1 and thread 2, the outcome of the race condition doesn't matter. The only requirement should be that one of the threads should define the resulting value.

当线程1和线程2的执行时间非常接近时，竞态条件的结果并不重要。唯一的要求应该是其中一个线程应该定义结果值。

For the implementation: If update followed by insert results in error "duplicate key", this should be treated as success.

对于实现:如果更新后插入结果出现错误“重复键”，则应将其视为成功。

Also, one should of course never assume that value in the database is the same as the value you wrote last.

此外，当然不应该假设数据库中的值与您上次编写的值相同。

#15

That depends on the usage pattern. One has to look at the usage big picture without getting lost in the details. For example, if the usage pattern is 99% updates after the record has been created, then the 'UPSERT' is the best solution.

这取决于使用模式。一个人必须从宏观的角度来看待它的使用，而不能迷失在细节中。例如，如果在创建记录之后，使用模式是99%的更新，那么“UPSERT”是最好的解决方案。

After the first insert (hit), it will be all single statement updates, no ifs or buts. The 'where' condition on the insert is necessary otherwise it will insert duplicates, and you don't want to deal with locking.

在第一次插入(命中)之后，它将全部是单个语句更新，没有ifs或buts。插入中的“where”条件是必需的，否则它将插入重复的内容，您不希望处理锁定。

UPDATE <tableName> SET <field>=@field WHERE key=@key;

IF @@ROWCOUNT = 0
BEGIN
   INSERT INTO <tableName> (field)
   SELECT @field
   WHERE NOT EXISTS (select * from tableName where key = @key);
END

#16

I had tried below solution and it works for me, when concurrent request for insert statement occurs.

我已经尝试了下面的解决方案，当出现对insert语句的并发请求时，它对我是有效的。

begin tran
if exists (select * from table with (updlock,serializable) where key = @key)
begin
   update table set ...
   where key = @key
end
else
begin
   insert table (key, ...)
   values (@key, ...)
end
commit tran

#17

-1

You can use this query. Work in all SQL Server editions. It's simple, and clear. But you need use 2 queries. You can use if you can't use MERGE

您可以使用这个查询。适用于所有SQL Server版本。这很简单,明了。但您需要使用两个查询。如果不能使用MERGE，则可以使用。

    BEGIN TRAN

    UPDATE table
    SET Id = @ID, Description = @Description
    WHERE Id = @Id

    INSERT INTO table(Id, Description)
    SELECT @Id, @Description
    WHERE NOT EXISTS (SELECT NULL FROM table WHERE Id = @Id)

    COMMIT TRAN

NOTE: Please explain answer negatives

注意:请解释答案否定

#18

-2

If you use ADO.NET, the DataAdapter handles this.

如果你使用ADO。NET, DataAdapter处理这个。

If you want to handle it yourself, this is the way:

如果你想自己处理，方法如下:

Make sure there is a primary key constraint on your key column.

确保键列上有主键约束。

Then you:

然后你:

Do the update
做更新
If the update fails because a record with the key already exists, do the insert. If the update does not fail, you are finished.
如果更新失败，因为已经存在键的记录，那么执行插入操作。如果更新没有失败，您就完成了。

You can also do it the other way round, i.e. do the insert first, and do the update if the insert fails. Normally the first way is better, because updates are done more often than inserts.

你也可以反过来做，比如先插入，如果插入失败就进行更新。通常情况下，第一种方法更好，因为更新通常比插入操作更频繁。

#19

-3

Doing an if exists ... else ... involves doing two requests minimum (one to check, one to take action). The following approach requires only one where the record exists, two if an insert is required:

如果存在的话……其他的……包括至少做两个请求(一个检查，一个采取行动)。下面的方法只需要记录存在的一个，如果需要插入，则需要两个:

DECLARE @RowExists bit
SET @RowExists = 0
UPDATE MyTable SET DataField1 = 'xxx', @RowExists = 1 WHERE Key = 123
IF @RowExists = 0
  INSERT INTO MyTable (Key, DataField1) VALUES (123, 'xxx')

#20

-3

I usually do what several of the other posters have said with regard to checking for it existing first and then doing whatever the correct path is. One thing you should remember when doing this is that the execution plan cached by sql could be nonoptimal for one path or the other. I believe the best way to do this is to call two different stored procedures.

我通常会按照其他几个海报上说的那样，先检查它是否存在，然后再做正确的事情。在执行此操作时，您应该记住的一件事是，sql缓存的执行计划可能对任何一条路径都不是最优的。我认为最好的方法是调用两个不同的存储过程。

FirstSP:
If Exists
   Call SecondSP (UpdateProc)
Else
   Call ThirdSP (InsertProc)

Now, I don't follow my own advice very often, so take it with a grain of salt.

现在，我不常听我自己的建议，所以带点盐吧。

#21

-6

Do a select, if you get a result, update it, if not, create it.

做一个选择，如果你得到一个结果，更新它，如果不是，创建它。

#1

302