从SQL Server更新链接MySQL表的查询

I have a MS SQL Server with a linked MySQL server. I need to partially synchronize a table between the two servers. This is done in three steps and based on a condition:

我有一个带有链接MySQL服务器的MS SQL Server。我需要部分同步两个服务器之间的表。这是通过三个步骤完成的,并基于以下条件:

Delete all rows from the MySQL table that do not satisfy the condition

删除MySQL表中不满足条件的所有行
Insert all new rows in the MySQL table that satisfy the condition

在MySQL表中插入满足条件的所有新行
Update all rows in the MySQL server that satisfy the condition and have different data between MySQL and SQL Server

更新MySQL服务器中满足条件的所有行,并在MySQL和SQL Server之间具有不同的数据

Steps 1 and 2 always run without a problem. But step 3 won't run if there is anything to update. The query fails with the following exception: The rowset was using optimistic concurrency and the value of a column has been changed after the containing row was last fetched or resynchronized.].

步骤1和2始终没有问题。但如果有任何更新,第3步将无法运行。查询失败,出现以下异常:行集使用了乐观并发,并且在上次提取或重新同步包含行后,列的值已更改。

This is the query that is executed:

这是执行的查询:

update mysqlserver...subscribers
set Firstname = Voornaam, 
  Middlename = Tussenvoegsel, 
  Surname = Achternaam, 
  email = e-mail 
from mysqlserver...subscribers as b, tblkandidaat 
where (b.kandidaatid = tblkandidaat.kandidaatid) and
  (tblkandidaat.kandidaatid in (
    select subsc.kandidaatid
    from mysqlserver...subscribers subsc inner join tblKandidaat 
      on (subsc.kandidaatid=tblKandidaat.kandidaatid) 
    where (subsc.list=1) and
      ((subsc.firstname COLLATE Latin1_General_CI_AI <> Voornaam 
      or (subsc.middlename COLLATE Latin1_General_CI_AI <> Tussenvoegsel) 
      or (subsc.surname COLLATE Latin1_General_CI_AI <> tblKandidaat.Achternaam) 
      or (subsc.email COLLATE Latin1_General_CI_AI <> tblKandidaat.e-mail))
  ));

Anybody has an idea about how to prevent this?

有人知道如何防止这种情况吗?

7 个解决方案

#1

Try this query instead:

请尝试此查询:

update b
set
   Firstname = Voornaam, 
   Middlename = Tussenvoegsel, 
   Surname = Achternaam, 
   email = e-mail 
from
   mysqlserver...subscribers b
   inner join tblkandidaat k on b.kandidaatid = k.kandidaatid
where
   b.list=1
   and (
      b.firstname COLLATE Latin1_General_CI_AI <> k.Voornaam 
      or b.middlename COLLATE Latin1_General_CI_AI <> k.Tussenvoegsel
      or b.surname COLLATE Latin1_General_CI_AI <> k.Achternaam
      or b.email COLLATE Latin1_General_CI_AI <> k.e-mail
   )

It's best practice to use ANSI joins and properly separate JOIN conditions from WHERE conditions.

最佳做法是使用ANSI连接并从WHERE条件中正确分离JOIN条件。
It's more readable to use aliases for all your tables instead of long table names throughout the query.

在整个查询中,为所有表使用别名而不是长表名称更具可读性。
It's best to use the aliases for all column references instead of leaving them blank. Not only is it a good habit and makes things clearer, it can avoid some very nasty errors in inner-vs-outer table references.

最好为所有列引用使用别名,而不是将它们留空。它不仅是一个好习惯,而且使事情更清晰,它可以避免内部和外部表引用中的一些非常讨厌的错误。
If performance is also an issue: linked server joins sometimes devolve to row-by-row processing in the DB data provider engine. I have found cases where breaking out part of a complex join across a linked server into a regular join followed by a cross apply hugely reduced the unneeded rows being fetched and greatly improved performance. (This was essentially doing a bookmark lookup, aka a nonclustered index scan followed by clustered index seek using those values). While this may not perfectly mesh with how MySql works, it's worth experimenting with. If you can do any kind of trace to see the actual queries being performed on the MySql side you might get insight as to other methods to use for increased performance.

如果性能也是一个问题:链接服务器连接有时会转移到DB数据提供程序引擎中的逐行处理。我发现这样的情况:将链接服务器上的复杂连接的一部分分解为常规连接,然后交叉应用,大大减少了不需要的行被提取并大大提高了性能。 (这主要是进行书签查找,也就是非聚簇索引扫描,然后使用这些值进行聚簇索引查找)。虽然这可能与MySql的工作原理不完全相符,但值得尝试。如果您可以执行任何类型的跟踪以查看在MySql端执行的实际查询,您可能会了解其他方法以提高性能。
Another performance-improving idea is to copy the remote data locally to a temp table, and add an ActionRequired column. Then update the temp table so it looks like it should, putting 'U', 'I', or 'D' in ActionRequired, then perform the merge/upsert across the linked server with a simple equijoins on the primary key, using ActionRequired. Careful attention to possible race conditions where the remote database could be updated during processing are in order.

另一个提高性能的想法是将远程数据本地复制到临时表,并添加ActionRequired列。然后更新临时表,使其看起来应该如此,在ActionRequired中放置'U','I'或'D',然后使用ActionRequired在主键上使用简单的等值连接在链接服务器上执行合并/ upsert。仔细关注可能在处理过程中更新远程数据库的可能竞争条件。
Beware of nulls... are all those columns you're comparing non-nullable?

谨防空值...你所比较的那些列是不可空的吗?

#2

You might try creating a second table in mysql, doing an insert from sql-server into that empty table for all changed lines and doing Step 3 between the two mysql tables.

您可以尝试在mysql中创建第二个表,从sql-server插入到所有更改行的空表中,并在两个mysql表之间执行步骤3。

#3

try to not using sub-query in your where statement. Sub-query may return more than one row, and then you got the error.

尽量不在where语句中使用子查询。子查询可能会返回多行,然后您收到错误。

#4

try creating a view which has source, destination and has_changed column between and has linked tables joined. then you can issue query

尝试创建一个视图,其中包含source,destination和has_changed列,并且链接了已连接的表。然后你可以发出查询

update vi_upd_linkedtable set destination=source where has_changed=1

update vi_upd_linkedtable set destination = source其中has_changed = 1

#5

This is a shot in the dark, but try adding FOR UPDATE or LOCK IN SHARE MODE to the end of your subselect query. This will tell MySQL that you are trying to select stuff for an update within your transaction and should create a row level write lock during the select rather than during the update.

这是在黑暗中拍摄的,但尝试在子选择查询的末尾添加FOR UPDATE或LOCK IN SHARE MODE。这将告诉MySQL您正在尝试为事务中的更新选择内容,并且应该在选择期间而不是在更新期间创建行级写锁定。

From 13.2.8.3. SELECT ... FOR UPDATE and SELECT ... LOCK IN SHARE MODE Locking Reads:

从13.2.8.3开始。 SELECT ... FOR UPDATE和SELECT ...锁定共享模式锁定读取:

SELECT ... LOCK IN SHARE MODE sets a shared mode lock on the rows read. A shared mode lock enables other sessions to read the rows but not to modify them. The rows read are the latest available, so if they belong to another transaction that has not yet committed, the read blocks until that transaction ends.

SELECT ... LOCK IN SHARE MODE在读取的行上设置共享模式锁定。共享模式锁使其他会话可以读取行但不能修改它们。读取的行是最新的可用,因此如果它们属于另一个尚未提交的事务,则读取将阻塞,直到该事务结束。

#6

For rows where the names are the same, the update is a no-op.

对于名称相同的行,更新是无操作。

You don't save any work by trying to filter out the rows where they're the same, because the data still has to be compared across the link. So I can't see any benefit to the subquery.

您不会通过尝试过滤掉它们相同的行来保存任何工作,因为仍然必须在链接中比较数据。所以我看不到子查询有什么好处。

Therefore the query can be simplified a lot:

因此查询可以简化很多:

update mysqlserver...subscribers
set Firstname = Voornaam, 
  Middlename = Tussenvoegsel, 
  Surname = Achternaam, 
  email = e-mail 
from mysqlserver...subscribers as b join tblkandidaat 
  on b.kandidaatid = tblkandidaat.kandidaatid;
where b.list = 1;

Eliminating the subquery might make your the locking issue go away. MySQL does have some issues combining a select and an update on the same table in a given query.

消除子查询可能会使锁定问题消失。 MySQL在给定查询中的同一个表上组合select和update有一些问题。

#7

Try this. I wrote several of these today.

尝试这个。我今天写了几篇。

update b
set
   Firstname = Voornaam, 
   Middlename = Tussenvoegsel, 
   Surname = Achternaam, 
   email = e-mail 
from
   mysqlserver...subscribers b
   inner join tblkandidaat k on b.kandidaatid = k.kandidaatid
where
   b.list=1
   and (
      ISNULL(b.firstname,'') COLLATE Latin1_General_CI_AI <> ISNULL(k.Voornaam,'')
      or ISNULL(b.middlename,'') COLLATE Latin1_General_CI_AI <> ISNULL(k.Tussenvoegsel,'')
      or ISNULL(b.surname,'') COLLATE Latin1_General_CI_AI <> ISNULL(k.Achternaam,'')
      or ISNULL(b.email,'') COLLATE Latin1_General_CI_AI <> ISNULL(k.e-mail,'')
   )

Using the ISNULL allows you to null your columns.

使用ISNULL可以使列无效。

#1