使用子查询提高UPDATE查询的性能

时间:2020-11-29 07:41:41

I have a table of ZipCodes, that contains US and Canadian postal codes and related information - including Latitude and Longitude.

我有一个ZipCodes表,其中包含美国和加拿大的邮政编码和相关信息 - 包括纬度和经度。

I have records being inserted into a table from a separate application that I cannot modify. These records lack latitude and longitude data, so I'd like to populate the records accordingly. (Before anyone asks, doing a JOIN with the ZipCodes table at runtime didn't seem to be an option because of performance, which is why I've ended up where i am.)

我有记录从单独的应用程序插入到表中,我无法修改。这些记录缺少纬度和经度数据,因此我想相应地填充记录。 (在任何人问之前,在运行时使用ZipCodes表进行JOIN似乎不是一个选择,因为性能,这就是为什么我结束了我的位置。)

I have the query shown below, which is to be set to run as a scheduled task, that should update any records that have not yet had their geo data looked up. This query (100 at a time) takes about 1m 35s to run.

我有下面显示的查询,该查询将设置为作为计划任务运行,该任务应更新尚未查找其地理数据的任何记录。此查询(每次100个)大约需要1分35秒才能运行。

I am seeking any and all options to what i'm doing here, that may improve performance.

我正在寻找我在这里所做的任何和所有选择,这可能会提高性能。

I resorted to a scheduled task because I fear having the entire server being brought to its knees with INSTEAD OF INSERT or AFTER INSERT triggers to do the same thing... but that would have been my preference. Given the current performance issues, triggers seem out of the question.

我使用了预定的任务,因为我担心让整个服务器与INSTEAD OF INSERT或者INSERT INSERT触发器一起做同样的事情......但这本来是我的偏好。鉴于当前的性能问题,触发器似乎是不可能的。

The ZipCodes table has a non-clustered index on CityName and ProvinceAbbr, with additional included columns PostalCode, Latitude and Longitude. I have Allow Row Locks and Allow Page Locks set to false. The data in this table will change MAYBE once a quarter, so dirty reads are fine.

ZipCodes表在CityName和ProvinceAbbr上具有非聚集索引,其他包括PostalCode,Latitude和Longitude列。我允许行锁定和允许页面锁定设置为false。此表中的数据将每季度更改一次MAYBE,因此脏读是正常的。

I can provide execution plan results if necessary... but not sure how exactly so give me a hint as to how to generate copy/paste material. =D

如果需要,我可以提供执行计划结果......但不确定如何给我一个关于如何生成复制/粘贴材料的提示。 = d

The ZipCodes table has 947,172 records, and the LoadsAvail table has around 38k records at any given time... records are being inserted, updated, and deleted in real time/constantly, sometimes in larger batches (I'd say maximum of 20 inserted at a time, usually more like one or two at a time).

ZipCodes表有947,172条记录,LoadsAvail表在任何给定时间都有大约38,000条记录...记录正在实时/不断地插入,更新和删除,有时是大批量(我说最多插入20条)在一个时间,通常更像一次一两个)。

Once most of the records have their geo data in place, there will probably be less than a 100 records being updated every minute, but I'd like to know if there's just a better way to do all this.

一旦大多数记录都有他们的地理数据,每分钟可能会有不到100条记录被更新,但我想知道是否有更好的方法来完成所有这些。

UPDATE TOP (100)
    LoadsAvail

SET
    coordinatesChecked = 1,
    FromLatitude =  (   SELECT TOP (1) Latitude
                                        FROM ZipCodes AS ZipCodes_1 WITH(NOLOCK)
                                        WHERE (CityName = loadsavail.FromCity) AND (ProvinceAbbr = loadsavail.FromState)
                                    ),
    FromLongitude = (   SELECT TOP (1) Longitude
                                        FROM ZipCodes AS ZipCodes_2 WITH(NOLOCK)
                                        WHERE (CityName = loadsavail.FromCity) AND (ProvinceAbbr = loadsavail.FromState)
                                    ),
    ToLatitude =    (   SELECT TOP (1) Latitude
                                    FROM ZipCodes AS ZipCodes_3 WITH(NOLOCK)
                                    WHERE        (CityName = loadsavail.ToCity) AND (ProvinceAbbr = loadsavail.toState)
                                ),
    ToLongitude =   (   SELECT TOP (1) Longitude
                                    FROM ZipCodes AS ZipCodes_4 WITH(NOLOCK)
                                    WHERE        (CityName = loadsavail.ToCity) AND (ProvinceAbbr = loadsavail.toState)
                                )
WHERE
    coordinatesChecked = 0

1 个解决方案

#1


3  

You can use Joins instead of subqueries:

您可以使用联接而不是子查询:

UPDATE T1
SET T1.coordinatesChecked = 1,
    T1.FromLatitude =  T2.Latitude,
    T1.FromLongitude = T2.Longitude
    T1.ToLatitude =    T3.Latitude,
    T1.ToLongitude =   T3.Longitude
FROM            LoadsAvail AS T1 LEFT JOIN ZipCodes AS T2 ON T1.FromCity = T2.CityName AND T1.FromState = T2.ProvinceAbbr
                                 LEFT JOIN ZipCodes AS T3 ON T1.toCity = T3.CityName AND T1.toState = T3.ProvinceAbbr
WHERE    T1.coordinatesChecked = 0

#1


3  

You can use Joins instead of subqueries:

您可以使用联接而不是子查询:

UPDATE T1
SET T1.coordinatesChecked = 1,
    T1.FromLatitude =  T2.Latitude,
    T1.FromLongitude = T2.Longitude
    T1.ToLatitude =    T3.Latitude,
    T1.ToLongitude =   T3.Longitude
FROM            LoadsAvail AS T1 LEFT JOIN ZipCodes AS T2 ON T1.FromCity = T2.CityName AND T1.FromState = T2.ProvinceAbbr
                                 LEFT JOIN ZipCodes AS T3 ON T1.toCity = T3.CityName AND T1.toState = T3.ProvinceAbbr
WHERE    T1.coordinatesChecked = 0