SQL Server - 循环表和基于计数的更新

时间:2022-03-12 23:08:27

I have a SQL Server database. I need to loop through a table to get the count of each value in the column 'RevID'. Each value should only be in the table a certain number of times - for example 125 times. If the count of the value is greater than 125 or less than 125, I need to update the column to ensure all values in the RevID (are over 25 different values) is within the same range of 125 (ok to be a few numbers off)

我有一个SQL Server数据库。我需要遍历一个表来获取“RevID”列中每个值的计数。每个值应仅在表中显示一定次数 - 例如125次。如果值的计数大于125或小于125,我需要更新列以确保RevID中的所有值(超过25个不同的值)都在125的相同范围内(可以关闭几个数字) )

For example, the count of RevID = "A2" is = 45 and the count of RevID = 'B2' is = 165 then I need to update RevID so the 45 count increases and the 165 decreases until they are within the 125 range.

例如,RevID =“A2”的计数= 45且RevID ='B2'的计数= 165然后我需要更新RevID,因此45计数增加而165减少直到它们在125范围内。

This is what I have so far:

这是我到目前为止:

DECLARE @i INT = 1,
        @RevCnt INT = SELECT RevId, COUNT(RevId) FROM MyTable group by RevId

WHILE(@RevCnt >= 50)
BEGIN
    UPDATE MyTable 
    SET RevID= (SELECT COUNT(RevID) FROM MyTable) 
    WHERE RevID < 50)

    @i = @i + 1       
END

I have also played around with a cursor and instead of trigger. Any idea on how to achieve this? Thanks for any input.

我也玩过光标而不是触发器。有关如何实现这一点的任何想法?感谢您的任何意见。

3 个解决方案

#1


0  

Okay I cam back to this because I found it interesting even though clearly there are some business rules/discussion that you and I and others are not seeing. anyway, if you want to evenly and distribute arbitrarily there are a few ways you could do it by building recursive Common Table Expressions [CTE] or by building temp tables and more. Anyway here is a way that I decided to give it a try, I did utilize 1 temp table because sql was throwing in a little inconsistency with the main logic table as a cte about every 10th time but the temp table seems to have cleared that up. Anyway, this will evenly spread RevId arbitrarily and randomly assigning any remainder (# of Records / # of RevIds) to one of the RevIds. This script also doesn't rely on having a UniqueID or anything it works dynamically over row numbers it creates..... here you go just subtract out test data etc and you have what you more than likely want. Though rebuilding the table/values would probably be easier.

好吧,我回想一下,因为我发现它很有趣,尽管很明显你和我以及其他人都没有看到一些商业规则/讨论。无论如何,如果你想要均匀分布并随意分配,你可以通过构建递归公用表表达式[CTE]或通过构建临时表等来实现它。无论如何这里是一种我决定试一试的方法,我确实利用了1个临时表,因为sql每次第10次与主逻辑表有点不一致,但是临时表似乎已经清除了。无论如何,这将任意地均匀地传播RevId并随机地将任何余数(记录数/ RevIds数)分配给其中一个RevId。这个脚本也不依赖于具有UniqueID或它在它创建的行号上动态工作的任何东西......在这里你只需减去测试数据等,你就拥有了你可能想要的东西。虽然重建表/值可能会更容易。

--Build Some Test Data
DECLARE @Table AS TABLE (RevId VARCHAR(10))
DECLARE @C AS INT = 1
WHILE @C <= 400
BEGIN

    IF @C <= 200
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('A1')
    END

    IF @c <= 170
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('B2')
    END

    IF @c <= 100
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('C3')
    END

    IF @c <= 400
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('D4')
    END

    IF @c <= 1
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('E5')
    END

    SET @C = @C+ 1
END

--save starting counts of test data to temp table to compare with later
IF OBJECT_ID('tempdb..#StartingCounts') IS NOT NULL
    BEGIN
        DROP TABLE #StartingCounts
    END

SELECT
    RevId
    ,COUNT(*) as Occurences
    INTO #StartingCounts
FROM
    @Table
GROUP BY
    RevId
ORDER BY
    RevId


/************************ This is the main method **********************************/
--clear temp table that is the main processing logic
IF OBJECT_ID('tempdb..#RowNumsToChange') IS NOT NULL
    BEGIN
        DROP TABLE #RowNumsToChange
    END

--figure out how many records there are and how many there should be for each RevId
;WITH cteTargetNumbers AS (
    SELECT
       RevId
       --,COUNT(*) as RevIdCount
       --,SUM(COUNT(*)) OVER (PARTITION BY 1) / COUNT(*) OVER (PARTITION BY 1) +
          --CASE
             --WHEN ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY NEWID()) <= 
                --SUM(COUNT(*)) OVER (PARTITION BY 1) % COUNT(*) OVER (PARTITION BY 1)
             --THEN 1
             --ELSE 0
          --END as TargetNumOfRecords
       ,SUM(COUNT(*)) OVER (PARTITION BY 1) / COUNT(*) OVER (PARTITION BY 1) +
          CASE
             WHEN ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY NEWID()) <= 
                SUM(COUNT(*)) OVER (PARTITION BY 1) % COUNT(*) OVER (PARTITION BY 1)
             THEN 1
             ELSE 0
          END - COUNT(*) AS NumRecordsToUpdate
    FROM
       @Table
    GROUP BY
       RevId
)

, cteEndRowNumsToChange AS (
    SELECT *
       ,SUM(CASE WHEN NumRecordsToUpdate > 1 THEN NumRecordsToUpdate ELSE 0 END)
             OVER (PARTITION BY 1 ORDER BY RevId) AS ChangeEndRowNum
    FROM
       cteTargetNumbers
)

SELECT
    *
    ,LAG(ChangeEndRowNum,1,0) OVER (PARTITION BY 1 ORDER BY RevId) as ChangeStartRowNum
    INTO #RowNumsToChange
FROM
    cteEndRowNumsToChange


;WITH cteOriginalTableRowNum AS (
    SELECT
       RevId
       ,ROW_NUMBER() OVER (PARTITION BY RevId ORDER BY (SELECT 0)) as RowNumByRevId
    FROM
       @Table t
)

, cteRecordsAllowedToChange AS (
    SELECT
       o.RevId
       ,o.RowNumByRevId
       ,ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT 0)) as ChangeRowNum
    FROM
       cteOriginalTableRowNum o
       INNER JOIN #RowNumsToChange t
       ON o.RevId = t.RevId
       AND t.NumRecordsToUpdate < 0
       AND o.RowNumByRevId <= ABS(t.NumRecordsToUpdate)
)

UPDATE o
    SET RevId = u.RevId
FROM
    cteOriginalTableRowNum o
    INNER JOIN cteRecordsAllowedToChange c
    ON o.RevId = c.RevId
    AND o.RowNumByRevId = c.RowNumByRevId
    INNER JOIN #RowNumsToChange u
    ON c.ChangeRowNum > u.ChangeStartRowNum
    AND c.ChangeRowNum <= u.ChangeEndRowNum
    AND u.NumRecordsToUpdate > 0

IF OBJECT_ID('tempdb..#RowNumsToChange') IS NOT NULL
    BEGIN
        DROP TABLE #RowNumsToChange
    END

/***************************** End of Main Method *******************************/

-- Compare the results and clean up

;WITH ctePostUpdateResults AS (
    SELECT
       RevId
       ,COUNT(*) as AfterChangeOccurences
    FROM
       @Table
    GROUP BY
       RevId
)

SELECT *
FROM
    #StartingCounts s
    INNER JOIN ctePostUpdateResults r
    ON s.RevId = r.RevId
ORDER BY
    s.RevId

IF OBJECT_ID('tempdb..#StartingCounts') IS NOT NULL
    BEGIN
        DROP TABLE #StartingCounts
    END

#2


0  

Since you've given no rules for how you'd like the balance to operate we're left to speculate. Here's an approach that would find the most overrepresented value and then find an underrepresented value that can take on the entire overage.

由于您没有给出如何操作天平的规则,我们只能进行推测。这是一种方法,可以找到最具代表性的价值,然后找到一个可以承担整个过剩的代表性不足的价值。

I have no idea how optimal this is and it will probably run in an infinite loop without more logic.

我不知道这是多么优化,它可能会在没有更多逻辑的无限循环中运行。

declare @balance int = 125;

declare @cnt_over  int;
declare @cnt_under int;
declare @revID_overrepresented  varchar(32);
declare @revID_underrepresented varchar(32);

declare @rowcount int = 1;

while @rowcount > 0
begin
    select top 1 @revID_overrepresented = RevID, @cnt_over = count(*)
    from T
    group by RevID
    having count(*) > @balance
    order by count(*) desc

    select top 1 @revID_underrepresented = RevID, @cnt_under = count(*)
    from T
    group by RevID
    having count(*) < @balance - @cnt_over
    order by count(*) desc

    update top @cnt_over - @balance T
    set RevId = @revID_underrepresented
    where RevId = @revID_overrepresented;

    set @rowcount = @@rowcount;
end

#3


-1  

The problem is I don't even know what you mean by balance...You say it needs to be evenly represented but it seems like you want it to be 125. 125 is not "even", it is just 125.

问题是我甚至不知道你的意思是什么...你说它需要均匀表示但似乎你希望它是125. 125不是“偶数”,它只是125。

I can't tell what you are trying to do, but I'm guessing this is not really an SQL problem. But you can use SQL to help. Here is some helpful SQL for you. You can use this in your language of choice to solve the problem.

我不知道你要做什么,但我猜这不是一个真正的SQL问题。但您可以使用SQL来提供帮助。这是一些有用的SQL。您可以使用您选择的语言来解决问题。

Find the rev values and their counts:

查找转速值及其计数:

SELECT RevID, COUNT(*)
FROM MyTable
GROUP BY MyTable

Update @X rows (with RevID of value @RevID) to a new value @NewValue

将@X行(RevID值为@RevID)更新为新值@NewValue

UPDATE TOP @X FROM MyTable
  SET RevID = @NewValue
WHERE RevID = @RevID

Using these two queries you should be able to apply your business rules (which you never specified) in a loop or whatever to change the data.

使用这两个查询,您应该能够在循环中应用业务规则(您从未指定过)或更改数据。

#1


0  

Okay I cam back to this because I found it interesting even though clearly there are some business rules/discussion that you and I and others are not seeing. anyway, if you want to evenly and distribute arbitrarily there are a few ways you could do it by building recursive Common Table Expressions [CTE] or by building temp tables and more. Anyway here is a way that I decided to give it a try, I did utilize 1 temp table because sql was throwing in a little inconsistency with the main logic table as a cte about every 10th time but the temp table seems to have cleared that up. Anyway, this will evenly spread RevId arbitrarily and randomly assigning any remainder (# of Records / # of RevIds) to one of the RevIds. This script also doesn't rely on having a UniqueID or anything it works dynamically over row numbers it creates..... here you go just subtract out test data etc and you have what you more than likely want. Though rebuilding the table/values would probably be easier.

好吧,我回想一下,因为我发现它很有趣,尽管很明显你和我以及其他人都没有看到一些商业规则/讨论。无论如何,如果你想要均匀分布并随意分配,你可以通过构建递归公用表表达式[CTE]或通过构建临时表等来实现它。无论如何这里是一种我决定试一试的方法,我确实利用了1个临时表,因为sql每次第10次与主逻辑表有点不一致,但是临时表似乎已经清除了。无论如何,这将任意地均匀地传播RevId并随机地将任何余数(记录数/ RevIds数)分配给其中一个RevId。这个脚本也不依赖于具有UniqueID或它在它创建的行号上动态工作的任何东西......在这里你只需减去测试数据等,你就拥有了你可能想要的东西。虽然重建表/值可能会更容易。

--Build Some Test Data
DECLARE @Table AS TABLE (RevId VARCHAR(10))
DECLARE @C AS INT = 1
WHILE @C <= 400
BEGIN

    IF @C <= 200
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('A1')
    END

    IF @c <= 170
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('B2')
    END

    IF @c <= 100
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('C3')
    END

    IF @c <= 400
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('D4')
    END

    IF @c <= 1
    BEGIN
       INSERT INTO @Table (RevId) VALUES ('E5')
    END

    SET @C = @C+ 1
END

--save starting counts of test data to temp table to compare with later
IF OBJECT_ID('tempdb..#StartingCounts') IS NOT NULL
    BEGIN
        DROP TABLE #StartingCounts
    END

SELECT
    RevId
    ,COUNT(*) as Occurences
    INTO #StartingCounts
FROM
    @Table
GROUP BY
    RevId
ORDER BY
    RevId


/************************ This is the main method **********************************/
--clear temp table that is the main processing logic
IF OBJECT_ID('tempdb..#RowNumsToChange') IS NOT NULL
    BEGIN
        DROP TABLE #RowNumsToChange
    END

--figure out how many records there are and how many there should be for each RevId
;WITH cteTargetNumbers AS (
    SELECT
       RevId
       --,COUNT(*) as RevIdCount
       --,SUM(COUNT(*)) OVER (PARTITION BY 1) / COUNT(*) OVER (PARTITION BY 1) +
          --CASE
             --WHEN ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY NEWID()) <= 
                --SUM(COUNT(*)) OVER (PARTITION BY 1) % COUNT(*) OVER (PARTITION BY 1)
             --THEN 1
             --ELSE 0
          --END as TargetNumOfRecords
       ,SUM(COUNT(*)) OVER (PARTITION BY 1) / COUNT(*) OVER (PARTITION BY 1) +
          CASE
             WHEN ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY NEWID()) <= 
                SUM(COUNT(*)) OVER (PARTITION BY 1) % COUNT(*) OVER (PARTITION BY 1)
             THEN 1
             ELSE 0
          END - COUNT(*) AS NumRecordsToUpdate
    FROM
       @Table
    GROUP BY
       RevId
)

, cteEndRowNumsToChange AS (
    SELECT *
       ,SUM(CASE WHEN NumRecordsToUpdate > 1 THEN NumRecordsToUpdate ELSE 0 END)
             OVER (PARTITION BY 1 ORDER BY RevId) AS ChangeEndRowNum
    FROM
       cteTargetNumbers
)

SELECT
    *
    ,LAG(ChangeEndRowNum,1,0) OVER (PARTITION BY 1 ORDER BY RevId) as ChangeStartRowNum
    INTO #RowNumsToChange
FROM
    cteEndRowNumsToChange


;WITH cteOriginalTableRowNum AS (
    SELECT
       RevId
       ,ROW_NUMBER() OVER (PARTITION BY RevId ORDER BY (SELECT 0)) as RowNumByRevId
    FROM
       @Table t
)

, cteRecordsAllowedToChange AS (
    SELECT
       o.RevId
       ,o.RowNumByRevId
       ,ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT 0)) as ChangeRowNum
    FROM
       cteOriginalTableRowNum o
       INNER JOIN #RowNumsToChange t
       ON o.RevId = t.RevId
       AND t.NumRecordsToUpdate < 0
       AND o.RowNumByRevId <= ABS(t.NumRecordsToUpdate)
)

UPDATE o
    SET RevId = u.RevId
FROM
    cteOriginalTableRowNum o
    INNER JOIN cteRecordsAllowedToChange c
    ON o.RevId = c.RevId
    AND o.RowNumByRevId = c.RowNumByRevId
    INNER JOIN #RowNumsToChange u
    ON c.ChangeRowNum > u.ChangeStartRowNum
    AND c.ChangeRowNum <= u.ChangeEndRowNum
    AND u.NumRecordsToUpdate > 0

IF OBJECT_ID('tempdb..#RowNumsToChange') IS NOT NULL
    BEGIN
        DROP TABLE #RowNumsToChange
    END

/***************************** End of Main Method *******************************/

-- Compare the results and clean up

;WITH ctePostUpdateResults AS (
    SELECT
       RevId
       ,COUNT(*) as AfterChangeOccurences
    FROM
       @Table
    GROUP BY
       RevId
)

SELECT *
FROM
    #StartingCounts s
    INNER JOIN ctePostUpdateResults r
    ON s.RevId = r.RevId
ORDER BY
    s.RevId

IF OBJECT_ID('tempdb..#StartingCounts') IS NOT NULL
    BEGIN
        DROP TABLE #StartingCounts
    END

#2


0  

Since you've given no rules for how you'd like the balance to operate we're left to speculate. Here's an approach that would find the most overrepresented value and then find an underrepresented value that can take on the entire overage.

由于您没有给出如何操作天平的规则,我们只能进行推测。这是一种方法,可以找到最具代表性的价值,然后找到一个可以承担整个过剩的代表性不足的价值。

I have no idea how optimal this is and it will probably run in an infinite loop without more logic.

我不知道这是多么优化,它可能会在没有更多逻辑的无限循环中运行。

declare @balance int = 125;

declare @cnt_over  int;
declare @cnt_under int;
declare @revID_overrepresented  varchar(32);
declare @revID_underrepresented varchar(32);

declare @rowcount int = 1;

while @rowcount > 0
begin
    select top 1 @revID_overrepresented = RevID, @cnt_over = count(*)
    from T
    group by RevID
    having count(*) > @balance
    order by count(*) desc

    select top 1 @revID_underrepresented = RevID, @cnt_under = count(*)
    from T
    group by RevID
    having count(*) < @balance - @cnt_over
    order by count(*) desc

    update top @cnt_over - @balance T
    set RevId = @revID_underrepresented
    where RevId = @revID_overrepresented;

    set @rowcount = @@rowcount;
end

#3


-1  

The problem is I don't even know what you mean by balance...You say it needs to be evenly represented but it seems like you want it to be 125. 125 is not "even", it is just 125.

问题是我甚至不知道你的意思是什么...你说它需要均匀表示但似乎你希望它是125. 125不是“偶数”,它只是125。

I can't tell what you are trying to do, but I'm guessing this is not really an SQL problem. But you can use SQL to help. Here is some helpful SQL for you. You can use this in your language of choice to solve the problem.

我不知道你要做什么,但我猜这不是一个真正的SQL问题。但您可以使用SQL来提供帮助。这是一些有用的SQL。您可以使用您选择的语言来解决问题。

Find the rev values and their counts:

查找转速值及其计数:

SELECT RevID, COUNT(*)
FROM MyTable
GROUP BY MyTable

Update @X rows (with RevID of value @RevID) to a new value @NewValue

将@X行(RevID值为@RevID)更新为新值@NewValue

UPDATE TOP @X FROM MyTable
  SET RevID = @NewValue
WHERE RevID = @RevID

Using these two queries you should be able to apply your business rules (which you never specified) in a loop or whatever to change the data.

使用这两个查询,您应该能够在循环中应用业务规则(您从未指定过)或更改数据。