将T-SQL中的每N条记录分组

时间:2021-09-10 01:27:37

I have some performance test results on the database, and what I want to do is to group every 1000 records (previously sorted in ascending order by date) and then aggregate results with AVG.

我在数据库上有一些性能测试结果,我要做的是将每1000条记录(以前按日期按升序排序)分组,然后用AVG聚合结果。

I'm actually looking for a standard SQL solution, however any T-SQL specific results are also appreciated.

实际上我正在寻找一个标准的SQL解决方案,但是任何特定于T-SQL的结果都是值得赞赏的。

3 个解决方案

#1


17  

WITH T AS (
  SELECT RANK() OVER (ORDER BY ID) Rank,
    P.Field1, P.Field2, P.Value1, ...
  FROM P
)
SELECT (Rank - 1) / 1000 GroupID, AVG(...)
FROM T
GROUP BY ((Rank - 1) / 1000)
;

Something like that should get you started. If you can provide your actual schema I can update as appropriate.

这样的事情应该让你开始。如果您可以提供您的实际模式,我可以根据需要进行更新。

#2


6  

I +1'd @Yuck, because I think that is a good answer. But it's worth mentioning NTILE().

I + 1d @Yuck,因为我觉得这是个好答案。但是值得一提的是NTILE()。

Reason being, if you have 10,010 records (for example), then you'll have 11 groupings -- the first 10 with 1000 in them, and the last with just 10.

原因是,如果你有10010条记录(例如),那么你将有11个分组——前10条有1000条,最后10条只有10条。

If you're comparing averages between each group of 1000, then you should either discard the last group as it's not a representative group, or...you could make all the groups the same size.

如果你在比较每组1000人的平均值,那么你应该放弃最后一组,因为它不是一个代表性的组,或者……你可以让所有的基团大小相同。

NTILE() would make all groups the same size; the only caveat is that you'd need to know how many groups you wanted.

NTILE()将使所有组的大小相同;唯一需要注意的是,您需要知道需要多少组。

So if your table had 25,250 records, you'd use NTILE(25), and your groupings would be approximately 1000 in size -- they'd actually be 1010 in size; the benefit being, they'd all be the same size, which might make them more relevant to each other in terms of whatever comparison analysis you're doing.

如果你的表有25,250条记录,你会使用NTILE(25),你的分组大小大概是1000条,实际上是1010条;好处是,它们的大小都是相同的,这可能会使它们在你所做的任何比较分析中更相关。

You could get your group-size simply by

你可以简单地通过

DECLARE @ntile int
SET  @ntile = (SELECT count(1) from myTable) / 1000

And then modifying @Yuck's approach with the NTILE() substitution:

然后用NTILE()替换修改@Yuck的方法:

;WITH myCTE AS (
  SELECT NTILE(@ntile) OVER (ORDER BY id) myGroup,
    col1, col2, ...
  FROM dbo.myTable
)
SELECT myGroup, col1, col2...
FROM myCTE
GROUP BY (myGroup), col1, col2...
;

#3


5  

Give the answer to Yuck. I only post as an answer so I could include a code block. I did a count test to see if it was grouping by 1000 and the first set was 999. This produced set sizes of 1,000. Great query Yuck.

把答案告诉恶心的人。我只发布作为一个答案,以便包含一个代码块。我做了一个计数测试,看它是否被1000分组,第一组是999。这就产生了1000个尺寸。伟大的查询恶心。

    WITH T AS (
    SELECT RANK() OVER (ORDER BY sID) Rank, sID 
    FROM docSVsys
    )
    SELECT (Rank-1) / 1000 GroupID, count(sID)
    FROM T
    GROUP BY ((Rank-1) / 1000)
    order by GroupID 

#1


17  

WITH T AS (
  SELECT RANK() OVER (ORDER BY ID) Rank,
    P.Field1, P.Field2, P.Value1, ...
  FROM P
)
SELECT (Rank - 1) / 1000 GroupID, AVG(...)
FROM T
GROUP BY ((Rank - 1) / 1000)
;

Something like that should get you started. If you can provide your actual schema I can update as appropriate.

这样的事情应该让你开始。如果您可以提供您的实际模式,我可以根据需要进行更新。

#2


6  

I +1'd @Yuck, because I think that is a good answer. But it's worth mentioning NTILE().

I + 1d @Yuck,因为我觉得这是个好答案。但是值得一提的是NTILE()。

Reason being, if you have 10,010 records (for example), then you'll have 11 groupings -- the first 10 with 1000 in them, and the last with just 10.

原因是,如果你有10010条记录(例如),那么你将有11个分组——前10条有1000条,最后10条只有10条。

If you're comparing averages between each group of 1000, then you should either discard the last group as it's not a representative group, or...you could make all the groups the same size.

如果你在比较每组1000人的平均值,那么你应该放弃最后一组,因为它不是一个代表性的组,或者……你可以让所有的基团大小相同。

NTILE() would make all groups the same size; the only caveat is that you'd need to know how many groups you wanted.

NTILE()将使所有组的大小相同;唯一需要注意的是,您需要知道需要多少组。

So if your table had 25,250 records, you'd use NTILE(25), and your groupings would be approximately 1000 in size -- they'd actually be 1010 in size; the benefit being, they'd all be the same size, which might make them more relevant to each other in terms of whatever comparison analysis you're doing.

如果你的表有25,250条记录,你会使用NTILE(25),你的分组大小大概是1000条,实际上是1010条;好处是,它们的大小都是相同的,这可能会使它们在你所做的任何比较分析中更相关。

You could get your group-size simply by

你可以简单地通过

DECLARE @ntile int
SET  @ntile = (SELECT count(1) from myTable) / 1000

And then modifying @Yuck's approach with the NTILE() substitution:

然后用NTILE()替换修改@Yuck的方法:

;WITH myCTE AS (
  SELECT NTILE(@ntile) OVER (ORDER BY id) myGroup,
    col1, col2, ...
  FROM dbo.myTable
)
SELECT myGroup, col1, col2...
FROM myCTE
GROUP BY (myGroup), col1, col2...
;

#3


5  

Give the answer to Yuck. I only post as an answer so I could include a code block. I did a count test to see if it was grouping by 1000 and the first set was 999. This produced set sizes of 1,000. Great query Yuck.

把答案告诉恶心的人。我只发布作为一个答案,以便包含一个代码块。我做了一个计数测试,看它是否被1000分组,第一组是999。这就产生了1000个尺寸。伟大的查询恶心。

    WITH T AS (
    SELECT RANK() OVER (ORDER BY sID) Rank, sID 
    FROM docSVsys
    )
    SELECT (Rank-1) / 1000 GroupID, count(sID)
    FROM T
    GROUP BY ((Rank-1) / 1000)
    order by GroupID