在聚合函数SQL Server中选择N行

时间:2021-10-06 20:12:05

I have a table that looks like this:

我有一张这样的桌子:

+--------+----------+--------+------------+-------+
|   ID   | CHANNEL  | VENDOR | num_PERIOD | SALES |
+--------+----------+--------+------------+-------+
| 000001 | Business | Shop   |          1 | 40    |
| 000001 | Business | Shop   |          2 | 60    |
| 000001 | Business | Shop   |          3 | NULL  |
+--------+----------+--------+------------+-------+

With many combinations of ID, CHANNEL and VENDOR, and sales records for each of them over time (num_PERIOD).

有许多ID、通道和供应商的组合,以及每个组合的销售记录(num_PERIOD)。

The idea is to obtain a new column which returns the number of NULLS in SALES column, but in the first 111 registers according to num_PERIOD column.

我们的想法是获得一个新的列,该列返回SALES列中的NULLS数目,但是在前111个寄存器中根据num_PERIOD列返回。

I have been trying something like this:

我一直在尝试这样的东西:

SELECT ID,
       CHANNEL,
       VENDOR,
       sum(CASE
               WHEN SALES IS NULL THEN 1
               ELSE 0
           END) OVER (PARTITION BY ID,
                                   CHANNEL,
                                   VENDOR
                      ORDER BY num_PERIOD ROWS BETWEEN UNBOUNDED PRECEDING AND 111 FOLLOWING) AS NULL_SALES_SET
FROM TABLE
GROUP BY ID,
         CHANNEL,
         VENDOR

But I'm not obtaining what I'm looking for.

但是我没有得到我想要的东西。

So to obtain a table simillar to:

以便获得表明喻:

+--------+--------------+--------+----------------+
|   ID   |   CHANNEL    | VENDOR | NULL_SALES_SET |
+--------+--------------+--------+----------------+
| 000001 | Business     | Shop   |              1 |
| 000002 | Business     | Market |              0 |
| 000002 | Non Business | Shop   |              3 |
+--------+--------------+--------+----------------+

The difficulty comes when selecting these first 111 rows per ID, CHANNEL AND VENDOR ordered by num_PERIOD.

当根据num_PERIOD为每个ID、通道和供应商选择前111行时,会遇到困难。

2 个解决方案

#1


1  

Do you have to use the windowing function?

你需要使用窗口功能吗?

SELECT ID
     , CHANNEL
     , VENDOR
     , NULL_SALES_SET = SUM(CASE WHEN SALES IS NULL THEN 1 ELSE 0 END)
  FROM Table
 WHERE num_PERIOD <= 111
 GROUP BY ID, CHANNEL, VENDOR

Or are you looking for the first 111 num_PERIOD values allowing for gaps in the num_PERIOD column?

或者您正在寻找第一个111个num_PERIOD值,以允许num_PERIOD列中出现空白?

SELECT t.ID
     , t.CHANNEL
     , t.VENDOR
     , NULL_SALES_SET = SUM(CASE WHEN t.SALES IS NULL THEN 1 ELSE 0 END)
  FROM Table t
        INNER JOIN ( SELECT i.ID
                          , i.CHANNEL
                          , i.VENDOR
                          , i.num_PERIOD
                          , rowNum = ROW_NUMBER(PARTITION BY i.ID, i.CHANNEL, i.VENDOR ORDER BY i.num_PERIOD)
                       FROM Table i ) l
          ON t.ID = l.ID
         AND t.CHANNEL = l.CHANNEL
         AND t.VENDOR = l.VENDOR
         AND t.num_PERIOD = l.num_PERIOD
 WHERE l.rowNum <= 111
 GROUP BY ID, CHANNEL, VENDOR

Edit: Not sure how I overlooked it, but it is necessary to JOIN on the num_PERIOD column.

编辑:不知道我是如何忽略它的,但是有必要加入num_PERIOD列。

Edit: Add the number of distinct num_PERIOD per ID, Channel, Vendor without affecting the NULL_SALES_SET

编辑:在不影响NULL_SALES_SET的情况下,为每个ID、通道、供应商添加不同的num_PERIOD数

SELECT t.ID
     , t.CHANNEL
     , t.VENDOR
       -- Counts the NULL Sales when the num_PERIOD is in the 
       -- first 111 num_PERIODs
     , NULL_SALES_SET = SUM(CASE WHEN l.rowNum IS NOT NULL AND t.SALES IS NULL 
                                   THEN 1 
                                 ELSE 0 END)
       -- Counts the distinct num_PERIOD values
     , PERIOD_COUNT = COUNT(DISTINCT t.num_PERIOD)
  FROM Table t
        LEFT OUTER JOIN ( SELECT i.ID
                               , i.CHANNEL
                               , i.VENDOR
                               , i.num_PERIOD
                               , rowNum = ROW_NUMBER(PARTITION BY i.ID,
                                                                  i.CHANNEL,
                                                                  i.VENDOR
                                                     ORDER BY i.num_PERIOD)
                          FROM Table i ) l
          ON t.ID = l.ID
         AND t.CHANNEL = l.CHANNEL
         AND t.VENDOR = l.VENDOR
         AND t.num_PERIOD = l.num_PERIOD
         AND l.rowNum <= 111
 GROUP BY ID, CHANNEL, VENDOR

#2


2  

Use a CTE (Common Table Expression) with the ROW_NUMBER windowed function and you should be set:

使用带有ROW_NUMBER窗口函数的CTE(公共表表达式),应该设置为:

;WITH MyCTE AS
(
    SELECT
        id,
        channel,
        vendor,
        sales,
        ROW_NUMBER() OVER (PARTITION BY id, channel, vendor ORDER BY num_period) AS row_num
    FROM
        MyTable
)
SELECT
    id,
    channel,
    vendor,
    SUM(CASE WHEN sales IS NULL THEN 1 ELSE 0 END) AS null_sales_set
FROM
    MyCTE
WHERE
    row_num <= 111
GROUP BY
    id, channel, vendor

#1


1  

Do you have to use the windowing function?

你需要使用窗口功能吗?

SELECT ID
     , CHANNEL
     , VENDOR
     , NULL_SALES_SET = SUM(CASE WHEN SALES IS NULL THEN 1 ELSE 0 END)
  FROM Table
 WHERE num_PERIOD <= 111
 GROUP BY ID, CHANNEL, VENDOR

Or are you looking for the first 111 num_PERIOD values allowing for gaps in the num_PERIOD column?

或者您正在寻找第一个111个num_PERIOD值,以允许num_PERIOD列中出现空白?

SELECT t.ID
     , t.CHANNEL
     , t.VENDOR
     , NULL_SALES_SET = SUM(CASE WHEN t.SALES IS NULL THEN 1 ELSE 0 END)
  FROM Table t
        INNER JOIN ( SELECT i.ID
                          , i.CHANNEL
                          , i.VENDOR
                          , i.num_PERIOD
                          , rowNum = ROW_NUMBER(PARTITION BY i.ID, i.CHANNEL, i.VENDOR ORDER BY i.num_PERIOD)
                       FROM Table i ) l
          ON t.ID = l.ID
         AND t.CHANNEL = l.CHANNEL
         AND t.VENDOR = l.VENDOR
         AND t.num_PERIOD = l.num_PERIOD
 WHERE l.rowNum <= 111
 GROUP BY ID, CHANNEL, VENDOR

Edit: Not sure how I overlooked it, but it is necessary to JOIN on the num_PERIOD column.

编辑:不知道我是如何忽略它的,但是有必要加入num_PERIOD列。

Edit: Add the number of distinct num_PERIOD per ID, Channel, Vendor without affecting the NULL_SALES_SET

编辑:在不影响NULL_SALES_SET的情况下,为每个ID、通道、供应商添加不同的num_PERIOD数

SELECT t.ID
     , t.CHANNEL
     , t.VENDOR
       -- Counts the NULL Sales when the num_PERIOD is in the 
       -- first 111 num_PERIODs
     , NULL_SALES_SET = SUM(CASE WHEN l.rowNum IS NOT NULL AND t.SALES IS NULL 
                                   THEN 1 
                                 ELSE 0 END)
       -- Counts the distinct num_PERIOD values
     , PERIOD_COUNT = COUNT(DISTINCT t.num_PERIOD)
  FROM Table t
        LEFT OUTER JOIN ( SELECT i.ID
                               , i.CHANNEL
                               , i.VENDOR
                               , i.num_PERIOD
                               , rowNum = ROW_NUMBER(PARTITION BY i.ID,
                                                                  i.CHANNEL,
                                                                  i.VENDOR
                                                     ORDER BY i.num_PERIOD)
                          FROM Table i ) l
          ON t.ID = l.ID
         AND t.CHANNEL = l.CHANNEL
         AND t.VENDOR = l.VENDOR
         AND t.num_PERIOD = l.num_PERIOD
         AND l.rowNum <= 111
 GROUP BY ID, CHANNEL, VENDOR

#2


2  

Use a CTE (Common Table Expression) with the ROW_NUMBER windowed function and you should be set:

使用带有ROW_NUMBER窗口函数的CTE(公共表表达式),应该设置为:

;WITH MyCTE AS
(
    SELECT
        id,
        channel,
        vendor,
        sales,
        ROW_NUMBER() OVER (PARTITION BY id, channel, vendor ORDER BY num_period) AS row_num
    FROM
        MyTable
)
SELECT
    id,
    channel,
    vendor,
    SUM(CASE WHEN sales IS NULL THEN 1 ELSE 0 END) AS null_sales_set
FROM
    MyCTE
WHERE
    row_num <= 111
GROUP BY
    id, channel, vendor